Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies and Applications 9815136763, 9789815136760

Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies and Applications capture

221 52 42MB

English Pages [319]

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies and Applications
 9815136763, 9789815136760

Table of contents :
Cover
Title
Copyright
End User License Agreement
Contents
Foreword
Preface
List of Contributors
Study of Machine Learning for Recommendation Systems
Tushar Deshpande1,*, Khushi Chavan1 and Ramchandra Mangrulkar1
INTRODUCTION
Recommendation System
Machine Learning
Supervised learning
Semi-supervised learning
Unsupervised learning
Reinforcement learning
METHODS
Collaborative Filtering
Model-Based
Memory-Based
Content-based Filtering
Hybrid Filtering
Algorithms
Co-clustering
Matrix Factorization
K-Nearest Neighbors
K-means Clustering
Naive Bayes
Random Forest
Evaluation Methods
F1. Measure
RMSE (Root Mean Squared Error)
MAE (Mean Absolute Error)
EXPERIMENTATION
Dataset
Implementation
Result
DISCUSSION
CONCLUSION
ACKNOWLEDGEMENT
REFERENCES
Machine Learning Approaches for Text Mining
and Spam E-mail Filtering: Industry 4.0
Perspective
An Overview of Deep Learning-Based
Recommendation Systems and Evaluation Metrics
Towards Recommender Systems Integrating Contextual Information from Multiple Domains through Tensor Factorization
Douglas Véras1,*, André Nascimento1 and Gustavo Callou1
INTRODUCTION
Problem Statement
CD-CARS Overview
LITERATURE REVIEW
Cross-Domain RS
Definition of Domain
Cross-Domain Recommendation Tasks
Cross-Domain Recommendation Goals
Cross-Domain Recommendation Scenarios
Cross-domain Methods
Context-Aware Recommender Systems
Definition of Context
Obtaining Contextual Information
Contextual Information Relevance and availability
Context-Aware Approaches
“Ad-hoc” Cross-Domain Context-Aware Recommender Systems
SYSTEMATIC CROSS-DOMAIN CONTEXT-AWARE RECOMMENDER SYSTEMS
CD-CARS Problem Formalization
Contextual Information Modelling
Contextual Features Formalization
Obtaining and Choosing Relevant Contextual Information
CD-CARS Algorithms
Base Cross-Domain Algorithms
CD-CARS Evaluation
Evaluation of Data Partitioning
Sensitivity Analysis
Discussion
CONCLUSION AND RESEARCH DIRECTIONS
ACKNOWLEDGMENT
REFERENCES
Developing a Content-based Recommender System for Author Specialization using Topic Modelling and Ranking Framework
Shilpa Verma1,*, Rajesh Bhatia1 and Sandeep Harit1
INTRODUCTION
RELATED WORK
PROBLEM DESCRIPTION
HADOOP-BASED TOPIC MODELLING SYSTEM TO IDENTIFY AUTHOR SPECIALIZATION
Text Vectorization
Mapper
Reducer
INFLUENCE OF NODES AND MULTI-CRITERIA RANKING MODEL
EXPERIMENTAL SETUP AND DISCUSSION
Dataset Used
Pre-processing Step
Results of Hadoop-based Topic Modeling
Result of Ranking Model
CONCLUSION AND FUTURE SCOPE
ACKNOWLEDGEMENT
REFERENCES
Movie Recommendations
Sentiment Analysis for Movie Reviews
Balajee Maram1,*, Suneetha Merugula2 and Santhosh Kumar Balan3
INTRODUCTION
SENTIMENT ANALYSIS
LITERATURE SURVEY
PROPOSED WORK
Sentiment Analysis
Opinion Mining
TECHNICAL DESCRIPTION
Input Dataset
Dataset Description
Data Preprocessing
Deep Learning
Supervised Learning
METHODOLOGY
Random Forest
Long Short-Term Memory
Bi-Directional Long Short-Term Memory
RESULTS AND DISCUSSIONS
CONCLUSION
ACKNOWLEDGEMENT
REFERENCES
A Movie Recommender System with Collaborative and Content Filtering
Anupama Angadi1, Padmaja Poosapati1, Satya Keerthi Gorripati2 and Balajee Maram3,*
INTRODUCTION
RELATED WORK
Limitations
Proposals of a New Similarity Metrics
Accuracy
BACKGROUND
CATEGORIES OF RECOMMENDER SYSTEMS
Collaborative Recommender Systems
Memory-Based Collaborative Filtering
Model-based Collaborative Filtering
Content Recommender System
ALGORITHMS
Nearest-Neighbors
Matrix Factorization Methods
Clustering-Based RS
SIMILARITY METRICS
User-Based Collaborative Recommender System
Finding Nearest Neighbors using Jaccard Similarity
Finding Nearest Neighbors using Cosine Similarity
Nearest Neighbors using Pearson Similarity
Nearest Neighbors using Mean Square Difference Similarity
Item-Based Collaborative System
Nearest Products using Pearson Similarity
Content-Based Filters
Data Pre-processing
Vectorization
TF-IDF
Word Embeddings
Limitations
Topic Modelling
EVALUATION METRICS
Precision and Recall
MAE
RMSE
CONCLUSION AND FUTURE WORK
ACKNOWLEDGEMENTS
REFERENCES
An Introduction to Various Parameters of the Point of Interest
Shreya Roy1,*, Abhishek Majumder1,* and Joy Lal Sarkar1,*
INTRODUCTION
IMPACT OF VARIOUS PARAMETERS ON POI RECOMMENDATION
Users’ Interest-Based Recommendation
Location Popularity-Based Recommendation
Weather Based Recommendation
Cost Effective Recommendation
SUMMARY
CONCLUSION AND FUTURE SCOPE
ACKNOWLEDGMENTS.
REFERENCES
Mobile Tourism Recommendation System for Visually Disabled
Pooja Selvarajan1, Poovizhi Selvan1,*, Vidhushavarshini Sureshkumar1 and Sathiyabhama Balasubramaniam1
INTRODUCTION
PROPOSED WORK
Recommendation Systems
Collaborative Recommender Systems
A Content-based Recommender
Hybrid Recommendation System
MAPPING TECHNOLOGIES
Tipping
Proximo
Geo Notes
Macau Map
Microsoft Planner
Tourist Guide
Cyber Guide
Context-Aware Tourist Information System
Deep Map
Tour Planning Research
Artificial Language Experimental Assistant Internet (ALEXA)
SOLUTION STRATEGY
CONCLUSION
FUTURE WORK
ACKNOWLEDGEMENT
REFERENCES
Point of Interest Recommendation via Tensor Factorization
Shreya Roy1,*, Abhishek Majumder1 and Joy Lal Sarkar1
INTRODUCTION
Influential Factors of POI Eecommendation
Pure Check-in Based POI Recommendations
Geographical Influence Enhanced POI Recommendation
Social Influence Enhanced POI Recommendation
Temporal Influence Enhanced POI Recommendation
A Brief Introduction to Tensors
LITERATURE SURVEY ON RECOMMENDATION SYSTEM VIA TENSOR FACTORIZATION
Hotel Recommendation
Advantages
Disadvantages
Recommendation in the Travel Decision-making Process
Advantages
Disadvantages
Location-Based Social Networks for POI Recommendation
Time-Aware Preference Mining
Tensor Factorization
Advantages
Disadvantages
POI Recommendation Based on Weather Context
Context Inference and Modeling
Construction of Tensor and Feature Matrix
Collaborative Tensor Decomposition
POI Recommendation
Advantages
Disadvantages
POI Recommendation with Category Transition and Temporal Influence
Advantages
Disadvantages
CONCLUSION AND FUTURE SCOPE
ACKNOWLEDGMENTS
REFERENCES
Exploring the Usage of Data Science Techniques for Assessment and Prediction of Fashion Retail - A Case Study Approach
Dillip Rout1,*
INTRODUCTION
PREVIOUS WORKS
Goal and Objectives
Proposed Framework
Data Preprocessing
Feature Engineering
Predictive Analysis
Experimental Study
Data Description and Preparation
Issues and Resolution of Data
Exploratory Analysis
Feature Engineering
Impact of Rating on Sales
Impact of Material Price Season and Style on Sales
Predictive Analysis
Automation of Recommendations
Sales Forecast
CONCLUSION
ACKNOWLEDGEMENT
REFERENCES
Data Analytics in Human Resource Recruitment and Selection
Sumi Kizhakke Valiyatra1,*
INTRODUCTION
RECRUITMENT ANALYTICS
Procedure for Recruitment Analytics
OPERATIONAL REPORTING
Recruiting Metrics
The Number of Days that Have Passed Since Time to Fill
Quality of Hire
Artificial Intelligence in Screening
Artificial Intelligence in Online Assessments
Artificial Intelligence in Job Interviews
Time to Hire
Cost Per Hire
First-year Attrition
Success Ratio Recruiting Metric
Employee Selection
Selection Ratio
Optimum Productivity Level (OPL)
Time to Productivity
CONCLUSION
ACKNOWLEDGEMENT
REFERENCES
A Personalized Artificial Neural Network for Rice Crop Yield Prediction
Pundru Chandra Shaker Reddy1,*, Alladi Sureshbabu2, Yadala Sucharitha2 and Goddumarri Surya Narayana3
INTRODUCTION
Traditional Crop Yield Forecasting Methods
Artificial Neural Networks
LITERATURE REVIEW
STUDY AREA AND DATASET DESCRIPTION
Study Area
Dataset Description
PROPOSED METHODOLOGY
P-ANN (Personalization of ANN)
MODEL EXECUTION AND EVALUATION
Comparative Analysis
CONCLUSION AND FUTURE WORKS
ACKNOWLEDGEMENTS
REFERENCES
Subject Index
Back Cover

Citation preview

Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies and Applications Edited By Abhishek Majumder

Tripura University, Tripura, India

Joy Lal Sarkar

Tripura University, Tripura, India

& Arindam Majumder

NIT Agartala, Tripura 799046, India

Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies and Applications Editors: Abhishek Majumder, Joy Lal Sarkar and Arindam Majumder ISBN (Online): 978-981-5136-74-6 ISBN (Print): 978-981-5136-75-3 ISBN (Paperback): 978-981-5136-76-0 © 2023, Bentham Books imprint. Published by Bentham Science Publishers Pte. Ltd. Singapore. All Rights Reserved. First published in 2023.

BSP-EB-PRO-9789815136746-TP-301-TC-14-PD-20230816

BENTHAM SCIENCE PUBLISHERS LTD.

End User License Agreement (for non-institutional, personal use) This is an agreement between you and Bentham Science Publishers Ltd. Please read this License Agreement carefully before using the ebook/echapter/ejournal (“Work”). Your use of the Work constitutes your agreement to the terms and conditions set forth in this License Agreement. If you do not agree to these terms and conditions then you should not use the Work. Bentham Science Publishers agrees to grant you a non-exclusive, non-transferable limited license to use the Work subject to and in accordance with the following terms and conditions. This License Agreement is for non-library, personal use only. For a library / institutional / multi user license in respect of the Work, please contact: [email protected].

Usage Rules: 1. All rights reserved: The Work is the subject of copyright and Bentham Science Publishers either owns the Work (and the copyright in it) or is licensed to distribute the Work. You shall not copy, reproduce, modify, remove, delete, augment, add to, publish, transmit, sell, resell, create derivative works from, or in any way exploit the Work or make the Work available for others to do any of the same, in any form or by any means, in whole or in part, in each case without the prior written permission of Bentham Science Publishers, unless stated otherwise in this License Agreement. 2. You may download a copy of the Work on one occasion to one personal computer (including tablet, laptop, desktop, or other such devices). You may make one back-up copy of the Work to avoid losing it. 3. The unauthorised use or distribution of copyrighted or other proprietary content is illegal and could subject you to liability for substantial money damages. You will be liable for any damage resulting from your misuse of the Work or any violation of this License Agreement, including any infringement by you of copyrights or proprietary rights.

Disclaimer: Bentham Science Publishers does not guarantee that the information in the Work is error-free, or warrant that it will meet your requirements or that access to the Work will be uninterrupted or error-free. The Work is provided "as is" without warranty of any kind, either express or implied or statutory, including, without limitation, implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the results and performance of the Work is assumed by you. No responsibility is assumed by Bentham Science Publishers, its staff, editors and/or authors for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products instruction, advertisements or ideas contained in the Work.

Limitation of Liability: In no event will Bentham Science Publishers, its staff, editors and/or authors, be liable for any damages, including, without limitation, special, incidental and/or consequential damages and/or damages for lost data and/or profits arising out of (whether directly or indirectly) the use or inability to use the Work. The entire liability of Bentham Science Publishers shall be limited to the amount actually paid by you for the Work.

General: 1. Any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims) will be governed by and construed in accordance with the laws of Singapore. Each party agrees that the courts of the state of Singapore shall have exclusive jurisdiction to settle any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims). 2. Your rights under this License Agreement will automatically terminate without notice and without the

need for a court order if at any point you breach any terms of this License Agreement. In no event will any delay or failure by Bentham Science Publishers in enforcing your compliance with this License Agreement constitute a waiver of any of its rights. 3. You acknowledge that you have read this License Agreement, and agree to be bound by its terms and conditions. To the extent that any other terms and conditions presented on any website of Bentham Science Publishers conflict with, or are inconsistent with, the terms and conditions set out in this License Agreement, you acknowledge that the terms and conditions set out in this License Agreement shall prevail. Bentham Science Publishers Pte. Ltd. 80 Robinson Road #02-00 Singapore 068898 Singapore Email: [email protected]

BSP-EB-PRO-9789815136746-TP-301-TC-14-PD-20230816

CONTENTS FOREWORD ........................................................................................................................................... i PREFACE ................................................................................................................................................ ii LIST OF CONTRIBUTORS .................................................................................................................. iii CHAPTER 1 STUDY OF MACHINE LEARNING FOR RECOMMENDATION SYSTEMS ... Tushar Deshpande, Khushi Chavan and Ramchandra Mangrulkar INTRODUCTION .......................................................................................................................... Recommendation System ........................................................................................................ Machine Learning ................................................................................................................... Supervised learning ....................................................................................................... Semi-supervised learning .............................................................................................. Unsupervised learning .................................................................................................. Reinforcement learning ................................................................................................. METHODS ...................................................................................................................................... Collaborative Filtering ............................................................................................................ Model-Based ................................................................................................................. Memory-Based .............................................................................................................. Content-based Filtering ........................................................................................................... Hybrid Filtering ...................................................................................................................... Algorithms .............................................................................................................................. Co-clustering ................................................................................................................. Matrix Factorization ..................................................................................................... K-Nearest Neighbors ..................................................................................................... K-means Clustering ....................................................................................................... Naive Bayes ................................................................................................................... Random Forest .............................................................................................................. Evaluation Methods ................................................................................................................ F1. Measure .................................................................................................................. RMSE (Root Mean Squared Error) ............................................................................... MAE (Mean Absolute Error) ......................................................................................... EXPERIMENTATION .................................................................................................................. Dataset ..................................................................................................................................... Implementation ....................................................................................................................... Result ...................................................................................................................................... DISCUSSION .................................................................................................................................. CONCLUSION ............................................................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 2 MACHINE LEARNING APPROACHES FOR TEXT MINING AND SPAM EMAIL FILTERING: INDUSTRY 4.0 PERSPECTIVE ....................................................................... Pradeep Kumar, Abdul Wahid and Venkatesh Naganathan INTRODUCTION .......................................................................................................................... Integration and Interconnection .............................................................................................. Data and Digitalization ........................................................................................................... Refinement and Personalization ............................................................................................. Smart Manufacturing .............................................................................................................. Automated Vehicles and Machines ........................................................................................ Quality Control .......................................................................................................................

1 1 1 2 3 3 3 3 4 4 4 5 6 6 7 8 8 11 12 13 14 15 16 16 17 18 18 19 19 19 21 21 21 25 25 27 27 27 28 28 28

Predictive Maintenance ........................................................................................................... Demand Predictions ................................................................................................................ Chatbots .................................................................................................................................. BACKGROUND & MOTIVATION ............................................................................................. Spam Filtering Using Machine Learning Approaches ............................................................ Data Pre-processing Techniques ............................................................................................. Spam Filtering: A Comparative Study of Machine Learning Approaches ............................. Data Repositories .................................................................................................................... Performance Measurement ..................................................................................................... MACHINE LEARNING APPROACHES ................................................................................... Decision Tree Modeling ......................................................................................................... Random Forest ........................................................................................................................ Gradient Boosted Model (GBM) ............................................................................................ AdaBoost Method ................................................................................................................... Naive Bayes Classification ..................................................................................................... Artificial Neural Network ....................................................................................................... Support Vector Machines ....................................................................................................... Tuning Hyper-parameters ....................................................................................................... EXPLORATORY DATA ANALYSIS .......................................................................................... Experimental Inferences and Discussion ................................................................................ CONCLUDING REMARKS ......................................................................................................... CONSENT FOR PUBLICATION ................................................................................................ CONFLICT OF INTEREST ......................................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 3 AN OVERVIEW OF DEEP LEARNING-BASED RECOMMENDATION SYSTEMS AND EVALUATION METRICS ....................................................................................... Samudrala Venkatesiah Sheela and Kotrike Rathnaiah Radhika INTRODUCTION .......................................................................................................................... RECOMMENDATION SYSTEMS .............................................................................................. Content-based Recommendation ............................................................................................ Collaborative Filtering Recommendation ............................................................................... Hybrid ..................................................................................................................................... DEEP LEARNING APPROACHES ............................................................................................. Embedding .............................................................................................................................. Generative Approach .............................................................................................................. Discriminative Approach ........................................................................................................ Hybrid Approach .................................................................................................................... DEEP LEARNING-BASED RECOMMENDATION SYSTEMS ............................................. Article Citation ........................................................................................................................ Entertainment .......................................................................................................................... E-commerce ............................................................................................................................ Other Applications .................................................................................................................. EVALUATION METRICS ............................................................................................................ CONCLUSION ............................................................................................................................... REFERENCES ...............................................................................................................................

28 28 28 29 29 30 32 32 33 34 34 35 36 36 38 39 40 41 43 48 49 50 50 50 50 53 53 54 54 55 56 56 57 57 58 59 59 61 62 63 63 64 66 67

CHAPTER 4 TOWARDS RECOMMENDER SYSTEMS INTEGRATING CONTEXTUAL INFORMATION FROM MULTIPLE DOMAINS THROUGH TENSOR FACTORIZATION ... 72 Douglas Véras, André Nascimento and Gustavo Callou INTRODUCTION .......................................................................................................................... 72

Problem Statement .................................................................................................................. CD-CARS Overview .............................................................................................................. LITERATURE REVIEW .............................................................................................................. Cross-Domain RS ................................................................................................................... Definition of Domain ..................................................................................................... Cross-Domain Recommendation Tasks ........................................................................ Cross-Domain Recommendation Goals ........................................................................ Cross-Domain Recommendation Scenarios .................................................................. Cross-domain Methods ................................................................................................. Context-Aware Recommender Systems ................................................................................. Definition of Context ..................................................................................................... Obtaining Contextual Information ................................................................................ Contextual Information Relevance and availability ..................................................... Context-Aware Approaches .......................................................................................... “Ad-hoc” Cross-Domain Context-Aware Recommender Systems ....................................... SYSTEMATIC CROSS-DOMAIN CONTEXT-AWARE RECOMMENDER SYSTEMS .... CD-CARS Problem Formalization ......................................................................................... Contextual Information Modelling ......................................................................................... Contextual Features Formalization .............................................................................. Obtaining and Choosing Relevant Contextual Information .......................................... CD-CARS Algorithms ............................................................................................................ Base Cross-Domain Algorithms .................................................................................... CD-CARS Evaluation ............................................................................................................. Evaluation of Data Partitioning .................................................................................... Sensitivity Analysis ........................................................................................................ Discussion ............................................................................................................................... CONCLUSION AND RESEARCH DIRECTIONS .................................................................... ACKNOWLEDGMENT ................................................................................................................ REFERENCES ............................................................................................................................... CHAPTER 5 DEVELOPING A CONTENT-BASED RECOMMENDER SYSTEM FOR AUTHOR SPECIALIZATION USING TOPIC MODELLING AND RANKING FRAMEWORK Shilpa Verma, Rajesh Bhatia and Sandeep Harit INTRODUCTION .......................................................................................................................... RELATED WORK ......................................................................................................................... PROBLEM DESCRIPTION ......................................................................................................... HADOOP-BASED TOPIC MODELLING SYSTEM TO IDENTIFY AUTHOR SPECIALIZATION ........................................................................................................................ Text Vectorization .................................................................................................................. Mapper .......................................................................................................................... Reducer ......................................................................................................................... INFLUENCE OF NODES AND MULTI-CRITERIA RANKING MODEL ........................... EXPERIMENTAL SETUP AND DISCUSSION ......................................................................... Dataset Used ........................................................................................................................... Pre-processing Step ................................................................................................................. Results of Hadoop-based Topic Modeling ............................................................................. Result of Ranking Model ........................................................................................................ CONCLUSION AND FUTURE SCOPE ...................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ...............................................................................................................................

76 76 78 78 78 79 80 80 81 83 83 85 86 87 89 89 91 92 92 94 95 97 98 99 100 101 102 103 103 110 110 112 114 114 114 115 115 117 119 119 119 120 122 123 124 124

CHAPTER 6 MOVIE RECOMMENDATIONS ................................................................................ 126

Anukampa Behera, Chhabi Rani Panigrahi, Abhishek Mishra, Bibudhendu Pati and Sumit Mitra INTRODUCTION .......................................................................................................................... MOVIE RECOMMENDATION SYSTEM ................................................................................. RECOMMENDER SYSTEM DESIGN VARIANTS .................................................................. Collaborative Filtering ............................................................................................................ Content-based Filtering ........................................................................................................... Demographic Filtering ............................................................................................................ Knowledge-based Filtering ..................................................................................................... Utility-based ............................................................................................................................ Hybrid Recommender System ................................................................................................ DESIGN OF A MOVIE RECOMMENDER SYSTEM .............................................................. Machine Learning (ML) Based Approaches ........................................................................... Deep Learning-based Approach ............................................................................................. THE NETFLIX RECOMMENDER SYSTEM - A CASE STUDY ........................................... Netflix Personalization ............................................................................................................ Each Row on the Page is Personalized ......................................................................... Ranking ......................................................................................................................... PERFORMANCE METRICS ADOPTED FOR MOVIE RECOMMENDATION ................ CONCLUSION ............................................................................................................................... REFERENCES ............................................................................................................................... CHAPTER 7 SENTIMENT ANALYSIS FOR MOVIE REVIEWS ................................................ Balajee Maram, Suneetha Merugula and Santhosh Kumar Balan INTRODUCTION .......................................................................................................................... SENTIMENT ANALYSIS ............................................................................................................. LITERATURE SURVEY .............................................................................................................. PROPOSED WORK ...................................................................................................................... Sentiment Analysis ................................................................................................................. Opinion Mining ............................................................................................................. TECHNICAL DESCRIPTION ..................................................................................................... Input Dataset ........................................................................................................................... Dataset Description ...................................................................................................... Data Preprocessing ....................................................................................................... Deep Learning ......................................................................................................................... Supervised Learning ............................................................................................................... METHODOLOGY ......................................................................................................................... Random Forest ........................................................................................................................ Long Short-Term Memory ...................................................................................................... Bi-Directional Long Short-Term Memory .............................................................................. RESULTS AND DISCUSSIONS ................................................................................................... CONCLUSION ............................................................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 8 A MOVIE RECOMMENDER SYSTEM WITH COLLABORATIVE AND CONTENT FILTERING ........................................................................................................................ Anupama Angadi, Padmaja Poosapati, Satya Keerthi Gorripati and Balajee Maram INTRODUCTION .......................................................................................................................... RELATED WORK ......................................................................................................................... Limitations .............................................................................................................................. Proposals of a New Similarity Metrics ...................................................................................

126 128 129 130 131 132 132 134 135 136 136 137 140 142 142 144 145 147 148 151 151 152 153 155 155 156 157 157 157 157 158 158 159 159 159 161 162 163 163 163 165 165 166 166 167

Accuracy ................................................................................................................................. BACKGROUND ............................................................................................................................. CATEGORIES OF RECOMMENDER SYSTEMS ................................................................... Collaborative Recommender Systems .................................................................................... Memory-Based Collaborative Filtering ........................................................................ Model-based Collaborative Filtering ........................................................................... Content Recommender System ............................................................................................... ALGORITHMS .............................................................................................................................. Nearest-Neighbors .................................................................................................................. Matrix Factorization Methods ................................................................................................. Clustering-Based RS ............................................................................................................... SIMILARITY METRICS .............................................................................................................. User-Based Collaborative Recommender System .................................................................. Finding Nearest Neighbors using Jaccard Similarity .............................................................. Finding Nearest Neighbors using Cosine Similarity .................................................... Nearest Neighbors using Pearson Similarity ................................................................ Nearest Neighbors using Mean Square Difference Similarity ...................................... Item-Based Collaborative System ........................................................................................... Nearest Products using Pearson Similarity .................................................................. Content-Based Filters .............................................................................................................. Data Pre-processing ..................................................................................................... Vectorization ................................................................................................................. TF-IDF .......................................................................................................................... Word Embeddings ......................................................................................................... Limitations ..................................................................................................................... Topic Modelling ............................................................................................................ EVALUATION METRICS ............................................................................................................ Precision and Recall ................................................................................................................ MAE ........................................................................................................................................ RMSE ...................................................................................................................................... CONCLUSION AND FUTURE WORK ...................................................................................... ACKNOWLEDGEMENTS ........................................................................................................... REFERENCES ............................................................................................................................... CHAPTER 9 AN INTRODUCTION TO VARIOUS PARAMETERS OF THE POINT OF INTEREST ............................................................................................................................................... Shreya Roy, Abhishek Majumder and Joy Lal Sarkar INTRODUCTION .......................................................................................................................... IMPACT OF VARIOUS PARAMETERS ON POI RECOMMENDATION .......................... Users’ Interest-Based Recommendation ................................................................................. Location Popularity-Based Recommendation ........................................................................ Weather Based Recommendation ........................................................................................... Cost Effective Recommendation ............................................................................................ SUMMARY ..................................................................................................................................... CONCLUSION AND FUTURE SCOPE ...................................................................................... ACKNOWLEDGMENTS. ............................................................................................................. REFERENCES ...............................................................................................................................

167 168 169 170 171 171 171 172 172 173 173 173 174 175 176 177 177 178 178 179 179 180 182 183 184 184 185 185 185 186 186 187 187 189 189 190 190 194 198 200 201 203 203 203

CHAPTER 10 MOBILE TOURISM RECOMMENDATION SYSTEM FOR VISUALLY DISABLED ............................................................................................................................................... 205 Pooja Selvarajan, Poovizhi Selvan, Vidhushavarshini Sureshkumar and Sathiyabhama Balasubramaniam

INTRODUCTION .......................................................................................................................... PROPOSED WORK ...................................................................................................................... Recommendation Systems ...................................................................................................... Collaborative Recommender Systems .................................................................................... A Content-based Recommender ............................................................................................. Hybrid Recommendation System ........................................................................................... MAPPING TECHNOLOGIES ...................................................................................................... Tipping .................................................................................................................................... Proximo ................................................................................................................................... Geo Notes ................................................................................................................................ Macau Map ............................................................................................................................. Microsoft Planner .................................................................................................................... Tourist Guide .......................................................................................................................... Cyber Guide ............................................................................................................................ Context-Aware Tourist Information System .......................................................................... Deep Map ................................................................................................................................ Tour Planning Research .......................................................................................................... Artificial Language Experimental Assistant Internet (ALEXA) ............................................ SOLUTION STRATEGY .............................................................................................................. CONCLUSION ............................................................................................................................... FUTURE WORK ............................................................................................................................ ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 11 POINT OF INTEREST RECOMMENDATION VIA TENSOR FACTORIZATION ................................................................................................................................. Shreya Roy, Abhishek Majumder and Joy Lal Sarkar INTRODUCTION .......................................................................................................................... Influential Factors of POI Eecommendation .......................................................................... Pure Check-in Based POI Recommendations ............................................................... Geographical Influence Enhanced POI Recommendation ........................................... Social Influence Enhanced POI Recommendation ....................................................... Temporal Influence Enhanced POI Recommendation .................................................. A Brief Introduction to Tensors .............................................................................................. LITERATURE SURVEY ON RECOMMENDATION SYSTEM VIA TENSOR FACTORIZATION ........................................................................................................................ Hotel Recommendation .......................................................................................................... Advantages .................................................................................................................... Disadvantages ............................................................................................................... Recommendation in the Travel Decision-making Process ..................................................... Advantages .................................................................................................................... Disadvantages ............................................................................................................... Location-Based Social Networks for POI Recommendation ................................................. Time-Aware Preference Mining .................................................................................... Tensor Factorization ..................................................................................................... Advantages .................................................................................................................... Disadvantages ............................................................................................................... POI Recommendation Based on Weather Context ................................................................. Context Inference and Modeling ................................................................................... Construction of Tensor and Feature Matrix ................................................................. Collaborative Tensor Decomposition ...........................................................................

206 207 207 208 208 209 209 209 209 209 209 210 210 211 211 211 211 212 212 213 213 213 214 216 216 217 218 219 219 220 220 222 222 223 223 223 226 226 226 227 227 228 229 229 229 230 230

POI Recommendation ................................................................................................... Advantages .................................................................................................................... Disadvantages ............................................................................................................... POI Recommendation with Category Transition and Temporal Influence ............................ Advantages .................................................................................................................... Disadvantages ............................................................................................................... CONCLUSION AND FUTURE SCOPE ...................................................................................... ACKNOWLEDGMENTS .............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 12 EXPLORING THE USAGE OF DATA SCIENCE TECHNIQUES FOR ASSESSMENT AND PREDICTION OF FASHION RETAIL - A CASE STUDY APPROACH Dillip Rout INTRODUCTION .......................................................................................................................... PREVIOUS WORKS ..................................................................................................................... Goal and Objectives ................................................................................................................ Proposed Framework .............................................................................................................. Data Preprocessing .................................................................................................................. Feature Engineering ................................................................................................................ Predictive Analysis ................................................................................................................. Experimental Study ................................................................................................................. Data Description and Preparation ........................................................................................... Issues and Resolution of Data ................................................................................................. Exploratory Analysis .............................................................................................................. Feature Engineering ................................................................................................................ Impact of Rating on Sales ....................................................................................................... Impact of Material Price Season and Style on Sales ............................................................... Predictive Analysis ................................................................................................................. Automation of Recommendations .......................................................................................... Sales Forecast .......................................................................................................................... CONCLUSION ............................................................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 13 DATA ANALYTICS IN HUMAN RESOURCE RECRUITMENT AND SELECTION ............................................................................................................................................ Sumi Kizhakke Valiyatra INTRODUCTION .......................................................................................................................... RECRUITMENT ANALYTICS ................................................................................................... Procedure for Recruitment Analytics ...................................................................................... OPERATIONAL REPORTING ................................................................................................... Recruiting Metrics .................................................................................................................. The Number of Days that Have Passed Since Time to Fill .................................................... Quality of Hire ........................................................................................................................ Artificial Intelligence in Screening ............................................................................... Artificial Intelligence in Online Assessments ................................................................ Artificial Intelligence in Job Interviews ........................................................................ Time to Hire ............................................................................................................................ Cost Per Hire ........................................................................................................................... First-year Attrition .................................................................................................................. Success Ratio Recruiting Metric ............................................................................................. Employee Selection ................................................................................................................

231 231 232 232 233 234 235 235 236 239 239 240 243 244 244 245 245 245 246 246 247 251 251 255 256 256 257 258 259 259 262 262 263 263 264 264 265 265 266 266 266 266 267 267 267 267

Selection Ratio ........................................................................................................................ Optimum Productivity Level (OPL) ....................................................................................... Time to Productivity ............................................................................................................... CONCLUSION ............................................................................................................................... ACKNOWLEDGEMENT ............................................................................................................. REFERENCES ............................................................................................................................... CHAPTER 14 A PERSONALIZED ARTIFICIAL NEURAL NETWORK FOR RICE CROP YIELD PREDICTION ............................................................................................................................ Pundru Chandra Shaker Reddy, Alladi Sureshbabu, Yadala Sucharitha and Goddumarri Surya Narayana INTRODUCTION .......................................................................................................................... Traditional Crop Yield Forecasting Methods ......................................................................... Artificial Neural Networks ..................................................................................................... LITERATURE REVIEW .............................................................................................................. STUDY AREA AND DATASET DESCRIPTION ...................................................................... Study Area .............................................................................................................................. Dataset Description ................................................................................................................. PROPOSED METHODOLOGY .................................................................................................. P-ANN (Personalization of ANN) .......................................................................................... MODEL EXECUTION AND EVALUATION ............................................................................ Comparative Analysis ............................................................................................................. CONCLUSION AND FUTURE WORKS .................................................................................... ACKNOWLEDGEMENTS ........................................................................................................... REFERENCES ...............................................................................................................................

268 269 269 270 270 270 272 272 275 276 278 281 281 282 283 283 287 292 292 293 293

SUBJECT INDEX .................................................................................................................................... 26

i

FOREWORD I have the pleasant task of writing the foreword for the book Artificial Intelligence and Data Science in Recommendation System: Current Trends, Technologies, and Applications. This is a work edited by Abhishek Majumder, Joy Lal Sarkar of Tripura University, India, and Arindam Majumder of NIT Agartala, India. This book spans certain very crucial and current issues on the theory and application of Artificial Intelligence and Machine Learning. One of the most widely used applications is recommendation systems, which millions of people use on an everyday basis for shopping and entertainment. The methods in AI and NLP have been in development for several decades. Classification methods and neural networks have also existed for a long time. However, the advent of largescale gathering of social and user data has recently allowed theoretical techniques to be tested and proved in everyday practice. As a student of AI in the late 80s at IISc, it was difficult for me to imagine this day. We have seen the progression of the methods of pattern recognition and statistical classification methods. There was an interesting twist in the developments of AI systems where in the late 60s, it appeared that linear classification systems and Perceptron training algorithms would progress far. But the failure to solve XOR logic problems led the researchers to believe that these would be ineffective. This has now been very much established to be a fallacy. But the twist took AI research into the development of logic and systems called expert systems. It was imagined that these expert systems would have the real world and the real world experts' knowledge. The knowledge acquisition bottleneck and lack of trainability of the expert systems were their downfalls. There is now a resurgence of another type of system that is filling in this role: the recommender system. These systems are bringing together diverse methods and techniques in AI, Data Science, and large data sets into human interfaces. Thus, it gives me immense pleasure to see that this compilation has various applications, such as industry 4.0. Going further, we have applications presented here on deep learning, developing applications, movie recommendations, and movie reviews. One of the major applications these days is through natural language processing methods to perform sentiment analysis with data from social media. This is applied to movie reviews for tourist reviews, assessment of prediction and fashion retail, and exploring human resource recruitment and selection aspects. In addition to the very current topics that have been compiled, it is seen that there is a good diversity in the contributors to this volume. I wish this compilation the best wishes and that the readers might benefit most from it.

Atul Negi Professor School of Computer and Information Sciences University of Hyderabad Hyderabad India

ii

PREFACE A recommendation System is an intelligent computer-based system that serves as a guide and suggests, as per the preferences of the person. It uses state-of-the-art technologies like Big Data, Machine Learning, Artificial Intelligence, etc., and benefits both the consumer and the merchant. Recommendation System is becoming very popular as it serves as a guide for the activity that a person or a group plans to perform in the best possible manner, given the constraints imposed by the user(s). Software tools and techniques provide advice on items to be used by a user. The recommendations are to inspire its users to buy different products. This music creation initiative includes specialists in several fields, including Artificial Intelligence, Human-Computer Interaction, Data Mining, Analytics, Adaptive User Interfaces, and Decision Support Systems, etc. In this book, the major concepts of recommender systems, theories, methodologies, challenges and advanced applications of recommenders systems are imposed on this diversity. This book comprises various parts: techniques, applications and assessments of recommendation systems, interactions with these systems, and advanced algorithms. The topic of recommendation systems is highly diverse, since it makes it possible for users to make recommendations using different types of user preferences and user needs data. Collaborative filtering processes, content-based methods, and knowledge-based methods are the most common methods in recommending systems. Such three approaches are the basic foundations of recommendation systems. Specialized methods for different data fields and contexts, such as time, place, and social information, have been developed in recent years. Many developments for specific scenarios have been suggested, and techniques have been adapted to different fields of use.

Abhishek Majumder Tripura University Tripura India Joy Lal Sarkar Tripura University Tripura India & Arindam Majumder National Institute of Technology Agartala Tripura 799046, India

iii

List of Contributors Abdul Wahid

Amity Global Institute, Singapore 238466, Singapore

Abhishek Mishra

Indian Institute of Technology, Bhubaneswar, India

Abhishek Majumder

Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India

Anukampa Behera

Department of Computer Science & Engineering, ITER, S'O'A (Deemed to be) University, Bhubaneswar, India

Alladi Sureshbabu

Department of CSE, JNTUA College of Engineering, Ananthapur, Andhra Pradesh, India

Anupama Angadi

Department of Information Technology, Anil Neerukonda Institute of Technology & Science, Visakhapatnam, Andhra Pradesh, India

André Nascimento

Department of Computing, Federal Rural University of Pernambuco, Recife, Brazil

Balajee Maram

Department of Computer Science and Engineering, GMR Institute of Technology (Autonomous), Rajam, Andhra Pradesh, India

Bibudhendu Pati

Department of Computer Science, Rama Devi Women's University, Odisha, India

Dillip Rout

Department of Computer Science and Engineering, Centurion University of Technology and Management, Odisha, India

Chhabi Rani Panigrahi

Department of Computer Science, Rama Devi Women's University, Odisha, India

Douglas Véras

Department of Computing, Federal Rural University of Pernambuco, Recife, Brazil

Gustavo Callou

Department of Computing, Federal Rural University of Pernambuco, Recife, Brazil

Goddumarri Surya Narayana

Department of CSE, Vardhaman College of Engineering, Hyderabad, TS, India

Joy Lal Sarkar

Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India

Khushi Chavan

Department of Computer Engineering, Dwarkadas J. Sanghvi college of Engineering, Mumbai, Maharastra, India

Kotrike Rathnaiah Radhika

Department of Information Science and Engineering, BMS College of Engineering, Bangalore, India

Padmaja Poosapati

Department of Information Technology, Anil Neerukonda Institute of Technology & Science, Visakhapatnam, Andhra Pradesh, India

Pundru Chandra Shaker Reddy

Department of CSE, CMR College of Engineering & Technology, Hyderabad, TS, India

Pooja Selvarajan

Department of Computer Science and Engineering, Sona College of Technology, India

iv Poovizhi Selvan

Department of Computer Science and Engineering, Sona College of Technology, India

Pradeep Kumar

Department of CS&IT, Maulana Azad National Urdu University, Hyderabad, India

Ramchandra Mangrulkar

Department of Computer Engineering, Dwarkadas J. Sanghvi college of Engineering, Mumbai, Maharastra, India

Rajesh Bhatia

Department of Computer Science & Engineering, Punjab Engineering College, Chandigarh, India

Satya Keerthi Gorripati

Computer Science and Engineering, Gayatri Vidya Parishad College of Engineering (Autonomous), Visakhapatnam, Andhra Pradesh, India

Sumi Kizhakke Valiyatra

Institute of Management in Kerala, University of Kerala, Kerala 695034, India

Suneetha Merugula

Department of Information Technology, GMR Institute of Technology, Rajam, Andhra Pradesh, India

Sandeep Harit

Department of Computer Science & Engineering, Punjab Engineering College, Chandigarh, India

Sathiyabhama Balasubramaniam

Department of Computer Science and Engineering, Sona College of Technology, Tamilnadu, India

Santhosh Kumar Balan

Department of Computer Science & Engineering, Guru Nanak Institute of Technology, Hyderabad, Telangana

Shilpa Verma

Department of Computer Science & Engineering, Punjab Engineering College, Chandigarh, India

Shreya Roy

Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India

Sumit Mitra

Managing Partner, Citizen, Odisha, India

Samudrala Venkatesiah Sheela Department of Information Science and Engineering, BMS College of Engineering, Bangalore Tushar Deshpande

Department of Computer Engineering, Dwarkadas J. Sanghvi college of Engineering, Mumbai, Maharastra, India

Vidhushavarshini Sureshkumar

Department of Computer Science and Engineering, Sona College of Technology, India

Venkatesh Naganathan

Amity Global Institute, Singapore, Singapore

Yadala Sucharitha

Department of CSE, CMR Institute of Technology, Hyderabad, TS, India

Artificial Intelligence and Data Science, 2023, 1-24

1

CHAPTER 1

Study of Machine Learning for Recommendation Systems Tushar Deshpande1,*, Khushi Chavan1 and Ramchandra Mangrulkar1 Department of Computer Engineering, Dwarkadas J. Sanghvi college of Engineering, Mumbai, Maharastra, India 1

Abstract: This study provides an overview of recommendation systems and machine learning and their types. It briefly outlines the types of machine learning, such as supervised, unsupervised, semi-supervised learning and reinforcement. It explores how to implement recommendation systems using three types of filtering techniques: collaborative filtering, content-based filtering, and hybrid filtering. The machine learning techniques explained are clustering, co-clustering, and matrix factorization methods, such as Single value decomposition (SVD) and Non-negative matrix factorization (NMF). It also discusses K-nearest neighbors (KNN), K-means clustering, Naive Bayes and Random Forest algorithms. The evaluation of these algorithms is performed on the basis of three metric parameters: F1 measurement, Root mean squared error (RMSE) and Mean absolute error (MAE). For the experimentation, this study uses the BookCrossing dataset and compares analysis based on metric parameters. Finally, it also graphically depicts the metric parameters and shows the best and the worst techniques to incorporate into the recommendation system. This study will assist researchers in understanding the summary of machine learning in recommendation systems.

Keywords: F1-measure, Machine learning, Mean absolute error (MAE), Nearest k- neighbors (KNN), Non-negative matrix factorization (NMF), Recommendation system, Root mean squared error (RMSE), Singular value decomposition (SVD). INTRODUCTION Recommendation System The recommendation system [1] is the main part of digitization as it analyses the interest of users and recommends something based on those interests [2 - 5]. The aim of these systems is to reduce information overload by retrieving the most simCorresponding author Tushar Deshpande: Department of Computer Engineering, Dwarkadas J. Sanghvi college of Engineering, Mumbai, Maharastra, India; Tel: +91-07599029823; E-mail: [email protected]

*

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

2 Artificial Intelligence and Data Science

Deshpande et al.

ilar items depending on the customer's interest [6 - 10]. The primary use of these systems is decision making, maximizing profits, and reducing risks. This reduces customer’s efforts and time in information searching. It works as a filter that suggests alternatives based on massive data. Moreover, it acts as a multiplier that contributes to the expansion of the client’s options [11 - 22]. Over the last few years, the enthusiasm for recommendation systems has increased tremendously [23]. This is the most widely used service on high-end websites like Amazon, Google, YouTube, Netflix, IMDb, TripAdvisor, Kindle, etc. A number of media companies develop these systems as a service model for their clients. Furthermore, the implementation of such systems at commercial and non-profit sites attracts the attention of the customer [24 - 32]. These also satisfy clients more with online research results. These systems help customers search for their loved items faster and acquire more authentic predictions leading to higher sales at an eCommerce site. Regarding knowledge of these systems, there are various undergraduate and graduate courses at institutions around the world. Conferences, workshops, and contests are organized in accordance with these systems [33 - 47]. One of the competitions was the Netflix Prize, organized around machine learning and data mining. In this competition, participants were required to develop a movie recommendation system whose accuracy is 10% more precise than the existing system, also known as Cinematch. After a year of hard work, the Korbell team won first place using the two main algorithms: matrix factorization (Singular value decomposition (SVD)) and Restricted Boltzmann machines (RBM). Real applications [2] employ different ML algorithms, such as K-nearest neighbor (KNN), Naive Bayes, Random Forest, Adaboost, Singular value decomposition (SVD), and many others. The evolution of the recommendation scheme has led to the application of ML and AI algorithms for effective prediction and accuracy. In addition, the results provided by some ML algorithms are expected to be slightly promising. Due to the broad classification of ML algorithms, the choice of an ML algorithm may become a challenge depending on the different situations where recommendation systems are needed. To select an effective ML algorithm, the best way for the researcher or programmer would be to have a thorough knowledge of ML and recommending systems [48, 49]. This knowledge enables the researcher to create a model appropriate to a specific problem. Here, the study provides an overview of ML briefly. Machine Learning Machine learning demonstrates the imitation of human learning in computers by learning from experiences and applying them to recently encountered situations.

Study of Machine Learning

Artificial Intelligence and Data Science 3

ML originated in the 1950s but became more popular in the 1990s. Humans understand, but on the other side, the computer uses algorithms. Machine Learning is classified into four categories: 1. Supervised learning 2. Semi-supervised learning 3. Unsupervised learning 4. Reinforcement learning Supervised learning This learning deals with algorithms that provide training data with a set of features and the correct prediction according to those features. The task of the model would be to learn from this data and apply the information learned into new data with the input features and predict its outcome. An example would be predicting the price of a house according to the area. Semi-supervised learning In this learning, the model learns from training data that includes missing information. These types of algorithms focus more on concluding from insufficient data. An example is the evaluation of movies where not all viewers will give a review, but the model ends with the reviews provided. Unsupervised learning This learning focuses on algorithms that do not require training data. These algorithms use real-world information to learn by themselves. It focuses primarily on relations hidden in the specified data. An example is YouTube, which parses the viewed videos and recommends similar videos to the user. Reinforcement learning This type of learning involves algorithms that learn from feedback from an external body. It is similar to a student and teacher where the teacher may give fewer grades (negative feedback) or more grades (positive feedback). An example is to offer a treat to a dog for a positive response and not give that treat for a negative one.

4 Artificial Intelligence and Data Science

Deshpande et al.

METHODS The idea of recommendation systems is to provide recommendations to the user according to their behavior or profile. It analyzes the user's interest dynamically so that when the user carries out actions, he recommends according to his tastes. Various types of recommendations also involve recommendations based on trust, context, and risk. The types discussed in this document can be found in Fig. (6). The Recommendation System [4] is mainly divided into three categories: 1. Collaborative filtering 2. Content-based filtering 3. Hybrid filtering Collaborative Filtering In this approach [5], recommendation systems work according to user information. It compares users of similar preferences and recommends trying items that other users have tried shown in Fig. (1). An example is book applications in which the model would search for similar preferred users and would recommend what was purchased by those users to the current user. This type of system is further divided into a memory-based and model-based approach The Difference between memory-based and model-based method is shown in Fig. (2).

Fig. (1). Example of Collaborative filtering [6].

Model-Based In this method [7], the information base is past evaluations by which the model learns for better future predictions. This method functions on items that are not yet seen or used by the user. This method increases the accuracy of the system. Model-based approaches include matrix factorization, clustering, association techniques, Bayesian networks, and many more.

Study of Machine Learning

Artificial Intelligence and Data Science 5

Memory-Based In this method, the basis of the information is the likes and dislikes of other users, which is similar to the profile of the user who requires recommendations. This approach analyses the similarity between user interests to predict an item to the desired user. The approach is divided into subtypes, particularly user-based and item-based methods Fig. (3) shows the difference between user-based and itembased method. . User-Based This approach analyses the similarity among users in predictions. It can also predict, depending on the desired user's behavioral patterns. For example, if a user purchases a book, they will analyze other users' preferences on that book and recommend new items to the user. Item-Based This approach analyzes the similarity between the items researched or purchased by users for predictions. In other words, it computes the similarities between items unknown to the user and items known to the user and displays unknown items if the similarity value is high. For example, if a user buys an item, this system will look for items with similar features to the item purchased and recommend it to the user.

Fig. (2). Difference between memory-based [8] and model-based [9].

6 Artificial Intelligence and Data Science

Deshpande et al.

Fig. (3). Difference between user-based and item-based [10].

Content-based Filtering In this approach, the recommendation system functions based on the data of the item the user is looking for. The model analyses other items with attributes similar to those in the search and recommends them to the user. An example, shown in Fig. (4), is online shopping, where the user searches for an item with specific features and recommends similar items.

Fig. (4). Example of Content-based filtering [6].

Hybrid Filtering This approach is a combination of the two earlier methods, as illustrated in Fig. (5). This means that these recommendation systems are based on item data and user information. The first step consists of analyzing the user information.

Study of Machine Learning

Artificial Intelligence and Data Science 7

The second step is to analyze the data element you are looking for or using. Finally, the relevant dataset of the first two steps appears in the form of recommendations (Fig. 6).

Fig. (5). Mechanism of Hybrid filtering.

Fig. (6). Tree diagram of Filtering Techniques.

Algorithms This article includes a detailed explanation of Singular value decomposition (SVD), Non-negative matrix factorization (NMF), K-means clustering, K-nearest

8 Artificial Intelligence and Data Science

Deshpande et al.

neighbors (KNN), Co-clustering, Naive Bayes, and Random Forest algorithms. Co-clustering Co-clustering, also known as bi-clustering [11], is a method wherein there is a simultaneous clustering between rows and columns of a matrix. This matrix represents information as a function of user characteristics and item characteristics. In other words, co-clustering can also be visualized as grouping two different kinds of entities according to their similarity. The result of a coclustering algorithm is commonly termed a bi-cluster [12, 13]. The kinds of biclustering are classified according to the nature of these bi-clusters. It depends mainly upon constant and consistent values. 1) Bi-cluster with constant values: Rows and columns within a clustering block have the same constant value. 2) Bi-cluster with constant values in rows or columns: Every row or column in a clustering block has the same constant value. 3) Bi-cluster with coherent values: These bi-clusters identify more complex similarities between genes and conditions using an additive or multiplicative method. It is used across a wide variety of applications. Rege et al. [14] use co-clustering for clustering documents and topics. Chen et al. [15] and Felzenszwalb and Huttenlocher [16] use image co-clustering for image processing. It also helps to identify interaction networks [17, 18]. It is also an analytical tool for election data. The clustering technique is implemented through a variety of matrix factorization techniques. Matrix Factorization Matrix factorization is a type of algorithm associated with the decomposition of the user-item interaction matrix into the product of two rectangular matrices. This is usually done by minimizing the mathematical cost function RMSE (Root mean square error) which is done using gradient descent. Because of its effectiveness, this method became more popular during the Netflix Prize challenge (as discussed above). Recommendation systems use different matrix factorization techniques. Furthermore, a detailed study on Singular value decomposition (SVD) and Nonnegative matrix factorization (NMF) is given below. Singular Value Decomposition This method is associated with linear algebra and is increasingly popular within

Study of Machine Learning

Artificial Intelligence and Data Science 9

ML algorithms. Its application is mainly recommendation systems for ecommerce, music, or video streaming sites. SVD refers to the decomposition of a single matrix into three additional matrices. The general form is: ‫ܯ‬ൌܻܺܵ

 (1)

where M is the given mxn matrix, X is an mxn orthogonal matrix that denotes the relation between the user and latent factors, S is an nxn diagonal matrix that denotes the strength of these latent factors, and Y is nxn orthogonal matrix and it represents the similarity between the user and latent factors. The steps involved in SVD are given below: 1. In the first step, the data is represented as a matrix with rows as user and columns as items. 2. If there are any empty entries in the matrix, provide the average of the other entries so that there is no major error in the calculation. 3. After this, compute the SVD. (Done using numpy and surprise library) 4. After calculating the SVD, you only need to reduce it to obtain the expected matrix that will be used for the prediction by looking at the appropriate user/article pair. The primary benefit of SVD is that it simplifies the data set and eliminates noise from the data set. It also functions with the numerical data set. Also, it could improve the precision. There are many issues related to the SVD. One of the most important issues is data scarcity, also called the cold start problem [20]. This occurs due to a new community, user, or item. If a new community, user, or item is added, the recommendation system will not work properly due to a lack of information. Black sheep is also an issue, meaning some customers also agree and disagree with the same group of people. If so, it is impossible to make recommendations. Due to its temporal complexity (O (n)), it also suffers from scalability issues.

10 Artificial Intelligence and Data Science

Deshpande et al.

There are different applications of SVD. The most common applications are pseudo- inverse, resolving homogeneous linear equations, minimizing total least squares, range, null space and rank, and approximation to the lowest rank matrix. In addition, it is used for signal processing, image processing, and big data. Non-negative Matrix Factorization This is also a matrix factorization technique [21]. As with SVD, the analogy for this approach is to break down or factorize a given matrix. The only difference, on the other hand, is that the matrix is split into two parts. The two parts are called W and H. W matrix is for weights which represent each column as a basic element. These are building blocks from which to obtain predictions to the original data item. H matrix is hidden, which represents the coordinates of the data items of W. In other words, it guides us in converting to the original data item from the group of building blocks of W. The order of execution in NMF is given below: 1. Import the NMF model using the surprise library. 2. Then, load the dataset and isolate it to the given model. 3. Later, clean the data and create a function to pre-process data. 4. Successively create a document term matrix 'V'(given matrix). 5. Create a function to display the mode features. 6. Then, run NMF on the document term matrix 'V'. 7. Continue checking and iterating until useful features are found. The advantage of NMF is that it breaks down the given matrix into two smaller matrices whose dimensions can be controlled by the given matrix. It differs from other matrix factorization algorithms because it works only on positive numbers which makes the data interpretable. The dataset can become smaller if W and H are depicted sparsely. The issue with the semi-supervised NMF is that depending on the number of data points available, there is a reduction in the fitted data points. Applications of the NMF include the processing of audio spectrograms, document clustering, recommendation systems, chemometrics, and many others. It is also used for dimensionality reduction in astronomy, statistical data imputation, as well as nuclear imaging.

Study of Machine Learning

Artificial Intelligence and Data Science 11

Difference between SVD and NMF So as stated above, both SVD and NMF are matrix factorization techniques. But there are also some differences between them, which could help us to choose the best algorithm for a situation between these two. 1. The SVD includes both negative and positive values, while the NMF has strictly positive values. That makes NMF useful because it provides more sense and connections are made easier. 2. SVD factors can be related to the eigenfunctions of a system where the original matrix denotes a system about which one is taking interest from a signal processing perspective. This makes SVD more effortless. Although NMF can also be used for the same purpose because the association is indirect in this approach, it becomes more tedious. 3. The factors of SVD are unique, whereas the factors of NMF are not unique. As a result, NMF is better for algorithms with privacy protection. 4. SVD factors into three matrices, out of which the sigma matrix gives the information stored in the vector. Whereas NMF only factors into two matrices which do not include the sigma matrix. K-Nearest Neighbors KNN is an easy machine learning algorithm based on supervised ML learning. It finds similar items based on the distance between test data and individual training data using a variety of distance concepts. In this algorithm, predictions are mainly made using the calculation of the Euclidean distance of the nearest neighbors. Besides, the use of Jaccard similarity, Minkowski, Manhattan, or Hamming distance can be done instead of Euclidean. This is a non-parametric algorithm that assumes nothing about the given data. It is also referred to as a lazy learning algorithm, which does not learn from data, but instead stores and performs actions on the data. The steps involved in KNN are given below: 1. Load the dataset and preprocess it. 2. Fit the KNN algorithm (defined as Nearest-Neighbors) to the training dataset (use the sklearn library). For using the surprise library, it is defined as KNNBasic. 3. Predict the test result.

12 Artificial Intelligence and Data Science

Deshpande et al.

4. Creating the confusion matrix and finding the test accuracy of the result. 5. After this, the visualization of the test result can be done. This algorithm is used as it is easy to interpret the result. It also has great predictive power and less computing time. The main issue with KNN is that it becomes much slower as the volume of data increases. As such, it does not give good accuracy with large datasets. It is also highly sensitive to missing values, outliers, and noise from the dataset. It is primarily used for classification and regression problems. The result of a classification problem is a discrete value while for a regression problem, the result is a real number (containing a decimal). It is commonly used for text extraction. It is used in finance for stock prediction, management of loans, and analysis of money laundering. It is used in agriculture for weather forecasting and estimation of soil water parameters. It is also used in medicine to predict different diseases. K-means Clustering The k-means algorithm is the most widely known clustering algorithm. It is the simplest method of unsupervised learning to resolve the clustering issue. It also aims at solving the Expectation-Maximization problem. In this algorithm, a k value is received that represents the number of clusters. Then it classifies the data set by dividing it into a given number of clusters of similar characteristics/preferences. The similarity is calculated using the distance between the two items. In this method, the distance is measured using a square Euclidean, Manhattan, Euclidean, or Cosine distance measure. This method is evaluated using the elbow method or silhouette analysis [22, 23, 24]. Euclidean: ݀ሺ‫ݔ‬ǡ‫ݕ‬ሻൌඥሺ‫ ʹݕ‬െ ‫ͳݕ‬ሻଶ ൅ ሺ‫ ʹݔ‬െ ‫ͳݔ‬ሻଶ

(2)

Squared Euclidean: ݀ʹሺ‫ݔ‬ǡ‫ݕ‬ሻ ൌ ሺ‫ʹݕ‬െ‫ʹݔ‬ሻʹ൅ሺ‫ͳݕ‬െ‫ͳݔ‬ሻʹ (3) Manhattan: ݀ሺ‫ݔ‬ǡ‫ݕ‬ሻൌȁ‫ͳݔ‬െ‫ͳݕ‬ȁ൅ȁ‫ݔ‬2 െ‫ݕ‬2ȁ ሬԦ ௔ሬԦǤ௕

‘•‹‡†‹•–ƒ…‡ǣ…‘• ߠ ൌ  ȁ௔ሬԦȁȁ௕ሬԦ

ȁ

(4) (5)

where x1, y1, x2, y2 are the coordinates of the data points and (‫݌‬, ߠ) and (‫ݍ‬, ߰) are the polar coordinates of x and y.



Study of Machine Learning

Artificial Intelligence and Data Science 13

Naive Bayes Naive Bayes [3] is an ML probabilistic algorithm that is based on the Bayes theorem. Such algorithms result in each pair of items or features being independent of each other. In Naive Bayes, the assumptions are that each feature provides an independent and equal part in the outcome. To start, the Bayes theorem is discussed below [26]. P(ܺΤܻሻൌP(ܻΤܺ) ‫ܲכ‬ሺܺሻΤܲሺܻሻǡ

(6)

where P(X/Y) is the probability of X given that Y event has occurred, P(Y/X) is the probability of Y given that X event has occurred, P(X) is the probability of event X, and P(Y) is the probability of event Y. The types of naive Bayes are: Bernoulli, Multinomial, and Gaussian naive Bayes. Bernoulli naive Bayes: This is a binary algorithm that interprets whether a feature is present or not. It is used when there are binary function vectors (i.e., ones and zeroes). One of its applications is the bag of words model for text classification [27]. It follows the following rule: (7)  ܲሺ‫‹ݔ‬Τ‫ݕ‬ሻൌܲሺ݅Τ‫ݕ‬ሻ‫‹ݔכ‬൅ሺͳെܲሺ݅Τ‫ݕ‬ሻሻሺͳെ‫‹ݔ‬ሻ  where x and y are two events and i is a subevent of x. Multinomial naive Bayes: Feature vector refers to the frequencies that are made using the multinomial distribution. It is used efficiently for working with texts in natural language processing. Gaussian naive Bayes: Values associated with each feature vector are generated by Gaussian distribution or Normal distribution. If this is shown graphically, it results in a bell-shaped curve. The equation for this is as follows: ଶ

൫‫ݔ‬௜ െ ‫ݑ‬௬ ൯ ͳ ‫ݔ‬௜ (8) ‡š’ ൭െ ܲ൬ ൰ ൌ ൱ ଶ ଶ ‫ݕ‬ ʹߪ ʹߨߪ ௬ ඥ ௬

14 Artificial Intelligence and Data Science

Deshpande et al.

The steps involved in naive Bayes are written below: 1. The dataset is first preprocessed. 2. The fitting of Naive Bayes in the training data. 3. Predict the features of the test data. 4. Create the confusion matrix and get the accuracy of the model. 5. Try to visualize the result of the testing set. The advantage of naive Bayes is that it is quick and precise for predictions. Such an approach also reduces the complexity of the computations. It can be used not only for one but also for problems with multiple feature classes. This algorithm works best if the variables are discrete and not continuous. The main disadvantage of naive Bayes is the assumption that features are independent of each other, which is not possible in real life. Moreover, if there is no training set for a particular class feature, this may result in a posterior probability of zero. This is known as the zero-frequency problem. There are a variety of applications of naive Bayes. A major application of Naive Bayes lies in the recommendation system. If collaborative filtering and naive Bayes are both integrated into the recommendation system, it can predict through the unseen information regardless of preferences. As well, text classification is a popular application of naive Bayes. Applications of naive Bayes are real-time predictions and multiclass predictions for classification problems. It can also be used for facial recognition, medical testing, and weather forecasting. Random Forest The random forest algorithm [29] is a common supervised machine learning technique based on the ensemble learning concept. Ensemble learning is a method of combining various classifiers to improve model accuracy. In this algorithm, the dataset is split into several subsets and then contained in the same number of decision trees. Instead of depending on a decision tree, this algorithm takes an average of the predictions of all decision trees. This makes the outcome of the predictions more accurate. The steps involved in implementing a random forest algorithm are given below: 1. The dataset is loaded and then preprocessed by splitting the data into a training and testing set.

Study of Machine Learning

Artificial Intelligence and Data Science 15

2. The training and testing data are then feature scaled. 3. The training set is used to fit the random forest algorithm (defined as RandomForestClassifier). This is done by importing the sklearn library. 4. Prediction of the test result is made using a new prediction vector. 5. To conclude, a confusion matrix is created. This matrix gives the correct and incorrect predictions. 6. Visualization of the test result is done. The main advantage of this algorithm is its versatility. It has increased predictability. So, this is a handy algorithm to use. It also overcomes the biggest problem of overfitting. It can handle a large dataset and also needs less time to train the dataset. The major drawback is that many decision trees can delay the algorithm and not function efficiently in the real world. It is used for both classification and regression, although it is not appropriate for regression. There are various application domains of the random forest method. In banking, it is used for fraud detection, and loan risk identification, and various identifications and detections are performed based on banking services. In medicine, it is used to find the combination of medications and also to predict the risk and patterns of the disease. In commercialization, it can be used to predict stock prices and trends. It is also used in satellite imagery and object and multiclass detection. Evaluation Methods There are various methods used in the evaluation of machine learning methods. One of the commonly used methods is the absolute error and accuracy-based evaluation methods such as RMSE (Root mean squared Error), MSE (Mean square error), and MAE (Mean absolute error). There are decision support methods like precision, recall, F1-measure, and ROC (Receiver operating characteristic) curve. In addition, there are ranking-based evaluation methods, such as nDCG (Normalization of discounted cumulative gain), MRR (Mean reciprocal rank), mean precision, and Spearman rank correlation. Moreover, different metric evaluation methods assess performance based on prediction, decision, and ranking power. Examples of these metric-based approaches include coverage, popularity, novelty, diversity, and temporal evaluation. Finally, business sector metrics can be used to reach its objective. The above-mentioned algorithms will be evaluated using F1-measure, RMSE, and MAE.

16 Artificial Intelligence and Data Science

Deshpande et al.

F1. Measure This accuracy measurement combines accuracy and recall and is also called the harmonic average of the model. This is used to measure the accuracy of the model. The formula for the F1 measure is F1=2*P*R/(P+R), where P and R are the precision and recall of the model. Precision: This measure, also known as the TP (True positives), is defined as the ration of TP to the sum of TP and FP (False positives). Recall: This measure, also known as sensitivity, is defined as the ratio of the TP to the sum of TP and FN (False negatives). ܲ‫ ݊݋݅ݏ݅ܿ݁ݎ‬ൌ

ܴ݈݈݁ܿܽ ൌ

ȁܴ݁ܿ‫݈ܾ݁ܽݐ݅ݑݏ݁ݎܽݐ݄ܽݐݏ݉݁ݐ݅݀݁݀݊݁݉݉݋‬ȁ ȁܶܲȁ ൌ  ȁܶܲ ൅ ‫ܲܨ‬ȁ ȁܴ݁ܿ‫ݏ݉݁ݐܫ݀݁݀݊݁݉݉݋‬ȁ

ȁܴ݁ܿ‫݈ܾ݁ܽݐ݅ݑݏ݁ݎܽݐ݄ܽݐݏ݉݁ݐ݅݀݁݀݊݁݉݉݋‬ȁ ȁܶܲȁ ൌ  ȁܶܲ ൅ ‫ܰܨ‬ȁ ȁܵ‫ݏ݉݁ݐܫ݈ܾ݁ܽݐ݅ݑ‬ȁ

(9)

(10)

To avoid the least robustness of normal accuracy measurements, this measurement is preferred since it can take note of variations of different types of errors. The F1 measure is efficient whenever there is a presence of different costs of FP(False positives) and FN(False negatives). The F1 measurement can also be useful if there is an imbalance in the class feature numbers because, in such cases, the precision can be very misleading. The weakness of the F1 measurement is that the value calculated for one feature is independent of the other. In other words, it cannot compute the effectiveness of two features combined or based on each other's information. The applications for the F1 measurement include information retrieval in NLP (Natural Language Processing). This is most frequently used in search engine systems. In addition, it is most commonly used in binary classification systems. RMSE (Root Mean Squared Error) It is a performance measure of the ML models that are primarily calculated to see how well the model fits (i.e., less error, more accuracy). In other words, this is used to predict quantitative data. It is defined as:

Study of Machine Learning

Artificial Intelligence and Data Science 17



ͳ ܴ‫ ܧܵܯ‬ൌ ඩ ෍ሺ‫ݕ‬௝ െ ‫ݕ‬ෝఫ ሻଶ ݊

(11)

௝ୀଵ

In the above RMSE equation, ‫Œݕ‬is the original data and ‫ݕ‬ෝఫ is the predicted data. This measure is used because it is quite easy to distinguish. This makes it easier to work with methods such as gradient descent. This is also good for evaluating the standard deviation for distributing the errors generated. RMSE has square errors, so even a small error can affect the value immensely, which allows us to ensure that the model yields as little error as possible. This means that an error of 10 will become 100 times worse than an error of 1. RMSE could become difficult to understand from an interpretation point of view as it contains square values, whereas MAE would be clear to understand due to absolute values. MAE (Mean Absolute Error) This measure is also used as an alternative to RMSE. MAE is the average of the absolute difference between the original data and the predicted data. If this absolute value is not taken, this will become the mean bias error (MBE). To represent MAE mathematically: ௡

 ͳ ‫ ܧܣܯ‬ൌ  ෍ ȁ‫ݕ‬௝ െ ‫ݕ‬ෝఫ ȁ  (12) ݊ ௝ୀଵ

In the above MAE equation, ‫ݕ‬௝ is the original data and ‫ݕ‬ෝఫ is the predicted data.

MAE is more stable than RMSE when the variation in frequency error distribution increases. This means that an error of 10 will be 10 times worse than an error of 1. MAE is generally preferable when scales of error are linear, whereas RMSE is preferable when scales of error are non-linear. MAE is not useful when no absolute value is required, in such cases, RMSE is preferable.

18 Artificial Intelligence and Data Science

Deshpande et al.

EXPERIMENTATION Dataset The BookCrossing dataset [34] is built by CAI-Nicolas Ziegler from Amazon Web Services. There are 270,000 books read by 90,000 users with 1.1 million reviews. The data consist of three tables which include information about ratings, books, and users. This data is downloaded from Kaggle. The rating dataset provides a list of book ratings given by the users. It includes 1,149,780 rating records containing 3 fields: userID, ISBN, and bookRating. The ratings are either explicitly expressed on a scale of 1 to 10 or implicitly expressed by zero. As shown in Fig. (7), the vast majority of ratings are 0 and these ratings are distributed very unevenly. The books dataset provides book information, which includes 271,360 book records containing 8 fields. First, 5 fields containing the content-based information: ISBN, Book-Title, Book-Author, Year-OfPublication, Publisher, and the last 3 image-URL fields: Image-URL-S, ImageURL- M, Image-URL-L. These 3 different URL images are linked to the cover page of the books according to their size. The user dataset provides demographic information of users. It includes 278,858 user records and 3 fields: user id, Location, and Age. Fig. (8) shows that the majority of active users are youth between the ages of 20 and 30.

Fig. (7). Rating Distribution of the books in dataset.

Study of Machine Learning

Artificial Intelligence and Data Science 19

Fig. (8). Age Distribution of users in user-data.

Implementation The book recommendation system has been done using item-based and user-based collaborative filtering experimented in python and compiled in Jupyter Notebook. After evaluating the RMSE scores of the user and the item, optimization of the book recommendation system is done by integrating various other algorithms, such as co- clustering, SVD, NMF, KNNbasic, KNNwithMeans, and KNNwithZScore models from the surprise library. Result In this article, the BookCrossing Dataset was implemented. It contains three tables. One table contains the user's information. The second table includes information on books. The final table includes the book routing information. Experimentation employed user-based and item-based collaborative filtering methods for the desired recommendation system. The RMSE score of these methods varied from 7 to 8. For improvement, the use of co-clustering, SVD, NMF, KNNbasic, KNNwithMeans, and KNNwithZScore models is done. The use of the above algorithms allowed a dramatic improvement of RMSE and MAE errors. The following table shows the RMSE value, the MAE value, and the F1 score for the implemented algorithms. DISCUSSION By comparing the values obtained from the above analysis, the graphic display is shown below in Fig. (9) and Fig. (10). For a suitable algorithm, it is necessary to use a smaller RMSE and MAE measurement and a greater F1 measurement. Fig. (9). also represented that RMSE is higher than MAE. This is due to the differences mentioned earlier in Table 1. Thus, in the measure of errors, for

20 Artificial Intelligence and Data Science

Deshpande et al.

comparison, the RMSE value is much better than the MAE value. As well, the F1 measurement is used for the confusion matrix Table 2 shows the comparison of different techniques in term of RMSE, MAE, and F1.

Fig. (9). RMSE and MAE comparison of implemented algorithms. (Error(%) on y-axis and Algorithms on xaxis).

Fig. (10). F1 measure of implemented algorithms. (F1-measure(%) on y-axis and Algorithms on x-axis). Table 1. Difference between RMSE and MAE [33]. MAE

RMSE

It doesn’t consider the sign of the input, if the input is negative it takes the positive value.

It considers the sign of the input whether it is positive or negative.

It is less biased towards large values. Thus, when it When it comes to large errors, it reflects in the result comes to a large error, it does not reflect the result of of the algorithm. Thus, it is much better than MAE. the algorithm. The MAE value is comparatively smaller as the sample size increases.

RMSE is comparatively higher than MAE for increasing sample size.

MAE restricts larger errors.

RMSE does not restrict large errors.

Study of Machine Learning

Artificial Intelligence and Data Science 21

(Table 1) cont.....

MAE

RMSE

MAE is preferred where there is a proportion between overall performance and an increase in error.

RMSE is preferred where the overall performance and the increase in error are disproportionate.

Table 2. RMSE, MAE, and F1 measure of algorithms implemented on dataset. Algorithm

RMSE

F1 measure

MAE

Co-clustering

1.8393

0.4289

1.4274

SVD

1.5726

0.4428

1.2046

NMF

2.4767

0.4202

2.0717

KNNBasic

1.9473

0.4434

1.5263

KNNwithMeans

1.7994

0.4404

1.3925

KNNwithZScore

1.7967

0.4402

1.3819

CONCLUSION Hence, this work concludes that the SVD technique is the most preferred among the algorithms implemented. Fig. (9) shows that the NMF has a large RMSE and MAE and less F1 measurement compared to others. It further concludes that the NMF alone is not suitable for this dataset. Moreover, KNN (includes KNNBasic, KNNwithMeans, KNNwithZScore) is much better compared with NMF, primarily based on RMSE and MAE values. In addition, it concludes that the evaluation of the RMSE is much better than that of the MAE. ACKNOWLEDGEMENT We would like to express our gratitude to the Department of Computer Engineering at Dwarkadas J Sanghvi College of Engineering, who motivated us to dive into research and guided us when we faced any difficulty. Also, the assistance provided by our senior classmate Onkar Thorat is greatly appreciated. REFERENCES [1]

T. Silveira, M. Zhang, X. Lin, Y. Liu, and S. Ma, "How good your recommender system is? A survey on evaluations in recommendation", Int. J. Mach. Learn. Cybern., vol. 10, 2019. [http://dx.doi.org/10.1007/s13042-017-0762-9]

[2]

G. Shani, and A. Gunawardana, "Tutorial on application-oriented evaluation of recommendation systems", AI Commun., vol. 26, pp. 225-236, 2013. [http://dx.doi.org/10.3233/AIC-130551]

[3]

Sang Nguyen, “Model-Based Book Recommender Systems using Naïve Bayes enhanced with Optimal Feature Selection”, 217-222. [http://dx.doi.org/10.1145/3316615.3316727]

[4]

F.O. Isinkaye, Y.O. Folajimi, and B.A. Ojokoh, “Recommendation systems: Principles, methods and

22 Artificial Intelligence and Data Science

Deshpande et al.

evaluation”, Egyptian Informatics Journal, Volume 16, Issue 3, 2015, Pages 261-273, [5]

P. Valdiviezo-Diaz, F. Ortega, E. Cobos, and R. Lara-Cabrera, "A Collaborative Filtering Approach Based on Naïve Bayes Classifier", IEEE Access, vol. 7, pp. 108581-108592, 2019. [http://dx.doi.org/10.1109/ACCESS.2019.2933048]

[6]

S. Doshi, Brief on recommender systems, 2019. https://miro.medium.com/max/1064/1*mz9tzP1L jPBhmiWXeHyQkQ.png

[7]

Do, Minh-Phung Thi, D. V. Nguyen, and Loc Nguyen. "Model-based approach for collaborative filtering." In 6th International Conference on Information Technology for Education, pp. 217-228. 2010.

[8]

A. Laishram, Novelty in Recommender Systems, 2019. https://miro.medium.com/max/449/0*IU9e 4BZiaflPb_iL.png

[9]

W. Johnson, Recommender Systems with Apache Spark’s ALS function, 2016. https://image.slidesh arecdn.com/20160503mkebdrecosys-160501205055/95/recommender-systems-with-apache-sarks-als-function-10-638.jpg?cb=1462136016

[10]

Ayse Yaman, CodeX, “Hybrid Recommender System-Netflix Prize Dataset”, Medium, Retrieved https://miro.medium.com/max/1370/0*PCZeW5TphSgtkIqm.png

[11]

B. Pontes, R. Giráldez, and J.S. Aguilar-Ruiz, “Biclustering on expression data: A review”, Journal of Biomedical Informatics, Vol. 57, pp. 163-180, ISSN 1532-0464 [http://dx.doi.org/10.1016/j.jbi.2015.06.028, 2015.]

[12]

X. Gan, A.W-C. Liew, and H. Yan, "Discovering biclusters in gene expression data based on highdimensional linear geometries", BMC Bioinformatics, vol. 9, p. 209, 2008. [http://dx.doi.org/10.1186/1471-2105-9-209] [PMID: 18433477]

[13]

Das, Joydeep & Mukherjee, Partha & Majumder, Subhashis & Gupta, Prosenjit, “Clustering-Based Recommender System Using Principles of Voting Theory”, Proceedings of 2014 International Conference on Contemporary Computing and Informatics, IC3I 2014. 10.1109/IC3I.2014.7019655, 2014.

[14]

M. Rege, M. Dong, and F. Fotouhi, "Co-clustering documents and words using bipartite isoperimetric graph partitioning", Proc. Int. Conf. Data Mining, pp. 532-541, 2006.

[15]

Y. Chen, M. Dong, and W. Wan, "Image co-clustering with multi-modality features and user feedbacks", Proc. Int. Conf. Multimedia, pp. 689-692, 2009.

[16]

P.F. Felzenszwalb, and D.P. Huttenlocher, "Efficient graph-based image segmentation", Int. J. Comput. Vis., vol. 59, no. 2, pp. 167-181, 2004.

[17]

J. Luo, B. Liu, B. Cao, and S. Wang, "Identifying miRNA-mRNA regulatory modules based on overlapping neighborhood expansion from multiple types of genomic data", Proc. Int. Conf. Intell. Comput, pp. 234-246, 2016.

[18]

G. Pio, M. Ceci, C. Loglisci, D. D’Elia, and D. Malerba, "Hierarchical and overlapping co-clustering of mRNA: miRNA interactions", Proc. Eur. Conf. Artif. Intell, pp. 654-659, 2012.

[19]

Deep Learning Book Series · 2.8 Singular Value Decomposition, 2019. https://hadrienj. github.io/assets/images/2.8/singular-value-decomposition.png

[20]

L.V.P. Andre, and E.R. Hruschka, “Simultaneous co-clustering and learning to address the cold start problem in recommender systems”, Knowledge-Based Systems, Vol. 82, pp. 11-19, ISSN 0950-7051, 2015. [http://dx.doi.org/10.1016/j.knosys.2015.02.016]

[21]

Hosseinzadeh Aghdam, Mehdi & Analoui, Morteza & Kabiri, Peyman. (2012). “Application of nonnegative matrix factorization in recommender systems”, 873-876, 2012. [http://dx.doi.org/10.1109/ISTEL.2012.6483108]

Study of Machine Learning

Artificial Intelligence and Data Science 23

[22]

S. Gupta, Top 5 Distance Similarity Measures implementation in Machine Learning, 2019. https://miro.medium.com/max/875/1*L1pWK9foGvUIT7uQM9f_yQ

[23]

https://upload.wikimedia.org/wikipedia/commons/thumb/5/55/Euclidean_distance_2d.svg/450pxEuclidean_distance_2d.svg.png

[24]

S. Gupta, Top 5 Distance Similarity Measures implementation in Machine Learning, 2019. https://miro.medium.com/max/790/1*dMv1HKYgFLlcCR5-ejLwPw.png

[25]

V. Karbhari, What is a cosine similarity matrix?, 2020. https://miro.medium.com/max/625/ 1*dGWOzgAYv9NUkWvkETQUTQ.png

[26]

T. Ahadli, Naive Bayes Classifier: Bayesian Inference, Central Limit Theorem, Python/C++ Implementation, 2020. https://miro.medium.com/max/875/1*HfG1PY5-VSILokC66mLtsA.png

[27]

N. Mutha, Bernoulli Naive Bayes. https://iq.opengenus.org/bernoulli-naive-bayes/#:~:text=Bernoulli% 20Naive%20Bayes%20is%20used%20for%20discrete%20data,or%20failure%2C%200%20or%201% 20and%20so%20on

[28]

R. Gandhi, Naive Bayes Classifier, 2018. https://miro.medium.com/max/1576/1*0If5Mey7FnW _RktMM5BkaQ.png

[29]

A Ajesh, “A random forest approach for rating-based recommender system”. pp. 1293-1297. 2016. [http://dx.doi.org/10.1109/ICACCI.2016.7732225]

[30]

A. Al-Molegi, I. Alsmadi, N. Hassan, and H. Al-bashiri, "Automatic Learning of Arabic Text Categorization", International Journal of Digital Contents and Applications., vol. 2, pp. 1-16, 2015. [http://dx.doi.org/10.21742/ijdcasd.2015.2.1.01]

[31]

Performance measures: RMSE and MAE. https://thedatascientist.com/performance-measures-rme-mae/

[32]

MAE and RMSE-Which Metric OVlFLnMwHDx08PHzqlBDag.gif

[33]

https://akhilendra.com/evaluation-metrics-regression-mae-mse-rmse-rmsle/

[34]

http://www2.informatik.uni-freiburg.de/~cziegler/BX/

[35]

B. Gipp, J. Beel, and C. Hentschel, “Scienstein: A Research Paper Recommender System”, 2009.

[36]

T. Li, J. Wang, H. Chen, X. Feng, and F. Ye, "A NMF-based Collaborative Filtering Recommendation Algorithm", 6th World Congress on Intelligent Control and Automation, pp. 6082-6086, 2006.

[37]

S. Sahu, A. Nautiyal, and M. Prasad, "Machine Learning Algorithms for Recommender System - a comparative analysis", International Journal of Computer Applications Technology and Research., vol. 6, pp. 97-100, 2017. [http://dx.doi.org/10.7753/IJCATR0602.1005]

[38]

I. Portugal, P. Alencar, and D. Cowan, "The Use of Machine Learning Algorithms in Recommender Systems: A Systematic Review", Expert Syst. Appl., vol. 97, 2015. [http://dx.doi.org/10.1016/j.eswa.2017.12.020]

[39]

Sang Nguyen, “Model-Based Book Recommender Systems using Naïve Bayes enhanced with Optimal Feature Selection”. pp. 217-222. 2019. [http://dx.doi.org/10.1145/3316615.3316727]

[40]

H. Gaudani, "A Review Paper on Machine Learning Based Recommendation System", Development, vol. 2, pp. 3955-3961, 2014.

[41]

A. Lampropoulos, and G. Tsihrintzis, "Review of Previous Work Related to Recommender Systems", Intelligent Systems Reference Library., vol. 92, pp. 13-30, 2015. [http://dx.doi.org/10.1007/978-3-319-19135-5_2]

[42]

A. Nawrocka, A. Kot, and M. Nawrocki, "Application of machine learning in recommendation

is

Better?,

2016.

https://miro.medium.com/max/630/1*

24 Artificial Intelligence and Data Science

Deshpande et al.

systems", 19th International Carpathian Control Conference (ICCC), 2018pp. 328-331 [43]

M. Babaee, S. Tsoukalas, M. Babaee, G. Rigoll, and M. Datcu, “Discriminative Nonnegative Matrix Factorization for dimensionality reduction”, Neurocomputing, Vol. 173, Part 2, PP. 212-223, ISSN 0925-2312, 2016. [http://dx.doi.org/10.1016/j.neucom.2014.12.124]

[44]

K. Anwar, J. Siddiqui, and S. Sohail, "Machine Learning Techniques for Book Recommendation: An Overview", SSRN, 2019.

[45]

Joeran Beel, Bela Gipp, Stefan Langer & Corinna, “Research-paper recommender systems: a literature survey”, Int J Digit Libr 17, 305–338, [http://dx.doi.org/10.1007/s00799-015-0156-0]

[46]

“A new point-of-interest approach based on multi-itinerary recommendation engine, Expert Systems with Applications”, Vol. 181, 115026, ISSN 0957-4174, 2021. [http://dx.doi.org/10.1016/j.eswa.2021.115026]

[47]

K. Al Fararni, B. Aghoutane, J. Riffi, A. Sabri, and A. Yahyaouy, "Comparative Study on Approaches of Recommendation Systems", In: Embedded Systems and Artificial Intelligence. Advances in Intelligent Systems and Computing., V. Bhateja, S. Satapathy, H. Satori, Eds., vol. 1076. Springer: Singapore, 2020.

[48]

P. Piletskiy, D. Chumachenko, and I. Meniailov. "Development and Analysis of Intelligent Recommendation System Using Machine Learning Approach", In: Integrated Computer Technologies in Mechanical Engineering. Advances in Intelligent Systems and Computing., In: Nechyporuk M., Pavlikov V., Kritskiy D. (eds)., Vol. 1113. Springer: Cham, 2020

[49]

Javed, U., Shaukat, K., A. Hameed, I., Iqbal, F., Mahboob Alam, T. & Luo, S, “A Review of ContentBased and Context-Based Recommendation Systems”. International Journal of Emerging Technologies in Learning (iJET), 16(3), 274-306. Kassel, Germany: International Journal of Emerging Technology in Learning. Retrieved June 24, 2021. https://www.learntechlib.org/p/219036/

Artificial Intelligence and Data Science, 2023, 25-52

25

CHAPTER 2

Machine Learning Approaches for Text Mining and Spam E-mail Filtering: Industry 4.0 Perspective Pradeep Kumar1,*, Abdul Wahid2 and Venkatesh Naganathan2 1 2

Department of CS&IT, Maulana Azad National Urdu University, Hyderabad, India Amity Global Institute, Singapore 238466, Singapore Abstract: The revolution of Industry 4.0 will leave an impact on the domain of everyone's lives directly or indirectly. Several new complex applications will be developed in the days to come that are complicated to predict in the current scenario. With the help of machine learning approaches and intelligent IoT devices, people will be relieved from extra overheads of redundant work currently being performed. Industry 4.0 has become a significant catalyst for innovation and development in various industrial sectors like production processes and quality improvement with greater flexibility. This chapter applied different machine learning algorithms for spam detection and classifying emails into legitimate and spam. Seven classification models: Decision Trees, Random Forest, Artificial Neural Network, Gradient Boosting Machines, AdaBoost, Naive Bayes, and Support Vector Machines are applied. Three benchmark spam datasets are extracted from standard repositories to conduct the experiments. The chapter also presents a quantitative performance analysis. The results from rigorous experiments reveal that ensemble methods, Gradient Boosting and AdaBoost, outperformed other methods with an overall accuracy of 98.70% and 98.18%, respectively. The ensembled models are effective on a large-sized dataset embedded with more extensive features. The performance of non-ensemble methods, ANN and Naïve Bayes, was instrumental on large datasets as a viable alternative, with an overall accuracy of 98.38% and 97.63% on test data.

Keywords: Cross-validation, Industrial revolution, Machine learning methods, Parameter optimization, Performance measurement, Preprocessing techniques. INTRODUCTION With the advancement of Information and Communication Technologies (ICT), Fourth Industrial Revolution (Industry 4.0) embodies several aspects of cuttingedge technology ranging from servicing robots attending to patients during the * Corresponding author Pradeep Kumar: Department of CS&IT, Maulana Azad National Urdu University, Hyderabad, India; E-mail: [email protected]

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

26 Artificial Intelligence and Data Science

Kumar et al.

ever-challenging COVID-19 epidemic situation, Unmanned Aerial Vehicle (UAV), auto-piloted planes, and cars roaming the skies and roads, and entertainment through audio-visuals. Industry 4.0 is a new technological revolution that provides leverage of cyber-physical systems to improve new areas of development based on traditional industrial technology and services through the combination of information and industrialization. Growth and diffusion of Industry 4.0-related technology, such as augmented reality, have provided novel audio-visuals and facilitated several online training programs for professionals like doctors, front-line workers, and volunteers during lockdown periods across the globe. The Internet of Things (IoT) can comprehensively monitor appliances through intelligent sensors for smart cities, highways, and agriculture. Industrial cyber security platforms can perform intelligent monitoring of corporate networks and can take countermeasures against different attacks can be done. The prevalence of the industrial 4.0 revolution promises to standardize and streamline product manufacturing. Text categorization plays a significant role in text retrieval, information extraction, and question-answering patterns. Intelligent classifiers are found to be more promising for the automatic filtration of text documents. One of the most widely used ways of digital communication is email for personal and business purposes. Therefore, a substantial need is to categorize emails as spam or ham. Spam filtering is a technique to detect unsolicited emails that prevents them from delivering to the user's inbox. A typical fourth Industrial revolution scenario/platform comprises data and machine learning techniques for a better understanding of the user, product manufacturing, monitoring of the quality of the product, and distribution of logistics with user feedback. Data is captured through the sensors, transferred to the internet's cloud server, and analyzed through machine learning algorithms (supervised, unsupervised, and reinforced). Moreover, intelligent decisions are made for Industry 4.0 users, such as frequency of use, preferences, modes of use, and other related schedules. In addition to statistical data analysis, machine learning and machine vision technology can be applied for automatic large-scale, highly accurate product inspections, particularly identifying complex defects that are not easily visible to the human eye. One of the most diverse field of Industry 4.0 revolution is auto-pilot system. The auto-pilot will play a crucial role in logistic distribution. Within the next decade, it is expected that machine learning, computer vision, and control technology will get fully commercialized for the automated driving technology that will make delivery and logistics much more straightforward while significantly reducing costs. Fig. (1). shows the various phases of the industrial revolution.

Machine Learning Approaches

Artificial Intelligence and Data Science 27

Fig. (1). Industrial revolutions.

Current knowledge about the capabilities needed in Industry 4.0 is inadequate. This research aims to unfold an outline of Industry 4.0 email spam filtering, employing text mining on intimately available email spam filtering, frequently applied as a channel for gathering potential information. The distinguishing characteristics of the fourth industrial revolution primarily include: Integration and Interconnection From Industry 4.0 perspective, sensors are integrated and embedded into the hardware. With the help of machine learning engines, everything becomes interconnected. That is, it becomes quite possible to connect people to people (P2P), machines to machines (M2M), people to machines (P2M), and services to services (S2S) seamlessly. Therefore, using integration and interconnection factor, all the processes from production to services like equipment, production lines, factories, and services can be closely linked together efficiently and effectively. Data and Digitalization Data components include various aspects of production and associated services like equipment data, product data, supply chain data, operational data, R&D, and user-related data. Machine learning algorithms must be trained and tested with sufficient potential data for efficient outcomes. Deploying machine learning algorithms requires data generation to control the production processes. Thus, using the digitization of data, associated processes can be automated. Refinement and Personalization Fourth Industrial Revolution interpolates relatively specific and customized requirements that are increasingly becoming more refined. Each part of the production line becomes more modular and refined, leading to personalized

28 Artificial Intelligence and Data Science

Kumar et al.

production as far as possible. Some of the significant applications of machine learning in Industry 4.0 include: Smart Manufacturing Applications of machine learning techniques have leveraged production processes into smart manufacturing. Automating production processes lead to smart manufacturing. Automated Vehicles and Machines Automated driverless transformation in global logistics vehicles and machines allows the replacement of the operators that entail a physical risk. Quality Control Traditionally, products are evaluated later in the production process. With the advancement of intelligent IoT devices, like smart sensors, the quality of the production phase can be controlled effectively through machine learning algorithms. Predictive Maintenance Smart sensors can be deployed to capture valuable information about machines' status, making machine management increasingly affordable. These recorded data can be applied later to train the machine learning-based models that can predict the incorrect operation of individual components of different manufacturing machines. Demand Predictions To handle the supply-demand chain management efficiently is a common problem in the industry to adapt production to demand. If the production conditions are favorable, saving surpluses is not advisable. With machine learning models, the expected demand can be predicted optimally. Chatbots Smart chatbots could be designed and developed for better ease of business, particularly from the customer's perspective. Chatbots are system generated leveraging the users' conversation through text and voice. Chatbots can be used to obtain vital information and perform various tasks dealing with customer services.

Machine Learning Approaches

Artificial Intelligence and Data Science 29

BACKGROUND & MOTIVATION Handling unsolicited bulk emails has become a tough challenge over the internet. According to Statista's online report, email users have amounted to 3.9 billion globally, and by 2024 this Fig. will grow by 4.48 billion globally. Presently almost 55 percent of incoming emails are delivered as spam emails, including promotional emails in daily life routines [1 - 6]. These malicious emails can consume a lot of vital network resources. Automatic text classification is a popularly applied approach to spam filtering. Significant issues like time and space complexity, including suspicious delivery of e-contents to the receiver, have posed severe concerns. Spam emails are becoming an overhead consumption of computational resources and human efforts to manage them effectively [3, 8]. The growing percentage of spam emails and messages has opened a new dimension between spammers and researchers to handle them effectively. Therefore, intelligent email filtering techniques and tools are much needed for safe and secure public, private, and commercial communication. Alternatively, knowledge engineering is also widely employed for email classification using a set of rules. Guzella and Caminhas (2009) elaborated on the classification rules that may be created by third-party vendors [10]. The major drawback of the rule-based approach is that these rules must be updated frequently in a dynamic context. Machine learning is a practical and viable alternative approach since no specific rules are generated explicitly; instead, a set of training samples is utilized [11 - 13]. Machine learning algorithms develop multiple models using different data sets collected from some standard repository or corpus, and each model is analogous to an experience [38]. Machine learning algorithms can be categorized as parametric and non-parametric. Different data distributions, such as probability distribution, exponential distribution, Poisson distribution, etc., are applied in various predictive models. Non-parametric models do not require any assumption of the data distribution. Data sets have the values of input variables and associated outcomes as target or response variables. The algorithm learns from training data sets and makes predictions for unknown samples based on the type of training provided to the model. The flow diagram for spam mail filtering using state-of-the-art learning techniques employed in our study is summarized in Fig. (2). Spam Filtering Using Machine Learning Approaches This section explores various approaches adopted by industry leaders dealing with spam email classification using different techniques. Various ensembles and nonensembles are applied for text classification effectively [40]. Widely used supervised learning methods like K-nearest neighbor, Naive Bayes, DTs, Neural

30 Artificial Intelligence and Data Science

Kumar et al.

Networks, SVMs, Bagging, and Boosting are of significant concern for classification problems [1 - 8]. Unsupervised machine learning can be utilized to discover the hidden pattern for detecting anomalies like network intrusion or spam messages. All the leading ISPs (Internet Service Providers) like Gmail, Yahoo, Outlook, and AOL have incorporated the best practices for spam filtering to prevent legitimate emails from being received as spam or delivering spam emails to the user's inbox. Anti-spam tools and firewall security are also deployed to address the unpleasant delivery of suspicious emails and possible threats. From the industry 4.0 perspective, spam filters have been deployed by all the leading Internet Service Providers (ISPs) at various network layers incorporating firewalls [8]. Data Pre-processing Techniques The internet is the most significant source of information in today's world, where enormous data is available. Finding insights from this unstructured text requires extensive data preprocessing (data cleaning, integration, transformation, reduction, and discretization). To categorize emails as spam or ham, we require extensive preprocessing of text data for effective categorization (Forman, 2003). Various filtering techniques are applied for spam email categorization [36, 37]. Table 1 presents the most widely used approaches. Table 1. Filtering techniques for email spam filtering. Filtering Techniques

Description with Limitations

Content-based filtering method

Filtering rules are created to categorize the emails using different classifiers like Bayesian classification, Naïve Bayes, KNN, ANNs, and SVMs. This method analyzes the contents of emails, such as words, frequency, and distribution of words and phrases. Then it generates the procedures to filter the received emails [14].

Case-based spam filtering

It is another widely used filtering approach. In this method, spam/non-spam emails are fetched from the user's inbox, which may contain some irrelevant details that need to be preprocessed [39].

Heuristic-based spam filtering

The heuristic method applies previously generated procedures to evaluate the patterns of text data comprising of some regular expressions versus selected messages. Such patterns improve the message's score. Contrarily, the score gets reduced when the patterns do not match. If the message's score exceeds a threshold, it is filtered as legitimate mail; otherwise treated as invalid. However, the ranking rules do not perform constantly. These rules need to be updated regularly to capture the spammers effectively.

Machine Learning Approaches

Artificial Intelligence and Data Science 31

(Table 1) cont.....

Filtering Techniques

Description with Limitations

Previous likenessbased spam filtering

This approach classifies the incoming emails with known training text based on a resemblance known as memory-based learning. Email attributes are applied to generate a multidimensional space vector to create new instances. These instances are assigned to the most frequent nearest kth class of training instances. This method employs k-NN for spam filtering [16].

Adaptive-spam filtering

This method filters spam emails by arranging them into different classes. Text emails are divided into various groups sharing common characteristics. Incoming emails are compared with the individual groups to categorize the probable group of an email it belongs to [15].

Fig. (2). Flow sequence of ML approaches for spam mail classification.

32 Artificial Intelligence and Data Science

Kumar et al.

Spam Filtering: A Comparative Study of Machine Learning Approaches A set of emails, including legitimate and non-legitimate emails, are used to train the classification model with labeled emails and then validated with previously unseen emails to be categorized as spam and ham (non-spam or legitimate). For a given dataset D containing N number of observations classified with label C, automatic text classification may be represented as X={x1, x2, ..., xN} categorized with label C={c1, c2,..., cN}. A text document can be assigned to multiple types also [7]. In this study, we are addressing a binary categorization only, such as legitimate or non-legitimate emails (spam or ham). Data Repositories The model should be validated for performance analysis on large-size updated datasets from public, private, and commercial domains. Three benchmark multivariate datasets were collected from standard repositories for predictive modeling to classify emails as spam and non-spam. These datasets, divided into training and testing, are defined in Table 2. Table 2. Summary of spam dataset utilized. Spam mail/message dataset DS-I. UCI Spambase

DS-II Spam dataset

DS-III corpus Ham-spam emails

Criteria

Training data

Testing data

Spam

1269

544

Non-spam

1952

836

Total

3221

1380

Spam

523

224

Non-spam

3377

1447

Total

3900

1671

Spam

958

410

Non-spam

3052

1308

Total 4010 1718 DS-I is a multivariate dataset collected from the UCI ML Repository http://archive.ics.uci.edu/ml [33]. DS-II is a set of 5574 SMS messages tagged as legitimate or spam. One message is represented by a line. Dataset III is a collection of spam email classification extracted from Kaggle. It has more than 5000 email samples labeled spam or ham (https://www.kaggle.com/rushirdx/ spam-and-ham-dataset?select=spam.csv) [34, 35].

The dependent variable represents whether an input email was considered spam or non-spam. In DS-II and DS-III, 1 and 0 represent the spam and non-spam emails, respectively. In DS-I, most of the input attributes represent the occurrence of a particular word or character. Datasets were split into three parts: training, testing, and validation. The training component comprised 70%, and 30% were reserved for testing and validation.

Machine Learning Approaches

Artificial Intelligence and Data Science 33

Performance Measurement Table 3 shows the criteria for performance evaluation. A confusion matrix is generated by checking the actual values and predicted values from the observations of a given dataset. The confusion matrix is also referred to as the error matrix and classification table [20]. The confusion matrix is applied to the categorical response variable, whereas error metrics are applied to the continuous data variables. Table 3. Criteria for performance evaluation. Performance-assessment criteria Classifier's-Accuracy

Classifier-Error

Metrics-type

Accuracy

TP + TN Total observations

Misclassified

1  Accuracy

TPR

TPR

TP TP  FN

Specificity / TNR

TNR

FP TN  FP

Spam Precision

Pr ecision

FPR

FPR

NPV

NPV

F-Measure (F)

Recall TPR FPR FNR =1-TPR TNR = 1- FPR

F

TP

TP  FP FP TN  FP

TN

TN  FN

2 x Recall x Precision Re call  Pr ecesion

Metrics-description The percentage of all correctly classified instances. Accuracy is known as precision. Showsthepercentageofincorre ctly classified emails that are legitimate. Defined as the number of spam emails prevented from entering the email inbox by the predictive model The ratio of predicted negative outcomes to the total negative observations It represents the ability of a predictive model to categorize legitimate emails. The ratio of a predicted positive outcome to the total negative observations The ratio of predicted negative results to the total predicted negative observations. The weighted average of recall and precision.

The % of emails correctly classified as spam TPR -the number of observations predicted as positive counted as spam mail The number of text emails predicted as positive, but they are spam emails. FP is also known as the Type I Error The number of text emails predicted as negative, but they are not spam emails. FN is also known as the Type II Error. The number of observations predicted as negative

34 Artificial Intelligence and Data Science

Kumar et al.

(Table 3) cont.....

Performance-assessment Criteria Rate of error

Metrics-type

Metrics-description

The proportion of overall incorrect samples to the total number of observations. It can be defined as = (correct – N) / N

MACHINE LEARNING APPROACHES This section quantitatively explains the Bagging and Boosting (Decision Trees, Random Forest, Gradient Boosting, AdaBoost) and non-ensemble (Naive Bayes, ANN, and SVM) machine learning techniques when applied to spam datasets (DS-I, DS-II, DS-III). These seven potential classification techniques are used for spam mail filtering and text classification as follows: Decision Tree Modeling The decision tree induction algorithm is a popular supervised learning algorithm suitable for binary classification, such as spam or non-spam. The model learns from the training data sets and makes predictions for unknown samples based on the training provided. DTs are established on a greedy method that builds the tree recursively in a divide and conquers way. The tree starts building with a root node and then iteratively applies some statistical measures, such as the Gini impurity index and entropy, to split the node into branches. The original datasets are divided into subsets using input features [17, 18]. The samples are tested to classify incoming emails as legitimate or spam, and each unique path is maintained from root to leaf node. The DTs generate rules for each leaf node of the tree. Based on these rules, incoming emails are classified into legitimate and spam emails. DTs work pretty well with both categories of numerical and categorical data [19, 20]. Table 4 presents the results of the DT classifier conducted on 3-datasets. Table 4. Performance measures of Decision Tree Classifier. Decision Trees DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.88

0.87

Non-spam

0.92

0.93

Spam

0.83

0.99

Non-spam

0.92

0.45

Spam

0.69

0.99

Non-spam

0.99

0.85

Machine Learning Approaches

Artificial Intelligence and Data Science 35

Random Forest Random forest is an ensemble technique that takes a group of estimators and builds multiple models to generate improved accuracy. RF provides more accurate results over a single classifier since different models are combined to produce an overall outcome based on majority voting. Bagging (Bootstrap Aggregating) and Boosting are popular ensemble techniques. Bagging combines the results of multiple base models to get improved results. Bootstrapping is a sampling technique with a replacement where subsets of samples are generated from the original dataset. Bagging applies these subsets, known as bags, to get the distribution of a complete set. The size of subsets may vary from the original set. RF is generated using DTs as base estimators. RF randomly picks a set of attributes to decide the best split at each node of every decision tree. RF selects these random data points and features to build multiple trees. Random subsets are created from the given spam datasets and corpus. DT model is fitted on each subset containing random features, and the final prediction is computed by averaging the predictions from all individual DTs [17 - 20]. Table 5 represents the results of the RF classifier conducted on 3-datasets. Table 5. Performance measures of Random Forest Classifier. Random Forest DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.96

0.90

Non-spam

0.94

0.97

Spam

0.99

0.86

Non-spam

0.98

1.00

Spam

0.99

0.67

Non-spam

0.90

1.00

Boosting is also widely used as an ensemble technique. Boosting is employed to improve the performance of weak classifiers to a more robust model. A model with a significant error rate, usually less than 50%, is marginally better than random guessing, considered a weak model. The primary purpose of boosting is to emphasize the samples that are hard to classify accurately. Boosting builds the multiple models sequentially by assigning equal weights to each sample initially and then targets misclassified samples in subsequent models. Two popularly used algorithms are Gradient boosting and AdaBoost.

36 Artificial Intelligence and Data Science

Kumar et al.

Gradient Boosted Model (GBM) GBM is another useful ensemble prediction model that applies a decision tree as a base classifier. Gradient boosting focuses on residuals from earlier classifiers and fits the model to residuals. The gradient descent algorithm is employed to minimize the error. GBM applies boosted machine learning for emails extracted from each spam dataset. GBM constructs one tree at a time, where each new tree helps to rectify errors caused by the earlier trained tree, unlike the RF classification model, where the trees do not correlate with previously constructed trees [21 - 23]. The training instances {(x1,y1), ……,(xi, yi)} are extracted from the spam datasets and corpus. Here xi € Rn and yi € {+1, -1}denoting the outcomes for ith training sample indicating +1 as spam and -1 for non-spam email, a voted combination of classifiers F(X) can be represented as: F(X )

T

¦ wt ft ( x)

(1)

t 1

where ft(x): Rn →{+1,-1} are base classifiers, and wt ϵR, the weights for each base classifier in the combined classifiers. A positive value for the margin represents spam mail, and the negative value corresponds to legitimate mail (non-spam). Table 6 represents the results of the Gradient Boosting classifier implemented on three datasets. Table 6. Performance measures of Gradient Boosting Classifier. GBM DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.95

0.92

Non-spam

0.95

0.97

Spam

0.97

1.00

Non-spam

0.98

0.88

Spam

0.97

0.96

Non-spam

0.99

0.99

AdaBoost Method The AdaBoost method applies a sequence of weak learners (decision trees and logit) on modified versions of the text data repeatedly. For a given dataset of emails/messages, the training samples can be represented as X={(x1, y1),., (xn, yn)} where xi is the input feature and corresponding target variable, yi €{+1, -1} where +1 indicates the spam mails and -1 represents legitimate mail. The

Machine Learning Approaches

Artificial Intelligence and Data Science 37

AdaBoost method boosts the accuracy of a weak learner by simulating multiple distributions over the training samples. The Adaboost takes the majority vote of the resulting outcomes. Initially, a set of weights is applied to the training samples and updated after every round of training. The weights are updated to increase the weights of the samples classified incorrectly. On the other hand, the correctly classified samples are assigned lower weights. While updating the weight mechanism, the base learner concentrates on the more complex samples during the training process. The overall model prediction is evaluated on the weighted totality of all classifiers [24]. F(Xi )



K

sign ¦ D k f k ( X i ) k 1



(2)

Where K represents the total number of classifiers utilized, fk(Xi) is the outcome of weak classifier k for corresponding feature Xi, αk is the weight assigned to classifier k computed as: Dk

1 § 1 Hk · ln ¨ ¸ 2 © Hk ¹

(3)

Where εk represents the error rate of the classifier, that is, the number of incorrectly classified samples over the training set divided by the total number of the training set, F(Xi) indicates the combination of all the weak classifiers. Table 7 and Table 8 represent the results of the AdaBoost classifier using a base learner as a logit model and DT model. Table 7. Performance measures of AdaBoost Classifier when the base learner is a logit model. AdaBoost DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.91

0.89

Non-spam

0.93

0.94

Spam

0.95

1.00

Non-spam

0.99

0.63

Spam

0.98

0.96

Non-spam

0.99

0.99

38 Artificial Intelligence and Data Science

Kumar et al.

Table 8. Performance measures of AdaBoost Classifier when the base learner is Decision Tree. AdaBoost DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.95

0.94

Non-spam

0.96

0.96

Spam

0.98

0.99

Non-spam

0.93

0.90

Spam

0.95

0.92

Non-spam

0.97

0.98

Naive Bayes Classification It is based on a simple probabilistic classification technique broadly applied in text classification. The probabilistic approach of the NB model can evaluate the possibility of every group of legitimate and non-legitimate emails [9]. It is established on the Bayesian theory assuming that features are statistically independent. Bayes theorem for given datasets D={D1, D2,….., Dn}of emails as an input X={X1, X2,….., Xn} labeled with class C={C1, C2,….., Cn} can be represented as: P(C | X ) P( X |C)P(C) P( X )

(4)

From spam datasets, if an email contains specific keywords such as stop-words, non-words, or lemmatized words, then consider it spam/ or non-spam otherwise [27]. The problem of spam email classification can be formulated as follows, P(class spam|contains "keywords") P(contains "keywords"| class spam).P(class spam) P(contains "keywords")

(5)

P(class=spam | contains=” keywords”) is the probability of an email being spam, given that this email contains any word from the predefined set of keywords. P(contains=” keywords” | class=spam) is the probability of an email containing any words from the set of keywords given that this email has been recognized as spam. It belongs to the training data, which correlates between emails considered spam and the keywords associated with the mail. P(class=spam) is the probability of an email being spam without prior knowledge of its keywords. It is the proportion of emails being spam in our entire training set.

Machine Learning Approaches

Artificial Intelligence and Data Science 39

P(contains==”keywords”) is the probability of an email containing any words from the set of keywords. It is the proportion of emails containing the keywords in the training set. Table 9 represents the performance of the Naïve Bayes model applied to three datasets. Table 9. Performance measures of Naïve Bayes Classifier. Naive Bayes DS-I DS-II DS-III

Classification

Precision

Recall

Spam

0.88

0.78

Non-spam

0.87

0.93

Spam

0.97

0.90

Non-spam

0.99

1.00

Spam

0.75

0.99

Non-spam

1.00

0.89

Artificial Neural Network ANNs have a massive capability of mapping and parallel computations. ANNs deal with a vast number of processing elements referred to as neurons. The ANN is comprised of several interconnecting artificial neurons. It is flexible in adopting the structure according to the inputs associated with weights and bias flowing in the network during learning [28]. Mathematically, the single-layer ANN can be represented as: netk

bk  ¦ i 1 xi wki and yk n

f (netk )

(6)

where n is the number of incoming emails categorized as spam or ham, i.e., X={ x1, x2, x3, …………. xn.}, is the activation function applied to get the generalized results. ∑ represents summation function and wkj represents links with the corresponding weights. Yk denotes the output of the neural network. The classification function using ANN for mapping the response variable to exploratory inputs to predict the email as spam or non-spam can be written as C=f(X) [29, 30]. The architecture of the ANN model using a Multilayer Perceptron (MLP) is shown in Fig. (3). The ANN is trained using Gradient Descent algorithms by adjusting the weights. The output of the ANN model is taken as a combination to generate the overall outcomes of the model. The purpose of the MLP is to achieve the linear function of a given feature vector f(x) = wT x + b such that f(x)>0 for vectors of legitimate emails and another category of spam emails f(x)., in this case, there is no transition between the two categories. Although the HITS (Hypertext Induced Topic Search) algorithm is better than the other POI recommendation algorithms, but it suffers from recommending irrelevant recommendations to the users.

[19]

The pair-wise interaction tensor factorization (PITF) predicts the Pair-wise category list. A interaction fourth-order tensor tensor The main goal is factorization factorization, to recommend the approach integrates Time Aware different locations users’ short-term Factorized Factorized of interest along and long-term Personalized Personalized with changes in influence, including Markov Chain Markov Chain, time and current time variant Distance locations of the preferences. In the weighted user. second phase, the Hyperlinklocation ranking list Induced Topic is obtained Search according to the categorical ranking list fetched from the first step.

[20]

It may not always outperform other Social media data The aim is to state-of-the-art from multiple recommend POIs Long ShortMulti-source methods and will sources was used to based on the Term Memory topical package use more measure space and user’s interest in (LSTM) LSTM. computational recommend travel specific resources than the routes. traditional LSTM method.

Introduction to Various Parameters

Artificial Intelligence and Data Science 203

CONCLUSION AND FUTURE SCOPE In this chapter, several parameters of recommendations are discussed. A few of them are highlighted, such as user preference, popularity, and weather details. The heterogeneous features make the prediction better in the case of the recommendation of POIs. Many techniques have been proposed by other researchers that are based on the above-mentioned contexts. Along with these research works, some problems or loopholes are left out, which are challenging. For that, new methods need to be developed, which can be considered as the improved or extended versions of the existing works that can reduce the challenges and provide a recommendation process in a wide range of applications, considering the better quality and accuracy aspects. ACKNOWLEDGMENTS. The work described in this paper was fully supported by Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India. REFERENCES [1]

C.I. Eke, A.A. Norman, L. Shuib, and H.F. Nweke, "A Survey of User Profiling: State-of-the-Art, Challenges, and Solutions", IEEE Access, vol. 7, pp. 144907-144924, 2019. [http://dx.doi.org/10.1109/ACCESS.2019.2944243]

[2]

R. Logesh, and V. Subramaniyaswamy, "A Reliable Point of Interest Recommendation based on Trust Relevancy between Users", Wirel. Pers. Commun., vol. 97, no. 2, pp. 2751-2780, 2017. [http://dx.doi.org/10.1007/s11277-017-4633-1]

[3]

M. Gao, K. Liu, and Z. Wu, "Personalisation in web computing and informatics: Theories, techniques, applications, and future research", Inf. Syst. Front., vol. 12, no. 5, pp. 607-629, 2010. [http://dx.doi.org/10.1007/s10796-009-9199-3]

[4]

G. Liao, S. Jiang, and Z. Zhiheng, "C., Wan, and X., Liu, “POI Recommendation of Location-Based Social Networks Using Tensor Factorization", 19th IEEE International Conference on Mobile Data Management, pp. 116-124, 2018.

[5]

J. Chen, W. Zhang, P. Zhang, P. Ying, K. Niu, and M. Zou, "Exploiting Spatial and Temporal for Point of Interest Recommendation", Complexity, vol. 2018, pp. 1-16, 2018. [http://dx.doi.org/10.1155/2018/6928605]

[6]

S. Zhao, I. King, and M.R. Lyu, "Aggregated Temporal Tensor Factorization Model for Point-o-Interest Recommendation", Neural Process. Lett., vol. 47, no. 3, pp. 975-992, 2018. [http://dx.doi.org/10.1007/s11063-017-9681-8]

[7]

M. Debnath, P.K. Tripathi, A.K. Biswas, and R. Elmasri, "Preference Aware Travel Route Recommendation with Temporal Influence", 2nd ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks, pp. 1-9, 2018. [http://dx.doi.org/10.1145/3282825.3282829]

[8]

L. Cai, W. Wen, B. Wu, and X. Yang, "A coarse-to-fine user preferences prediction method for pointof-interest recommendation", Neurocomputing, vol. 422, pp. 1-11, 2021. [http://dx.doi.org/10.1016/j.neucom.2020.09.034]

204 Artificial Intelligence and Data Science

Roy et al.

[9]

42nd IEEE International Conference on Computer Software Applications, pp. 57-62, 2018.

[10]

R. Li, Y. Shen, and Y. Zhu, "Next Point-of-Interest Recommendation with, Temporal and Multi-level Context Attention", IEEE International Conference on Data Mining, pp. 1110-1115, 2018. [http://dx.doi.org/10.1109/ICDM.2018.00144]

[11]

Z. Yao, Y. Fu, B. Liu, Y. Liu, and H. Xiong, "POI Recommendation: A Temporal Matching between POI Popularity and User Regularity", [http://dx.doi.org/10.1109/ICDM.2016.0066]

[12]

"Sarkar, A., Majumder, C.R., Panigrahi, and S., Roy, “Multitour: A multiple itinerary tourists recommendation engine”", Electron. Commerce Res. Appl., vol. 40, pp. 1-20, 2020.

[13]

C. Trattner, A. Oberegger, L. Marinho, and D. Parra, "Investigating the utility of the weather context for point of interest recommendations", Inf. Technol. Tour., vol. 19, no. 1-4, pp. 117-150, 2018. [http://dx.doi.org/10.1007/s40558-017-0100-9]

[14]

C., Trattner, A., Oberegger, L., Eberhard, D., Parra, and L.B., Marinho, “Understanding the impact of weather for poi recommendations”, In RecTour, 2016, pp 16–23.

[15]

J. Lu, and M.A. Indeche, "Multi-Context-Aware Location Recommendation Using Tensor Decomposition", IEEE Access, vol. 8, pp. 61327-61339, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.2983555]

[16]

W., Luan, G., Liu, C., Jiang, and L., Qi, “Partition-based collaborative tensor factorization for POI recommendation”, IEEE/CAA Journal Of Automatica Sinica, vol. 4, no. 3, 437 – 446, 2017.

[17]

T. Kurashima, T. Iwata, G. Irie, and K. Fujimura, "Travel route recommendation using geotags in photo sharing sites", Proceedings of the 19th ACM international conference on information and knowledge management, pp. 579-588, 2010. [http://dx.doi.org/10.1145/1871437.1871513]

[18]

C. Lucchese, R. Perego, F. Silvestri, H. Vahabi, and R. Venturini, "How random walks can help tourism", European Conference on Information Retrieval, pp. 195-206, 2012.

[19]

X. Li, M. Jiang, H. Hong, and L. Liao, "A Time-Aware Personalized Point-of-Interest Recommendation via High-Order Tensor Factorization"., ACM Trans. Inf. Syst., vol. 35, no. 4, pp. 123, 2017. [TOIS]. [http://dx.doi.org/10.1145/3057283]

[20]

G. Hu, Y. Qin, and J. Shao, "Personalized travel route recommendation from multi-source social media data", Multimedia Tools Appl., vol. 79, no. 45-46, pp. 33365-33380, 2020. [http://dx.doi.org/10.1007/s11042-018-6776-9]

[21]

T. Qian, B. Liu, Q.V.H. Nguyen, and H. Yin, "Spatiotemporal representation learning for translationbased POI recommendation", ACM Trans. Inf. Syst., vol. 37, no. 2, pp. 1-24, 2019. [TOIS]. [http://dx.doi.org/10.1145/3295499]

[22]

Q. Wang, H., Yin, T., Chen, Z., Huang, H. Wang, Y. Zhao, and N.Q.V., Hung, “Next point-of-interest recommendation on resource-constrained mobile devices”, In the Web Conference, 2020, pp. 906-916.

[23]

S. Safavi, and M. Jalali, "RecPOID: POI Recommendation with Friendship Aware and Deep CNN", Future Internet, vol. 13, no. 3, p. 79, 2021. [http://dx.doi.org/10.3390/fi13030079]

[24]

"Sarkar,A. Majumder, “A new point-of-interest approach based on multi-itinerary recommendation engine”", Expert Syst. Appl., vol. 181, no. 115026, 2021.

Artificial Intelligence and Data Science, 2023, 205-215

205

CHAPTER 10

Mobile Tourism Recommendation System for Visually Disabled Pooja Selvarajan1, Poovizhi Selvan1,*, Vidhushavarshini Sureshkumar1 and Sathiyabhama Balasubramaniam1 Department of Computer Science and Engineering, Sona College of Technology, Tamilnadu, India 1

Abstract: Mobile Tourism Recommendation System recommends to a tourist the best attractions in a particular place according to his preferences, profile and interest. First, a Recommender system offers a list of the city places likely to interest the user. This list estimates the user demographic classification, likes in former trips, and preferences for the current visit. Second, a planning module schedules the list of recommended places according to their characteristics and user limitations. The planning system decides how and when to perform the recommended activities. For implementing these recommender methods, we have applied different machine learning algorithms, which are the K-nearest neighbors (K-NN) for both Clean Boot (CB) and Consolidation Function (CF) and the decision tree for all Data Framing (DF). Thus, executing a recommendation system for tourists helps them with user-friendly planning. Blind people can also use this. This application provides complete voice assistance for easy navigation via a simple button click. Vibratory and voice feedback is provided for accurate crash alerts for visually challenged people. The application extracts its smartness by incorporating Android and Internet of Things (IoT) support. Since blindsupported applications and devices are more expensive and many blinds can not afford them, we aim to put forth a novel, low cost and reliable approach to help the blind explore the possibilities and power of smartphone technology in navigation. We additionally expect to find the static variables that should be tended to, food, tidiness, and opening times, and valuable to suggest a tourist place depending on the travel history of the client. In this investigation, we propose a cross-planning table methodology depending on the area’s prevalence, appraisals, idle points, and conclusion. A targeted work for proposal streamlining is defined as dependent on these mappings. Our outcomes show that the consolidated highlights of Latent Dirichlet Allocation (LDA), Support vector machines (SVM), appraisals, and cross mappings are helpful for upgraded execution. The fundamental motivation of this study was to help businesses related to tourism.

* Corresponding author Poovizhi Selvan: Department of Computer Science and Engineering, Sona College of Technology, Tamilnadu, India; E-mail: [email protected]

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

206 Artificial Intelligence and Data Science

Selvarajan et al.

Keywords: Requirement Portal, Hybrid Recommendation System, Personalized Recommendation System, Deep map, KNN Algorithm, Ultrasonic Sensor.

INTRODUCTION The tourism industry has always welcomed new technologies. With the usage of the web and electronics, mobile tourism (e-tourism) has grown significantly. At the same time, tourists play a more active role in the process of content production. They thoroughly publish information using web technologies, such as social networks, blogs, and wikis. The users also post dynamic content about visited destinations or relevant information about their visit, which can be helpful to other tourists [1 - 3]. Extending the notion of electronic tourism to meet the vision of tourist services and provisioning of migrant users with no spatialtemporal limitations is expected to become a reality within the next few years. In this context, the ‘Mobile Tourism’ field has emerged, wherein tourist information and services are accessed via mobile devices. Blind mobility is one of the main challenges that are faced worldwide. The number of visually challenged people is around 285 million, and 39 million are blind. Since blind support devices are more expensive, many blind people can not afford them. This work also aims to develop a cheap blind guidance system for developing countries. This chapter aims to develop a low-cost intelligent system for guiding the visually challenged by providing information about the environmental scenario of static and dynamic objects around them. The main functions of this system are path indication and environment recognition. The system is provided with a smartphone and a small embedded circuitry. Smartphone software is developed to recognize the destination place through voice command and draw a route from the current position to the desired destination [4]. Ultrasonic sensors will provide information about obstacles if they are within the limit range. Ultrasonic sensors are connected to a microcontroller circuit, and a microcontroller timer module is used to regulate the output of the sensor and send them to a smartphone via Bluetooth. Mobile tourism represents a recent trend in the tourism field that involves the use of tourist applications offering services and tours with multimedia content executed on electronic mobile [5]. The tourism Recommendation System is to enhance tourist decision-making. The tourist needs to understand how the system generates recommendations [6, 7]. It reduces users’ efforts and preserves their privacy. The main purpose of this application is to develop a recommendation system based on the user’s enhanced profiles [8]. These profiles will be composed of functionality levels regarding accessibility issues. Locomotion, vision, and phobias like acrophobia, agoraphobia, and claustrophobia are some accessibility issues. The user’s basic information, like age, gender, and nationality, is also provided by the user

Mobile Tourism Recommendation

Artificial Intelligence and Data Science 207

profiles. This research assumes a vital role in tourism recommendation systems and other areas where individual user knowledge is a key factor. One of the most important goals of this proposal is to fulfill the user’s needs in coexistence with respect for their physical and psychological limitations [8, 9]. Our primary commitments are summed up as follows: (a) An on-location travel conduct information gathering technique is intended to detect nearby travel conduct under indoor consequently and outside the travel industry situations. It considers travelers’ cell phones and Bluetooth Low Energy (BLE) guides. (b) Tourist-Behaviour Prefix Span calculation is proposed to create successive travel courses effectively depending on recorded Tourist-Behaviour design successions. (c) Travel course positioning strategy is proposed to suggest specific travel courses as indicated by the questioning vacationer’s profile and requirements. It guarantees the course esteem and reasonableness of the final travel courses. The centralization of the analysis has been complemented by the growing popularity of portability and worldwide innovation. A suitable traveler should help sightseers with their planned schedule. An approach has been proposed in this work to suggest a schedule plan for travelers. When sightseers travel to an unfamiliar city or a country, they may have a fascination with specific spots they need to visit. Initially, different travel spots have been clustered depending on the geographic area and inclinations of the travelers utilizing the K-means clustering algorithm. The following stage is to track down the ideal itinerary for each group. For this reason, covetous and 2-select calculations are used in this work. In this way, a perfect schedule plan is suggested for travelers. PROPOSED WORK Recommender systems map user needs and constraints through algorithms and convert them into product selections. The framework of the proposed technique has been presented in Fig. (1). Recommendation Systems In this section, some of the existing recommendation systems have been presented along with their capabilities:

208 Artificial Intelligence and Data Science

Selvarajan et al.

Fig. (1). The framework of the Tourism Recommendation System.

Collaborative Recommender Systems They aggregate recommendations of tourist places, recognize commonalities between the users based on their ratings and generate new recommendations based on inter-user comparisons. Collaborative filtering assumes that people who agreed in the past will agree in the future. Moreover, they will like similar spots as they liked in the past. A Content-based Recommender It learns the profile of the new user’s interests based on the features present in tourist places the user has rated. In this recommender system, the algorithms are such that it recommends similar items the user has liked in the past or is examining currently.

Mobile Tourism Recommendation

Artificial Intelligence and Data Science 209

Hybrid Recommendation System It is the most sought-after Recommender system that many companies look after, as it combines the strengths of more than two Recommender systems and eliminates any weakness that exists when only one recommender system is used. MAPPING TECHNOLOGIES Tipping Tipping [10] guidelines provide recommendation services through mobile devices for tourism. These services implement various algorithms to calculate tourist preferences using the defined tourist profile and location data (location awareness). Proximo Proximo [11] is a location-aware mobile. It guides the tourist through tours within buildings using Java and Bluetooth technologies. The mobile device also tracks the tourist location and provides the system with important information. The application constantly monitors the user’s location and displays the active areas of the building where the users are present accordingly. Geo Notes Geo Notes [12] system strives to socially intensify digital space (collaborative filtering, social navigation, etc.) by allowing users to participate in the information space’s design. Geo Notes in the recommendation system is a location-based information system that enables users to access information concerning the user’s position in geographical space. With this mobile application, tourists can get updates regarding their current location. Other tourists can visit the same place, thereby leaving their geo-referenced notes and introducing their notes. An example of Geo Notes has been presented in Fig. (2). Macau Map Macau Map [12] is a tourism-oriented mobile GIS application for the city of Macau, especially for the bus networks that display the user’s current location. Since it provides information about the public bus network, it also guides for calculating optimal bus routes. It helps to get information about museums, churches, temples, hotels, restaurants, and other places of interest, along with their location on the map.

210 Artificial Intelligence and Data Science

Selvarajan et al.

Fig. (2). Geo Notes.

Microsoft Planner It is a mobile planning [14] assistant system that allows personalized tourism. Using the Microsoft planner (et planner) mobile device, the customer’s stay is planned cleverly. In this way, the user can be assisted before, during, and after his journey. This system is applicable in real-time to special destination offers and to relevant occurrences like flight delays or weather issues. Tourist Guide This system [15] is also a location-based tourist guide application for the outdoor environment. It was implemented for the visitors to the Maw son Lakes campus (of the University of South Australia) and the North Terrace precinct in the Adelaide city center. The user interacts with the system using a Personal Digital Assistant (PDA). The user’s current position and detailed information about nearby Points of Interest (POI) are displayed by the PDA. It offers a self-guided tour of a specific area, like buildings, attractions, and nearby utilities such as public telephones and toilets.

Mobile Tourism Recommendation

Artificial Intelligence and Data Science 211

Cyber Guide The cyber guide [16] system was developed at the Georgia Institute of Technology (GIT), Atlanta, United States of America (USA). This system is based on the ubiquitous computing concept and focuses on the mobile contextaware tour guide. The system was created to assist a tourist on tour to the Georgia Institute of Technology (GIT) and helps the user to obtain information about the display demos. Knowledge of the user’s current location and a history of past locations is used to provide more of the different kinds of services that we can expect from a real tour guide. Context-Aware Tourist Information System Context-Aware Tourist Information System (CATIS) [17] is a context-aware tourist information system with a Web service-based architecture. The context elements considered for this work are location, time of day, speed, the direction of travel, and personal preferences of the tourist. This system will provide the user with relevant information according to his location and the current time, as the system mentioned above does. For example, suppose the user is traveling in the morning or noon. In that case, simple integration of the time context, the location, and respective user preferences, like restaurants, will result in a list with restaurants for breakfast or lunch as the user wish. Deep Map Deep Map [18] realizes the vision of a future tourist guidance system that works as a mobile guide and as a web-based planning tool that plans and makes decisions intelligently. Deep Map is a mobile system that alerts tourists to navigate the city of Heidelberg by generating personal guided tours. Such a tour shall consider personal interests and needs, social and cultural backgrounds like age, education, gender, and type of transportation like the car, foot, bike, or wheelchair, and other circumstances from the season, weather, and traffic conditions, to time and financial resources. Tour Planning Research A proposed tour plan intends to help the tourists find a personalized tour plan allowing them to use their time efficiently and promote the culture and national tourism [19]. Therefore, this research focuses on tour planning support. It defines and adapts a visit plan considering the most critical tourism characteristics, namely mesmerizing places, attractions, restaurants, and accommodation, according to the tourist’s specific preference and profile (which includes interests,

212 Artificial Intelligence and Data Science

Selvarajan et al.

personal values, wishes, constraints, and disabilities). The availability of transportation modes between the selected POIs is also considered. Artificial Language Experimental Assistant Internet (ALEXA) It is a device [20] capable of voice interaction, music playback, setting alarms, streaming podcasts, playing audiobooks and providing weather, updating the locations, and other real-time information. This device can also control several smart devices as a home automation system. It can make the world easier for blind people. SOLUTION STRATEGY Examining a proficient and precise route framework for outwardly disabled individuals is very important. As of late different methodologies and strategies have been proposed, like Electronic Travel Aids (ETAs) [8], Electronic Orientation Aids (EOAs) [9], Position Locator Devices (PLDs) [22] and Microsoft Seeing AI [23]. Each technique has a few advantages and disadvantages. Table 1 shows the examination and investigation of existing frameworks depending on cost, upheld voice capacity and capacity to work on the cell phone, equipment/sensors utilized, ongoing ability, and level of exactness. It can be inferred that there is a difference between every single existing framework and open-air route framework with respect to outwardly disabled individuals. Table 1. Comparison Table. Techniques

Cost

Voice Awareness

Implemented on Mobile

Hardware Dependency

Implemented Technology

Real-time Awareness

Electronic Travel Aids (ETAs)

Very high

Not fully

Yes

Very high

Multi sensors

Yes

Position Locator Devices (PLDs)

Very high

Yes

Yes

Not much

GPS

In some conditions

Deep Map Architecture

High

Yes

Yes

Not

GPS

Not

Microsoft seeing Cheap AI

Yes

Yes

Not

CNN

Not

Microsoft Seeing AI [23] is perhaps the ideal choice for changing content into voice; it can help blind individuals study the item and its value. It also provides a helpful way to deal with reading the filtered page of the book or to know about

Mobile Tourism Recommendation

Artificial Intelligence and Data Science 213

the articles and individuals. Radio Frequency Identification (RFID) chips are another arrangement [24] used for the route framework. Most of the current framework relies upon some broad fringe gadgets, and utilizing these gadgets is, for the most part, difficult. CONCLUSION In this chapter, we proposed the framework of a tourist recommender system with processes called location, item, and both location and item-based searches. The motivation for developing this application is sustainable growth and development for the tourism industry. The proposed framework analyzes user interests and performs attraction recommendations using check-in information with less effort. Here the data is helpful for user attraction preference analysis and significantly benefits tourism industries. This application targets one of the significant problems of blind people in finding tourist places. Until now, this problem has not been properly addressed, and the visually disabled still suffer from a lack of physical freedom. It is also possible to use more sensors to precisely detect different shapes of obstacles. This Tourist Recommendation System can work in both indoor and outdoor environments effectively. So it should be used as a helping device for better understanding the environment and reducing the blind’s difficulties. FUTURE WORK The work can be extended by (1) using the framework in a real-time POI, (2) utilizing more types of cell phone sensors to access more information about the locations visited to learn the sightseers’ inclination correctly, (3) tackling continuous blockage data at each spot of a popular region to produce more sensible travel courses and further improve vacationers’ travel insight, and (4) utilizing the current area and recorded travel arrangement of the questioning traveler to produce an ongoing course proposal when the traveler demands suggestions at a subjective area inside a POI for improving the flexibility of the framework. The framework can also be extended to incorporate a more significant number of factors with a larger dataset for handling both the open-air and indoor scenarios. Moreover, blind individuals must understand what items are near and ready to discover. The errors in determining the distance between the POIs also need to be minimized. ACKNOWLEDGEMENT I might want to express my uncommon thanks to our Respected madam, Dr. SATHIYABHAMA and MRS.VIDHUSHAVARSHINI, who offered me the brilliant chance to do this superb venture on the subject (Mobile Tourism

214 Artificial Intelligence and Data Science

Selvarajan et al.

Recommendation System for Visually Disabled) and additionally assisted me with the lofting of research. I came to think about such countless new things I am genuinely grateful to them. REFERENCES [1]

M. A., Awal, J., Rabbi, S. I., Hossain, & M. M. A., Hashem, “A hybrid approach to plan itinerary for tourists”, In 5th International Conference on Informatics, Electronics and Vision (ICIEV), May 2016, pp. 219-223. [http://dx.doi.org/10.1109/ICIEV.2016.7759999]

[2]

H. Hasija, and D. Chaurasia, "Recommender system with web usage mining based on fuzzy c means and neural networks", 1st International Conference on Next Generation Computing Technologies (NGCT), pp. 768-772, 2015. [http://dx.doi.org/10.1109/NGCT.2015.7375224]

[3]

W. Shafqat, and Y.C. Byun, "A recommendation mechanism for under-emphasized tourist spots using topic modeling and sentiment analysis", Sustainability (Basel), vol. 12, no. 1, p. 320, 2019. [http://dx.doi.org/10.3390/su12010320]

[4]

N. Harris, "The Design and Development of Assistive Technology", IEEE Potentials, vol. 36, no. 1, pp. 24-28, 2017. [http://dx.doi.org/10.1109/MPOT.2016.2615107]

[5]

A, Gregoriades, M, Pampaka, M., Georgiades, “A holistic approach to requirements elicitation for mobile tourist recommendation systems”, In future of Information and Communication Conference, February 2019, pp. 857-873.

[6]

C. Bin, T. Gu, Y. Sun, L. Chang, W. Sun, and L. Sun, "Personalized POIs travel route recommendation system based on tourism big data", Pacific Rim International Conference on Artificial Intelligence, pp. 290-299, 2018. [http://dx.doi.org/10.1007/978-3-319-97310-4_33]

[7]

K. Kesorn, W. Juraphanthong, and A. Salaiwarakul, "Personalized attraction recommendation system for tourists through check-in data", IEEE Access, vol. 5, pp. 26703-26721, 2017. [http://dx.doi.org/10.1109/ACCESS.2017.2778293]

[8]

Hamid, A.S, Albahri, JK, Alwan, ZT, Al-qaysi, OS, Albahri, AA, Zaidan, A, Alnoor, AH, Alamoodi, BB, Zaidan. "How smart is e-tourism? A systematic review of smart tourism recommendation system applying data management"., Comput. Sci. Rev., vol. 39, no. 100337, 2021.

[9]

F. Santos, A. Almeida, C. Martins, P. Oliveira, and R. Gonçalves, "Tourism Recommendation System based in user’s profile and functionality levels", In Ninth International C* Conference on Computer Science & Software Engineering, 2016, pp. 93-97. [http://dx.doi.org/10.1145/2948992.2948995]

[10]

P. Irvine, M. Lipson, and A. Puckett, "Tipping", Rev. Financ. Stud., vol. 20, no. 3, pp. 741-768, 2007. [http://dx.doi.org/10.1093/rfs/hhl027]

[11]

E., Parle, and A., Quigley, "Proximo, location-aware collaborative recommender." School of Computer Science and Informatics, University College Dublin Ireland, 1251-1253, 2006.

[12]

F. Espinoza, P. Persson, A. Sandin, H. Nyström, E. Cacciatore, and M. Bylund, "Geonotes: Social and navigational aspects of location-based information systems", International Conference on Ubiquitous Computing, Springer: Berlin, Heidelberg, pp. 2-17, 2001. [http://dx.doi.org/10.1007/3-540-45427-6_2]

[13]

R., P., Biuk-Aghai, J., A., C., Hoi, K., I., U., Pui. "MacauMap: Handheld Digital Map of Macau." In Symposium on Technological Innovation in Macau, 2002, pp. 149-158.

[14]

https://tasks.office.com/ [Accessed on 20.04.2021]

Mobile Tourism Recommendation

Artificial Intelligence and Data Science 215

[15]

T., Simcock, S., P., Hillenbrand, B., H., Thomas, "Developing a location based tourist guide application", In Conferences in Research and Practice in Information Technology, Vol. 21, 2003.

[16]

D., Salber, A., K., Dey, and G., D., Abowd, "Ubiquitous computing: Defining an hci research agenda for an emerging interaction paradigm", GVU Technical Report, 1998.

[17]

A., Pashtan, R., Blattler, A., H., Andi, and P., Scheuermann, "CATIS: a context-aware tourist information system", 4th International Workshop of Mobile Computing, 2003.

[18]

R., Malaka, A., Zipf, “Deep Map: Challenging IT research in the framework of a tourist information system.”.Information and communication technologies in tourism. Springer: Vienna, 2000, pp. 15-27.

[19]

P. Phillips, and L. Moutinho, Critical review of strategic planning research in hospitality and tourism., 2014. [http://dx.doi.org/10.1016/j.annals.2014.05.013]

[20]

K., Micak, "The Alexa Experiment", Master’s Thesis, OCAD University, 2018.

[22]

D-Y. Yeh, and C-H. Cheng, "Recommendation system for popular tourist attractions in Taiwan using Delphi panel and repertory grid techniques", Tour. Manage., vol. 46, pp. 164-176, 2015. [http://dx.doi.org/10.1016/j.tourman.2014.07.002]

[23]

V. Parikh, M. Keskar, D. Dharia, and P. Gotmare, "A Tourist Place Recommendation and Recognition System", Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 218-222, 2018. [http://dx.doi.org/10.1109/ICICCT.2018.8473077]

[24]

K. Hajar, A. Abatal, and M. Bahaj, "Ontology-based context awareness for smart tourism recommendation system", International Conference on Learning and Optimization Algorithms: Theory and Applications, pp. 1-5, 2018.

216

Artificial Intelligence and Data Science, 2023, 216-238

CHAPTER 11

Point of Interest Recommendation via Tensor Factorization Shreya Roy1,*, Abhishek Majumder1 and Joy Lal Sarkar1 Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India 1

Abstract: In the recent era, recommendation systems have marked their footsteps and have changed the way of the travel industry. The recommendation system deals with massive amounts of data to identify users’ interests, making the location search easier. Many methods have been used so far for making predictions much more desirable regarding users’ interests by collecting Information from a large set of other users. The main objective of this paper is to show various methods and techniques used for generating recommendations. These recommendation processes are classified into different forms, such as traditional methods and tensor-based methods. A brief review of these methods was described with the help of some challenges faced by the recommendation system. Apart from that, the advantages and disadvantages are discussed, along with the highlights of future directions.

Keywords: Point of Interest, Tensor factorization, Recommendation, Collaborative Filtering, Check-in, Tucker decomposition, Preference. INTRODUCTION Locations are critical in case tour and travel planning, selecting proper locations, and scheduling them accordingly [1 - 4]. Finding the best location and recommending it to the user is known as a location recommendation. Countries’ recommendation systems are established based on different choices and interests of users over cities. Point Of Interest (POI) recommendation system offers personalized recommendations to the user based on user-defined data or open data. It is pretty much well-known that human behavior is consistent [5, 6], which makes learning and predicting the patterns of human behaviors much easier [7 - 19]. With the rapid blooming of intelligent mobile devices and web connections, Location-Based Corresponding author Shreya Roy: Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India; E-mail: [email protected]

*

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

Point of Interest

Artificial Intelligence and Data Science 217

Social Networks (LBSNs) help users share their locations [20 - 23]. There are some typical LSBNs available. Among them, Foursquare [24], Yelp [25], Facebook, Geolife, Gowalla [25], etc., help the user to build connections, upload photos, and share locations via check-in data [23]. The LBSNs need to have rich Information and be very much prompt about user preferences to recommend new places to the users they may be interested in visiting [26 - 34]. The recommendation system has been widely adopted by many e-commerce sites, such as Amazon, Netflix, Facebook, etc. Nowadays, POI is one trend. The POI recommendation needs accurate and prompt results. There are some factors where the POI recommendation system differs from traditional recommendation systems as it provides some unique characteristics. According to Tobler et al., the First Law of Geography states, “Everything is related to everything else, but near things are more related than distant things” [19]. It simply implies that the user prefers nearby places rather than distant places [35 - 44]. Hence, it can be seen that POI recommendation is much more effective in predicting user preferences than traditional recommendation techniques, as it uses geographical influence to recommend new places to the user. In early times traditional recommendation methods take the user ratings on an item or place, which are further converted into an item rating matrix [29]. The ratings are mainly based on numerical range, e.g., 1 to 5. A score of 5 is the higher rating and is considered the best satisfactory result. Unlike such recommendations, user preferences are considered using the check-in, which forms the check-in frequency matrix of user location [29]. The frequency range is larger than ratings, but the sparsity [29] (e.g., 90% sparse means that 90% of its cells are either not filled with data or are zeros) of the matrix is very high. It makes POI recommendations very challenging. Apart from that, social influence is also another way to know user preferences. Most users tend to follow their family and peers’ choice. Hence, the traditional recommendation system takes the user preferences and ratings to improvise the recommendation which is found in the previous studies [45, 8]. In the case of POI recommendations based on early research studies, users share a significantly less common interest, which means social influence is less effective for check-in behavior. Influential Factors of POI Eecommendation Different user modeling approaches or algorithms are available for POI recommendation according to different types of LBSN data.

218 Artificial Intelligence and Data Science

Roy et al.

Pure Check-in Based POI Recommendations Traditional recommendation systems are used to gather ratings given by users on different items or places that are not available in the LBSNs [5]. In the case of POI recommendation, the frequency of check-in to different locations is considered as the user preference. With this available check-in data. The user visits only a few locations in LBSNs, so the user data is encoded into a sparse matrix [14]. With all those Information, the recommendation approaches are employed for POI recommendation. The two most popular POI recommendation processes are Userbased collaborative filtering [28] and item-based collaborative filtering [28]. Collaborative Filtering (CF) [1, 29] based recommendations are achieved by Matrix Factorization(MF) [21]. In the case of user-based POI recommendations, the similar taste of similar users is considered, making POI recommendations effective. While in the case of item-based POI recommendations, users with similar kinds of POIs are considered. The user’s check-in history can be considered as binary values such as ‘1’ or ‘0’. The value of the matrix will be 1. If the user visits a particular location based on different categories, e.g., restaurants, cinema halls, residences, shops, etc., the value will be ‘1’; otherwise, ‘0’. If the check-in result is shown like this, it may ignore the frequency of check-in. Few users visit their preferred location. Therefore, many entries in the matrix are left with ‘0’s. It creates data sparsity, and to overcome such a problem, both the user-based and item-based POI approaches are applied. Various studies [29] found that userbased POI recommendation is better than item-based one, which leads to inaccurate item similarity as compared with user similarity. There is model-based collaborative filtering [29] which can be adopted for POI recommendation apart from a few memory-based collaborative filtering [28]. After being utilized by Google and Netflix, the MF, especially the CF modeling approach, have gained popularity for POI recommendation as it is much more effective in dealing with large user-item rating matrix [14]. Moreover, the regularization MF-based POI recommendation proposed by B. Berjani and T. Strufe [2] dealt with the lack of explicit rating. Another POI recommendation presented by Wang et al. [22] decomposes two low-rank latent feature matrices just for modeling the importance of ‘venue semantics’ in the check-in behavior of the user. The knowledge about content and context (e.g., POI category, user context, sentiment indication, and time stamp) in LBSNs also helps to publish other types of characteristics of the check-in behavior of several users. The advantages of using such approaches are to reduce dimension and minimize data sparsity. The drawback of this approach is

Point of Interest

Artificial Intelligence and Data Science 219

that it doesn’t consider any geographical, temporal, or social influence on checkin data. Geographical Influence Enhanced POI Recommendation Earlier traditional recommendation approaches did not consider geographical influences being other unique characteristics of POI recommendation. With the help of geographical check-ins data, humans’ spatial distribution of daily movement information [14] can be captured while enhancing the performance of the recommendation system. Moreover, the spatial clustering phenomenon [14] of users’ check-ins provides results using the Information from each user’s choice to visit nearby rather than distant places on a regular basis. The Bayesian and probabilistic methods [7, 13] have been introduced to deal with the geographical influences of POI recommendations. Furthermore, power law distribution [29] and collaborative POI recommendation [29, 14] have been established, which are also based on the Bayesian rule. A new factor called the check-in probability of a new location had been incorporated for recommending POIs using a multi-center Gaussian model [3] that combines user preferences, geographical influences, and personalized ranking. Another algorithm for the recommendation process, known as the out-of-town region, considers the spatial influence between POI to enhance the recommendation performance. The primary disadvantages of these approaches are inability to deal with user cold-start problems. Apart from the matrix factorization technique, another method for geographical Information is the Latent Model Factor [29]. Here, the core geographical influence is defined by the inherent spatial feature of POIs and spatial clustering phenomenon. Social Influence Enhanced POI Recommendation In social influence-enhanced POI recommendations, users are mainly inspired by their peers and family and usually share a common interest. POI recommendation system tries to improvise its recommendation by using social influences [12]. Another approach was called Friend based on collaborative filtering (FCF). FCF considers the preference of friends and family. FCF mainly focuses on the efficiency rather than effectiveness [28] of the POI recommendations system. Thus, it provides more improvement over user-based POI recommendations.

220 Artificial Intelligence and Data Science

Roy et al.

Apart from that, one more approach named Probabilistic MF(PMF) with social regularization (PMFSR) was developed that learns the latent preferences by user POI [4], which also integrates all the social influence [4] into PMF. Social influence has much more impact on traditional POI recommendations. Temporal Influence Enhanced POI Recommendation Temporal influences are factors that decay or decline the weight ratings in traditional recommendation techniques. In the case of POI recommendation, the temporal influence is used for a specific temporal state [29]. Yuan et al. assumed that users usually tend to visit locations that are not related to each other at different time slots. Therefore, they proposed a Time-Aware POI recommendation algorithm where the user-based POI recommendation extends itself using a time factor by considering user similarity and their check-in timings. The time factor considers a contextual factor [14]. The Markov Chain model [14] often models the sequence pattern for POI recommendation in LBSNs. But a third-order tensor was further proposed, which is known for modeling successive check-in behavior that fuses [14] Personalized Markov Chain with the latent pattern. Apart from that, other models are available where check-ins are modeled in a sequential pattern. Some such models are matrix factorization [14], tensor factorization [14], pairwise ranking model [14], and recurrent neural networks [34] Some content-based POI recommendation systems are available that use sequential influences of check-in data to model Spatio-temporal [14] preferences of POI. But these systems would not work well if the users do not check in often or are new users [14]. A Brief Introduction to Tensors Tensors can be assumed as a collection of numbers in a multi-way [10]. It can be multi-dimension [10] which is also known as order. Scalars are interpreted as 0order tensors, and vectors are represented as first-order tensors. Second-order tensors represent matrices. Now POI recommendation came up with third-order or higher dimensional tensors. In the case of a third-order tensor, the fibers [18] are divided in three ways, such as a row, column, and tube sliced in frontal, lateral, and horizontal patterns. The elements of tensors are usually of the same type. The major facility of using tensor is the ability to take care of various natures of user-item interactions. Tensor fiber is the same as the row and column of the matrix but with higher orders that include the sequence of elements considering all the indices are fixed except one [42]. Tensor slices are obtained by fixing all indices except two to form a two-dimensional matrix. In the case of a threedimensional matrix, there are three slices known as horizontal, vertical, and

Point of Interest

Artificial Intelligence and Data Science 221

lateral. In the case of tensor factorization, participation is the crucial factor that is used to reshape a tensor of any dimension into the matrix [15, 16], which is called the unfolding of the matrix. Tensors have an impact on dimensionality reduction [42]. As tensors are a kind of vector with numerous dimensions, their elements can have an outer vector product. Whenever those products of vectors are strictly decomposed, the tensor becomes a rank-one tensor. The sum of the minimum no of rank-one tensors which form the entire tensor is called the rank of the tensor. In the field of recommendation systems, flexibility is most important, which is provided by tensor-based data. In personalized search, relevant search results are generated according to per user choice considering the valuable insights information. There are different types of tensor decomposition available, mainly based on the two most important decomposition techniques: canonical polyadic decomposition (CPD) and the Tucker decomposition (TD). Both are outer product decomposition [18] but have different structural properties [20]. The CPD is a rankdecomposition [17, 18] which can be defined as rank-one tensors with the sum of a finite number [18]. Canonical decomposition (CANDECOMP) and parallel factors (PARAFAC) Decomposition are the most widely named rank decomposition. Some algorithms are also provided to compute CPD, such as Jennrich’s Algorithm, Alternating Least Squares (ALS) Algorithm, Tensor Power Method, Uniqueness, and Tensor Rank Peculiarities. Likewise, CPD Tucker decomposition [32] also follows a few algorithms, namely ‘Higher-Order Singular Value Decomposition’ (HOSVD) [18], and ‘Higher Order Orthogonal Iteration (HOOI)’ [18]. All these decomposition techniques are applied to several tensor-based models. The earliest work done exploring the insight data is CubeSVD [42] with a 3rdorder tensor to represent the association level between the user and the webpage [42]. In TOPHITS [42], users are not exceptionally stored, while their behavior is noted as a co-occurrence of resources. It also extends the adjacency matrix of web pages having interlink with each other combining the collected keyword information to build an adjacency tensor. The perception of this decomposition is different from the CubeSVD [42]. Another special technique is social tagging systems (STS) [42], where recommendations are established on social behavior, including common choices, which helps to categorize and coordinate items of user preference. The tensor values are binary, as one tag is used for one item without repetition. Ranking Tensor Factorization and pairwise interaction Tensor Factorization [42] deal with positive feedback. They keep others as negative

222 Artificial Intelligence and Data Science

Roy et al.

feedback and unknown values. Both of them deal with pairwise values. Some of the algorithms are discussed in the next section. LITERATURE SURVEY ON RECOMMENDATION SYSTEM VIA TENSOR FACTORIZATION Hotel Recommendation Personalized POI recommendation has drawn attention in every field. Among them, hotel recommendation is very trendy nowadays. As per Liu et al., Timesemantic-aware Poisson tensor factorization (TS-PTF) [34] was used to learn temporal dynamics, multi-aspect ratings, and review of texts for a hotel recommendation. This work defines the periodic effect of user movement utilizing the model to provide scalable hotel recommendations. The framework of this model consists of a set of users’ reviews of hotels, a set of features from the reviews that are extracted from multi-aspect ratings [34] with review text, a set of cities where the hotel belongs, and finally, a set of periods. This model used a tensor of user-hotel-time features where the entries are features that were found. The data gathered from multiple sources for hotel recommendations are the following [34]: Temporal behavior of users: The check-in time information is used for accuracy purposes which also captures the periodic effects. The temporal context [34] directly extends the two-dimensional models to a three-dimensional representation corresponding to user-hotel-time [34]. The timestamp is used for TS-PTF to denote the pattern for booking hotels. Multi aspects preference: Here, along with five rating information, some other aspects are considered, such as the cleanliness of the hotel, room service, location of the hotel, and value for better preference. Semantic topic model through LDA (latent Dirichlet allocation): It is used for semantic topic modeling from the text reviews done by LDA. It deals with a large collection of documents to find the latent (hidden) topic information. User reviews modeling as tensor: The data set extracted from the reviews is represented here by a four-dimensional tensor of user-hotel-time features. Here, the aspects, overall preferences and semantic topic are combined. Using tensor factorization, all the hidden structures of these data sources are exploited by latent factors. After having the 4-dimensional tensor, Poisson distribution [34] was

Point of Interest

Artificial Intelligence and Data Science 223

applied to model the interaction between the user and hotel, then Poisson Tensor Factorization [34] was used to characterize the feature tensor. The TS-PTF decomposes the 4-dimensional tensor into four latent factor matrices to find the latent features [46]. A posterior distribution was used to find out the latent factor in TS-PTF. The TS-PTF model recommends to users the unreserved hotels in the city selected and having available time for the booking. It is done after the posterior expectation of Poisson parameters is fixed, which provides hotel ranking for preferences. Finally, the model is generalized for users of cold-start while predicting their aspect preference for hotels with the help of review texts. As soon as the latent factors are inferred, they are divided into two boxes: aspect features and topic distribution features. Hence the main focus was to achieve scalability. It recommends a fresh variety of hotels for ordinary users and predicts aspects preferred by cold-start users even if the text size is smaller for reviews. Advantages 1. Temporal dynamics and semantic Information are utilized to improve hotel recommendations’ performance. 2. Alleviate the data sparsity problem. Disadvantages 1. There is a loophole regarding random real-time feedback Recommendation in the Travel Decision-making Process Point of Interest is a very interesting topic to discuss in this era. Some advanced methodologies have been developed to enhance this technique. Here, a POI recommendation system has been presented where the user and POI (deal with multiple factors) incorporation is achieved to provide a unified recommendation. The framework has been presented in Fig. (1). Two important processes simulate the travel decision-making process of the user: preference factors and geographical factors. The following steps are used to model the recommendation system: 1. To model user check-in as well as predict preferences dynamically, tensor factorization was introduced. 2. To characterize the geographical factors of each user, a personalized similarity is calculated among users considering fitted curves of the target user, so that relation between distances and travel probability can be established.

224 Artificial Intelligence and Data Science

Roy et al.

3. Combining the above-mentioned factors, the list of POIs is finally generated to recommend.

Fig. (1). Framework Architecture [27].

Preference dynamic prediction [27] deals with the preferences to predict better accuracy at different time slots. User temporal preferences are modeled by tensor fcatorization, which recovers the missing data by tensor decomposition. The hidden patterns are also isolated and analyzed by tensor factorization. A thirdorder tensor is used to model the user preferences. Each entry in the tensor is the sum of user check-in frequency. The tensor decomposition is modeled by tucker decomposition. Since all the entries are non-negative, especially non-negative tensor decomposition applied. After decomposition, the Geographical influence curve fitting [27] phase is applied. The curves for each user measure the probability of the POI visit in the concerned time travel time. Here, the authors mentioned that whenever the checkin data is too sparse for a particular user, some other check-in data is considered from another similar type of user along with those data, and the curve is built. As they mentioned, to find out the similarity among users, similarity needs to be calculated. According to the authors, their advanced similarity calculation method is different from the traditional method as it does not ignore some auxiliary information required for calculation and solves the issue of no directionality. According to the method, the virtual common access sequence [27] is created between two users using the following steps: A sequence of check-in is considered from the history of check-in data. Among these check-ins, the longest subsequence of continual check-ins is selected and

Point of Interest

Artificial Intelligence and Data Science 225

removed from the historical data until the sub-sequence length is less than 2. Finally, a set is created, which consists of the length of their respective continual check-in sequence. After getting the continual sequence set, the time interval is calculated by artificially creating conjoint time intervals that correspond to every check-in time of each continual sequence. In a set of particular continual check-in sequences for each check-in, the time interval was calculated in three cases: 1. Based on the first check-in of the set [27], the first time interval is calculated. 2. Based on the last check-in of the set [27], the last interval of time is determined. 3. Otherwise, the time interval is calculated as per the check-in time of each element of the sequence set. The POIs with the same preferences are then clustered based on latitude and longitude. Finally, a common virtual sequence is constructed for two users where a particular user’s checked-in POI location cluster is chosen according to the POI clustering result. After that, the corresponding time interval is calculated according to the checked-in time to a POI location. Then other POIs are chosen from the cluster, which another user visits in the same time interval. With this Information, the calculation of similarity between two users is done along with the following three factors, Sequential property: The length of the virtual common access sequence is considered here. The longer the values, the more two users are related to each other, which defines continuity. Common access property: Here, those POIs are considered where two users have common access, providing more similar users. Virtual common access collection: Visited popularity of virtual common access collection defines that very few people visit those places that are common or popular among particular two users. This method, proposed by authors [27], calculates the similarity with the sum of the virtual common access sequence scores according to the weighted length. The final scores of the virtual common access sequence are further calculated. With the similarity calculation, authors moved toward the geographical curve fitting technique, which includes the following steps: 1. Two user similarity is considered based on their weighted sub-sequence length [27], which is greater than a predefined threshold value.

226 Artificial Intelligence and Data Science

Roy et al.

2. Curve fitting provides the probability of a user visiting as the distance increases during different time slots. A curve represents the relationship between the distance traveled by the user and travel probability. To achieve that, some specific steps are followed where discrepancy among users is avoided by normalizing the check-in time of a user and other similar users and the frequency of check-in of each user. After that, the obtained check-in frequencies of specific POI are gathered and summed up. The same operations are performed for other POIs also. Along with the Information about the frequency of check-in to all POIs, other factors like heat map, center of the hottest zone, and virtual starting point are considered. The Euclidean distance [27] is calculated to find the maximum distance from each POIs to the above-mentioned point. Concentric circles [27] are considered with a virtual starting point as the circle’s center. Accordingly, the check-in frequency of all POIs is computed, which belong to the circle, and travel probability is calculated. The radius of the circle and the travel probability adopted the least square method for curve fitting. Finally, combining preference and geographical factors generates a list for the next POI recommendation with the given current time and location. Here, dynamic preference predictions are used for users’ preferences at the current time. The available POIs that a user did not visit before the current time is considered a recommendation candidate set. From this set, the rank score of each POI is calculated and sorted accordingly in decreasing order, which provides the n-POI list to users. Advantages 1. Sparseness reduced. 2. Dynamic prediction achieved. Disadvantages 1. User similarity calculation can be more accurate using the heterogeneous context. 2. Prediction of user preference on different time scales needs more concern. 3. Estimating the implicit transition probability of new POIs is quite challenging based on the sparse data. 4. Precision is also low if compared with general POI recommendations. Location-Based Social Networks for POI Recommendation It is known that the Location-Based Social Network is an essential component of Point of Interest (POI) recommendation. This is because it contains many check-

Point of Interest

Artificial Intelligence and Data Science 227

ins and comments, which helps in getting more accurate results for the preferences. Guoqiong et al. [35] proposed a new strategy with tensor factorization to achieve accurate recommendations. First of all, Latent Dirichlet Allocation (LDA) uses all the comments from the users that extract Information about POI. All the informative check-in data is split into slices corresponding to every hour of the day. A user-topic-time tensor was constructed by combining the topic distribution of visited POI. Finally, a higher-order singular value decomposition [35] algorithm decomposes a third-order tensor to get more dense preferences. From Fig. (2), the entire framework can be described with a user set and POI set. With a user visiting the number of POIs at a time point, a specific list of POI can be selected.

Fig. (2). The overall POI framework [35].

Guoqiong et al. [35] constructed User-Topic -Time tensor, also known as UZT, and decomposed and provided a more dense User-Topic-Time tensor. The usertopic distributions of the time slice are further exchanged into user-POI distributions for the POI recommendation. Time-Aware Preference Mining Here, the LDA model is a language-based model that considers natural language, which extracts topics and distribution of those topics for every POI. According to the model, it can be said that the model contains hidden variables known as topicword distribution and document-topic distribution. From Figure 2, it is clear that by fetching POI-topic distribution and check-in and combining them time aware user-topic distribution can be found. Tensor Factorization The time-aware user-topic distribution [35] is also sparse, just like the user checkins, which need to be decomposed using high order single value decomposition (HOSVD). Finally, the third-order tensor of user-topic-time is represented as

228 Artificial Intelligence and Data Science

Roy et al.

ܷܼܶǣܺ‫ܰܯא‬ൈ‫ܭ‬ൈܱ Where N denotes the number of users, K denotes topics, and O defines the time slice. Each element of the X is the standard topic distribution of the user of the kth topic at the mth slice, as well as a sparse tensor that measures the user preferences over the topics. HOSVD decomposes this tensor with the following steps: 1. First of all, the UZT is decomposed into a three-factor matrix. 2. A core tensor is generated to make those matrices interact with each other. 3. Then, a new third-order tensor is formed by the product of the core tensor and factor matrices. The model is trained by generating an objective function, and every objective function brings a loss function and the regularization associated with it to avoid over-fitting. The algorithm iterates the factor matrices and the core tensor together, and finally, the objective function is minimized using Stochastic Gradient Decent. After getting the minimized versions, the final phase of POI recommendation starts, where the users’ topic distribution converts into users’ POI preferences. The following processes are required to find out the user-POI distribution,

‫ܩ‬ൌܲ݇ሺ‫݅ݑ‬ሻൈܳܶ Where P be the K dimension vector to represent users’ preference on all topics at a specific time. Q is the matrix that represents the POI-topic distribution with the M x K dimension. Finally, G, which is the M dimensional vector representing all the POIs with user preferences at mthslice is determined by the product of P and transpose of Q. Thus, N number of POIs are recommended to users. Advantages 1. Being a total probability generative model, LDA has a clear internal structure. To calculate model parameters, the model uses efficient probability inference algorithms [33]. 2. The model parameter size of LDA space has no relation with the number of training documents, making it suitable for handling large-scale data [33]. 3. The usage of the SVD matrix for computations rather than the original matrix has the advantage of being more robust in case of numerical error [37]. 4. The HOSVD, which is directly generalized from the classical SVD of matrices,

Point of Interest

Artificial Intelligence and Data Science 229

defines multi-linear rank or Tucker rank instead of rank, which contains the rank of unfolding matrices along different modes that possibly reduces tensor problems from multi-linear space into linear space [36]. Disadvantages 1. The topics of LDA cannot be influenced, i.e., sometimes it may happen that while working with documents, some words are there which are already known. It is also presumed that after applying LDA, those known words will still be there, but LDA may not find it out, which can be very frustrating, and there is no other way to let the model know about some words which should be gathered in the model. 2. Topic models can be easily abused if they are wrongly understood (e.g., an object representation of the meaning of text). 3. The topics generated by the general LDA model are synthetic, i.e., they do not necessarily conform to topics identified by humans for the same core [38]. POI Recommendation Based on Weather Context According to Lu et al. [39], Multi-Context-Aware Location Recommendation can be used to model the users’ check-in history along with multiple contexts, i.e., time and weather, at different granularity scales. These temporal contexts are modeled into the hour slot of the day, the day of the week, and the year’s season. The four-mode tensor model can establish the relationship among user, location, time, and weather. Four feature matrices are produced which are collaboratively decomposed with the tensor to reduce sparse data. The recommendation of the top K POIs list to users consists of the following parts: Context Inference and Modeling This crucial stage mainly deals with context. User check-in distribution time and weather are selected to get the high check-in variance, and the time context is primarily split into the season, day of the week, and hours of the day. The unique timestamp of the visit by each user makes the data set huge. Hence, the check-in timestamp is encoded into time slots representing a season, day of the week, and hour slot, ensuring dimension reduction. Usually, different users mostly prefer visiting contrasting locations in distinct seasons of the year. As a result, locations also experience varying check-in distribution. Not only seasons but users also prefer visiting places at different hours of the day. Hence, the entire day is divided into five different slots. The longer the time frame of the visit, the more check-in information is achieved to form the tensor to provide better performance.

230 Artificial Intelligence and Data Science

Roy et al.

Weather context: Lu et al. [39] also state that weather circumstances influence the user’s check-in behavior. In most scenarios, users wish to check in at distinct location categories in discrete weather conditions. The temperature values are branched into four intervals and combined with seven daily weather summary tags to reduce data sparsity. The objective is to obtain weather context dimensions for the tensor construction. Construction of Tensor and Feature Matrix The four-mode tensor factorization considers the user, POI, time slot of check-ins, and weather conditions during check-in. In the sparse tensor, corresponding feature matrices that are required to be factorized collaboratively are, Time-category Matrix The time-category matrix needs all the check-in information clustered into 144 categories. Those check-ins to every category are further clustered into 40 unique time slots. Location Similarity Matrix The coincidence between two locations is wholly based on user check-in handling of different time slots determined by the cosine similarity. Such Information helps to construct the location-location matrix. Location-weather Matrix The weather context mainly clusters check-ins into 28 clusters, where the total number of check-ins that happened in a specific weather context is stored as an element of the matrix. The goal is to find out the popularity of a location in particular weather. User category matrix: For each user category, the frequency of check-in is counted and stored in the matrix. Collaborative Tensor Decomposition In this work [39], Canonical Polyadic decomposition (CP) is applied to factorize the tensor into a sum of rank-one tensors [39], which considers four modes, users, locations, time slots, and weather. Four feature matrices, where rows represent dimension and columns show the rank, are factorized using matrix factorization.

Point of Interest

Artificial Intelligence and Data Science 231

With the help of collaborative factorization, when an entity participates in multiple relations, only the factor matrices share parameters for gathering knowledge from those feature matrices. Finally, the optimization done by Stochastic Gradient Descent (SGD) [9] considers non-zero elements to fetch the minimum values and output generated, which is nothing but the four-factor matrices. POI Recommendation In this phase, missing values are recovered from the sparse tensor using the outer product of the four output factor matrices to recreate a denser tensor. The new tensor then measures user associations, locations, time slots, and weather. The candidate locations list is accessed for recommendation purposes by selecting N corresponding locations with the highest predicted weights values from the upgraded tensor and ranked accordingly, providing Top-K locations for users. The framework of the technique has been presented in Fig. (3).

Fig. (3). Framework [39].

Advantages 1. Considering multiple contextual factors from heterogeneous sources enhanced better prediction. 2. Sparsity problems are handled effectively.

232 Artificial Intelligence and Data Science

Roy et al.

Disadvantages 1. Sometimes the only check-in data is lacking while informing about the exact contextual scenario. 2. Incorporating multi-context with high-level modeling gives better Information, but sometimes the model may affect the cost, efficiency, or both. 3. Scalability should be more concerned with multiple contexts. POI Recommendation with Category Transition and Temporal Influence A Two-stage coarse-to-fine POI recommendation algorithm was designed by Lin et al. [31]. According to the algorithm, considering the categories of POIs will be effective for recommendation as it simplifies the successive check-in preferences. The authors also proposed two phases for personalized POI recommendation, category prediction, and the POI recommendation. Four factors, such as category transition, temporal influence, user preference, and geographical influence, are considered for recommendations. The system architecture is presented in Fig. (4).

Fig. (4). System Architecture [31].

The category transition primarily consists of check-in sequences Su =< (l1, t1), (l2, t2), . . (ln, tn) > with time threshold τ .The category transition is modeled by Matrix Factorization (MF) [31, 11]. The category transition matrix can be defined as *n where m denotes the number of category transitions, n is the total number of users, ܱ߳‫ ݂כ݉ܯ‬and ܲ‫݂כ݊ܯא‬are the matrices that depict hidden factors of category transition and user’s preference on that factor.

Point of Interest

Artificial Intelligence and Data Science 233 ௠

௡ ଶ

ܽ‫݊݅݉݃ݎ‬ைǡ௉ ൌ ෍ ෍ ‫ݓ‬௜௝ ൫ܱ௜ ܲ௝் െ ݇௜௝ ൯ ൅ ߣ௞ ԡܱԡଶ ൅ ߣ௞ ԡܲԡଶ 

௜ୀଵ ௜ୀଵ

‫݆݅ݓ‬ൌሼͳ݂݅ܽ݊‫݊݋݅ݐ݅ݏ݊ܽݎݐݕ‬ Ͳ ‫݁ݏ݅ݓݎ݄݁ݐ݋‬ Where is the λk regularization constant and ‫ݓ‬௜௝ is the indicator function. Usually, the user visits different POIs that falls under several categories at different period. The time category pair can be denoted as (Cli , ti ), and once again, MF is used to model the user and time-category pair matrix. The time-category matrix can be denoted as ܸ߳‫ כ݋ܯ‬, ܹ߳‫ ݂כ݋ܯ‬,‫ ݂כ݊ܯ߳ܪ‬. The learning algorithm adopted by the authors was Alternating Least Squares (ALS) [31] which computes each latent factor by fixing the other factors in two MF models. The next phase is the POI recommendation, where all predicted POIs categories are treated as candidates. Collaborative filtering (CF) is applied with two approaches, namely User-based Collaborative Filtering (UCF) [31] and User-Time-based Collaborative Filtering (UTCF) [31]. UCF deals with a user’s preferences on a particular item and aggregates all of them having item preferences in common. If two users share the same preferences on POIs, their similarity is higher, and based on that, the similarities of all users are computed. Another approach is UTCF which integrates time influence and user preferences. Here, the time is divided into several hours, and using the temporal influence extension of UCF is achieved. In the case of UCF, the higher similarity between two users was defined if both users tend to visit the two or more same types of POI at the same time slot, making the POI most preferable for the users. After having those preferable POIs, the second part of the POI recommendation has been initiated, which is geographical influence. Advantages 1. The critical advantage of the Collaborative Filtering recommendation system is that it does not rely on the machine’s analyzable contents and is, therefore, capable of making accurate recommendations [40]. 2. The problem related to the size of the data set and data sparsity of the recommendation system is handled by Matrix factorization. [41, 40] 3. In matrix factorization, the Alternating Least Square (ALS) technique is relatively easy to implement, as it scales well while working with a large number of data.

234 Artificial Intelligence and Data Science

Roy et al.

4. ALS minimizes two (gradient descent with item matrix and gradient descent with user matrix) loss functions alternately. Disadvantages 1. The key disadvantage of the user-based collaborative filtering model is data sparsity. 2. If there are too many users and users change frequently, then there are too many calculations in the user-based collaborative filtering model. 3. In collaborative filtering, it is pretty challenging to include side features of the item (e.g., POI) for the recommendation system. It may not be easy to include side features into Weighted ALS. A generalization of WALS makes it possible. A summary of the recommendation methodology is shown in Table 1. Table 1. Summary table of recommendation methodology References

Methodology

Features

Data description

Results

[34]

Time-aware Poison Tensor Factorization

[27]

Dynamic prediction User’s current Foursquare data set: PRIME with tensor location, preferences It includes two performed very well, using decomposition, Curve over dynamic times, cities, Tokyo and two latent spaces that fitting module. geographical New York. Gowalla embed user preferences influence of POIs, data set contains raw and sequential patterns, and Correlation data from New and also exploits the among POIs based York. metric embedding method on for the recommendation similarity. by avoiding drawbacks of the Matrix Factorization.

[39]

Multi-Context- aware User details, Yelp and Foursquare tensor decomposition Location details, the data set used for time when the user New York, and checked- in, and Tokyo. weather details of the location

Multi-aspect Two data sets were TS-PTF review and taken from performs well for ratings of hotels, TripAdvisor, which cold start users, dealt with user profiles, time of includes 1,800,059 data sparsity as well. check-in, and other reviews written by features of hotels. 447,555 users.

Mean Reciprocal Rank (MRR) is used in the experiment, which gives better quality to ranked POIs.

Point of Interest

Artificial Intelligence and Data Science 235

(Table 1) cont.....

References

Methodology

[28]

Matrix Factorization, Collaborative Filtering.

[30]

Gaussian mixture model, inductive matrix completion for the recommendation.

Features

Data description

Results

Category transition LBSN dataset, UTCF method pair matrix with Gowalla (12,459 outperforms other baseline category transition user data, 2,561,710 methods based on an and user Checkin, 25,460 increased time threshold. preferences, Time Locations, 133 category pair matrix Categories), used for with user the preferences experiment. Trajectory history extracted from check-in information, the latitude and longitude data to present a location.

In the experiment, two datasets, Foursquare and Gowalla are used. Check-in data fetched 194,108 in total by 2,321 users with 5,596 POIs in the Foursquare dataset and in the Gowalla dataset 456,988 check-ins fetched from 10,162 users at 24,250 POIs.

GAIMC (Geography Aware Inductive Matrix Completion) method gives the best performance when compared with a few baseline methods over all fractions of the training set. The AUC of GAIMC reaches at the peak.

CONCLUSION AND FUTURE SCOPE The POI recommendation system is a kind of information system that helps to connect users, and POIs also allows users to discover visiting area based on their interest. In this chapter, introductory Information about recommendation systems has been presented. The usage of a tensor-based method for the recommendation was also discussed. A brief history of tensors, including various decompositions, applied to models in recommendation systems has been presented. Different tensor factorization approaches applied to recommend top POIs to users with delightful features were also presented. The tensor-based recommendation system is a powerful system that helps the tourism sector. A summary of the existing tensor-based recommendation system has been presented. ACKNOWLEDGMENTS The work described in this paper was fully supported by Mobile Computing Lab, Department of Computer Science and Engineering, Tripura University, Tripura, India.

236 Artificial Intelligence and Data Science

Roy et al.

REFERENCES [1]

A. Karatzoglou, X. Amatriain, and L. Baltrunas, "Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware CollaborativeFiltering", 4th ACM conference on Recommender systems, pp. 79-86, 2010. [http://dx.doi.org/10.1145/1864708.1864727]

[2]

B. Berjani, and T. Strufe, "A recommendation system for spots in location online social networks", 4th Workshop on Social Network System, pp. 1-6, 2011. [http://dx.doi.org/10.1145/1989656.1989660]

[3]

C., Cheng, H., Yang, I., King, and M., R., Lyu, “A unified point of-interest recommendation framework in location-based social networks”., ACM Trans. Intell. Syst. Technol., vol. 8, no. 1, pp. 121, 2016. [PMID: 28344853]

[4]

C., Cheng, H., Yang, I., King, and M., R., Lyu, “Fused matrix factorization with geographical and social influence in location-based social networks”, In AAAI Conference on Artificial Intelligence, vol. 26, no. 1, 2012, pp. 17–23.

[5]

D.C. Funder, and C.R. Colvin, "Explorations in behavioral consistency: Properties of persons, situations, and behaviors", J. Pers. Soc. Psychol., vol. 60, no. 5, pp. 773-794, 1991. [http://dx.doi.org/10.1037/0022-3514.60.5.773] [PMID: 2072255]

[6]

S., A., Khan, S., Arif, and L., Bölöni, “Emulating the Consistency of Human Behavior with an Autonomous Robot in a Market Scenario”, In AAAI Workshop: Plan, Activity, and Intent Recognition, 2013, pp 17–23.

[7]

L. Guo, H. Jiang, X. Wang, and F. Liu, "Learning to Recommend Point-of-Interest with the Weighted Bayesian Personalized Ranking Method in LBSNs", Information (Basel), vol. 8, no. 1, p. 20, 2017. [http://dx.doi.org/10.3390/info8010020]

[8]

H., Ma, H., Yang, M., R., Lyu, and I., King, “Sorec: social recommendation using probabilistic matrix factorization”, In 17th ACM conference on Information and knowledge management, 2008, pp 931– 940.

[9]

M. Pozo, and R. Chiky, "An Implementation Of A Distributed Stochastic Gradient Descent For Recommender Systems Based On Map-Reduce", International Workshop On Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1-5, 2015. [http://dx.doi.org/10.1109/IWCIM.2015.7347074]

[10]

P. Zhou, C. Lu, Z. Lin, and C. Zhang, "Tensor Factorization for Low-Rank Tensor Completion", IEEE Trans. Image Process., vol. 27, no. 3, pp. 1152-1163, 2018. [http://dx.doi.org/10.1109/TIP.2017.2762595] [PMID: 29028199]

[11]

A., P., Singh, G., J., Gordon, “Relational Learning via Collective Matrix Factorization”, In 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, pp 650-658.

[12]

R. Vinodha, and R. Parvathi, "Location Based Point-Of-Interest Recommendation System Using CoPear Similarity Measure", International Journal of Scientific & Technology Research, vol. 8, no. 12, pp. 3689-3696, 2019.

[13]

S., Rendle, C., Freudenthaler, Z., Gantner, and L., S., Thieme, “BPR: Bayesian personalized ranking from implicit feedback” , In 25th Conference on Uncertainty in Artificial Intelligence, 2009, pp 452–461.

[14]

S. Liu, "User Modeling for Point-of-Interest Recommendations in Location-Based Social Networks: The State of the Art", Mob. Inf. Syst., vol. 2018, pp. 1-13, 2018. [http://dx.doi.org/10.1155/2018/7807461]

[15]

P., Symeonidis, A., Zioupos, “Matrix and Tensor Factorization Techniques for Recommender Systems”. Springer International Publishing: New York, 2017, pp. 1-101.

Point of Interest

Artificial Intelligence and Data Science 237

[16]

S. Zhao, I. King, and M.R. Lyu, "Aggregated Temporal Tensor Factorization Model for Point-o-Interest Recommendation", Neural Process. Lett., vol. 47, no. 3, pp. 975-992, 2018. [http://dx.doi.org/10.1007/s11063-017-9681-8]

[17]

S., Rendle, L., S., Thieme, “Pairwise Interaction Tensor Factorization for Personalized Tag Recommendation”, In 3rd ACM International Conference on Web Search and Data Mining, 2010, pp 81-90.

[18]

S. Rabanser, O. Shchur, and S. Günnemann, "Introduction to Tensor Decompositions and their applications in Machine Learning", 1711.10781

[19]

W.R. Tobler, "A Computer Movie Simulating Urban Growth in the Detroit Region", Econ. Geogr., vol. 46, pp. 234-240, 1970. [http://dx.doi.org/10.2307/143141]

[20]

T., A., N., Pham, X., Li, and G., Cong, “A general model for out-of-town region recommendation”, In 26th International Conference on World Wide Web, 2017, pp 401–410.

[21]

W. Luan, G. Liu, and C. Jiang, "Collaborative Tensor Factorization and its application in POI Recommendation", 13th International Conference on Networking, Sensing, and Control, pp. 1-6, 2016.Mexico [http://dx.doi.org/10.1109/ICNSC.2016.7478984]

[22]

X., Wang, Y., L., Zhao, L., Nie, Y., Gao, W., Nie, Z., Zha, and T., Chua, “Semantic-based location recommendation with multimodel venue semantics”., IEEE Trans. Multimed., vol. 17, no. 3, pp. 409419, 2015. [http://dx.doi.org/10.1109/TMM.2014.2385473]

[23]

W., Luan, G., Liu, C., Jiang, and L., Qi, “Partitioned based Collaborative Tensor Factorization for POI Recommendation”, IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 3, 437−446, 2017.

[24]

X. Li, M. Jiang, H. Hong, and L. Liao, “A time-aware personalized point-of-interest recommendation via high-order tensor factorization”, ACM Transactions on Information Systems (TOIS), vol. 35, no. 4., Article No., vol. 31, pp. 1-23, 2017.

[25]

X. Long, and J. Joshi, "HITS-based POI Recommendation Algorithm for Location-Based Social Networks", IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 642-647, 2013. [http://dx.doi.org/10.1145/2492517.2492652]

[26]

X., Zhong, Y., Zhang, D., Yan, Q., Wu, Y., T., Yan, and W., Li, “Recommendations For Mobile Apps Based On The HITS Algorithm Combined With Association Rules”, IEEE Access, no. 7, 105572 – 105582, 2019.

[27]

X., Jiao, Y. Xiao, W.Zheng, H. Wang, and C.H. Hsu, “A novel next new point-of-interest recommendation system based on simulated user travel decision-making process”, Future Generation Computer Systems, vol. 100, 2019, 982-993, 2019.

[28]

M., Ye, P., Yin, and W., C., Lee, “Location recommendation for location-based social networks”, In 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2010, pp 458– 461.

[29]

Y. Yu, and X. Chen, "A Survey of Point-of-Interest Recommendation in Location-Based Social Networks", Workshops at the twenty-ninth AAAI conference on artificial intelligence, pp. 53-58, 2015.

[30]

W. Wang, J. Chen, J. Wang, J. Chen, and Z. Gong, "Geography-Aware Inductive Matrix Completion for Personalized Point-of-Interest Recommendation in Smart Cities", IEEE Internet Things J., vol. 7, no. 5, pp. 4361-4370, 2020. [http://dx.doi.org/10.1109/JIOT.2019.2950418]

[31]

I., C., Lin, Y., S., Lu, W., Y., Shih, and J., L., Huang, “Successive POI Recommendation with Category Transition and Temporal Influence”, In IEEE 42nd Annual Computer Software and

238 Artificial Intelligence and Data Science

Roy et al.

Applications Conference (COMPSAC), 2018, pp 57-62. [32]

Q., Yuan, G., Cong, Z., Ma, A., Sun, and N., M., Thalmann, “Time-aware Point-of-interest Recommendation”, In 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2013, pp. 363–372.

[33]

Z., Liu, “High Performance Latent Dirichlet Allocation for Text Mining”, Brunel University School of Engineering and Design. Ph.D. Theses, 2013, pp. 1-187.

[34]

S. Liu, Z. Chen, and X. Li, "Time-semantic-aware Poisson tensor factorization approach for scalable hotel recommendation", Inf. Sci., vol. 504, pp. 422-434, 2019. [http://dx.doi.org/10.1016/j.ins.2019.07.068]

[35]

G. Liao, S. Jiang, Z. Zhou, C. Wan, and X. Liu, "POI Recommendation of Location-Based Social Networks Using Tensor Factorization", 19th IEEE International Conference on Mobile Data Management, pp. 116-124, 2018. [http://dx.doi.org/10.1109/MDM.2018.00028]

[36]

S., A., Asl, S., Abukhovich, M., G., A., Mensah, A., Cichocki, A., H., Phan, T. Tanaka, I., Oseledets, “Randomized Algorithms for Computation of Tucker decomposition and Higher Order SVD (HOSVD)”., IEEE Access, vol. 9, pp. 28684-28706, 2020.

[37]

S., Leach, “Singular Value Decomposition A Primer”, Unpublished Manuscript, Department of Computer Science, Brown University, Providence, RI, USA , 1995, pp 1-8.

[38]

V., H., Thuc, P., Srinivasan, “A Latent Dirichlet Framework for Relevance Modeling”, In Asia Information Retrieval Symposium, 2009, pp 13-25.

[39]

J. Lu, and M.A. Indeche, "Multi-Context-Aware Location Recommendation Using Tensor Decomposition", IEEE Access, vol. 8, pp. 61327-61339, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.2983555]

[40]

D., K., Bokde, S., Girase, and D., Mukhopadhyay, “Role of Matrix Factorization Model in Collaborative Filtering Algorithm: A Survey”., International Journal of Advance Foundation and Research in Computer, vol. 1, no. 12, pp. 111-118, 2014.

[41]

Y. Koren, R. Bell, and C. Volinsky, "Matrix Factorization Techniques for Recommender Systems", Computer, vol. 42, no. 8, pp. 30-37, 2009. [http://dx.doi.org/10.1109/MC.2009.263]

[42]

E. Frolov, and I. Oseledets, Tensor Methods and Recommender Systems, 2017. [http://dx.doi.org/10.1002/widm.1201]

[43]

J.L. Sarkar, and A. Majumder, "A new point-of-interest approach based on multi-itinerary recommendation engine", Expert Syst. Appl., vol. 181, 2021.115026 [http://dx.doi.org/10.1016/j.eswa.2021.115026]

[44]

J., L., Sarkar, A., Majumder, C., R., Panigrahi, and S., Roy, “Multitour: A multiple itinerary tourists recommendation engine”., Electron. Commerce Res. Appl., vol. 40, no. 100943, 2020.

[45]

M. Jamali, and M. Ester, "A matrix factorization technique with trust propagation for recommendation in social networks", 4th ACM conference on Recommender systems, pp. 135-142, 2010. [http://dx.doi.org/10.1145/1864708.1864736]

[46]

J. He, X. Li, L. Liao, and S. Song, "Inferring a personalized next point-of-interest recommendation model with latent behavior patterns", AAAI Conference on Artificial Intelligence (AAAI), vol. vol. 30, pp. 137-143, 2016. [http://dx.doi.org/10.1609/aaai.v30i1.9994]

Artificial Intelligence and Data Science, 2023, 239-261

239

CHAPTER 12

Exploring the Usage of Data Science Techniques for Assessment and Prediction of Fashion Retail A Case Study Approach Dillip Rout1,* Department of Computer Science and Engineering, Centurion University of Technology and Management, Odisha, India 1

Abstract: In this article, the insights of a garment retail store have been studied with respect to the attributes of the dresses and sales information. Mention that each dress in fashion retail has several attributes or features. These features play a critical role in the selection of consumers or customers. This study tries to establish the relationship among these features by which the importance of the attributes is evaluated concerning sales. Furthermore, this paper tries to automate the process of the recommendation of the dresses by using these attributes. It is merely a binary classification but useful for retail sales. Moreover, the demand for sales is estimated over a period. All these objectives are achieved through using one or more data science techniques. The case study shows that the algorithms of data science are helpful in the decision-making of fashion retail.

Keywords: Categorical Features, Data Wrangling, Feature Engineering, Market Basket Analysis, Time Series Forecasting. INTRODUCTION The fashion and garment industry is expanding very fast in this modern era. The underlying reason for the rapid growth is the diversity in demand which in turn has an expeditious change because of the trend in the choice of customers [1]. Consequently, the business model evolves over time, needing regular revisions in order to match the swinging demand and choice [2, 3]. In such a scenario, the basic thumb rule and/or general perception will not be feasible enough to ensure a smooth operation of trading by tackling this issue. Hence, some of the manufacturing units and retail outlets of the fashion industry have already automated their businesses. Nevertheless, automation is a typical and critical aim *

Corresponding author Dillip Rout: Department of Computer Science and Engi- neering, Centurion University of Technology and Management, Odisha, India, Email- dil- [email protected] Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

240 Artificial Intelligence and Data Science

Dillip Rout

to achieve as it requires an ample understanding of the underneath business logic, cost, attributes of fashion, demand, choice of customers, etc [4]. Therefore, it is not surprising that many of these industries are yet to be automated, and the already automated industries struggle to maintain the consistency of profit. The major bottlenecks to such a circumstance are the characteristics of the industry, which differ from case to case apart from the varying demand and choice of customers [5]. Thus, understanding the data is vital in this situation, and the formulation of models is a necessity for the fashion retail industry, possibly at an individual level. In this regard, the usage of data science techniques is explored in this paper for the automation of a fashion retail outlet. A dataset of retail information is given, which contains attributes or features and sales of a set of garments. The primary aim is to find the importance of garments that influences sales, listing out the major features which led to higher sales. Furthermore, the user ratings are inspected to figure out the importance of features on total sales. In addition, the analysis is also focused on forecasting the future demand for each garment of retail based on history. Each of these questions is dealt with one or multiple machine learning algorithms, such as Linear Regression, Logistic Regression, Decision Tree, Random Forest, etc. Mention that one algorithm may not be sufficient to find a definite solution as it is a data-driven approach, hence, a combination of algorithms is used to recommend the best solution. Besides measuring the features, the algorithms are also evaluated as fitting with respect to the objectives and data. All the applied algorithms are demonstrated through a case study which consists of 15 features, including attributes and sales data for 500 garments. The remainder of the paper is organized as follows. The rest of this section has a literature review followed by the list of objectives of the study. Then in Section 2, a theoretical background of the applied machine learning techniques is discussed. Next, the article is described through the stated case study in Section 3. Lastly, the concluding remarks are presented in Section 4. PREVIOUS WORKS In this subsection, a review of previous works is presented. This paper focuses on fashion or garment retail, however, there are very few papers which has the same theme. Hence, the review is extended to the articles which address general retail problems. The literature shows that the retail industry is studied in three facets, namely, the behavior of consumers, decision-making, and demand forecasting. Most of the literature is focused on demand estimation for future production,

A Case Study Approach

Artificial Intelligence and Data Science 241

whereas a few of the papers state the other facets. Here, a comprehensive review is presented for all three facets as follows. The behavior of the consumers plays a crucial role in determining the need for production for the retail industry [1 - 5]. Note that the behavior spreads over both offline and online consumer behavior. In general, several classifiers are used to fit consumer behavior and build up a recommended system. For instance, the decision table classifier provides the highest accuracy level for consumer behavior for online shopping data [6]. On the other hand, the filtered classifier depicts the lowest accuracy in showing the behavior of the consumers. Finally, a recommended system is suggested by the authors by which consumers can easily select their required items. Furthermore, the authors state that clustering techniques, association rules, random trees, forests, etc., are also attempted in the past to know the behavior of the consumers. Another article reveals that a combination of sentimental analysis and neural networks gives better precision in fixing the price of the products [7]. Thus, it is particularly important to analyze the data of the products as well as the needs of the consumers in order to settle the price of the product. Furthermore, the purchase decision and situations where the consumers buy are studied with digital signage to predict the behavior [8]. In this case, the support vector machine is proven to result in the best output with high accuracy. Differently, the usage of the Internet of Things in the application of smart stores is studied through the application of indoor positioning, augmented reality, facial recognition, and interactive display [9]. Furthermore, attitudes and subjective norms are found to be the key predictors for online fashion renting, which is found through confirmatory factor analysis and structural path analysis [5]. Ease of shopping is the most influencing factor in consumer behavior [10]. It is concluded that these applications improve the experience of consumers and their behavior of buying from offline stores. Overall, multiplexing technology has played a critical role in both online and offline shopping, and sales have increased. Decision-making in retail management is extremely difficult due to the variety of features (characteristics) to be processed [11]. So, the dimensionality is reduced using correlation analysis first, and then a random forest classifier is applied to the lowly correlated features. Note that the features having high correlation are discarded as these are redundant while applying classification. Likewise, machine learning techniques are used with the help of non-parametric statistical tools to determine the performance of sales strategy [12]. It is found that discount offers are effective in increasing sales, possibly with the combined products. Nevertheless, there is a saturation point to the discount where the sales do not increase. The manufacturing of Zara is being shared by a couple of sub-continents because of its popularity [13]. Zara started its manufacturing in Europe but spread

242 Artificial Intelligence and Data Science

Dillip Rout

over other sub-continents due to the liking of its tailor-made fashion. The competition from the countries has made it challenging yet growth in sales. Hence, it is necessary to study the elements affecting the sales and popularity of the fashion industry. Clustering-based algorithms are applied to assess the sales of retail stores [14]. Mainly four such algorithms are studied, namely, K- Means, Density-based, Filtered, and Farthest First clustering algorithms to classify the retail data. It is found that the Farthest First clustering algorithm is robust and correctly classifies the retail sales in order to derive insights. The big-data helps characterize retail analytics, which can reveal the underlying facts in the dimension of customer data, products, temporal, spatial and channels used for sales [15]. The big data is combined with statistical and theoretical insights to guide systematic answers to retailing, and is also useful in streamlining analysis. It provides a new source of data and study of large-scale correlation. Mention that it overcomes the issues in traditional predictive analytics tools, which have a larger impact with lesser bias. Similarly, a study is conducted to measure the impact of big-data analytics in retailing, providing additional opportunities in terms of retail management and research [4]. Furthermore, social networking, participation, and feedback of consumers are studied from a retail perspective to enhance decision-making in retail management [16]. Big data has provided a platform to establish a relationship between online and offline consumers, manufacturers and retailers, and suppliers and consumers. Nonetheless, some facts remain the same, like consumers to continue buying products from a safe and happy experienced channel. Therefore, it is proven that the retail industry has more scope for analysis and research about sales and decision-making to sustain and grow. Most research article found for retail is demand forecasting. A comprehensive review is provided for fashion retail in view of operational issues for both the demand and supply sides [17]. It is cumbersome and challenging since the involvement of uncertainties on both sides. In particular, products with a short life cycle and high demand uncertainty are difficult to formulate as mathematical models. Hence, researchers identify that risk management is the bottleneck to sales forecasting, and it needs microlevel estimation of demand with respect to season, pricing delays, distinction in products, and product design and manufacturing decisions. For instance, for an online retailer, the pricing is optimized based on demand forecasting [18]. Particularly, non-parametric machine learning techniques are used to assess historical lost sales. Further, prediction of the demand for new products is established keeping in view of the competitive market. This helps in setting the appropriate pricing for the respective products. It is found that the regression trees model mixed with the bagging method to be most suited for the pricing depending upon the demand. Moreover, combining operation research with machine learning forms a robust model for

A Case Study Approach

Artificial Intelligence and Data Science 243

pricing, which needs further investigation. For predicting the demand, a comparative study is made using machine learning techniques such as normal regression and boosting algorithms [19]. It is found that Gradi- ent boost algorithm is the best model for prediction as its Root Mean Square Error (RMSE) is the lowest and its R2 value is the highest. Also, it is revealed that hyperparameter tuning is required for high performance, especially for the AdaBoost algorithm. Similarly, machine learning techniques are used for de mand estimation for retail [20]. These models are believed to establish the inherent relationships among parametric models, user-selected covariates, and completely non-parametric approaches. Furthermore, it is proven that deep learning methods have high prediction accuracy for retail sales [21]. Espe- cially, it outperforms logistic regression, which has low accuracy because of the presence of multiattributes. In general, machine learning methods are suitable for handling large data, and classification methods are more efficient than regression in this context [3]. Overall, it can be stated that machine learning and deep learning methods are suitable for demand prediction for retail sales. A handful of articles are seen which have developed models for retail management. The majority of these focused on consumer behavior and demand forecasting. However, a few papers are found to be helpful in decision-making for in-store retail management. In particular, the relationships among the attributes (features) are given less importance. Moreover, the relationship between the attributes is overlooked by modeling it in terms of correlation. Mention that correlation analysis can only be applied to numeric attributes, whereas a retail store may have categorical attributes. Hence, it is important to look into this aspect in more detail. Nevertheless, the relationship among the attributes provides a base for further analysis of the sales and pricing. Thus, apart from demand forecasting, the inherent relationship of attributes must be studied. Goal and Objectives The goal of the study is to automate the decision-making process in fashion retail through data science techniques. The spectrum of this decision-making is widespread. For instance, which of the garments are liked by the customers, and what are the underlying factors which drive the sale of the garments. This study will be beneficial for any high-end fashion retail store looking to expand its business. It helps to understand the market and find the current trends in the industry. However, the available data may not be readily usable for the prediction as it is likely to contain many errors due to manual entry. Hence, a sophisticated procedure is needed to prepare the data and predict the needs, ultimately providing end-to-end automation.

244 Artificial Intelligence and Data Science

Dillip Rout

In this context, the particular objectives of the study are as follows. 1. To process the given data in such a format so that it is helpful in decisionmaking. 2. To identify the garments having high profits based on the given attributes of the product, like the style, season, etc. 3. To analyze the attributes of dresses and find which are the leading at- tributes affecting the sale of a dress. 4. To find if the rating of the dress affects the total sales to regularize the rating procedure and find its efficiency. 5. To build up a model to predict the recommendation of products for future stocks. 6. To build a predictive model as per the trend of total sales for each dress so that stocks can be extended for a period of three more alternative days. The remainder of the article is organized as follows. In the next section, a framework of analysis is established. Then, a case study of the set of available input data is described. Thereafter, the description and predictions are applied and discussed with the results. Proposed Framework The proposed framework for analyzing garment retail is presented in this section. Mention that this paper focuses on the study of garment retail through a case study. However, it is important to show the theoretical background relevant to the study, which is explained as follows. In particular, the proposed framework is threefold. The first one refers to the preparation of the data before processing it, as there might be errors in the given data (Section 2.1). Then, it is inevitable to show how the features are related to each other and the target variable (Section 2.2). Finally, the basis for the required prediction is discussed in Section 2.3. Data Preprocessing The preprocessing of the data includes dealing with errors. This refers to the first objective. Note that the errors are of various types. For instance, there may be several, not applicable (NAs) values inside the given data. These NAs need to be filled with appropriate values. This article is dealt with numerical and categorical variables. For the numerical values, an average is taken into consideration, while the categorical values are filled with the most frequent mode values. Furthermore, there can be spelling mistakes while data entry, which is dealt with by mapping

A Case Study Approach

Artificial Intelligence and Data Science 245

those into adequate values. The rest of the procedure we will encounter while discussing the case study. Feature Engineering This approach depicts the relationship among the features as per the requirement in the second, third, and fourth objectives. Further, it states the important features with or without the target variable. However, note that the numerical and categorical variables need to be treated differently in this case. For numerical variables, correlation can be established to show the relationship among the independent variables. Similarly, the significance of the variables with respect to the output can be easily said by looking at the p-value in a regression analysis. On the other hand, the categorical variables are processed with the Random Forest technique to know their importance. Furthermore, to find the coupling of the categorical variables, we can establish a relationship with the help of market basket analysis. This is a bit tricky as the market basket analysis has some preconditions which may not be readily available in the data. Hence, we must prepare the data before processing the market basket analysis. Predictive Analysis Here, the predictive analysis is discussed in the context of garment retail. This refers to the fifth and sixth objectives, as mentioned earlier. Again, the prediction may be for regression (numerical) or classification (categorical) as per the need. In this case, we deal with the classification as we need to predict the recommendation of a dress which is nothing but a binary classification. In addition, there is a need for sales forecasting as per the objective. Here, a time series analysis is provided to predict the sales. The discussion in this section forms a basis for the approaches to achieve the defined objectives. The particular techniques are discussed while going through the case study, which is explained in the following section. Experimental Study The objectives and the theoretical background are presented in the previous sections. This section discusses the applied techniques to achieve each of the mentioned objectives. This is described in the following sequence: preprocessing of the given data, feature engineering, and finally, the predictive analysis. The experiments are carried out using an R programming language.

246 Artificial Intelligence and Data Science

Dillip Rout

Data Description and Preparation The preprocessing of the data includes data wrangling and exploratory analysis. First, the given data and its attributes have been described, followed by a cleaning of the data and exploratory analysis. There are two files provided, (i) Attributes of dresses and (ii) Sales of dresses. The attributes and description of each of the attributes are given in Table 1. The sales file depicts the sales for each dress on a particular date. The date ranges from 29/8/2013 to 12/10/2013, and the sales are registered for alternative days. Table 1. Attributes of garments with respective descriptions. Attribute

Description

Dress˙ID

A unique identifier for each dress

Style

Style of dress can belong to one of 12 styles

Price

Price category of the dress

Rating

A number between 0 and 5, rating of the dress

Size

Size of the dress (small, medium, large, XL, and

free) Season

Season category of the dress, i.e., summer, spring, etc.

Neckline

Type of neckline, for example, V-neck, collar, etc.

Sleeve length

Length of the sleeve full, three-quarters, etc.

Waistline

The waistline of the dress

Material

The material of the dress, for example, silk, cotton, etc.

Fabric type

Fabric type of dress

Decoration

The decoration of the dress, like ruffles, embroidery, etc.

Pattern Type

Pattern type of the dress-dot, animal print, etc.

recommendation

A binary value suggesting a recommendation (1) or (0)

Issues and Resolution of Data The data received is not clean, so loading and correcting data are discussed here. The data given in the excel sheet was converted into sales and attribute datasets as csv files. The datasets had the following issues, which were adjusted. In the sales data, the date field was mentioned in various standardized formats. Also, the date field was properly converted into the data frame in the program.

A Case Study Approach

Artificial Intelligence and Data Science 247

In the case of numeric values, all NAs or non-numeric data are replaced with 0s. In the case of categorical values, NAs, undefined or unrecognizable types, were replaced by the most frequently occurring values. Consistency of sales and attribute dataset is checked by ensuring that all dress codes have sales and vice-versa. Consistency of the names in the attribute dataset, i.e., the categorical variables are verified and adjusted through small and capital letter issues; all are converted into small letters, replacing hyphens with an underscore in observations since it will create issues while applying dummy variables for factors. Removing the rows with lower frequency levels, say less than 5,create trouble in fitting models while segregating training and test data. Finally, a combined dataset as the total is generated by including the features from the attribute dataset and the total sales of each item. It helped in getting an overview as a linked dataset and is useful in predicting and/or highlighting the impact of one or more features with respect to another set of features. Exploratory Analysis In this subsection, a preliminary exploratory analysis is conducted on various features to know the pattern of a few features. Also, building the relationship among a few features is carried out. Particularly, the pattern of the feature is tested with normality, and the relationships are compared with the impact of one feature over another, as described below. Normality Check: Here, the interest is to check the normality of Rating and TotalSales (obtained by adding Sales of each day of a particular Dress˙ID) in the total dataset. Before performing any kind of normality check, the summary of these features is given in Table 2. Clearly, the mean is not equal to the median for both these features, so there is a chance that these might not follow a normal distribution. Also, note that there is a huge gap between the third quartile and the maximum value for TotalSales. On the other hand, the first and third quartiles generate a very close range for rating. Table 2. Distribution summary of TotalSales and Rating attributes of garments. Feature

Minimum

1st

Median

Mean

3rd

Maximum

TotalSales

19

1064

3002

6336

6664

91556

Rating

0

4.025

4.6

3.589

4.8

5

248 Artificial Intelligence and Data Science

Dillip Rout

The next test for normality is done with shapiro.test method. Mention that the null hypothesis of this test is that the population is normally distributed. Surprisingly, the p-value comes out to be 2.20E-16 for both Rating and TotalSales. It means that these two features are not normally distributed. To further cross-check it, Quantile-Quartile (Q-Q) plot is drawn for TotalSales and Rating, as shown in Fig. (1). It is clearly visible that a linear line cannot fit both these curves; hence, the data is not normally distributed for these features.

−3

−2

−1

0

1

2

4 2 0

40000

Sample Quantiles

Normal Q−Q Plot

0

Sample Quantiles

Normal Q−Q Plot

3

−3

(a) Q-Q plot of Total Sales

−2

−1

0

1

2

3

(b) Q-Q plot of rating

Fig. (1). Q-Q plot for Total Sales and Rating.

Again, Kernal-Density plots are drawn to further find the Total Sales and Rating patterns, as shown in Fig. (2). As expected, both the features fail the normality test, as studied in Table 2. In particular, Total Sales data is right-skewed, meaning that most of the sales count is on the lower side of the normal curve, although a few exist on the extremely higher side. Furthermore, the data in rating are clubbed at two places, almost 25% on the extremely lower side, whereas the rest are on the extremely higher side. All these experiments show enough evidence of a lot of variation in Rating and Total Sales data. It implies that some dresses are sold relatively more than others, and the same goes with a rating as well. Ultimately, it makes it difficult to predict the values and analysis.

0

20000 40000 60000 80000 N = 402 Bandwidth = 1134

(a) Kernel-Density plot of Total Sales Fig. (2). Kernel-Density plot for Total Sales and Rating.

0.0 0.4 0.8

density.default(x = total_df$Rating)

Density

0.00015 0.00000

Density

density.default(x = total_df$TotalSales)

0

1

2

3

4

5

N = 402 Bandwidth = 0.1569

(b) Kernel-Density plot of rating

A Case Study Approach

Artificial Intelligence and Data Science 249

Relationships: Here, the relationship among Total Sales, Rating, Price, and Style features is established since these are important in the objectives of this article. The distribution of Total Sales and Rating are already mentioned, but the pattern of Price and Style is not yet mentioned, which will make it inconvenient for analysis. So, first, the distribution of Price and Style are shown in Fig. (3). It can be observed that average Price is dominant over other types, whereas casual Style is most frequent in respective features. In addition to it, the relationship between Price and Style is presented in Fig. (4). It is evident that the highest distribution of Price type is average and low, which is clearly visible in the most frequent casual Style. Distribution of Dresses Style

low

0 50

very_highmedium

high

Count

150

average

bohemian

cute

party

vintage

Style Name

(a) Pie chart of Price Fig. (3). Distribution of Price and Style data.

Fig. (4). Price variation with respect to styles.

(b) Histogram of Style

250 Artificial Intelligence and Data Science

Dillip Rout

In addition to the aforementioned relationships, the following relationships are investigated to know more about the impact of Price and Style on Rating and Total Sales at an approximate measure. First, the impact of Price is shown on Rating and Total Sales in Table 3, particularly, the mean of Rating and Total Sales are calculated for each of the Price types. It is observed that low Price dresses have the highest Rating as well as Total Sales. On the other hand, a very high Price leads to a good Rating but very low Total Sales. Further, average Price dresses have decent Ratings and Total Sales. Overall, it is evident that the customers do not go for high or very high Price; rather, they are satisfied with low and average Price dresses. Similar to Price, the impact of various Style types is studied with respect to Rating and Total Sales, shown in Table 4. As seen earlier, casual Style was most frequent Fig. (3b), with a decent Rating, but its Total Sales are below average. Nonetheless, vintage Style has an average Rating but the highest Total Sales. Generally, the dresses of highly-rated Style (except brief Style) do not have good Total Sales values. Again, it is observed that customers like moderate or low rating, which got high Total Sales. Table 3. Impact of Price types on Rating and Total Sales. Price

Avg Rating

Avg Total Sales

average

3.5

6618

high

2.79

3886

low

3.85

7110

medium

3.31

2975

Very high

3.75

1322

Table 4. Impact of Style types on Rating and Total Sales. Style

Avg Rating

Avg Total Sales

bohemian

3.74

3120

brief

4.08

9540

casual

3.64

5697

cute

2.99

6070

novelty

2.01

2678

party

3.84

3959

sexy

3.61

9492

vintage

3.28

10084

work

4.06

5398

A Case Study Approach

Artificial Intelligence and Data Science 251

Feature Engineering As per the second to fourth objectives, the impact of each of the features with respect to Total Sales, needs to be studied. In other words, it is a superset of the question asked in the previous subsection. In this case, the importance of all features is also performed with various models to know the overall influence on sales for each dress with respect to every feature. One additional model named mRMRE is carried out with original data as well as data with dummy variables (mRMRE-D). Note that there are several levels for categorical features, so while calculating the importance of a feature, the model might give the importance of each level instead of the feature itself. In such cases, the median importance among the importance of levels is taken as the importance of that feature. Nonetheless, in the case of Linear Regression, a complement of the mean p-value is considered the importance of the corresponding feature. The importance of each feature with respect to Sales is compared in Table 5 with the help of various predictive models. In order to have a rough idea about which feature has more impact, a geometric mean is taken for the importance of each of the features in various models. Note that negative values in importance are disallowed in the geometric mean. So, in order to cancel the negativity, each negative value is transformed into a positive number by dividing by 10 and multiplying by -1, which makes a negative number lower than the least positive number. Further, the features are sorted in reverse order with respect to geometric mean and also categorized to have a sense of superiority. It can be verified that rating is clearly dominant among all features, which means that rating influences most for purchasing a dress by the customers. Along with Rating, Pattern.Type and recommendation have a higher impact on Total Sales. As seen earlier and observed in this case, Style has a moderate impact, whereas Price has a low impact on TotalSales. Impact of Rating on Sales As per the question, this task is a particular thread of the question addressed in the previous subsection. Here, the impact of one feature, the rating is evaluated with respect to Total Sales for each dress. Well, the rating is observed Table 5 to be the most influential feature. Nevertheless, some more experiments, namely Linear Regression and Associativity, are carried out in this subsection to extend the illustration of superiority. In this context, in order to demonstrate the influence of rating, three other high-impact features, namely, Pattern.Type, Recommendation, and Style are taken into consideration for comparison pur- poses.

252 Artificial Intelligence and Data Science

Dillip Rout

Table 5. Comparison of the impact of Features on TotalSales using various models. Feature

RF1

mRMRE

mRMRE-D

Boruta

LR2

GM

Rating

0.381

0.236

0.236

13.226

1.000

0.775

Pattern.Type

0.454

0.098

0.035

4.633

0.813

0.358

Recommendation

-1.395

0.033

0.033

1.958

0.708

0.185

Style

0.139

0.029

-0.016

3.670

0.356

0.097

Size

-0.180

0.065

0.001

4.746

0.707

0.075

NeckLine

-0.729

-0.086

-0.044

1.140

0.406

0.066

Season

-0.357

0.002

0.004

2.885

0.597

0.057

waiseline

-0.501

0.054

0.010

-0.441

0.429

0.055

Fabric Type

-0.451

-0.059

-0.024

0.706

0.558

0.048

Decoration

0.214

0.027

-0.013

-0.552

0.339

0.042

SleeveLength

0.044

-0.096

-0.008

0.351

0.308

0.032

Price

-1.251

-0.021

-0.048

0.107

0.173

0.030

Material -0.214 -0.098 -0.004 -0.573 0.529 0.019 RF = Random Forest, LR = Linear Regression, GM = GeoMean. 1 The values are obtained by taking the natural logarithm and subtracting 22. 2 The values are obtained by taking the complement of the mean p-value of all the levels per feature.

The impact of rating along with Pattern.Type, Recommendation, and Style are compared in Table 6 using Linear Regression model. In this case, two sets of data, i.e., original data, and data with dummy variables (lm˙fit˙D), are used for computation. Finally, the significance of the features is measured in terms of pvalue and its significance code. It is worthwhile to remind that the null hypothesis says that there is no impact of features on response. That means if the p-value is lower than the threshold (0.05 in general), then the alternative hypothesis (features have an impact on the response) stands. In other words, the lower the p-value, the higher the importance of the feature. In this context, rating is way above the other features since the significance code is highest (***). For others, a couple of levels of the features have at most good significance (**). Clearly, the rating has the highest influence on Total Sales.

A Case Study Approach

Artificial Intelligence and Data Science 253

Table 6. Comparison of the impact of some Features, including Rating on Total Sales using Linear Regression. Feature lm˙fit˙pvals Signa lm˙fit˙D˙pvalsb Signa Recommendation

0.23807

-

0.23807

Stylebrief

0.08771

.

0.43719

Stylecasual

0.29322

-

0.41232

Stylecute

0.19721

-

0.92402

Stylenovelty

0.75490

-

0.78727

Styleparty

0.49631

-

0.77110

Stylesexy

0.02608

*

0.80790

Stylevintage

0.03228

**

0.30868

Stylework

0.43719

-

0.25072

-

0.02122

*

Pattern.Typeanimal

-

-

Pattern.Typedot

0.46702

-

0.02122

-

Pattern.Typegeometric

0.11406

-

0.15902

**

Pattern.Typenull

0.00410

**

0.00148

-

Pattern.Typepatchwork

0.01139

-

0.86433

-

Pattern.Typeprint

0.06684

.

0.71228

-

Pattern.Typesolid

0.04057

*

0.24502

-

Pattern.Typestriped

0.02122

*

0.26062

-

0.00001

***

Rating 0.00001 *** Significance codes for p-value: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1. b The input data contains dummy variables for categorical features. a

Again, for demonstration purposes, the associativity of Rating along with Pattern.Type, Recommendation, and Style with respect to TotalSales are compared, as shown in Table 7. However, note that rating is a numeric variable that is not applicable to the associativity test. Hence, the rating is converted to a categorical variable, RatingType, with levels as Excellent, Super, Good, Bad, and Poor (high to low) as per heuristics and quartile value. Also, TotalSales is converted into a categorical variable, as discussed previously. Further note that each feature has several levels, and the associativity is actually calculated for each level. So, the overall associativity is taken as the median value found in the result for each categorical variable. Furthermore, it can be observed that RatingType is significant in every metric (except Coverage), specifically in terms of Lift which actually shows the measure of the relationship between the attributes.

254 Artificial Intelligence and Data Science

Dillip Rout

Table 7. Comparison of the associativity of features including Rating on TotalSales. Feature

Support

Confidence

Coverage

Lift

Count

RatingType

0.10075

0.33471

0.29104

1.33863

40.5

Pattern.Type

0.02488

0.31534

0.07214

1.26134

10.0

Style

0.02239

0.30278

0.06965

1.21109

9.0

Recommendation

0.14925

0.25171

0.57214

1.00185

60.0

In order to demonstrate the relationship in pictorial form, the impact of Rating on TotalSales is shown in Fig. (5). Fig. (5a) shows the impact using the numerical data for both Rating and TotalSales. It can be seen that the higher the rating, the sales becomes higher. On the other hand, in Fig. (5b), the impact is demonstrated as categorical variables for both Rating and TotalSales. In this case, the TotalSales type has more percentage of MOST and HIGH sales types for Super (high) and Good (moderate) type of Rating, respectively. Again, this proves that rating has a very high influence TotalSales in a positive manner, i.e., a higher rating leads to higher TotalSales.

40000 0

Total Sales

Sales Data

0

3

3.7 4.1 4.3 4.5 4.7 4.9 Rating

(a) Box plot for TotalSales with respect to rating

(b) Stacked bar plot for SalesType with respect to RatingType

Fig. (5). Impact of Rating on TotalSales.

A Case Study Approach

Artificial Intelligence and Data Science 255

Impact of Material Price Season and Style on Sales As per the task, the objective is to find the impact of features like Material, Price, Season, and Style on TotalSales. Basically, it drills down to feature importance and associativity methods. The feature importance with respect to the sales was found by three approaches, namely, Random Forest, Boruta, and Linear Regression models. In the case of Linear Regression, the features are given importance as per p-value. Mention that the null hypothesis says that the response variable is independent of the rest of the features. So, a low p-value indicates higher relevance between dependent and independent variables. In this context, 1 p-value (complement) is taken to indicate higher relevance. Furthermore, for Random Forest, the values are obtained by taking the natural logarithm and subtracting 22. This transformation is done to match other values from various models, informally, a standardization. On the other hand, for finding the importance of using associativity, Lift (indicates the associativity between two attributes) metrics were used among Support, Confidence, Coverage, Count, and Lift. Nevertheless, TotalSales is a numeric value, but associativity can only be tested on a categorical variable. Hence, TotalSales was converted into a categorical variable by quartile ranges and heuristic observation into categories such as MOST, HIGH, AVERAGE, and LOW sales types (see code for details). Basically, MOST and HIGH sales types are of interest to demonstrate the associativity with respect to Material, Price, Season, and Style. Further, these sales types are taken as RHS, and one of the features from Material, Price, Season, and Style are taken as LHS for processing particular associativity. As per the aforementioned discussion, a comparison of the impact of fea- tures (Material, Price, Season, and Style) on TotalSales is given in Table 8. It is observed that Style is prominent in the number of sales because it is consistent over all the models. On the other hand, Price has good significance except in the case of Random Forest model. However, in comparison, Style dominates Price in influencing TotalSales. Table 8. Comparison of the impact of Material, Price, Season, and Style on TotalSales using various models. Feature

RF

Associativity-Lift

Boruta

LR

Material

0.008

1.117

0.307

0.630

Price

-0.647

1.072

0.810

0.634

Season

-0.167

1.102

0.288

0.253

Style

0.175

1.211

1.021

0.554

256 Artificial Intelligence and Data Science

Dillip Rout

Predictive Analysis In this subsection, models are fitted to achieve the goals given in Section 1.2. In particular, it serves the purpose of the fifth and sixth objectives. The fifth objective is to build a model for classifying recommendations, whereas the sixth is to find sales forecasting. The fifth objective can be referred to as automation of recommendation, which is discussed first and followed by sales forecasting. Automation of Recommendations As per the question, it is a classification problem to find out Recommendations in an automated way based on the features. Note that, the recommendation was given as a binary variable which was converted into a categorical variable prior to applying models. Four models, namely, Logistic Regression, Decision Tree, Random Forest, and SVM, are chosen for applying classification. Further, these models are tested on three variations of the data, i.e., (i) Using the original data, which is referred to as Raw, (ii) Selecting important features which are tagged as a feature, and (iii) Using dummy variables for each of the categorical variables which are called as Dummy. The models are trained on 60% of the data, and the rest 40% is kept for comparison. The comparison of models for classifying recommendations is shown in Table 9 based on Sensitivity, Specificity, and Accuracy. It is observed that all models performed better with Raw data input except the Decision Tree model, whose efficiency is good when applied to Feature data. In other words, Logistic Regression and Random Forest models perform better, but those are nondominant with respect to various types of data, i.e., Raw, Feature, and Dummy. Thus, either of these can be used for the automation of recommendations as per the business logic. Table 9. Comparison of classification models for predicting recommendation. Metrics

Logistic Regression

Decision Tree

RF

SVM

-

-

Raw

-

-

-

Sensitivity

0.640

-

0.631

0.703

0.631

Specificity

0.479

-

0.487

0.610

0.551

Accuracy

0.569

-

0.563

0.669

0.606

-

-

Feature

-

-

-

Sensitivity

0.652

-

0.647

0.636

0.614

Specificity

0.563

-

0.507

0.509

0.522

A Case Study Approach

Artificial Intelligence and Data Science 257

(Table 9) cont.....

Metrics Accuracy

Logistic Regression 0.625

Decision Tree

RF

SVM

0.581

0.594

0.588

Dummy

Sensitivity

0.697

-

0.596

0.649

0.610

Specificity

0.393

-

0.459

0.565

0.483

Accuracy

0.538

-

0.544

0.625

0.563

Sales Forecast As per the task, it was found to be a time series analysis, i.e., predict the sales of dresses over the next three alternative days. In order to achieve it, first the sales dataset was converted into time-series data. Then, the prediction was made using auto.arima() model of forecast package. However, the issue was this model is for univariate forecasting. Hence, each individual Dress˙ID was forecasted separately in an iterative manner. A snippet (up to the first 20 Dress˙ID) of the forecast data is given in Table 10. It is observed that there is not much variation among the alternative days. Moreover, the forecasting is transformed into a moving average, although there is no way to prove the phenomenon. The alternative day forecasting of dresses for three days is given in Table 10. Table 10. Sales forecasting of dresses for three alternative days. Dress˙ID

14/10/2013

16/10/2013

18/10/2013

1006032852

4110

4173

4235

1212192089

4464

4652

4839

1190380701

11

11

11

966005983

1967

1971

1975

876339541

2815

2894

2973

1068332458

27

27

28

1220707172

575

598

621

1219677488

274

286

298

1113094204

34

35

36

985292672

14

14

14

1117293701

143

149

154

898481530

210

218

226

957723897

3058

3137

3216

749031896

4246

4322

4398

1055411544

53

53

53

258 Artificial Intelligence and Data Science

Dillip Rout

(Table 10) cont.....

Dress˙ID

14/10/2013

16/10/2013

18/10/2013

1162628131

229

239

249

624314841

2568

2611

2654

830467746

19

19

19

840857118

17

18

19

1113221101

667

691

715

The fitness of the model is provided in terms of Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) metrics, as shown in Table 11, which are collected over each Dress˙ID. It is observed that RMSE values are not satisfactory, although the median is likely to be acceptable. How- ever, MAPE values are good, although outliers exist with a higher percentage. Table 11. Fitting accuracy of forecasting model - auto.arima. Metrics

RMSE

MAPE

Min.

0.00

0.00

1st Qu.

5.61

3.29

Median

62.29

5.33

Mean

119.99

5.32

3rd Qu.

148.32

6.64

Max.

1956.56

23.64

So, overall, this approach is acceptable in terms of predicting with reasonable accuracy. CONCLUSION A dataset containing retail information on a set of dresses is investigated in this article to find the important attributes of dresses which lead to higher sales. The dataset needed preprocessing since many null values, ambiguity, and typo errors were observed, which may be due to the lack of interpretation of data and/or manual entries. Furthermore, the difficulty level is increased due to the inclusion of categorical (text and labels) features in addition to numerical ones. Thus, market basket analysis is applied to know the relationship among the features, as it is infeasible to use correlation for the categorical data. Based on the experiments and results, it is evident that data science is a prominent tool for getting insights like extracting information, interpreting, assessing, and predicting the data.

A Case Study Approach

Artificial Intelligence and Data Science 259

Analysis of features of the retail data of dresses shows that rating, style, pattern.Type and recommendation play a major role in increasing sales. Furthermore, the sales are forecasted for a couple of days more given a set of two months (approximately) sales data. This objective turned out to be a time series analysis conducted per Dress˙ID. The forecasting was able to find out the future sales but is not prominent enough since it results roughly as a moving average. The evaluation process used several models, but the models, such as Linear Regression, Logistic Regression, and Random Forest, are useful for the classification and/or prediction of this dataset.The retail dataset contains both numeric and categorical features, making it difficult to analyze. Hence, one needs to apply various techniques, such as dummy variables, scaling, feature selection, standardization, etc., apart from the models for analysis. The impact of scaling was not significant enough, as per the observations. Similarly, using the dummy variables has a low effect on classification for this dataset. However, feature selection has a moderate impact on classification and prediction as per the results, although it has not influenced the scores much. Standardization is used to compare the results since there are various scores obtained for a single feature for different models. This helped in selecting the good features by comparing the standard result, although it is a relatively less significant indicator. Above all, the performance of the models is also to be compared in terms of metrics like errors, some of which are discussed in this article. However, more metrics with adequate relevance need to be demonstrated in order to prove the efficiency of the models. Furthermore, multi-attribute results are required to be converted to an appropriate score (single and scalar value) for comparing the results, e.g., the geometric mean is considered in this report (Table 5) to find the overall rating. It is a non-standard technique, so a set of appropriate methods or frameworks is barely required to illustrate goodness. ACKNOWLEDGEMENT The author would like to thank Mr. Kapil Muley, who taught us Data Science with R at Simplilearn, which formed the basis for this paper. REFERENCES [1]

W.D. Dahana, Y. Miwa, and M. Morisada, "Linking lifestyle to customer lifetime value: An exploratory study in an online fashion retail market", Journal of Business Research Online (Bergh.), vol. 99, no. 1, pp. 319-331, 2019. [http://dx.doi.org/10.1016/j.jbusres.2019.02.049]

[2]

A. Khakpour, Data Science for Decision Support: Using Machine Learn- ing and Big data in Sales Forecasting for Production and Retail, 2020.

[3]

J. Huber, and H. Stuckenschmidt, "Daily retail demand forecasting using machine learning with emphasis on calendric special days", Int. J. Forecast., vol. 36, no. 4, pp. 1420-1438, 2020.

260 Artificial Intelligence and Data Science

Dillip Rout

[http://dx.doi.org/10.1016/j.ijforecast.2020.02.005] [4]

M.G. Dekimpe, "Retailing and retailing research in the age of big data analytics,” International Journal of Research in Mar- keting, vol. 37, no. 1, pp. 3–14", Online (Bergh.), 2020. [http://dx.doi.org/10.1016/j.ijresmar.2019.09.001]

[5]

S.H.N. Lee, and P.S. Chow, "Investigating consumer attitudes and intentions toward online fashion renting retailing", J. Retailing Consum. Serv., vol. 52, 2020.101892 [http://dx.doi.org/10.1016/j.jretconser.2019.101892]

[6]

R.A.E.D. Ahmed, M.E. Shehab, S. Morsy, and N. Mekawie, "Per- formance study of classification algorithms for consumer online shopping attitudes and behavior using data mining",

[7]

N. Kalaiselvi, K.R. Aravind, S. Balaguru, and V. Vijayaragul, "Retail price analytics using backpropogation neural network and sentimental analysis", International Conference on Signal Processing, Communica- tion and Networking, ICSCN 2017, 2017pp. 16-21 [http://dx.doi.org/10.1109/ICSCN.2017.8085696]

[8]

R. Ravnik, F. Solina, and V. Zabkar, Modelling In-Store Con- sumer Behaviour Using Machine Learning and Digital Signage for Audience Measurement Data, 2014.http://link.springer.com/10.1007/978-3-319-12811-5

[9]

H. Hwangbo, Y. S. Kim, and K. J. Cha, “Use of the Smart Store for Per- suasive Marketing and Immersive Customer Experiences: A Case Study of Korean Apparel Enterprise,” Mobile Information Systems, vol. 2017

[10]

D. Suleman, I. Zuniarti, R. Marginingsih, I.H. Susilowati, I. Sari, S. sabil, and E. Nurhayaty, "The effect of decision to purchase on shop fashion product in Indonesia mediated by attitude to shop", Management Science Letters, vol. 11, pp. 111-116, 2021. [http://dx.doi.org/10.5267/j.msl.2020.8.024]

[11]

N.V. Razmochaeva, and D.M. Klionskiy, "Data presentation and appli- cation of machine learning methods for automating retail sales manage- ment processes", Proceedings of the 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, ElConRus 2019, 2019pp. 1444-1448 [http://dx.doi.org/10.1109/EIConRus.2019.8657077]

[12]

C. Soguero-Ruiz, F. J. Gimeno-Blanes, and I. Mora-Jimenez, “On the differential bench- marking of promotional efficiency with machine learning modeling Principles and statistical comparison,” Expert Systems with Appli- cations, vol. 39, no. 17, pp. 12 772–12 783, 2012. [Online]. Available:, [http://dx.doi.org/10.1016/j.eswa.2012.04.017]

[13]

N. Tokatli, "Global sourcing: insights from the global clothing industry the case of Zara, a fast fashion retailer", J. Econ. Geogr., vol. 8, no. 1, pp. 21-38, 2007. [http://dx.doi.org/10.1093/jeg/lbm035]

[14]

V. Shrivastava, and P. Narayan Arya, A Study of Various Clus- tering Algorithms on Retail Sales Data, 2012.http://warse.org/pdfs/ijccn04122012.pdf

[15]

E.T. Bradlow, M. Gangwar, P. Kopalle, and S. Voleti, "The Role of Big Data and Predictive Analytics in Retailing,” Journal of Retailing, vol. 93, no. 1, pp. 79–95", Online (Bergh.), 2017. [http://dx.doi.org/10.1016/j.jretai.2016.12.004]

[16]

C. Lorenzo-Romero, M-E. Andrés-Martínez, M. Cordente-Rodríguez, and M.Á. Gómez-Borja, "Active Participation of E-Consumer: A Qualitative Analysis From Fashion Retailer Perspective", SAGE Open, vol. 11, no. 1, 2021. [http://dx.doi.org/10.1177/2158244020979169]

[17]

X. Wen, T.M. Choi, and S.H. Chung, "Fashion retail supply chain management: A review of operational models", Int. J. Prod. Econ., vol. 207, pp. 34-55, 2019. [http://dx.doi.org/10.1016/j.ijpe.2018.10.012]

[18]

K.J. Ferreira, B.H.A. Lee, and D. Simchi-Levi, "Analytics for an Online Retailer: Demand Forecasting

A Case Study Approach

Artificial Intelligence and Data Science 261

and Price Optimization", Manuf. Serv. Oper. Manag., vol. 18, no. 1, pp. 69-88, 2016. [http://dx.doi.org/10.1287/msom.2015.0561] [19]

A. Krishna, V. Akhilesh, A. Aich, and C. Hegde, "Sales-forecasting of Retail Stores using Machine Learning Techniques", Proceedings 2018 3rd International Conference on Computational Systems and Information Technology for Sustainable Solutions, CSITSS 2018, 2018pp. 160-166 [http://dx.doi.org/10.1109/CSITSS.2018.8768765]

[20]

P. Bajari, D. Nekipelov, S.P. Ryan, and M. Yang, "Machine learning methods for demand estimation", Am. Econ. Rev., vol. 105, no. 5, pp. 481-485, 2015. [http://dx.doi.org/10.1257/aer.p20151021]

[21]

Y. Kaneko, and K. Yada, "A Deep Learning Approach for the Prediction of Retail Store Sales", IEEE International Conference on Data Mining Workshops, ICDMW, vol. vol. 16, 2016pp. 531-537 [http://dx.doi.org/10.1109/ICDMW.2016.0082]

262

Artificial Intelligence and Data Science, 2023, 262-271

CHAPTER 13

Data Analytics in Human Resource Recruitment and Selection Sumi Kizhakke Valiyatra1,* 1

Institute of Management in Kerala, University of Kerala, Kerala 695034, India Abstract: Human resource data analytics are more important now than ever before. An increasing number of businesses are delving ever deeper into the data they collect about their employees, their success, and their well-being. Recruitment analytics can assist in making smarter, data-driven selection, recruiting, and sourcing decisions. This technology will scan hundreds of resumes at once to provide the best possible fit for a particular job opening. An organisation can submit automated emails with an interview appointment that automatically sinks with the work calendar using modern recruiting software. The organisation may use automated disqualification of unqualified applicants to automatically screen application forms and exclude candidates who aren't qualified. Effective recruitment is a mix of science and art. It necessitates the implementation of repeatable processes that produce consistent results.

Keywords: Recruiting analytics, Time to fill, Artificial Intelligence screening, Optimum Productivity Level, Selection Ratio, Attrition rate, Employee Life Cycle. INTRODUCTION Data analytics has gained much traction in this technological transformation age [1]. Data is obtained in its raw form, analysed according to a company's requirements, and then used for decision-making purposes. This procedure aids companies in growing and expanding their operations [2 - 5]. Data analysis is a constantly changing discipline that places a lot of emphasis on new predictive modelling techniques. The methods and techniques used to analyse data in order to improve efficiency and profitability are known as data analytics [6 - 9]. To analyse different behavioural patterns, data is collected from various sources, cleaned, and classified. Depending on the organisation or entity, different methods and techniCorresponding author Sumi Kizhakke Valiyatra: Institute of Management in Kerala, University of Kerala, Kerala 695034, India; E-mail: [email protected] *

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

Data Analytics in Human Resource

Artificial Intelligence and Data Science 263

ques are used. Due to Data analytics, as a result of technological advancements, has a far-reaching influence on various fields of human development. Employee recruitment is a critical activity of every organisation. Any flaw in the selection process results in exorbitant costs [10, 11]. ‘Data analytics in human resources is more important than ever before to an enterprise [2].' Although human resources managers continue to place a premium on people skills, a growing number of companies and non-profits are crunching numbers to evaluate everything from talent development and retention to efficiency and job structure [12 - 18]. The shift to analytics is both cost-effective and time-saving. The amount of data an organisation will obtain to increase efficiency and output appears to be limitless. An increasing number of businesses are delving ever deeper into the data they collect about their employees, their success, and their well-being. Both companies and organisations prioritise retention and recruiting, which is also highly targeted for analytics. Analysis of large amounts of data helps us understand the huge volume of information our world generates [20]. RECRUITMENT ANALYTICS In today's dynamic work market, recruitment analytics has a significant effect. Recruitment analytics can assist in making smarter, data-driven selection, recruiting, and sourcing decisions. It may be due to a mismatch between the job description and the actual position or a bad onboarding process, that a newly hired employee leaves within the first three months that are a case in point. Recruiting analytics can answer the following questions. How much does it cost to employ someone for a job? Which sourcing method yields the most qualified candidates? What is the recruiting success rate? It is important to be able to address these questions in order to enhance recruitment decision-making. Applicant tracking systems, satisfaction surveys and brand data. Customer Relationship Management systems, Human Resources Information System data, and data from work advertising platforms are all popular data sources for recruitment analytics., Procedure for Recruitment Analytics Before you source, employ, and hire highly qualified talent, create a candidate persona to outline the talents, strengths, experience, and tendencies of the ideal

264 Artificial Intelligence and Data Science

Sumi Kizhakke Valiyatra

candidate. A well-defined and accurate applicant persona will aid the talent team in tailoring its plans and approach to the talent that an organisation is trying to recruit. When it comes to high-level recruitment, this is particularly significant. Using existing data from Applicant Tracking System, customer relationship management system and databases on prospective applicants, a Candidate persona must answer several main questions. Fig. (1) shows an example of a perfect persona Candidate Persona.

Fig. (1). Flow of a perfect persona.

OPERATIONAL REPORTING Recruitment data is descriptive of nature. They are the well-known fundamental hiring metrics. The cost of recruiting, the source of recruitment, the number of prospective candidates per job opening, the quality ratio, the time required to fill, the time required to hire, and hiring manager satisfaction are a few examples of metrics. Recruiting Metrics Recruiting metrics are estimates used to track recruiting performance and success rates and improve the method of hiring candidates for a company. These criteria, when used correctly, aid in evaluating the recruitment process and determining if the organisation is hiring the right people. Recruiting metrics are the measures

Data Analytics in Human Resource

Artificial Intelligence and Data Science 265

that are used to track recruiting performance and maximise the hiring process within an organisation Measures to maximise the hiring process within an organisation Fig. (2) displayed the different Recruiting metrics to track performance.

Fig. (2). Recruiting metrics to track performance.

The Number of Days that Have Passed Since Time to Fill The cumulative amount of time required to fill a specific job role is referred to as time to fill. This recruiting metric aids in better preparation and acts as an alert when an organization's recruitment process is too long. In basic terms, it means the time taken to locate and recruit a new employee, which is frequently calculated by the number of days between posting a work opening and recruiting a new employee. It's a useful metric for business planning because it gives the manager a reasonable idea of how long it would take to find a substitute for a leaving employee. Quality of Hire The importance a new employee or new recruit brings to a company is determined by how much they contribute to the organization's long-term success in terms of

266 Artificial Intelligence and Data Science

Sumi Kizhakke Valiyatra

job results and tenure. The first-year success of a candidate is often determined by the quality of hire, which is often calculated by an employee's performance ranking. Candidates with high-performance scores are a good sign. SUCCESS RATIO =

Satisfactory Number of Hired Candidates

(1)

Total Number of Candidates Hired

Bad hires are defined by low first-year performance scores. A single bad hire can lead to a large sum of money, both directly and indirectly. The value of contributions that an individual makes while working at your organisation must be greater than the cost of hiring them as a minimum benchmark of comparison for a quality hire. Quality of hire can be a useful asset for improving the recruiting technique and, at last, employing better representatives. Technology helps in improving the quality of hire in the following ways. Artificial Intelligence in Screening By incorporating artificial intelligence (AI) into an existing resume database, intelligent screening software automates the screening process. This technology will scan hundreds of resumes to provide the best possible fit for a job opening. The programme learns what current employees' expertise, qualifications, and other attributes are needed to fill a position and applies this useful information to the hiring process. Artificial Intelligence in Online Assessments For decades, personality tests, EQ, IQ, and other psychometric measures have been used in recruiting. Online tests enable an organisation to evaluate several candidates at once, and most platforms on the market today have automated reports and ranking systems, reducing the time and costs associated with evaluating and interviewing a candidate. Artificial Intelligence in Job Interviews There is a lot of software in today's video technology that claims to use artificial intelligence to evaluate interviewing candidates' word choices, speech habits, and facial expressions in order to assess personality and motivation and decide if they are fit for the job. Time to Hire The time from when an employer first approaches a job applicant and when they acknowledge and accept the job offer, is referred to as time-to-hire. While the

Data Analytics in Human Resource

Artificial Intelligence and Data Science 267

concept of time-to-hire is simple, the metric itself will speak to the HR recruitment team's dynamic productivity proportions. Fast recruitment allows an organisation to hire better applicants more frequently, which helps to avoid costly mistakes. Cost Per Hire It costs money to bring in money, and finding the right candidate isn't cheap. Estimating, computing, and tracking cost-per-hire should be a requirement for any business. It should be noted that cost per hire should not be used to determine whether a company's recruiting capabilities are successful or unsuccessful. There are many methods for lowering the cost of hire. a. Social media b. Talent pools c. Employee Referral Programs Cost / Hire= ∑Cost of Recruitment ∑ Number of hires

(2)

First-year Attrition This metric denotes a successful hiring process. Resignation of employees within their initial year of employment ends up costing the company a lot of money. Success Ratio Recruiting Metric The recruiting utility study uses the success ratio as an input. This study allows the company to measure the return on investment (ROI) for various selection instruments. Employee Selection Selection entails more than just selecting the most qualified candidate. It entails identifying the appropriate collection of experience, skills, and abilities (KSAs) in prospective applicants who are the best match for the organisation Fig. (3) presents the procedure for identification of knowledge, skills, and abilities in the selection of applicants. The aim of a selection process is to identify the most

268 Artificial Intelligence and Data Science

Sumi Kizhakke Valiyatra

qualified candidate who meets the job requirements in an organisation and determine which job applicant is the most qualified. To achieve this aim, the organisation collects and evaluates information about applicants in terms of age, credentials, abilities, experience, and so on. Work requirements are matched with candidate profiles. Following the elimination of unsuitable candidates through successive stages of the selection process, the most appropriate candidate is chosen. The amount and quality of work produced by an employee are directly influenced by how well they are suited to their job. Only if the employee is happy and accepts his workload will morale and efficiency be improved. Any misalignment in this regard can cost a company a lot of money, time, and effort, particularly in terms of training and operating costs. A dissatisfied employee is a threat to an organisation, which affects the goodwill of the company Selection metrics.

Fig. (3). Identification of knowledge, skills, and abilities in the selection procedure.

Selection Ratio The number of recruited candidates divided by the total number of candidates is referred to as the selection ratio. Selection Ratio = Number of hired candidates ∑Number of candidates

(3)

Data Analytics in Human Resource

Artificial Intelligence and Data Science 269

If there are a sufficient number of candidates, the ratio hits zero. The selection ratio can be used to determine how effective a particular selection and recruitment process is by providing details such as the importance of various evaluation and recruitment methods. Optimum Productivity Level (OPL) It refers to how quickly a candidate can be found. The overall cost of bringing someone up to speed is known as the cost of getting to Optimum Productivity Level (OPL). Onboarding, induction, training, manager's expenses and coworkers participating in on-the-job training, employee salaries, and other fringe benefits are all included. Time to Productivity The time to productivity, also known as the time to Optimal Productivity Level, is a metric that calculates how long it takes to get people up to speed and make them productive. It measures the period between the employee's first day on the job and the point at which he or she is completely contributing to the business.

Fig. (4). Diminishing productivity level.

270 Artificial Intelligence and Data Science

Sumi Kizhakke Valiyatra

The chart shown in Fig. (4) depicts the productivity level of employees. When years pass, there will be a decline in productivity levels. The main reason for this is the advancement in technology and the obsolescence of the available technology. As a result, the company should support device refresher programs to upgrade the knowledge and skills of employees. CONCLUSION Time-to-hire an employee will be tracked and measured automatically by modern recruiting software. As a result, an organisation will learn how long it takes to recruit for a specific job opening and make adjustments as needed. Email automation can help businesses save a lot of time. An organisation can submit automated emails with an interview appointment that automatically with modern recruitment software. The organisation may use automated disqualification of unqualified applicants to automatically screen application forms and exclude candidates who aren't qualified. So that the organisation can remove unsuitable applicants easily and effectively at the outset of the selection process and save time. Effective recruitment is a mix of science and art. On the other hand, it takes sophistication to think beyond the box in order to find the perfect candidate. ACKNOWLEDGEMENT I would like to express my sincere gratitude to all those who helped along the way in the completion of this chapter. Every author owes a debt of gratitude to those who were a constant source of inspiration to the completion of his task. Each contributed wholeheartedly, and their cooperation could not have been greater. REFERENCES [1]

M. Edwards, and K. Edwards, Predictive HR Analytics: Mastering the HR Metric, 2019.

[2]

R. Soundararajan, and K. Singh, Winning on HR Analytics: Leveraging Data for Competitive Advantage”2016

[3]

D. Angrave, A. Charlwood, I. Kirkpatrick, M. Lawrence, and M. Stuart, HR and analytics: Why HR is set to fail the big data challenge. Human Resource Management Journal, 26, 1–11. [http://dx.doi.org/10.1111/1748-8583.12090,2016]

[4]

J.W. Boudreau, and P.M. Ramstad, "Talentship, talent segmentation, and sustainability: A new HR decision science paradigm for a new strategy definition", Hum. Resour. Manage., vol. 44, no. 2, pp. 129-136, 129-136, 2005. [http://dx.doi.org/10.1002/hrm.20054]

[5]

D. A. Garvin, How google sold its engineers on management. Harvard Business Review, 91, 74–82.2013

[6]

P.M. Leonardi, and S.R. Barley, "Materiality and change: Challenges to building better theory about technology and organizing", Inf. Organ., vol. 18, no. 3, pp. 159-176, 159-176, 2008.

Data Analytics in Human Resource

Artificial Intelligence and Data Science 271

[http://dx.doi.org/10.1016/j.infoandorg.2008.03.001] [7]

D.M. Rousseau, J. Manning, and D. Denyer, "11 Evidence in management and organizational science: Assembling the field’s full weight of scientific knowledge through syntheses", Acad. Management Ann., vol. 2, no. 1, pp. 475-515, 2008. [http://dx.doi.org/10.5465/19416520802211651]

[8]

Mishra, S. N., Lama, R., & Pal, Y. Human Resource Predictive Analytics (HRPA) For HR Management In Organisations. International Journal Of Scientific &Technology Research Volume, 5(5). 2016.

[9]

A. Bharti, Human resource analytics. South Asian Journal of Marketing & Management Research, 7(5), pp 68-77.2017

[10]

www.rezrunner.com

[11]

K. Raghavi, and N. Gopinathan, "Role of Human Resources as Change Agent in Enabling Equal Opportunity Practices. J Econ", Business Manag, vol. 1, no. 3, pp. 300-303, 2013.

[12]

Erik van Vulpen, Recruiting Metrics You Should Know About. AIHR Academy: The Netherlands, 2019.

[13]

R Reena, M.M.K. Ansari, and S.S. Jayakrishnan, “Emerging trends in human resource analytics in upcoming decade”, International Journal of Engineering Applied Sciences and Technology, Vol. 4, No. 8, pp. 260-264.2019 [http://dx.doi.org/10.33564/IJEAST.2019.v04i08.045]

[14]

D.K. Bhattacharyya, and H.R. Analytics, Understanding Theories and Applications. SAGE Publications: New Delhi, 2017.

[15]

H.C. Ben-Gal, "An ROI-based review of HR analytics: practical implementation tools", Person. Rev., 2018.

[16]

B. Jabir, N. Falih, and K. Rahmani, "HR analytics a roadmap for decision making: case study", Indonesian Journal of Electrical Engineering and Computer Science, vol. 15, no. 2, pp. 979-990, 2019. [http://dx.doi.org/10.11591/ijeecs.v15.i2.pp979-990]

[17]

A. Lochab, S. Kumar, and H. Tomar, "Impact of Human Resource Analytics on Organizational Performance: A Review of Literature Using R-Software", International Journal of Management, Technology And Engineering, vol. 8, pp. 1252-1261, 2018.

[18]

J. Malla, "HR Analytics Center of Excellence", International Journal of Business, Management and Allied Sciences, vol. 5, pp. 282-284, 2018.

[19]

P.R. Reddy, and P. Lakshmikeerthi, “HR Analytics’ - An Effective Evidence Based HRM Tool”, International Journal of Business and Management Invention, Vol. 6, No. 7, pp. 23-34.2017

[20]

A.Q. Mohammed, “HR analytics: a modern tool in HR for predictive decision making”, Journal of Management, Vol. 6, No. 3, pp. 51-63.2019

272

Artificial Intelligence and Data Science, 2023, 272-295

CHAPTER 14

A Personalized Artificial Neural Network for Rice Crop Yield Prediction Pundru Chandra Shaker Reddy1,*, Alladi Sureshbabu2, Yadala Sucharitha2 and Goddumarri Surya Narayana3 Department of CSE, CMR College of Engineering & Technology, Hyderabad, TS, India Department of CSE, JNTUA College of Engineering, Ananthapur, Andhra Pradesh, India 3 Department of CSE, Vardhaman College of Engineering, Hyderabad, TS, India 1 2

Abstract: Early and accurate crop yield estimates at a local and national level are essential to oversee industry and trade planning and to mitigate the price hypotheses. The major challenge for farmers in the agricultural field is selecting an appropriate crop for planting. Crop selection is dependent on several factors like climate, soil nature, market, etc. Majorly, crop yield production depends on weather conditions and soil types. Yield anticipation is essential for farmers nowadays, which significantly adds to the appropriate yield selection for sowing. There needs to be a framework to recommend what type of crops to produce for farmers. It is essential and challenging to make the right farming decisions at a future steady cost and yield balance. This article proposes an Artificial Neural Network (ANN) model for rice crop yield prediction by utilizing weather parameters like rainfall, temperature, sunshine hours, and evapotranspiration. Generally, Default-ANN has only one hidden layer. But in this work, a Personalized Artificial Neural Network (PANN) approach has been designed by varying the number of hidden layers, the number of neurons, and the learning rate. P-ANN model accuracy is computed using R-Square (R2) and Percentage Forecast Error (PFE). Outcomes demonstrate that the P-ANN model performs precisely with a greater R2 and smaller PFE values than existing methods. For this research, the seasonal (Kharif & Rabi) weather dataset and rice yield data of Guntur district, Andhra Pradesh, India, from 1997-2014 have been used. Better paddy yield was forecasted by utilizing the P-ANN approach.

Keywords: Rice yield, Agriculture, Prediction, Crop, P-ANN. INTRODUCTION Rice is accountable for 80% of national food supplies, making crop yield prediction necessary to direct the worldwide commodity market. Basic to global *

Corresponding author Pundru Chandra Shaker Reddy: Department of CSE, CMR College of Engineering & Technology, Hyderabad, TS, India; E-mail: [email protected] Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

A Personalized Artificial

Artificial Intelligence and Data Science 273

food security, the agriculture industry is not short of embracing innovations to make work processes more effective and enhance overall harvest yield [1]. Precision Agriculture (PA) is used to portray the execution of innovative procedures in horticulture. Suppose agriculturists can forecast crop yield and underline regions in the field where the yield is less than anticipated. In that case, they can hypothetically endeavor to counter the falling procedures in yield at given areas in the fields. The yield forecast is one of the fundamental demanding problems faced in the agricultural field. Farmers’ shortage of awareness about yield surplus, vulnerabilities in the climate conditions, seasonal precipitation, consumption of nutrition level of soils, fertilizer accessibility, price, pest restriction, post-yield damage, and other elements leads to a decline in the production of the crops [2]. Rice is the fundamental food crop, and being a tropical plant, it grows easily in a hot and damp atmosphere. Rice is primarily grown in rain-fed regions that get severe yearly precipitation. Farming is one of the fundamental parts of the national economy [3]. Agriculture, especially rice planting, has faced various issues for a reasonable length of time over decades; for example, phenomenal atmospheric conditions could bring about yield disappointment. Indeed, even after multiple improvements in agriculture, water management, pesticide, fertilization, and hybrid crop improvement, the advantage and long-term benefit of agricultureassociated sectors are yet treated unpredictably [4]. For a regular crop like rice, climate plays a significant role in choosing its yield. The basic contemplations for rice are undeniably an adequate and well-distributed rainfall and the accomplishment of certain perfect soils and temperatures. Also, the procedure of rice production includes various sectors, such as farming, transportation, milling, and marketing. In such a composite situation, early information on seasonal supplies may hold up selling policies and industry aggressiveness other than giving vital information to plan milling activities and rice shipments [5]. At last, precise predictions are made accessible to the public. Early caution is very helpful to avoid any instability in food prices and sudden production fall due to adverse conditions that frequently influence the principal food items. These contemplations prompted the development of a timely and accurate crop yield prediction framework. Yield forecasting is a complex and challenging farming task [6]. Each agriculturist is enthusiastic about knowing how much yield he is about to anticipate. Before this, yield prediction was made by considering the farmers’ past knowledge of a particular crop. Different data analytic (Machine Learning) strategies are utilized to explore massive volumes of datasets and develop supportive classifications and patterns. The effect of observed regular climate conditions, for example, precipitation and temperature

274 Artificial Intelligence and Data Science

Reddy et al.

variability, on anticipated crop yield has endeavored an accurate crop management structure [7]. Fig. (1) demonstrates the contribution of farming to the national income and its share in exports for 60 years (1950-2010), and it explains that the percentage of agriculture in the National Income is regularly declining. Farming subsidies were just around 33% of the National Income against 54% in 1950-51. Similarly, the share of agricultural goods in exports has reduced from 52.5% in 1950-51 to just 14% in 2010-11 [8]. The production of crops should be increased to improve the agricultural contribution to the country’s income. A conceivable reason behind the destitute contribution of the farming field to the GDP of Andhra Pradesh (AP) might be the absence of good crop management by agriculturists, just as by the government.

Fig. (1). Agricultural sector contribution to GDP (Courtesy: Wikimedia Commons).

Since the conventional approach of cultivation is proficient, there exists a glut or shortage of crops if a specific requirement is not achieved. The farmers don’t know about the trade in the current farming economy. It results in the defeat of the agriculturists in the cultivation sector [9]. The conveyed causes in order of significance behind farmer suicides were climate conditions, low production costs, poor water management, and escalation in the price of farming. Various application areas have been developed with enhanced constraints and models in

A Personalized Artificial

Artificial Intelligence and Data Science 275

the present scenario. Information technology (IT) has become a part of our daily life and is expanding into agriculture. The yield production rate relies on the geology of an area, climate change, soil variations, soil configurations, and the cultivation process [10]. Machine learning (ML) methods are raising technology for information extraction that relates input and output parameters that are difficult to acquire statistically. Conventional methods deal with only structured data, but ML deals with unstructured data also. It is a good quality of the ML techniques to work with the composite non-linear nature of crop yield forecasts. Predictive analytics is used to get precise results. This chapter aims to build a scientific, advanced, and adequate framework for future yield anticipation in rice sowing in various districts of AP state considering expansion which is influenced by weather conditions and other factors [11]. From this model, farmers can decide which crops to farm in the lands in which season to increase the benefit from it. Using the model, the concerned people can effectively predict the balancing of income and large-scale agricultural effects. This chapter presents the advancement of a crop yield anticipation model for rice based on Artificial Neural Networks (ANN). This study assesses the impact of environmental changes on the yields of main food crops and investigates adaption estimates that are apparent and embraced by smallholder agriculturists in India [12]. Traditional Crop Yield Forecasting Methods Agricultural surveys and crop exploration are the primary methods for yield forecasting. Nevertheless, these procedures are abstract and undergo a shortage of consistency. Since the 1995s, frameworks dependent on data recovered from agro-climatic pointers, remote sensing, and yield simulation techniques [13]. These strategies also have some obstinate conditions that undermine the precision of model-based yield estimating frameworks: i. To a great extent, the vulnerability concerning input data at a local scale (i.e., farming methods, soil properties, climate data, farming varieties, crop division) relies upon aggregation suspicions. ii. The complexity of simulating each of the variables that fundamentally impact crop yields, such as nutrient accessibility, contending weeds, maladies, pests, and severe climate conditions, is very high. iii. The vulnerability in parameterizations. To diminish the effect of these variables on the estimated vulnerability, we require a timely and precise crop yield prediction model to help the agricultural-related stakeholders to plan their activities.

276 Artificial Intelligence and Data Science

Reddy et al.

The motivation for driving the proposed work is to design and customize a forecast model that can be utilized to anticipate crop yields by giving different climate parameters on which crop yield is dependent. In this chapter, we have proposed a Personalized Artificial Neural Network (P-ANN) model for foreseeing the yield of rice crops. In this model, we vary the number of hidden layers, neurons, and learning rates to get optimal results. Climate dataset and rice yield production data are collected from the Indian Metrological Department and Agricultural department, respectively, of Nellore District, AP state, India, from 1997-2014. This dataset is preprocessed by expelling outliers, repetitive, conflicting, and missing values. We have run examinations on the rice dataset utilizing the P-ANN model in which four input attributes are considered for rice yield expectation: rainfall, sunshine, temperature, and evapotranspiration. The dataset consists of 36 records of two seasons (Kharif & Rabi) from 1997 to 2014. Based on the experimental results, we found that the P-ANN model performs superior to the existing models with a higher R2 value. Artificial Neural Networks Soft computing handles the approximation methods to give inaccurate yet usable answers for complex issues. Soft computing frameworks are adaptable enough to adjust or adapt to the changes experienced. They are vigorously tolerant when going up against uncertain data and ready to respond reasonably in light of events [14]. The adaptability, robustness, nonlinearity, and speedy reaction nature of soft computing made it an attractive innovation for crop yield prediction. Soft computing has three components: GA, ANN, and Fuzzy Logic (FL). The ANN, a part of AI, has a long history of communications with Robotics. AI suggests the utilization of computers to show intellectual behavior with insignificant human intercession [15]. ANNs are delightful biologically-propelled data processing strategies that draw motivation like a human brain. They work into programming a computer to learn from observational information. ANN can be trained to maintain proficiency on tasks using observational data and gain knowledge through a learning process. Inter-neuron association qualities store the information [16, 17]. The learning principles empower a network to add knowledge from accessible data and apply that knowledge to train the network. A trained network can be considered a specialist in the class of data it has been given for investigation. ANNs are presently utilized as an ML-based strategy for solving various problems, including language interpretation, image processing, image generation, and yield anticipating [18]. ANN models depend on the forecast by intelligently examining patterns from the existing enormous sets of data. ANN gives users a model that

A Personalized Artificial

Artificial Intelligence and Data Science 277

can create the input-out mapping for any data set. Using ANN, complicated pattern recognition can be endeavored without making any initial hypothesis [19]. Some significant features of ANNs attract many researchers to use ANNs because: • The ANN can detect complex non-linear relationships between the predictor and target variables. • The ANNs can be data-driven. ANNs are self-versatile strategies that learn from models and don’t require prohibitive assumptions about the type of model. • The ANN detects the relationships between data that might be too unpredictable to be defined. • The ANNs are a group of highly parallel structures skilled for learning and simplifying models and experience to create effective solutions for issues even though input data has errors and is inadequate. • It makes ANN an incredible asset for solving issues like a prediction. The neural network is proficient at training enormous-size data samples because of its parallel handling ability. • The ANN can also foresee patterns that are not given during training. The ANN processes the data, displays knowledge, and shows insight like pattern recognition learning. The ANN utilizes the processing of the brain as a basis to design algorithms that can be used to model complex patterns and prediction problems [20]. Fig. (2) shows the ANN structure, which consists of three layers.

Inputs

Fig. (2). Structure of ANN.

Hidden Layer

Out put

278 Artificial Intelligence and Data Science

Reddy et al.

Input layer: It has an input neuron that accepts the input and sends it to the hidden layer. Hidden layer: It takes the data from the first layer and sends it to the output layer. Output layer: It accepts the processed data from the hidden layer and produces output. Several hidden layers exist to solve the complex data to produce the desired output. Weights are used to manipulate the data in computation. The structure of the neuron is shown in Fig. (3). A neuron network function is defined as f (∑wi, xi) where x1, x2….xn are the inputs and w1, w2…..wn are the weights whose combinations form the output function of the network. It can be represented mathematically in Eq. (1),

Fig. (3). Structure of a neuron.

y= ߠ σ௡௜ୀଵ ‫ݓ‬௜ ‫ݔ‬௜ െ ߤ

(1)

Where θ(.) is a unit step function and wi is the weight associated with ith input. LITERATURE REVIEW The worldwide demand for food grains is increasing every year. Improving the production of food crops is the only feasible solution to satisfy this demand. Crop development and productivity are determined by various attributes such as the capability of yield cultivator, soil, and essential weather variables like rainfall,

A Personalized Artificial

Artificial Intelligence and Data Science 279

humidity, cloud cover, and sunshine hours. Early and precise anticipation of crop yield is significant for planning and choosing crop planting. In this section, various crop yield prediction techniques proposed by researchers have been presented. Niedbała [21] proposed a novel yield prediction technique for winter sowing crops which allows executing the simulation in the present year on Jun 30 immediately before yielding, using an Artificial Neural Network with Multilayer Perceptron (ANN-MLP). Experiments are carried out on a large scale, and the performance of the proposed models is assessed in terms of forecast errors. The investigational results illustrate that the ANN-MLP model produces the lowest MAPE error (9.43%) compared to the baseline crop prediction models. Teresa et al. [22] presented an NNs-based algorithm for estimating crop yield production with the help of machine learning strategies to enhance the model’s accuracy. The proposed method includes a back-propagation approach to find the fitting weight value for computing the error derivative. Satellite images and the CNN model are used to anticipate yields in all locations and produce improved efficiency than the ANN model. Oliveira et al. [23] designed a pre-season soya bean yield prediction model that illustrates a framework that includes satellite-produced soil properties, rainfall, and other climate variables as input data. The proposed framework was created by the neural networks where inputs are considered individually, and soil datasets and weather datasets are handled by fully connected and recurrent LSTM layers, respectively. Experimental outcomes demonstrate that agriculture-related people can benefit from needful information with fewer data necessities and sustain better exact values. Srikamdee et al. [24] presented three prediction techniques for forecasting sugarcane yield and quality using Deep Neural Networks (DNN), adaptive evolution techniques, and Back-Propagation Neural Networks (BPNN). Datasets were gathered for five years (2010-14) from sugarcane farmers who live in Thailand. The three proposed models were executed, and their performances were compared. The results demonstrate that DNN based approach produced promising accuracies in some cases, and its forecast error was less when compared with the other techniques. Crane-Droesch [25] proposed a crop yield anticipating model using a deep neural network with the semi-parametric variant for corn yield forecasting. The presented model results show that this technique achieved good results compared to the conventional approaches in forecasting yields. Mohan et al. [26] developed a crop yield and weather forecasting framework called Self Organizing Map (SOM), incorporating the Latent Dirichlet Allocation (LDA) technique. The proposed method is the appropriate dimensionality-reduction approach to

280 Artificial Intelligence and Data Science

Reddy et al.

emphasize the self-organizing outline. Deep Neural Network (DNN) classification is used for arranging season-wise suitable crops. The experiments were conducted, and the model performance was evaluated in terms of accuracy. The investigational outcome shows that the proposed model enhanced crop and climate prediction accuracy by up to 23% compared to the existing models. Wang et al. [27] developed a soya bean crop yield prediction technique for Argentina and Brazil using deep learning methods. The remote sensing data was utilized for the execution of the model. The investigational results prove that the proposed method achieved reasonable accuracy in soya bean crop yield prediction compared to the traditional approaches. The capability to enhance predictive performance with incomplete data by utilizing transfer learning is stimulating. You et al. [28] introduced a novel deep-learning approach for a timely, precise, and reasonable anticipation of crop yield ahead of harvest by utilizing remote sensing data. The developed method enhances the existing strategies in three ways. It uses remote sensing data and demonstrates a framework based on modern learning ideas. Then a new dimensionality reduction method was used that allows training an LSTM network and automatically learning helpful features while labeled training data are limited. At last, it includes the Gaussian Process component to design the temp-spatial data and further get better accuracy explicitly. The proposed framework performance is evaluated on the US’s local soya bean yield forecast and displays that it outperforms the existing approaches. Pandey et al. [29] proposed two crop yield estimating models for anticipating potato yield: RBFNN and Generalized Regression Neural Network (GRNN). The proposed models are trained with input, which includes crop attributes such as plant height, leaf area index, and biomass, whereas potato yield is the output to train and test the Neural Networks. Both RBNN and GRNN produce precise forecast results in potato yield estimation. Nevertheless, based on fast learning ability and lesser spread constant (0.5), the GRNN was an improved predictor over RBFNN. Ravichandran et al. [30] proposed a framework that helps the agriculture-related stakeholders know the land’s condition and bring their attention to the crops that could benefit them. The proposed method utilizes the Artificial Neural Network for forecasting effective and precise crop yield. The proposed framework recommends some fertilizers that might enhance productivity. The experimental outcomes are compared with the existing systems, and from the results, it was noticed that the proposed model produces 90% of forecast accuracy. Bose et al. [31] introduced the Spiking Neural Networks (SNNs) model for crop yield forecasting, which differs from the vegetation index image technique. The design technique was trained and tested using historical wheat crop yields in

A Personalized Artificial

Artificial Intelligence and Data Science 281

different regions of China. China’s spatial accumulation of time series data is used here. From the results, the authors found that the developed model forecasts yield production around one month before the harvest with a good accuracy of 95.64% and a mean prediction error of 0.236 tones/hectare. Shastry et al. [32] designed a wheat crop yield prediction system named Customized Artificial Neural Networks (CANN) based on various climate and agricultural parameters. Hidden layers, neurons in each hidden layer, and LR vary in CANN compared to Default ANN. Experiment outcomes prove that the CANN performs better with a maximum Rsquare (97%) and less percentage of forecast error (0.52%) over regression and Default ANN models. Matsumura et al. [33] designed two approaches for maize crop yield prediction: non-linear ANN and MLR. Forty-two years (1962-2004) of climate conditions and fertilizer usage were used as predictors for the designed model for anticipating maize yield production. The two models are trained and tested, and the prediction skill scores are calculated under retroactive validation and crossvalidation. From the outcomes, it has been found that ANN produces better performance than the MLR model. As the data was non-stationary, retroactive validation was found to be consistent over cross-validation in evaluating the prediction skill. Jabjone et al. [34] implemented a crop forecasting model for estimating rice yield in Phimai, Thailand, using back-propagation incorporated with a multi-layer feed-forward neural network framework. The input dataset consists of six climate parameters of 10 years (2002-12) collected from the Thailand Meteorological Station. The experimental outcomes displayed that the proposed model could work satisfactorily in ANN [8, 17, 19] structure and the predicted values are very near to the fitted values of the model. The designed model produces a higher R-square (0.99) and smaller RMSE (9.94) over regression models in crop yield estimation. STUDY AREA AND DATASET DESCRIPTION Study Area Guntur (15018’0” -16°50’0” N 15°18’0” -16°50’0” E), an administrative district in the CA region, Andhra Pradesh state, India, is shown in Fig. (4). It has a coastline of around 100 km, and the Krishna River shapes the northeastern and eastern limits of the district, untying Guntur from Krishna. The district’s area is limited to the southeast by the Bay of Bengal, to the south by Prakasam District, to the west by Mahbubnagar District, and to the northwest by Nalgonda District. It has a zone of 4,398 sq mi and is the second most crowded station in the state with a populace of 4,889,230 according to the 2011 statistics of India.

282 Artificial Intelligence and Data Science

Reddy et al.

Fig. (4). Geographical diagram of Guntur District, AP, and India (Source: en.wikipedia.org).

The Guntur district is habitually alluded to as the Land of Chillies. It is likewise a significant place for farming, training, and education. Agriculture-related activities are the main products added to the station’s Gross Value Added (GVA). The normal annual rainfall in the district is 830 mm. The rainfall usually decreases from the east to the west. The rain is generally experienced by both the southwest and the diminishing monsoon. October is the rainiest month of the year. Dataset Description This section describes the dataset used to conduct experiments for the rice crop yield prediction model. The historical climate and crop information were gathered from the meteorological regions and the department of agriculture government of AP, respectively. The rice crop yield production and four types of weather parameters (rainfall, sunshine, temperature, and evapotranspiration) data were utilized in this research for analysis. For this investigation, we used seasonal-wise the Guntur district weather data and rice yield data from 1997 to 2014. Once preprocessing is completed, the dataset is split into 80:20. To anticipate crop yield production, we have to describe various attributes which indirectly or directly influence crop yield. The input variables of yield estimation are rainfall (mm), temperature (°C), sunshine (hours), and evapotranspiration (mm). The complete dataset structure is shown in Table 1.

A Personalized Artificial

Artificial Intelligence and Data Science 283

Table 1. List of Input and Output parameters. S. No.

Variable

Name of the Attribute

Short Name

Type

1

X1

Temperature (°C)

Temp

Input

2

X2

Rainfall (mm)

Rainfall

Input

3

X3

sunshine (hours)

SH

Input

4

X4

evapotranspiration (mm)

ET

Input

5

Y

Rice Yield Production (Kgs / Hectare)

Yield

Output

The weather data is collected from http://www.imdhyderabad.gov.in. Temperature, Rainfall, SH, and ET are input values, and yield is considered as output parameter which relies on input attributes. We considered just the relevant, required attributes for designing the forecasting model. PROPOSED METHODOLOGY The main challenge for farmers in the farming sector is choosing a proper crop for sowing. Generally, crop yield production relies upon climate conditions and soil types. Yield anticipation is crucial for farmers, which mainly adds to the suitable selection of crops for planting. There is no system to prescribe farmers to grow a particular crop for harvests. It is a fundamental and provoking job to make the right cultivating choices at a future stable cost and yield balance. In this section, we propose the ANN model for rice crop yield forecast by utilizing weather parameters like rainfall, temperature, sunshine hours, and evapotranspiration and effectively applying it to Guntur district, Andhra Pradesh (AP), India. Usually, the Default ANN has just one hidden layer. But in this work, we designed a Personalized Artificial Neural Network (P-ANN) approach with a differing number of hidden layers with several neurons and learning rates. The architecture of the P-ANN is presented in Fig. (5). P-ANN (Personalization of ANN) This study defines the customization of ANN to develop the P-ANN model. It is done by changing the number of hidden layers, the number of neurons in the hidden layer, and the learning rate. The algorithm of P-ANN utilized for rice crop forecasting is presented in Fig. (6).

284 Artificial Intelligence and Data Science

Fig. (5). Flow diagram of P-ANN Model.

Reddy et al.

A Personalized Artificial

1 2 3 4 5 6

7

Artificial Intelligence and Data Science 285

Input: Empirical dataset of climate data and crop data Output: Forecasted rice yield Preprocess both weather and crop datasets, ensure that there are no redundant, missing, and inconsistent values in the dataset; eliminate if any from the dataset. Divide the dataset into the training set (80%) and test set (20%) Use the Quasi-Newton method for training Customize the feed-forward back-propagation network by changing the below parameters  Number of hidden layers (1-5)  Number of neurons in the hidden layer (20-100)  Learning rate (0.25,0.50)  Network weights (arbitrary) Repeat the above step until the ANN model with high test accuracy and low percentage forecast error is obtained.

Fig. (6). Algorithm for P-ANN.

1. Input: Empirical dataset of climate data and crop data 2. Output: Forecasted rice yield 3. Preprocess both weather and crop datasets, ensure that there are no redundant, missing, and inconsistent values in the dataset; eliminate if any from the dataset. 4. Divide the dataset into the training set (80%) and test set (20%) 5. Use the Quasi-Newton method for training 6. Customize the feed-forward back-propagation network by changing the below parameters Number of hidden layers (1-5) Number of neurons in the hidden layer (20-100) Learning rate (0.25,0.50) Network weights (arbitrary) 7. Repeat the above step until the ANN model with high test accuracy and low percentage forecast error is obtained. ❍ ❍ ❍ ❍

Two hidden layers have been considered. The number of neurons in each layer varies from 20-100. The order of P-ANN, which brings the highest R-Square value and lowest percentage forecast error for the test set, was considered. The best ANN model is selected based on the highest R-Square value for the test set, and it has the below configurations:

286 Artificial Intelligence and Data Science

Reddy et al.

Number of hidden layers: 02 Number of neurons: first layer 50 and second layer 20 Learning error: 0.25 Fig. (5) represents the methodology followed by the P-ANN model. The dataset of 100 records is preprocessed by eliminating noisy, missing, and duplicate values. It is divided into training and test-set as 80:20. The training set was utilized for training the P-ANN until the maximum R-Square value was reached, and the test set was used to evaluate the performance of the model for new values. In this way, data analytics were utilized in the investigation to assess the impact of significant weather attributes on rice yield in particular temperature, precipitation, sunshine, and rainfall over the crop developing period. Appropriately, the research was accomplished to build up the precise crop yield model depending on climate factors, utilizing the P-ANN approach. Fig. (7) represents the calculated neural network. Our model has two hidden layers with different combinations of neurons (10-100 in ten intervals in the 2nd layer and 20,30,40,50 in the first layer). In the figure, the input parameters are rainfall, sunshine hours (SH), temperature (Temp) and evapotranspiration (ET), and the output parameter is rice yield production. The dark black lines represent the connections with weights. The weights are computed using the backpropagation algorithm, and the blue line displays the bias term.

Fig. (7). P-ANN model execution diagram.

A Personalized Artificial

Artificial Intelligence and Data Science 287

MODEL EXECUTION AND EVALUATION This section describes the execution of the P-ANN model for rice yield forecasting. It is implemented using R-Tool on the windows 7 operating system with 4GB and 500 GB hard disk. The model’s accuracy is calculated using the Percentage Forecast Error (PFE) and R2 value. R2 is a statistical measure of how close the data is to the fitted in regression line, and it is computed by Eq. (2).

R2 = 1-

ଵି௠ ௌௌா

(

௠ି௣ ௌௌ்

)

(2)

Where SSE is the sum of the squared errors, SST is the sum of the squared total, m is the number of observations, and p is the regression coefficients. The higher value of R2 indicates that the model is good for prediction. The percentage forecast error of a method is measured by Eq. (3) PFE = (

௔௕௦ሺ௑ି௒ሻ ௔௕௦ሺ௑ሻ

ሻ*100

(3)

Where X is the actual value for rice yield and Y is the forecast value of rice yield. The lower value of the PE indicates better prediction accuracy. Here, P-ANN is customized by changing the hidden layers 1 to 2. Each hidden layer consists of 20-100 neurons, and the model is tested for learning rates of 0.25 and 0.50. Table 2 displays the outcomes of the two hidden layers with LR=0.25. The number of neurons for the first layer is 20 (fix), and the second layer neurons vary from 20 to 100 in 9 intervals (20, 30, 40, 50, 60, 70, 80, 90, and 100). This model execution is repeated in the first layer for 20, 30, 40, and 50 neurons. The neurons in the second layer are varied from 20 to 100 in 9 intervals. We observed that the best R-Square value is computed as 0.97, with 40 neurons in the first layer and 80 neurons in the second layer.

288 Artificial Intelligence and Data Science

Reddy et al.

Table 2. P-ANN outcomes for 2 – hidden layers with LR=0.25. No. of neurons in the 1st Layer

20

30

40

No. of Neurons Training Testing in the 2nd Layer (R2) (R2) 10

0.1

0.56

20

0.99

0.87

30

0.96

0.25

40

0.87

0.36

50

0.97

0.45

60

0.1

0.88

70

0.86

0.68

80

0.87

0.24

90

0.99

0.82

100

0.1

0.64

10

0.97

0.26

20

0.1

0.81

30

0.86

0.54

40

0.87

0.91

50

0.99

0.61

60

0.1

0.64

70

0.1

0.87

80

0.99

0.95

90

0.88

0.90

100

.87

0.12

10

0.99

0.28

20

0.97

0.72

30

0.87

0.86

40

0.97

0.56

50

0.1

0.91

60

0.86

0.87

70

0.87

0.21

80

0.99

0.97

90

0.1

0.91

100

0.87

0.96

A Personalized Artificial

Artificial Intelligence and Data Science 289

(Table 2) cont.....

No. of neurons in the 1st Layer

50

No. of Neurons Training Testing in the 2nd Layer (R2) (R2) 10

0.99

0.89

20

0.97

0.51

30

0.1

0.34

40

0.86

0.71

50

0.87

0.25

60

0.99

0.26

70

0.1

0.62

80

0.97

0.73

90

0.1

0.91

100

0.86

0.81

Table 3 displays the outcomes of the two hidden layers with LR=0.50. The best R-Square value obtained is 0.94 with 30 neurons in the first layer and 60 layers in the second layer. Table 3. P-ANN outcomes for 2 – hidden layers with LR=0.50. No. of Neurons in the 1st Layer

20

No. of Neurons Training Testing in the 2nd Layer (R2) (R2) 10

0.88

0.88

20

0.99

0.56

30

0.1

0.54

40

0.87

0.26

50

0.76

0.48

60

0.71

0.36

70

0.63

0.91

80

0.87

0.65

90

0.88

0.23

100

0.99

0.12

290 Artificial Intelligence and Data Science

Reddy et al.

(Table 3) cont.....

No. of Neurons in the 1st Layer

30

40

50

No. of Neurons Training Testing in the 2nd Layer (R2) (R2) 10

0.74

0.86

20

0.99

0.85

30

0.56

0.71

40

0.61

0.76

50

0.87

0.81

60

0.88

0.94

70

0.99

0.63

80

0.1

0.29

90

0.1

0.68

100

0.87

0.46

10

0.76

0.82

20

0.71

0.79

30

0.63

0.47

40

0.87

0.90

50

0.61

0.56

60

0.87

0.90

70

0.88

0.93

80

0.61

0.54

90

0.87

0.91

100

0.88

0.56

10

0.99

0.12

20

0.86

0.89

30

0.1

0.91

40

0.87

0.27

50

0.76

0.38

60

0.76

0.81

70

0.71

0.83

80

0.63

0.90

90

0.87

0.69

100

0.82

0.48

From Table 4, we compute the average PFE for P-ANN by using Eq. (3) as 0.5172. It is smaller than the other existing prediction models.

A Personalized Artificial

Artificial Intelligence and Data Science 291

Table 4. Percentage forecast error for test-set using P-ANN. Actual Yield (x)

Predicted Yield (y)

Difference D=abs(x-y)

PE

3835.6

3772.4

63.2

2.875

3547.5

3547.5

0

0

3479

3479

0

0

3576

3558.9

17.1

0.595

3259.4

3259.4

0

0

3810.5

3810.5

0

0

3912.7

3890.9

21.8

0.548

3762

3762

0

0

Fig. (8) demonstrates the comparison between fitted yield values and predicted yield values of the test dataset using the P-ANN model. It shows that the margin of error between the two values is significantly less and concludes that the proposed model produces superior results.

Actual vs Predicted Rice Yield 4500 Rice Yield Production

4000 3500 3000 2500

Actual Value

2000

Predicted Value

1500 1000 500 0 1

2

3

4

Fig. (8). Actual vs. fitted values for test data.

5

6

7

8

292 Artificial Intelligence and Data Science

Reddy et al.

Comparative Analysis This section compares P-ANN model performance with other crop yield prediction models (MLR, RA) in terms of R2 and PFE. The proposed model (PANN) is designed and implemented on a vast scale. The empirical outcomes show that the P-ANN produces superior results compared to the existing crop yield prediction models. The performance of the P-ANN is assessed in terms of RSquare (0.97) and PFE (0.5172), and the experimental results illustrate that it produces good statistical values than the existing methods. Table 6 compares Multiple Linear Regression (MLR) models, and Regression analysis (RA) models with the P-ANN. From Table 5, we observed that the proposed P-ANN model produces higher R2 and lower PFE values than other prediction models, which means it has better results when compared to other methods. Table 5. Forecast results based on R2 and PFE. Model Name

R-Square

PFE

MLR (2018) [35]

0.5457

13.01

Regression Analysis (2016) [36]

0.7242

-

P-ANN

0.97

0.5172

CONCLUSION AND FUTURE WORKS Crop yield forecasting is essential and crucial in the farming field. Developing techniques for foreseeing crop yield continuously in light of various agro-climatic conditions could improve the agriculture management choice procedure. Agricultural production is, to a great extent, influenced by the changeability in climate. Giving a chance to think about the impacts of variable sources of info, for example, climate events on cultivable yield parts, crop methods have been utilized effectively to help the decision-making process in farming. Early cautioning information on crop yield production is critical for agriculture-related stakeholders. In this section, the P-ANNs were built and applied to forecast rice yield using climate parameters: rainfall, temperature, sunshine hours, and evapotranspiration. The Guntur district’s seasonal (Kharif and Rabi) dataset from 1997 to 2014 was utilized to prepare the model. The outcome demonstrated that the P-ANN strategy could effectively predict rice crop yield. The exactness of anticipation was nearer to the fitted data. The results are compared using R2 and PFE values. The proposed rice yield anticipating model P-ANN produced reliably higher R2 (0.97)

A Personalized Artificial

Artificial Intelligence and Data Science 293

and lower PFE (0.5172) than the other existing models. From the results, it can be concluded that the P-ANN approach is suitable for timely and accurate rice crop yield prediction. The results are helpful for agriculture-related activities to make the right decisions. In the future, more layers will be incorporated into the proposed model to enhance its performance. ACKNOWLEDGEMENTS The authors would like to thank Mr. A Vivekan and Assistant Professor of the CSE Dept. & Rosy Matilda P, Professor of H&S in CMRCET, Hyderabad, TS, and INDIA, for their great support and N Sujatha, Meteorologist ‘A,’ for Director–In-Charge, IMD, Hyderabad, for providing weather datasets for this research work. REFERENCES [1]

S., N., Mandal, A., Ghosh, J., P., Choudhury, and S., R., B., Chaudhuri, “Prediction of productivity of mustard plant at maturity using harmony search”, In 2012 1st International Conference on Recent Advances in Information Technology (RAIT), IEEE, 2012, pp. 933-938.

[2]

P., C., S., Reddy, A., Sureshbabu, “An applied time series forecasting model for yield prediction of agricultural crop”, In International Conference on Soft Computing and Signal Processing 2019, Jun 21, Springer, Singapore, pp. 177-187.,

[3]

H. Lee, and A. Moon, "Development of yield prediction system based on real-time agricultural meteorological information", 16th International Conference on Advanced Communication Technology, 2014pp. 1292-1295 [http://dx.doi.org/10.1109/ICACT.2014.6779168]

[4]

P., C., Reddy, A., S., Babu, “Survey on weather prediction using big data analystics”, In Second International Conference on Electrical, Computer and Communication Technologies (ICECCT), IEEE, Feb 2017, pp. 1-6.

[5]

"S., S., Dahikar, and V., R., Sandeep, “Agricultural crop yield prediction using artificial neural network approach”, International Journal of Innovative Research in Electrical, Electronics", Instrumentation and Control Engineering, vol. 2, no. 1, pp. 683-686, 2014.

[6]

A. Choudhury, and J. James, "Crop yield prediction using time series models", Journal of Economics and Economic Education Research, vol. 15, no. 3, pp. 53-67, 2014.

[7]

Y., Sucharitha, Y., Vijayalata, V., K., Prasad, “Predicting Election Results from Twitter Using Machine Learning Algorithms”, Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), vol. 14, no. 1, 246-256, 2021.

[8]

M. van der Velde, and L. Nisini, "Performance of the MARS-crop yield forecasting system for the European Union: Assessing accuracy, in-season, and year-to-year improvements from 1993 to 2015", Agric. Syst., vol. 168, pp. 203-212, 2019. [http://dx.doi.org/10.1016/j.agsy.2018.06.009] [PMID: 30774183]

[9]

P. Reddy, and A. Sureshbabu, "An Adaptive Model for Forecasting Seasonal Rainfall Using Predictive Analytics", International Journal of Intelligent Engineering and Systems, vol. 12, no. 5, pp. 22-32, 2019. [http://dx.doi.org/10.22266/ijies2019.1031.03]

[10]

P.C. Shaker Reddy, and A. Sureshbabu, "An Enhanced Multiple Linear Regression Model for Seasonal Rainfall Prediction", Int. J. Sensors Wirel. Commun. Control, vol. 10, no. 4, pp. 473-483,

294 Artificial Intelligence and Data Science

Reddy et al.

2020. [http://dx.doi.org/10.2174/2210327910666191218124350] [11]

A. Garg, and B. Garg, "A robust and novel regression based fuzzy time series algorithm for prediction of rice yield", International Conference on Intelligent Communication and Computational Techniques (ICCT), 2017pp. 48-54 [http://dx.doi.org/10.1109/INTELCCT.2017.8324019]

[12]

"Y., Sucharitha, V., K., Prasad, Y., Vijayalatha, “Emergent Events Identification in Micro-Blogging Networks Using Location Sensitivity”", Journal of Advanced Research in Dynamical and Control Systems, vol. 11, pp. 596-607, 2019.

[13]

Y. Zhang, and Q. Qin, "Winter Wheat Yield Estimation with Ground Based Spectral Information", IEEE International Geoscience and Remote Sensing Symposium, 2018pp. 6863-6866 [http://dx.doi.org/10.1109/IGARSS.2018.8519582]

[14]

V. Shelia, J. Hansen, V. Sharda, C. Porter, P. Aggarwal, C.J. Wilkerson, and G. Hoogenboom, "A multi-scale and multi-model gridded framework for forecasting crop production, risk analysis, and climate change impact studies", Environ. Model. Softw., vol. 115, pp. 144-154, 2019. [http://dx.doi.org/10.1016/j.envsoft.2019.02.006]

[15]

N., A., Charaniya, S., V., Dudul, “Focused time delay neural network model for rainfall prediction using indian ocean dipole index”, In Fourth International Conference on Computational Intelligence and Communication Networks, IEEE, 2012, pp. 851-855.

[16]

S., K., Biswas, L., Marbaniang, B., Purkayastha, M., Chakraborty, H., R., Singh, and M., Bordoloi, “Rainfall forecasting by relevant attributes using artificial neural networks-a comparative study”, International Journal of Big Data Intelligence, vol. 3, no. 2, pp. 111-121, 2016. [http://dx.doi.org/10.1504/IJBDI.2016.077362]

[17]

A. Haidar, and B. Verma, "Monthly rainfall forecasting using one-dimensional deep convolutional neural network", IEEE Access, vol. 6, pp. 69053-69063, 2018. [http://dx.doi.org/10.1109/ACCESS.2018.2880044]

[18]

Z. Beheshti, M. Firouzi, S.M. Shamsuddin, M. Zibarzani, and Z. Yusop, "A new rainfall forecasting model using the CAPSO algorithm and an artificial neural network", Neural Comput. Appl., vol. 27, no. 8, pp. 2551-2565, 2016. [http://dx.doi.org/10.1007/s00521-015-2024-7]

[19]

Y., Sucharitha, Y., Vijayalata, V., K., Prasad, “Analysis of Early Detection of Emerging Patterns from Social Media Networks: A Data Mining Techniques Perspective”, In Soft Computing and Signal Processing, Springer, Singapore, 2019, pp. 15-25.

[20]

S.K. Nanda, D.P. Tripathy, S.K. Nayak, and S. Mohapatra, "Prediction of Rainfall in India using Artificial Neural Network (ANN) Models", Int. J. Intell. Syst. Appl., vol. 5, no. 12, pp. 1-22, 2013. [http://dx.doi.org/10.5815/ijisa.2013.12.01]

[21]

G. Niedbała, "Simple model based on artificial neural network for early prediction and simulation winter rapeseed yield", J. Integr. Agric., vol. 18, no. 1, pp. 54-61, 2019. [http://dx.doi.org/10.1016/S2095-3119(18)62110-0]

[22]

S.S. Dahikar, and S.V. Rode, "Agricultural crop yield prediction using artificial neural network approach", International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering, vol. 2, no. 1, pp. 683-686, 2014.

[23]

R., L., F., Cunha, B., Silva, M., A., S., Netto, “A scalable machine learning system for pre-season agriculture yield forecast”, 14th International Conference on e-Science (e-Science), 2018, pp. 423-430.

[24]

S. Srikamdee, S. Rimcharoen, and N. Leelathakul, "Sugarcane Yield and Quality Forecasting Models: Adaptive ES vs. Deep Learning", 2nd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence, 2018pp. 6-11 [http://dx.doi.org/10.1145/3206185.3206190]

A Personalized Artificial

Artificial Intelligence and Data Science 295

[25]

A. Crane-Droesch, "Machine learning methods for crop yield prediction and climate change impact assessment in agriculture", Environ. Res. Lett., vol. 13, no. 11, 2018.114003 [http://dx.doi.org/10.1088/1748-9326/aae159]

[26]

P. Mohan, and K. Patil, "Deep Learning Based Weighted SOM to Forecast Weather and Crop Prediction for Agriculture Application", International Journal of Intelligent Engineering and Systems, vol. 11, no. 4, pp. 167-176, 2018. [http://dx.doi.org/10.22266/ijies2018.0831.17]

[27]

A., X., Wang, C., Tran, N., Desai, D., Lobell, S., Ermon, “Deep transfer learning for crop yield prediction with remote sensing data”, In 1st ACM SIGCAS Conference on Computing and Sustainable Societies, 2018, pp. 1-5.

[28]

J. You, X. Li, M. Low, D. Lobell, and S. Ermon, "Deep gaussian process for crop yield prediction based on remote sensing data", Thirty-First AAAI Conference on Artificial Intelligence, 2017 [http://dx.doi.org/10.1609/aaai.v31i1.11172]

[29]

A. Pandey, and A. Mishra, "Application of artificial neural networks in yield prediction of potato crop", Russ. Agric. Sci., vol. 43, no. 3, pp. 266-272, 2017. [http://dx.doi.org/10.3103/S1068367417030028]

[30]

G. Ravichandran, and R.S. Koteeshwari, "Agricultural crop predictor and advisor using ANN for smartphones", International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS), 2016pp. 1-6 [http://dx.doi.org/10.1109/ICETETS.2016.7603053]

[31]

P. Bose, N.K. Kasabov, L. Bruzzone, and R.N. Hartono, "P., Bose, K., V., Kasabov, L., Bruzzone, and R., N., Hartono, “Spiking neural networks for crop yield estimation based on spatiotemporal analysis of image time series.”", IEEE Trans. Geosci. Remote Sens., vol. 54, no. 11, pp. 6563-6573, 2016. [http://dx.doi.org/10.1109/TGRS.2016.2586602]

[32]

"K., A., Shastry, H., A., Sanjay, A., Deshmukh, “A parameter based customized artificial neural network model for crop yield prediction”", Journal of Artificial Intelligence, vol. 9, no. 1, pp. 23-32, 2016.

[33]

K. Matsumura, C.F. Gaitan, K. Sugimoto, A.J. Cannon, and W.W. Hsieh, "Maize yield forecasting by linear regression and artificial neural networks in Jilin, China", J. Agric. Sci., vol. 153, no. 3, pp. 399410, 2015. [http://dx.doi.org/10.1017/S0021859614000392]

[34]

S. Jabjone, and C. Jiamrum, "Artificial neural networks for predicting the rice yield in Phimai District of Thailand", International Journal of Electrical Energy, vol. 1, no. 3, pp. 177-181, 2013. [http://dx.doi.org/10.12720/ijoee.1.3.177-181]

[35]

G. Niedbała, "Application of multiple linear regression for multi-criteria yield prediction of winter wheat", J. Res. Appl. Agric. Eng., vol. 63, no. 4, pp. 125-131, 2018.

[36]

V. Sellam, and E. Poovammal, "Prediction of crop yield using regression analysis", Indian J. Sci. Technol., vol. 9, no. 38, pp. 1-5, 2016. [http://dx.doi.org/10.17485/ijst/2016/v9i38/91714]

296

Artificial Intelligence and Data Science, 2023, 296-301

SUBJECT INDEX A Adaptive evolution techniques 279 Aggregate diversity 147 Agricultural 274, 275, 292 production 292 related stakeholders 275 sector contribution 274 Agriculture 12, 26, 272, 273, 274, 275, 280, 282, 292 government 282 related stakeholders 280, 292 Agriculturists 273, 274 Algorithms 2, 3, 11, 12, 13, 14, 15, 19, 20, 21, 55, 95, 96, 97, 98, 101, 102, 121, 172, 221, 240, 242, 243, 277 back-propagation 281 binary 13 boosting 243 genetic 55 Amazon web services 18 Analysis 101, 151, 242 data mining 151 sensitivity 101 streamlining 242 Audio spectrograms 10 Automatic 26, 58 discovery 58 filtration 26 Automating production processes 28

B Back-propagation neural networks (BPNN) 279 Bayesian theory 38 Bayes theorem 13, 38 Bi-directional long 58, 62, 161, 162, 163 short-term memory 161, 162, 163 LSTM (BiLSTM) 58, 62 Bidirectional long short-term memory (BLSTM) 138, 139, 162

systems 161 Binary function vectors 13 Bluetooth technologies 209 Boosting method 44 Business growth 147

C Canonical polyadic decomposition (CPD) 221, 230 CARS frameworks 83 Cloud servers 200 Cluster data sets 201 Clustering 8, 119, 136, 241 methods 136 network 119 techniques 8, 241 Codebook-transfer (CTB) 98 Collaborative 170, 172, 208, 230 filtering methods 170 recommender systems 170, 172, 208 tensor decomposition 230 Content 110, 112, 113, 171, 179, 180, 184, 187 based recommender system 110, 112, 113, 179, 180 recommender system 171, 184, 187 Content-based filtering (CBF) 1, 4, 6, 30, 112, 119, 126, 131, 132, 134, 139, 171, 174, 179 method 30, 134 Context 72, 74, 75, 76, 77, 78, 83, 84, 85, 86, 87, 88, 89, 99, 101, 211 awareness (CA) 72, 76 aware recommender system (CARS) 72, 74, 75, 76, 77, 78, 83, 84, 85, 86, 87, 88, 89, 99, 101 aware tourist information system (CATIS) 211 Contextual user-rating tensors 96 Convolutional neural networks (CNN) 58, 59, 60, 61, 63, 64, 201

Abhishek Majumder, Joy Lal Sarkar & Arindam Majumder (Eds.) All rights reserved-© 2023 Bentham Science Publishers

Subject Index

Artificial Intelligence and Data Science 297

Crop yield prediction techniques 279 Cross-domain 72, 73, 75, 76, 77, 78, 79, 80, 81, 83, 89, 90, 97, 100, 101, 103 context-aware recommender systems 72, 76, 77, 78 methods 78, 81 recommendation methods 83 recommender systems (CDRS) 72, 73, 75, 76, 78, 79, 80, 81, 89, 100 recommender systems performance 101 techniques 90, 97, 103 Customized artificial neural networks (CANN) 281

F

D

G

Data preprocessing techniques 43 Decomposition 200, 216, 221, 224 tensor-train 200 tucker 216, 221, 224 techniques 221 Deep 61 cooperative neural networks 61 Deep learning 53, 56, 57, 58, 59, 60, 61, 63, 66, 126, 128, 136, 137, 138, 140, 151, 158, 159, 243, 280 approaches 53, 56, 57, 58, 59 based Approach 137 framework 59 methods 61, 243, 280 network 60 systems 59, 159 techniques 63, 137 technologies 128 Deep neural network(s) (DNN) 58, 59, 60, 200, 279, 280 based recommender (DNNRec) 60

GAN network 63 Gaussian process 280 Generalized regression neural network (GRNN) 280 Generative adversarial networks (GAN) 58, 61, 64, 66 Geographical curve fitting technique 225

Factorization machine methods 61 Filtering 53, 96, 123, 126, 132, 165, 166, 167, 168, 235 demographic 126, 132 system 53, 132 Filtering techniques 1, 7, 30, 31, 187 neighborhood-based 187 Food crops 278 Forecasting 82, 240, 257, 259, 279, 280, 283 rice crop 283 Fuzzy logic (FL) 276

H High order single value decomposition (HOSVD) 221, 227, 228

I Intelligent 168, 169 search engine technology 168 video services 169 Internet 26, 30, 62, 168, 205, 241 of things (IoT) 26, 62, 168, 205, 241 service providers (ISPs) 30

E J Electronic travel aids (ETAs) 212 Jaccard similarity 11, 175 Jennrich’s algorithm 221

298 Artificial Intelligence and Data Science

K Kernel functions, polynomial 43

L Location-based social networks (LBSNs) 189, 190, 197, 217, 218, 220, 226 Long short-term memory (LSTM) 58, 62, 64, 138, 139, 140, 159, 160, 161, 162, 163, 202 systems 159, 160 LSTM-based neural network techniques 61

M Machine learning 1, 11, 15, 25, 26, 27, 28, 29, 34, 36, 45, 55, 56, 61, 62, 99, 205, 241, 243, 279 algorithms 11, 25, 26, 27, 28, 29, 45, 205 boosted 36 methods 15, 25, 55, 61, 243 strategies 279 tasks 99 techniques 1, 26, 28, 34, 56, 62, 241, 243 Machines, reduced Boltzmann 59 Matrix factorization (MF) 1, 2, 7, 8, 10, 11, 62, 63, 96, 97, 172, 173, 193, 199, 218, 219, 232, 233 algorithms 10, 96 approaches 199 methods 1, 173 techniques 8, 11, 219 Matthews correlation coefficient (MCC) 99 Mean 17, 258 absolute percentage error (MAPE) 258 bias error (MBE) 17 Mean squared 1, 15, 16, 65, 99, 139, 146, 172, 174 distance (MSD) 174, 178 error (MSE) 1, 15, 16, 65, 99, 139, 146 Mobile 127, 200, 206, 209, 210, 211 application 209 devices 200, 206, 209, 216

Majumder et al.

phone 127 system 211 Model-based 4, 172 algorithms 172 approach 4 Model consistency 194 Multi 114, 124, 134, 136, 138 attribute utility theory (MAUT) 134 criteria decision-making (MCDM) 114, 124, 136, 138 Multiple linear regression (MLR) 281, 292 Multiplicative method 8

N Naive Bayes 14, 137, 138, 153 algorithm 137, 138 applications of 14 and Logistic Regression 153 theory 137 National economy 273 Natural language 13, 16, 44, 153, 180 processing (NLP) 13, 16, 44, 153 toolkit (NLTK) 153, 158, 180 Natural logarithm 252, 255 Neural 60, 139 collaborative filtering (NCF) 60 network-based system 139

O Object-role modeling (ORM) 84

P Pairwise interaction tensor factorization (PITF) 193, 202, 221 Personalized video ranker (PVR) 144, 145 Position locator devices (PLDs) 212 Pre-filtering algorithm 96 Pre-processing ventures 153 Pricing delays 242 Probabilistic 38, 99, 172 approach 38

Subject Index

Artificial Intelligence and Data Science 299

classification technique 38 latent semantic analysis 172 measures 99 Probability 13, 38, 39, 116, 118, 153, 190, 197, 219, 224, 226, 228 computed 116 implicit transition 226 inference algorithms, efficient 228 Procedure, language preparation 151

Soya bean crop yield prediction technique 280 Spatiotemporal translation 198 Spearman rank correlation 15 Spiking neural networks (SNNs) 280 Standard recurrent neural networks 161 Streaming services 129 Support vector machines 25, 40, 88, 154, 158, 159, 205, 241 SVD technique 21

R

T

Radio frequency identification (RFID) 213 Recurrent neural networks (RNN) 58, 59, 60, 61, 62, 63, 64, 66, 161, 220 Research questions (RQ) 76 Restricted Boltzmann machines (RBM) 2, 57, 61, 64, 141 Rice yield forecasting 287 Risk management 242 Root mean squared error (RMSE) 1, 15, 16, 17, 19, 20, 21, 65, 99, 138, 139, 146, 147, 186, 258

Taxonomy of cross-domain 83 recommendation methods 83 Technologies, deep-learning 54 Technology 26, 62, 158, 205, 241, 266 architecture 62 machine vision 26 mechanical 158 multiplexing 241 4.0-related 26 smartphone 205 traditional industrial 26 video 266 Telecom utilities 127 Telephones, public 210 Temporal influence 196, 220, 233 enhanced poi recommendation 220 extension 196, 233 Tensor(s) 201, 221, 224, 228 , 229, 231, 234 adjacency 221 core 228 decomposition 201, 221, 224, 234 denser 231 fcatorization 224 power method 221 problems 229 Tensor factorization 96, 103 applying 103 method 96 Third-order tensor 220, 224, 227 TOPSIS 119, 122 algorithm 122 matrix for weighted normalized criteria 119

S Sensors 26, 27, 28, 49, 85, 206, 213 intelligent 26 mobile device GPS 85 smart 28, 49 Sentiment analysis techniques 163 Service(s) 128, 141, 200, 168 feeling reckless 141 mobile 128, 200 operational data 168 Single value decomposition (SVD) 1, 2, 7, 8, 9, 10, 11, 19, 21, 54, 55, 60, 141 Social 202, 221 media data 202 tagging systems (STS) 221 Software, modern recruitment 270 Soil 272, 275 nature 272 properties 275

300 Artificial Intelligence and Data Science

Total sales and rating patterns 248 Tourism 83, 127, 205, 206, 208, 209, 211 electronic 206 national 211 recommendation system 206, 208 Tourist 206, 211 guidance system 211 services 206 Tours 206, 209, 211, 216 generating personal guided 211 Tour planning 211 research 211 support 211 TRACER’s knowledge-transferring process 98 Traditional crop yield forecasting methods 275 Training 3, 11, 14, 32, 37, 38, 44, 45 , 48, 58, 157 accuracy 44, 45 component 32 data 3, 11, 14, 38, 48, 58, 157 process 37 strategy 58 Transformation 30, 59, 255 non-linear 59 Translated-based recommendation framework 198 Travel 213, 223 courses, sensible 213 decision-making process 223 Trees, single regression 42 Trustworthiness issues 63 Tucker 216, 221, 224, 229 decomposition (TD) 216, 221, 224 rank 229

U Understanding disclosure 158 Unmanned aerial vehicle (UAV) 26 User(s) 78, 129, 134, 137, 171, 173, 175, 196, 198, 206, 223, 233 clustering technique 137

Majumder et al.

cold-start 198, 223 dynamic 134 item filtering (UIF) 171, 174 migrant 206 smartphone 129 social network 78 time-based collaborative filtering (UTCF) 196, 233 User-based collaborative 19, 175, 184, 196, 218, 233, 234 filtering (UCF) 19, 184, 196, 218, 233, 234 recommender system 175 User-centric multi-platform 96 recommendation framework 96

V Validation 32, 281 retroactive 281 Values 244, 247, 255, 287 forecast 287 numeric 247, 255 numerical 244 Vegetation index image technique 280 Vehicles, global logistics 28 Video streaming services 126 Virtual marketplace 134 Visiting 229, 235 area 235 contrasting locations 229

W Wasserstein GAN 58 gradient penalty 58 Weather 189, 230, 272, 275, 279 conditions 189, 230, 272, 275 forecasting framework 279 Web 96, 127, 151, 182, 184, 190, 206, 211, 216 connections 190, 216 mining 182 series 127 technologies 206

Subject Index

Wheat crop yield prediction system 281 Wistful analysis 151 Work, learning-based 136

Y Yield forecast 273, 280 local soya bean 280

Artificial Intelligence and Data Science 301