This book is as an extension of previous book “Computer Vision and Machine Learning in Agriculture” for academicians, re
168 79 10MB
English Pages 273 [269] Year 2022
Table of contents :
Preface
Contents
Editors and Contributors
1 Harvesting Robots for Smart Agriculture
1 Introduction
2 Related Literature
3 Basics of Harvesting Robots
4 Robots in Harvesting Applications
4.1 Harvesting Robots in Path Navigation
4.2 Crops and Vegetable Harvesting
5 Commercialization and Current Challenges of Harvesting Robots
6 Conclusion
References
2 Drone-Based Weed Detection Architectures Using Deep Learning Algorithms and Real-Time Analytics
1 Introduction
2 Overview of UAVs, Artificial Intelligence and Spark Architecture
2.1 Evolution of UAVs
2.2 Applications of UAVs
2.3 UAVs in Agriculture
2.4 UAVs in Weed Detection and Management
2.5 Evolution of Distributed and Parallel Computing for Real-Time Performance
2.6 Spark Architecture
2.7 Spark Streaming Architecture
2.8 Artificial Neural Networks and Deep Learning
3 Proposed Architectures
3.1 Model 1—Conventional Server-Based Architecture
3.2 Model 2—Single-Tier UAV-Based Architecture
3.3 Model 3—Double-Tier UAV-Based Architecture
3.4 Model 4—Hybrid UAV Architecture
4 Conclusion
References
3 A Deep Learning-Based Detection System of Multi-class Crops and Orchards Using a UAV
1 Introduction
2 Related Work
3 Materials and Methods
3.1 Data Collection
3.2 Data Preprocessing and Data Enhancement
3.3 Optimized Faster-RCNN
3.4 Proposed Real-Time Framework Description
4 Results and Discussion
5 Conclusion
References
4 Real-Life Agricultural Data Retrieval for Large-Scale Annotation Flow Optimization
1 Introduction
2 Previous Work
3 Optimized Annotation Flow
4 Crop Identification
4.1 Data Preparation
4.2 Model and Hyperparameters
4.3 Results
5 Emergence Analysis
5.1 Data Preparation
5.2 Model Versions and Hyperparameters
5.3 Results
5.4 Feature Vectors Clustering
6 Conclusion
References
5 Design and Analysis of IoT-Based Modern Agriculture Monitoring System for Real-Time Data Collection
1 Introduction
2 Proposed System Model
2.1 IoT Sensor Nodes
2.2 Controllers and Processing Units
2.3 Communication Technologies
2.4 Cloud Storage and Local Database
2.5 Energy Solution
3 Experimental Setup
4 Simulation Results and Discussion
5 Conclusion
References
6 Estimation of Wheat Yield Based on Precipitation and Evapotranspiration Using Soft Computing Methods
1 Introduction
2 Materials and Methods
2.1 Crop Water Requirement
2.2 Description of the Methods
3 Results
4 Conclusions
References
7 Coconut Maturity Recognition Using Convolutional Neural Network
1 Introduction and Related Works
2 Development of Coconut Image Database
2.1 Image Acquisition and Preprocessing
2.2 Image Categorization
2.3 Ground Truth Image Creation
2.4 Image Augmentation
3 Materials and Methods
3.1 Convolutional Neural Networks
4 Experimental Results
5 Performance Evaluation Analyses
5.1 Coconut Maturity Stages Recognition
6 Conclusion
References
8 Agri-Food Products Quality Assessment Methods
1 Introduction
2 Techniques for Food Quality Analysis
3 Imaging Techniques for Quality Assessment
4 Spectroscopic Methods for Food Quality Analysis
5 Role of Machine Learning and Deep Learning in Food Quality Evaluation
6 Blockchain-Based Grading Mechanism
7 Conclusion
References
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
1 Introduction
2 Literature Review
3 Research Methodology
4 Data Collection and Preprocessing
5 Experimental Evaluation
6 Comparative Analysis of Results
7 Conclusion and Future Work
References
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
1 Introduction
2 Methodology
2.1 Spider Monkey Optimization
2.2 Proposed ESMO: a Plant Disease Identification Approach
2.3 Exponential SMO
2.4 Feature Extraction
2.5 Feature Selection
2.6 Performance Comparison
2.7 Classifier
3 Results
4 Conclusions and Future Work
References
11 Deep Learning-Based Cauliflower Disease Classification
1 Introduction
2 Related Works
3 Materials and Methods
3.1 Dataset
3.2 Convolutional Neural Network (CNN)
3.3 State-of-the-Art CNN Architectures
3.4 Computational Environment
3.5 Training
4 Result Analysis
5 Conclusion
References
12 An Intelligent System for Crop Disease Identification and Dispersion Forecasting in Sri Lanka
1 Introduction
2 Background Study
3 Research Methodology
3.1 Solution Design
3.2 Selection of Neural Network Models
3.3 Data Gathering for the Image Pools
3.4 Implement the Neural and Transfer Learning Models
3.5 Identification of Disease Progression
3.6 Visualization of Disease Propagation
4 Results and Discussion
5 Conclusion
References
13 Apple Leaves Diseases Detection Using Deep Convolutional Neural Networks and Transfer Learning
1 Introduction
2 Literature Review
3 Proposed Methodology
3.1 Dataset
3.2 Data Preprocessing
3.3 Data Augmentation
3.4 Models and Methods
4 Experimental Results
4.1 Experimental Setup
4.2 Evaluation Metrics
4.3 Performance Evaluation
5 Conclusion
References
14 A Deep Learning Paradigm for Detection and Segmentation of Plant Leaves Diseases
1 Introduction
2 Related Work
3 Methodology
3.1 Proposed Object Detection Architecture-1
3.2 Proposed Instance Segmentation Architecture 2
4 Image Dataset Acquisition
5 Experimental Setup and Results Analysis
5.1 Performance Measures
5.2 Experimental Setup and Results Discussions
6 Conclusion and Future Work
References
15 Early Stage Prediction of Plant Leaf Diseases Using Deep Learning Models
1 Introduction
2 Literature Survey
3 Types of Plants Diseases
4 Preliminary Overview
4.1 Convolutional Neural Network
4.2 Support Vector Machines
4.3 Extreme Gradient Boosting (XGBoost)
4.4 Proposed Method
4.5 Multiple Feature Extraction
5 Result and Discussion
5.1 Dataset
5.2 Performance Evaluation
6 Conclusion
References
Algorithms for Intelligent Systems Series Editors: Jagdish Chand Bansal · Kusum Deep · Atulya K. Nagar
Mohammad Shorif Uddin Jagdish Chand Bansal Editors
Computer Vision and Machine Learning in Agriculture, Volume 2
Algorithms for Intelligent Systems Series Editors Jagdish Chand Bansal, Department of Mathematics, South Asian University, New Delhi, Delhi, India Kusum Deep, Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India Atulya K. Nagar, School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK
This book series publishes research on the analysis and development of algorithms for intelligent systems with their applications to various real world problems. It covers research related to autonomous agents, multi-agent systems, behavioral modeling, reinforcement learning, game theory, mechanism design, machine learning, meta-heuristic search, optimization, planning and scheduling, artificial neural networks, evolutionary computation, swarm intelligence and other algorithms for intelligent systems. The book series includes recent advancements, modification and applications of the artificial neural networks, evolutionary computation, swarm intelligence, artificial immune systems, fuzzy system, autonomous and multi agent systems, machine learning and other intelligent systems related areas. The material will be beneficial for the graduate students, post-graduate students as well as the researchers who want a broader view of advances in algorithms for intelligent systems. The contents will also be useful to the researchers from other fields who have no knowledge of the power of intelligent systems, e.g. the researchers in the field of bioinformatics, biochemists, mechanical and chemical engineers, economists, musicians and medical practitioners. The series publishes monographs, edited volumes, advanced textbooks and selected proceedings. All books published in the series are submitted for consideration in Web of Science.
More information about this series at https://link.springer.com/bookseries/16171
Mohammad Shorif Uddin · Jagdish Chand Bansal Editors
Computer Vision and Machine Learning in Agriculture, Volume 2
Editors Mohammad Shorif Uddin Department of Computer Science and Engineering Jahangirnagar University Dhaka, Bangladesh
Jagdish Chand Bansal Department of Mathematics South Asian University New Delhi, India
ISSN 2524-7565 ISSN 2524-7573 (electronic) Algorithms for Intelligent Systems ISBN 978-981-16-9990-0 ISBN 978-981-16-9991-7 (eBook) https://doi.org/10.1007/978-981-16-9991-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
In recent years, computer vision, a non-contact and nondestructive technique involving the development of theoretical and algorithmic tools for automatic visual understanding and recognition, has found numerous applications in agricultural production. Rendering the machine learning techniques to the computer vision algorithms is boosting this sector with better productivity by developing more precise systems. Computer vision and machine learning (CV-ML) helps in plant disease assessment along with crop condition monitoring to control the degradation of yield, quality, and severe financial loss for farmers. Significant scientific and technological advances have been made in defect assessment, quality grading, and disease recognition of various agricultural plants, crops, leaves, and fruits. Besides, intelligent robots and drones are developed with the touch of CV-ML that can help farmers perform various tasks like planting, weeding, harvesting, plant health monitoring, etc. This book is written as an extension of our previous book (Computer Vision and Machine Learning in Agriculture, ISBN: 978-981-33-6424-0, https://doi.org/ 10.1007/978-981-33-6424-0) as Vol. 2 for academicians, researchers, and professionals interested in solving the problems of agricultural plants and products. The topics covered include plant, leaf, and fruit disease detection, crop health monitoring, applications of robots and drones in agriculture, precision farming, assessment of product quality and defects, etc. It contains 15 chapters. Chapter 1 highlights the automated harvesting of some common fruits and vegetables, such as tomatoes, apples, litchi, sweet peppers, and kiwifruit with the help of robots. Chapter 2 focuses on developing drone-based real-time weed detection architecture using deep learning strategies. Chapter 3 describes a UAV-based crops and orchards monitoring framework using optimized faster RCNN. Chapter 4 concentrates on the performance evaluation of deep image retrieval approaches for semantic clustering of agricultural crop identification and crop emergence. Chapter 5 discusses an IoT-based real-time agricultural crop monitoring system through collecting and analyzing sensor data. Chapter 6 shows the performance evaluation of different soft computing methods for the estimation of wheat yield based on precipitation and evapotranspiration. Chapter 7 focuses on finding an appropriate deep learning strategy v
vi
Preface
for the autonomous recognition of coconut maturity. Chapter 8 describes a critical review of spectroscopic and imaging techniques through deep learning, IoT, and blockchain for food quality analysis. Chapter 9 deals with the development of leaf data sets for 11 medicinal plants and also investigates diverse deep learning models for finding an optimum one. Chapter 10 presents a machine learning-based methodology for disease detection of rice and cotton leaves through optimized features using exponential spider monkey optimization (ESMO). Chapter 11 investigates different deep learning strategies to find the optimum one for the detection of four common diseases in cauliflower. Chapter 12 presents the potato, tomato, and bean plants’ fungal disease identification and dispersion forecasting system using multiple images. Chapter 13 presents the detection and identification of five common apple leaf diseases using deep learning strategies. Chapter 14 reports the development of a multitasking automated plant leaf disease detection and segmentation framework based on EfficientDet and mask RCNN deep learning models. Chapter 15 investigates an early-stage prediction of plant leaf diseases using some deep learning models such as CNN, CNN-SVM, and the XGBoost with CNN-SVM to improve the classification accuracy. We hope the covered topics in the current volume, along with the previous volume, will be comprehensive literature for both beginners and experienced including researchers, academicians, and students who wish to work and explore the applications of computer vision and machine learning systems in the agricultural sector for boosting production. We sincerely appreciate the time, effort, and contribution of the authors and esteemed reviewers in maintaining the quality of the papers. Special thanks to the supporting team of Springer for helping in publishing this book. Dhaka, Bangladesh New Delhi, India
Mohammad Shorif Uddin Jagdish Chand Bansal
Contents
1
Harvesting Robots for Smart Agriculture . . . . . . . . . . . . . . . . . . . . . . . . Sk. Fahmida Islam, Mohammad Shorif Uddin, and Jagdish Chand Bansal
2
Drone-Based Weed Detection Architectures Using Deep Learning Algorithms and Real-Time Analytics . . . . . . . . . . . . . . . . . . . Y. Beeharry and V. Bassoo
3
4
5
6
A Deep Learning-Based Detection System of Multi-class Crops and Orchards Using a UAV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shahbaz Khan, Muhammad Tufail, Muhammad Tahir Khan, and Zubair Ahmad Khan Real-Life Agricultural Data Retrieval for Large-Scale Annotation Flow Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiba Najjar, Priyamvada Shankar, Jonatan Aponte, and Marek Schikora
1
15
35
51
Design and Analysis of IoT-Based Modern Agriculture Monitoring System for Real-Time Data Collection . . . . . . . . . . . . . . . . Bekele M. Zerihun, Thomas O. Olwal, and Murad R. Hassen
73
Estimation of Wheat Yield Based on Precipitation and Evapotranspiration Using Soft Computing Methods . . . . . . . . . . Abdüsselam Altunkaynak and Eyyup Ensar Ba¸sakın
83
7
Coconut Maturity Recognition Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Parvathi Subramanian and Tamil Selvi Sankar
8
Agri-Food Products Quality Assessment Methods . . . . . . . . . . . . . . . . 121 Sowmya Natarajan and Vijayakumar Ponnusamy
vii
viii
9
Contents
Medicinal Plant Recognition from Leaf Images Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Md. Ariful Hassan, Md. Sydul Islam, Md. Mehedi Hasan, Sumaita Binte Shorif, Md. Tarek Habib, and Mohammad Shorif Uddin
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 H. K. Jayaramu, Dharavath Ramesh, and Sonal Jain 11 Deep Learning-Based Cauliflower Disease Classification . . . . . . . . . . 171 Md. Abdul Malek, Sanjida Sultana Reya, Nusrat Zahan, Md. Zahid Hasan, and Mohammad Shorif Uddin 12 An Intelligent System for Crop Disease Identification and Dispersion Forecasting in Sri Lanka . . . . . . . . . . . . . . . . . . . . . . . . 187 Janaka L. Wijekoon, Dasuni Nawinna, Erandika Gamage, Yashodha Samarawickrama, Ryan Miriyagalla, Dharatha Rathnaweera, and Lashan Liyanage 13 Apple Leaves Diseases Detection Using Deep Convolutional Neural Networks and Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . 207 Shivam Kejriwal, Devika Patadia, and Vinaya Sawant 14 A Deep Learning Paradigm for Detection and Segmentation of Plant Leaves Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 R. Kavitha Lakshmi and Nickolas Savarimuthu 15 Early Stage Prediction of Plant Leaf Diseases Using Deep Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 N. Rajathi and P. Parameswari
Editors and Contributors
About the Editors Prof. Mohammad Shorif Uddin completed his Ph.D. at Kyoto Institute of Technology in 2002, Japan, Master of Technology Education at Shiga University, Japan in 1999, Bachelor of Electrical and Electronic Engineering at Bangladesh University of Engineering and Technology (BUET) in 1991 and MBA from Jahangirnagar University in 2013. He began his teaching career as a Lecturer in 1991 at Chittagong University of Engineering and Technology (CUET). In 1992, he joined the Computer Science and Engineering Department of Jahangirnagar University and at present, he is a Professor of this department. Besides, he is the Teacher-in-Charge of the ICT Cell of Jahangirnagar University. He served as the Chairman of the Computer Science and Engineering Department of Jahangirnagar University from June 2014 to June 2017 and as an Adviser of ULAB from September 2009 to October 2020. He undertook postdoctoral research at Bioinformatics Institute, Singapore, Toyota Technological Institute, Japan and Kyoto Institute of Technology, Japan, Chiba University, Japan, Bonn University, Germany, Institute of Automation, Chinese Academy of Sciences, China. His research is motivated by applications in the fields of artificial intelligence, imaging informatics, and computer vision. He holds two patents for his scientific inventions and has published more than 160 research papers in international journals and conference proceedings. He had delivered a remarkable number of keynotes and invited talks and also acted as a General Chair or TPC Chair or Co-Chair of many international conferences, such as IJCACI 2021, IJCACI 2020, ICAEM 2019, ICIMSAT 2019, IJCCI 2019, IJCCI 2018, IWCI 2016, etc. He received the Best Paper award in the International Conference on Informatics, Electronics & Vision (ICIEV2013), Dhaka, Bangladesh, and the Best Presenter Award from the International Conference on Computer Vision and Graphics (ICCVG 2004), Warsaw, Poland. He was the Coach of Janhangirnagar University ACM ICPC World Finals Teams in 2015 and 2017 and supervised a good number of doctoral and Master theses. He is a Senior Member of IEEE and an Associate Editor of IEEE Access.
ix
x
Editors and Contributors
Dr. Jagdish Chand Bansal is an Associate Professor at South Asian University New Delhi and Visiting Faculty at Maths and Computer Science, Liverpool Hope University UK. Dr. Bansal has obtained his Ph.D. in Mathematics from IIT Roorkee. Before joining SAU New Delhi he has worked as an Assistant Professor at ABV—Indian Institute of Information Technology and Management Gwalior and BITS Pilani. His Primary area of interest is Swarm Intelligence and Nature Inspired Optimization Techniques. Recently, he proposed a fission-fusion social structure based optimization algorithm, Spider Monkey Optimization (SMO), which is being applied to various problems from engineering domain. He has published more than 70 research papers in various international journals/conferences. He is the editor in chief of the journal MethodsX published by Elsevier. He is the series editor of the book series Algorithms for Intelligent Systems (AIS) and Studies in Autonomic, Data-driven and Industrial Computing (SADIC) published by Springer. He is the editor in chief of International Journal of Swarm Intelligence (IJSI) published by Inderscience. He is also the Associate Editor of Array published by Elsevier. He is the general secretary of Soft Computing Research Society (SCRS). He has also received Gold Medal at UG and PG level.
Contributors Md. Abdul Malek Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Abdüsselam Altunkaynak Faculty of Civil Engineering, Hydraulics and Water Resource Engineering Division, Istanbul Technical University, Istanbul, Turkey Jonatan Aponte BASF Digital Farming GmbH, Cologne, Germany Md. Ariful Hassan Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Jagdish Chand Bansal Department of Mathematics, South Asian University, New Delhi, India V. Bassoo Department of Electrical and Electronic Engineering, Faculty of Engineering, University of Mauritius, Réduit, Mauritius Eyyup Ensar Ba¸sakın Faculty of Civil Engineering, Hydraulics and Water Resource Engineering Division, Istanbul Technical University, Istanbul, Turkey Y. Beeharry Department of Electrical and Electronic Engineering, Faculty of Engineering, University of Mauritius, Réduit, Mauritius Sk. Fahmida Islam Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh
Editors and Contributors
xi
Erandika Gamage Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Murad R. Hassen School of Electrical and Computer Engineering, Addis Ababa Institute of Technology, Addis Ababa University, Addis Ababa, Ethiopia Sonal Jain Indian Institute of Technology (Indian School of Mines), Dhanbad, India H. K. Jayaramu Indian Institute of Technology (Indian School of Mines), Dhanbad, India R. Kavitha Lakshmi Department of Computer Applications, NIT Trichy, Trichy, India Shivam Kejriwal Information Technology, Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Muhammad Tahir Khan Department of Mechatronics Engineering, University of Engineering and Technology Peshawar, Peshawar, Pakistan; Advanced Robotics and Automation Laboratory, National Center of Robotics and Automation (NCRA), Peshawar, Pakistan Shahbaz Khan Department of Mechatronics Engineering, University of Engineering and Technology Peshawar, Peshawar, Pakistan Zubair Ahmad Khan Department of Mechatronics Engineering, University of Engineering and Technology Peshawar, Peshawar, Pakistan Lashan Liyanage Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Md. Mehedi Hasan Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Ryan Miriyagalla Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Hiba Najjar MINES NANCY, Campus Artem, Nancy, France Sowmya Natarajan SRM Institute of Science and Technology, Kattankulathur, India Dasuni Nawinna Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Thomas O. Olwal Department of Electrical Engineering, Facility of Engineering and Built Environment, Tshwane University of Technology, Pretoria, South Africa P. Parameswari MCA Department, Kumaraguru College of Technology, Coimbatore, India Devika Patadia Information Technology, Dwarkadas J. Sanghvi College of Engineering, Mumbai, India
xii
Editors and Contributors
Vijayakumar Ponnusamy SRM Institute of Science and Technology, Kattankulathur, India N. Rajathi Information Technology Department, Kumaraguru College of Technology, Coimbatore, India Dharavath Ramesh Indian Institute of Technology (Indian School of Mines), Dhanbad, India Dharatha Rathnaweera Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Sanjida Sultana Reya Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Yashodha Samarawickrama Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka Tamil Selvi Sankar Department of ECE, National Engineering College, Kovilpatti, Tamil Nadu, India Nickolas Savarimuthu Department of Computer Applications, NIT Trichy, Trichy, India Vinaya Sawant Information Technology, Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Marek Schikora BASF Digital Farming GmbH, Cologne, Germany Priyamvada Shankar BASF Digital Farming GmbH, Cologne, Germany Sumaita Binte Shorif Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh Parvathi Subramanian Department of ECE, National Engineering College, Kovilpatti, Tamil Nadu, India Md. Sydul Islam Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Md. Tarek Habib Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Muhammad Tufail Department of Mechatronics Engineering, University of Engineering and Technology Peshawar, Peshawar, Pakistan; Advanced Robotics and Automation Laboratory, National Center of Robotics and Automation (NCRA), Peshawar, Pakistan Mohammad Shorif Uddin Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh Janaka L. Wijekoon Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka
Editors and Contributors
xiii
Nusrat Zahan Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Md. Zahid Hasan Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Bekele M. Zerihun School of Electrical and Computer Engineering, Addis Ababa Institute of Technology, Addis Ababa University, Addis Ababa, Ethiopia
Chapter 1
Harvesting Robots for Smart Agriculture Sk. Fahmida Islam , Mohammad Shorif Uddin , and Jagdish Chand Bansal
1 Introduction Even in the modern day, crisis like the food shortage continue to occur. The food security crisis affected 183 million people in 2019, spread across nearly 50 countries around the globe [1]. A stressful environment such as local strife, harsh weather, and a restricted amount of land accessible for cultivating food can drive this number higher. Due to a lack of production, food supplies are becoming increasingly scarce, resulting in a lack of nutrients and energy for the people. Due to a labor shortage and/or expensive salaries during the harvest season, farming methods are not sufficient to achieve optimal productivity [2]. Precision agriculture, if used properly, has the potential to raise maximum output along with improving product quality. Additionally, this technique makes use of cutting-edge technology to improve efficiency while conserving resources in an environmentally responsible manner [3]. Precision agriculture has a positive impact on operating costs due to elimination of manual work and human error. As a result, quality and production will improve, and waste from crops will be reduced [4, 5]. Interest in robotic agriculture systems has increased significantly which has led to the development of more adaptable vehicles in agriculture. Agricultural robots use a combination of the latest technologies in communication, sensor systems, ground positioning system, and geo-information. Researchers apply this technology to create Sk. Fahmida Islam (B) · M. S. Uddin Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh M. S. Uddin e-mail: [email protected] J. C. Bansal Department of Mathematics, South Asian University, New Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_1
1
2
Sk. Fahmida Islam et al.
new autonomous vehicles to achieve better crops. Agricultural robots facilitate the work of farmers to increase the productivity and quality of agricultural products. On-farm mobile robots are used to achieve the desired position or location without damaging crops, such as in giving pesticide or fertilizer, picking crops, planting with the desired accuracy, and the amount of material sprayed or cut [6]. Furthermore, the application of artificial intelligence has automated the creation of agricultural robots. Several studies, such as those on citrus picking robots [7], tomato collection robots [8], and strawberry grasping robots [9], have demonstrated the feasibility of the development of autonomous agriculture robots. Horticulture’s use of fruit harvesting reduces labor costs and harvesting electricity costs as well as increases efficiency [10, 11]. Manual fruit harvesting requires a lot of time and effort, and it is not very costeffective. It takes a lot of labor to do tedious manual harvesting that increases the expense. Hence, mechanization seems to be the efficient solution because it is the only way to reduce harvesting labor costs, allowing growers to remain competitive in the future and even expanding markets [12]. To deal with labor shortages and provide a financially viable solution to rapidly rising labor costs, the use of picking robots has been introduced [13–16]. Numerous review articles have been published on fruit harvesting. However, they do not address cost, performance, and technical details concurrently. They accomplished this task in part. The contributions of this chapter are as follows: • We analyzed various fruit and vegetable (tomato, apple, litchi, sweet pepper, and kiwifruit) harvesting robots in detail considering their cost, performance, and technical specifications. • Special consideration has been given to the benefits and drawbacks of common fruit and vegetable harvesting robots. • Harvesting robots’ challenges and probable future trends are discussed. The chapter is organized as follows. The introduction along with contributions is provided in the first section of this chapter. Section 2 discusses the related literature. Robotics basics are presented in Sect. 3. Section 4 describes the current harvesting robots of some common fruits and vegetables, such as tomato, apple, litchi, sweet pepper, and kiwifruit. The robots’ performance, cost, and technical aspects are examined. The commercialization and challenges of the existing harvesting robots are pointed out in Sect. 5. Finally, Sect. 6 concludes this chapter.
2 Related Literature Diverse works have been conducted on harvesting robots. Different kinds of fruits and vegetables are being harvested with harvesting robots. A literature review is illustrated below.
1 Harvesting Robots for Smart Agriculture
3
Citrus fruit harvesting robots were first proposed by Schertz and Brown [17, 18] in the 1960s. Since then, a wide range of fruit and vegetable harvesting robot technology research has been carried out. Davidson et al. [19] reported an apple harvesting robot prototype with two robotic arms. GRAPE [20] is a ground-based autonomous robot with a robotic arm for monitoring plant health through a simple user interface. It is demonstrated that a robot on the ground can monitor the health of plants, track their progress through the vineyard, and apply micronutrients to the grapes. Early multi-purpose agricultural robots [21] included harvesting, berry thinning, spray, and bagging functions. The system had a manipulator, visual sensor, a mobile device, and alternate end-effectors. A cost-effective robot of crop monitoring tasks in mountain vineyards was presented in [22]. Accordingly, a newer version of the robot [23] can now traverse high slope vineyards and carry out monitoring and harvesting duties. Semiautonomous spraying robot designs were discussed in [24]. For automated pruning of grapevines, a stereo vision system extracts each tree’s three-dimensional (3D) model, and a robotic arm does the trimming job [25]. Rahmadian and Widyartono [26] explained two important development components for autonomous harvesting. The first component is a machine vision for detecting the crops and guiding the robot through the field, and the second component is an actuator to grab or pick the crops or fruits. Besides, they [27] showed that agricultural robots can be developed with sensors and actuators and guided by using GPS and vision-based navigation systems. These robots can be used in soil analysis systems to tell the farmers about the suitability of the production of certain plants and crops on the land. The design of a dual-arm harvesting robot is addressed in [28]. This design’s goal was to create a modular torso that could be customized for different plant kinds, allowing the user to customize their work to maximize harvesting efficiency. An apple harvesting robot and its harvesting principle were briefly introduced in Ref. [29]. Paper [30] summarizes the current state of robotic grippers and sensorbased control approaches, as well as their use in robotic agricultural products that are brittle and easily damaged. In [31], agricultural robots are examined for cotton harvesting. It reviewed the status and progress in cotton harvesting robot research and the challenges in the commercial deployment of agricultural robotic systems. A robot named SWEEPER [32] was developed, tested, and validated for collecting sweet pepper fruit in greenhouses. Reference [33] discusses the use of harvesting robots and vision technologies in fruit picking for localization and target identification. An evolution of the development of robotic harvesting end-effectors with fruit and vegetable harvesting applications is discussed in [34]. It focused on harvesting apples, tomatoes, sweet peppers, and cucumbers using robotic final effectors. Mechanical harvesting efficiency has yet to catch up to human efficiency in yield speed, success rate, and cost. So, huge research efforts are needed for further development. Using the principles of 3D sensing, manipulation, and an end-effector, the authors in [35] suggest an effective tomato harvesting robot. To make this robot, deep learning is used to identify tomatoes, which are then used to extract 3D coordinates of the target tomato. There is a review of current agricultural automation and robotics applications in [36], which pointed out a list of difficulties and recommendations for building
4
Sk. Fahmida Islam et al.
Fig. 1 General block diagram of a typical harvesting robot
an effective autonomous and robotics system for agricultural applications. Using selective harvesting robots in three distinct production settings (greenhouse, orchard, and open field), Kootstra et al. [37] reviewed the current literature. Though harvesting robot is a demand of time in the agricultural sector for higher productivity, however, there is no in-depth literature that analyzed various fruits and vegetables, such as apples, sweet peppers, tomatoes, kiwis, and litchi harvesting robots considering their cost, performance, and technical specifications. The current study tries to fill this gap.
3 Basics of Harvesting Robots Harvesting robots had been introduced to replace traditional manual harvesting methods to increase production while improving quality with less effort. Figure 1 shows the typical block diagram of a harvesting robot, and Table 1 presents the required hardware, sensors, and software requirements. Harvesting robots can work in both outdoor (in open fields) and indoor (greenhouse) environments. They are classified based on various harvesting tasks. Table 2 demonstrates the classification of harvesting robots.
4 Robots in Harvesting Applications In the last 20 years, robots performed different types of inspiring tasks, such as autonomous path navigation, harvesting as well as grafting. These are discussed below.
1 Harvesting Robots for Smart Agriculture Table 1 System requirements of a harvesting robot
Table 2 Classification of harvesting robots
5
Module
Specification
Hardware
• • • • • • • •
System
• Dual-core NVDIA/quad-core ARM-based CPU • LPDDR4 memory
6-DoF (degrees of freedom) manipulator Custom end-effector Embedded board RGB-D camera Gripper Calibration board Oscillation board Robot arm
System software • • • •
JetPack SDK Ubuntu and CUDA ROS (robotic operating system) Motion control software
Sensors
Temperature and humidity sensor Temperature sensor Rainfall sensor Wind sensor Orientation sensor Ultrasonic sensor LIDAR (light detection and ranging) NIR (near infrared) LRF (laser range finder) GDS (geometric direction sensor) FOG (fiber optic gyroscope) IMU (inertial measurement unit)
• • • • • • • • • • • •
Harvesting robots
Functions
Picking robot
Identifies and collects ripe fruits and vegetables
Grafting robot
Normally used for grafting
Sorting robot
Classifies fruits and vegetables
4.1 Harvesting Robots in Path Navigation Automatic path navigation is the most challenging task in high-value crop fields. Localization, mapping, tracking, and path planning are needed for path navigation [38, 39]. Robots usually find the path from the pictures of the track (using an RGB-D camera) through the detection of the color of the objects. For example, if the color is green, then it is identified as immature otherwise ripe. The research in automatic path navigation is increasing day by day. Different path navigation mechanisms [38] are illustrated below.
6
(a)
Sk. Fahmida Islam et al.
GPS-based navigation
He et al. [40] use the intersection points of fruit trees and the ground as the feature points for automatic path navigation. Feature points need to be detected at high speed and accuracy. Finally, the robot’s navigation path is generated by extracting the center points of the junction line. In [41], an autonomous navigation algorithm using visual cues and fuzzy control is proposed for wolfberry orchards. The authors proposed a new weightage to convert a color image into a grayscale image for better identification of the trunk of wolfberry; the minimum positive circumscribed rectangle is used to describe the contours. Using the outline points, the least square method is used to fit the navigation line, and a region of interest (ROI) is computed that improves the real-time accuracy of the system. (b)
Vision-based navigation
In [42], Wang proposed a binocular stereo vision system based on disparity estimation for fruit picking robots. The algorithm of target fruit recognition was completed using the OpenCV library on Python. Color segmentation was used in this study for identifying mature target fruits. (c)
Computational method-based navigation
Different types of computational algorithms described below are used for finding essential information in the autonomous navigation system. (i)
Kalman filter
Blok et al. [43] discussed the applicability of two probabilistic localization algorithms. The first one is a Kalman filter (KF), and the second one is a particle filter (PF) along with a line detection algorithm. They concluded that the PF with the laser beam model outperformed the line base KF on navigation accuracy. Besides, PF is more robust than a line-based KF. (ii)
Fuzzy logic
Hagras et al. [44] explained a patented fuzzy genetic system applied to online learning and adapting outdoor agricultural robots. The online self-learning technique makes the system user-friendly. Moreover, its most significant feature is that the robot continuously adapts itself to dynamic environments. (iii)
Neural network and genetic algorithm
A binary particle swarm optimizer genetic algorithm (GA) [45] was used to mitigate the routing problem of an agricultural mobile robot.
4.2 Crops and Vegetable Harvesting Robots are being used continuously for many fruits and vegetables’ harvesting. Here, we are describing the robots that are used for some common fruits such as tomato, strawberry, wolfberry, kiwi, tomatoes, and grapes.
1 Harvesting Robots for Smart Agriculture
(a)
7
Tomato harvesting robot
Jun et al. [35] applied the principle of 3D perception and manipulation in designing a tomato harvesting robot using a scissor-type end-effector. They used the deep learning-based YOLO (You Only Look Once) method to provide high accuracy with a faster speed. Figure 2 shows the real-field diagram of such a robot. (b)
Apple harvesting robot
De-An et al. [47] designed a robot consisting of a manipulator, end-effector, and image-based vision servo control. This apple harvesting robot has a manipulator with 5 DoF structure. The spoon-shaped end-effector with pneumatic actuated gripper was designed to satisfy the requirements for harvesting apple. The harvesting robot autonomously performed its harvesting tasks using a vision-based module. The control system includes industrial computer and AC servo driver. The success rate of apple harvesting is 77%, and average harvesting time is approximately 15 s per apple. Figure 3 shows such a robot [48].
Fig. 2 Tomato harvesting robot [46]
Fig. 3 Apple harvesting robot [48]
8
Sk. Fahmida Islam et al.
Fig. 4 Litchi harvesting robot [49]
(c)
Litchi harvesting robot
Li et al. [49] designed a vision-based harvesting robot using a Kinect V2—a lowcost RGB-D camera. It was decided to place the camera 500–800 mm away from the litchi trees to cover the camera’s field of view and the picking robot’s reachable area. There was also a 6-DoF manipulator with a liftable walking platform included in the picking system. DeepLab V3 semantic image segmentation was used in this robot. Figure 4 shows the real-field diagram of such a harvesting robot whose experimental accuracy was found 83.33%. (d)
Sweet pepper harvesting robot
Lehnert et al. [50] designed an autonomous sweet pepper (capsicum) harvester that includes five distinct steps, such as sweet pepper segmentation, peduncle segmentation, grasp selection, attachment, and detachment actions. It involves moving the robot arm to a long-range perspective using an eye-in-hand RGB-D camera to capture a 3D color image of the entire scene. The peduncle segmentation performance is improved by locating a sweet pepper target from a distance and then moving the camera to a close-range perspective on that sweet pepper. Figure 5 shows the real-field diagram of such a harvesting robot. It exhibited an accuracy of 76%. (e)
Kiwifruit harvesting robot
Mu et al. [52] designed a kiwifruit picking robot consisting of five components, such as end-effector, a Cartesian coordinate manipulator, a programmable logic controller along with a human–machine interface, an electric vehicle, and a machine vision device. Kinect sensor installed on harvesting robot rack allows machine vision
1 Harvesting Robots for Smart Agriculture
9
Fig. 5 Sweet pepper (capsicum) harvesting robot [51]
Fig. 6 Kiwifruit harvesting robot [53]
system to recognize the kiwifruit. Figure 6 shows the real-field diagram of such a harvesting robot. This robot has two bionic fingers and performs grabbing, picking, and unloading. It exhibited an accuracy of 94.2%.
5 Commercialization and Current Challenges of Harvesting Robots There is a possibility that some research difficulties are underestimated or missed because of the scarcity of actual in-depth data. The feedback from end-effectors, the visual processing efficiency, and the economic aspects such as the primary income and expenditure are critical issues for the effective use of harvesting robots. It is
10
Sk. Fahmida Islam et al.
possible to determine the benchmark for commercial viability by comparing a robotic system’s annual cost and efficiency to those of a human picker’s harvest success and destruction rates. It is obvious for widespread use of harvesting robots that for picking the same number of fruits, the overall yearly robot cost should not be greater than the annual cost of human labor, and the robot’s daily collecting amount should not be less than a human counterpart. Many companies are developing harvesting robots. The expected cost of a single apple harvester robot/machine ranges from $300,000 to $350,000. The estimated cost of the sweet pepper harvesting robot ranges from approximately 75,000 to 100,000 euros. The estimated cost of a single kiwifruit harvesting robot ranges from 100,000 to 150,000 euros. Agrobot Company developed a strawberry harvesting robot whose approximate price is $250,000 for a single robot [54–57]. However, the prices of harvesting robots seem high. We hope that the massive use of harvesting robots will help in the reduction of unit prices.
6 Conclusion This chapter demonstrated successful uses of harvesting robots for some common fruits and vegetables such as tomato, apple, litchi, sweet peppers, and kiwifruit. The basic design mechanism of these harvesting robots along with commercialization challenges is explained. Through this study, it is found that at present the harvest success rate, harvesting speed, and damage rate are not up to the mark. Therefore, more research and in-depth experimentations are essential to influence the farmers for the widespread use of harvesting robots. Software and reliable algorithms are needed to increase the success rate and efficiency of the harvesting robots. To select fruits of various sizes and shapes, robots’ mechanical structure must be improved, for example, by adding an adaptable, flexible manipulator and end-effector.
References 1. FAO global report on food crisis 2020. Online source: https://www.fao.org/emergencies/resour ces/documents/resources-detail/en/c/1272014/. Accessed 9 July 2020 2. Yadav V, Rathod R, Nair J (2015) Big data meets small sensors in precision agriculture. Int J Comput Appl 1–4 3. Bechar A (2021) Agricultural robotics for precision agriculture tasks: concepts and principles. In: Progress in precision agriculture, pp 17–30 4. Bongiovanni R, Lowenberg-Deboer J (2004) Precision agriculture and sustainability. Precis Agric 5(4):359–387 5. Sishodia RP, Ray RL, Singh SK (2020) Applications of remote sensing in precision agriculture: a review. Remote Sens 12(19):3136 6. Aishwarya BV, Archana G, Umayal C (2015) Agriculture robotic vehicle-based pesticide sprayer with efficiency optimization. In: IEEE technological innovation in ICT for agriculture and rural development (TIAR), July 2015
1 Harvesting Robots for Smart Agriculture
11
7. Aloisio C, Mishra RK, Chang C, English J (2012) Next-generation image-guided citrus fruit picker. In: IEEE international conference on technologies for practical robot applications (TePRA), pp 37–41. https://doi.org/10.1109/TePRA.2012.6215651 8. Jun W, Zhou Z, Xiaodong D (2012) Design and co-simulation for tomato harvesting robots. In: Proceedings of the 31st Chinese control conference, pp 5105–5108 9. Qingchun F, Wengang Z, Quan Q, Kai J, Rui G (2012) Study on strawberry robotic harvesting system. In: IEEE international conference on computer science and automation engineering (CSAE), May 2012 10. Bachche S (2015) Deliberation on design strategies of automatic harvesting systems: a survey. Robotics 4(2):194–222 11. Zhang T, Huang Z, You W, Lin J, Tang X, Huang H (2019) An autonomous fruit and vegetable harvester with a low-cost gripper using a 3D sensor. Sensors 20(1):93 12. Charlton D, Castillo M (2020) Potential impacts of a pandemic on the US farm labor market. Appl Econ Perspect Policy 43(1):39–57 13. Bac CW, van Henten EJ, Hemming J, Edan Y (2014) Harvesting robots for high-value crops: state-of-the-art review and challenges ahead. J Field Robot 31(6):888–911 14. Zhao Y, Gong L, Huang Y, Liu C (2016) A review of key techniques of vision-based control for harvesting robot. Comput Electron Agric 127:311–323 15. Jia B, Zhu A, Yang SX, Mittal GS (2009) Integrated gripper and cutter in a mobile robotic system for harvesting greenhouse products. In: IEEE international conference on robotics and biomimetics (ROBIO), pp 1778–1783. https://doi.org/10.1109/ROBIO.2009.5420430 16. Megalingam RK, Vignesh N, Sivanantham V, Elamon N, Sharathkumar MS, Rajith V (2016) Low-cost robotic arm design for pruning and fruit harvesting in developing nations. In: International conference on intelligent systems and control (ISCO), pp 1–5. https://doi.org/10.1109/ ISCO.2016.7727016 17. Schertz CE, Brown GK (1966) Determining fruit-bearing zones in citrus. Trans ASAE 9(3):0366–0368 18. Schertz CE, Brown GK (1968) Basic considerations in mechanizing citrus harvest. Trans ASAE 11(3):0343–0346 19. Davidson JR, Hohimer CJ, Mo C, Karkee M (2017) Dual robot coordination for apple harvesting, Spokane, 16–19 July 2017 20. Roure F, Moreno G, Soler M, Faconti D, Serrano D, Astolfi P, Bardaro G, Gabrielli A, Bascetta L, Matteucci M (2017) GRAPE: ground robot for vineyard monitoring and protection. In: Advances in intelligent systems and computing, Nov 2017, pp 249–260 21. Fountas S, Mylonas N, Malounas I, Rodias E, Hellmann Santos C, Pekkeriet E (2020) Agricultural robotics for field operations. Sensors 20(9):2672 22. Neves Dos Santos F, Sobreira HMP, Campos DFB, Morais R, Moreira APGM, Contente OMS (2015) Towards a reliable monitoring robot for mountain vineyards. In: 2015 IEEE international conference on autonomous robot systems and competitions, pp 37–43. https://doi.org/10.1109/ ICARSC.2015.21 23. Neves Dos Santos F, Sobreira HMP, Campos DFB, Morais R, Moreira APGM, Contente OMS (2016) Towards a reliable robot for steep slope vineyards monitoring. J Intell Robot Syst 83:429–444. https://doi.org/10.1007/s10846-016-0340-5 24. Adamites G, Katsanos C, Constantinou G, Xenos M, Hadzilacos T, Edan Y (2017) Design and development of a semi-autonomous agricultural vineyard sprayer: human-robot interaction aspects. J Field Robot 34(5) 25. Botterill T, Paulin S, Green R, Williams S, Lin J, Saxton V, Mills S, Chen XQ, Corbett-Davies S (2017) A robot system for pruning grape vines. J Field Robot 34(6). https://doi.org/10.1002/ rob.21680 26. Rahmadian R, Widyartono M (2019) Harvesting system for autonomous robotic in agriculture: a review. Indones J Electr Eletron Eng 2(1):1 27. Rahmadian R, Widyartono M (2020) Autonomous robotic in agriculture: a review. In: 2020 third international conference on vocational education and electrical engineering (ICVEE), Oct 2020
12
Sk. Fahmida Islam et al.
28. Sepúlveda D, Fernández R, Navas E, Armada M, González-De-Santos P (2020) Robotic aubergine harvesting using dual-arm manipulation. IEEE Access 8:121889–121904. https:// doi.org/10.1109/ACCESS.2020.3006919 29. Jia W, Zhang Y, Lian J, Zheng Y, Zhao D, Li C (2020) Apple harvesting robot under information technology: a review. Int J Adv Robot Syst. https://doi.org/10.1177/1729881420925310 30. Zhang B, Xie Y, Zhou J, Wang K, Zhang Z (2020) State-of-the-art robotic grippers, grasping and control strategies, as well as their applications in agricultural robots: a review. Comput Electron Agric 177:105694 31. Fue K, Porter W, Barnes E, Rains G (2020) An extensive review of mobile agricultural robotics for field operations: focus on cotton harvesting. Agri Eng 2(1):150–174 32. Arad B, Balendonck J, Barth R, Ben-Shahar O, Edan Y, Hellström T, Hemming J, Kurtser P, Ringdahl O, Tielen T, van Tuijl B (2020) Development of a sweet pepper harvesting robot. J Field Robot 37(6):1027–1039 33. Tang Y, Chen M, Wang C, Luo L, Li J, Lian G, Zou X (2020) Recognition and localization methods for vision-based fruit picking robots: a review. Front Plant Sci 11 34. Morar CA, Doroftei IA, Doroftei I, Hagan MG (2020) Robotic applications on agricultural industry. A review. IOP Conf Ser Mater Sci Eng 997 35. Jun J, Kim J, Seol J, Kim J, Son HI (2021) Towards an efficient tomato harvesting robot: 3D perception, manipulation, and end-effector. IEEE Access 9:17631–17640 36. Azimi S, Abidin MSZ, Emmanuel AA, Hasan HS (2020) Robotics and automation in agriculture: present and future applications. Appl Model Simul 4:130–140 37. Kootstra G, Wang X, Blok PM, Hemming J, van Henten E (2021) Selective harvesting robotics: current research, trends, and future directions. Curr Robot Rep 2(1):95–104 38. Wang L, Liu M (2020) Path tracking control for autonomous harvesting robots based on improved double arc path planning algorithm. J Intell Rob Syst 100(3–4):899–909 39. Basri R, Islam F, Shorif SB, Uddin MS (2021) Robots and drones in agriculture—a survey. In: Computer vision and machine learning in agriculture, pp 9–29 40. He B, Liu G, Ji Y, Si Y, Gao R (2011) Auto recognition of navigation path for harvest robot based on machine vision. In: IFIP advances in information and communication technology, pp 138–148 41. Ma Y, Zhang W, Qureshi WS, Gao C, Zhang C, Li W (2021) Autonomous navigation for a wolfberry picking robot using visual cues and fuzzy control. Inf Process Agric 8(1):15–26 42. Wang P, Ma Z, Du X, Lu W, Xing W, Du F, Wu C (2020) A binocular stereo vision system of fruits picking robots based on embedded system. In: 2020 ASABE annual international virtual meeting, 13–15 July 2020 43. Blok PM, van Boheemen K, van Evert FK, IJsselmuiden J, Kim G-H (2019) Robot navigation in orchards with localization based on particle filter and Kalman filter. Comput Electron Agric 157:261–269 44. Hagras H, Callaghan V, Colley M, Clarke G, Duman H (2003) Online learning and adaptation for intelligent embedded agents operating in domestic environments. In: Studies in fuzziness and soft computing, pp 293–322 45. Mahmud MSA, Abidin MSZ, Mohamed Z (2018) Solving an agricultural robot routing problem with binary particle swarm optimization and a genetic algorithm. Int J Mech Eng Robot Res 521–527 46. Readers’ choice 2020: age of agriculture robots fruit-picking robots and drones to take over farms. Available online: https://www.google.com/search?q=tomato+harvesting+robot&tbm. Accessed 17 Nov 2021 47. De-An Z, Jidong L, Wei J, Ying Z, Yu C (2011) Design and control of an apple harvesting robot. Biosyst Eng 110(2):112–122 48. Abundant robotics shuts down the fruit harvesting business. Available online: https://www.the robotreport.com/abundant-robotics-shuts-down-fruit-harvesting-business. Accessed 17 Nov 2021 49. Li J, Tang Y, Zou X, Lin G, Wang H (2020) Detection of fruit-bearing branches and localization of litchi clusters for vision-based harvesting robots. IEEE Access 8:117746–117758
1 Harvesting Robots for Smart Agriculture
13
50. Lehnert C, McCool C, Sa I, Perez T (2020) Performance improvements of a sweet pepper harvesting robot in protected cropping environments. J Field Robot 51. Lehnert C, English A, McCool C, Tow AW, Perez T (2017) Autonomous sweet pepper harvesting for protected cropping systems. IEEE Robot Autom Lett 2(2):872–879 52. Mu L, Cui G, Liu Y, Cui Y, Fu L, Gejima Y (2020) Design and simulation of an integrated end-effector for picking kiwifruit by the robot. Inf Process Agric 7(1):58–71 53. Kiwifruit harvester robot. Available online: https://www.sciencelearn.org.nz/images/2566-kiw ifruit-harvester-robot. Accessed 18 Nov 2021 54. Sakarkar G, Baitule R (2021) Vegetable plucking machine using object detection: a case study. Int J Sci Res Comput Sci Eng Inf Technol 501–508 55. Follow the food. Available online: https://www.bbc.com/future/bespoke/follow-the-food/therobots-that-can-pick-kiwifruit.html. Accessed 20 Nov 2021 56. Food and farming technology. Available online: https://www.foodandfarmingtechnology.com/ news/autonomous-robots/japanese-agtech-company-launches-robotic-tomato-harvester.html. Accessed 19 Nov 2021 57. TechCrunch+. Available online: https://www.foodandfarmingtechnology.com/news/autono mous-robots/japanese-agtech-company-launches-robotic-tomato-harvester.html. Accessed 19 Nov 2021
Chapter 2
Drone-Based Weed Detection Architectures Using Deep Learning Algorithms and Real-Time Analytics Y. Beeharry and V. Bassoo
1 Introduction Being curious by nature and inspired by birds, humans have constantly been putting immense effort into the field of flight. Starting with the ornithopter of Leonardo da Vinci in 1485 to the first aeroplane of the Wright brothers in 1903–1905 [1], innovations had switched to unmanned aerial vehicles (UAVs) for military purposes. With the issue of commercial drone permits by the Federal Aviation Administration (FAA), several drones started to reach the market. Some examples are the Parrot AR Drone and Phantom 4 conceptualised and built by DJI, which is one of the best drone makers [2]. The features embedded in the Phantom 4 are smart computer vision and machine learning technology [3] used in autonomous path planning. Unmanned aerial vehicle (UAV) technology has reached a stage in the evolution process, making it attractive and suitable for various critical applications. The applications of UAVs in the commercial as well as personal domains are numerous, including the provision of wireless coverage [4], real-time monitoring, search and rescue [5], goods delivery [6], remote sensing [7], security and surveillance [8], civil infrastructure inspection [9, 10] and precision agriculture [11, 12]. However, the research in the different aspects of UAVs and their corresponding applications is still ongoing in view to come up with better and robust architectures and solutions. Other technologies such as data processing and machine learning algorithms have also experienced substantial upgrades. The most extraordinary aspect is the interoperability of these evolved technologies migrated to the UAVs, boosting their acceptance Y. Beeharry (B) · V. Bassoo Department of Electrical and Electronic Engineering, Faculty of Engineering, University of Mauritius, Réduit, Mauritius e-mail: [email protected] V. Bassoo e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_2
15
16
Y. Beeharry and V. Bassoo
for various real-life applications [13, 14]. Data analytics and machine learning algorithms are successively being employed for autonomous path planning and navigation of drones tasked with the identification of hazardous vegetation overgrowth, using imaging and location data [15]. In renewable energy fields such as in Aeolian farms, drones use deep learning algorithms for real-time image recognition and detection of anomalies in the wind turbine blades [16], thereby impacting positively on the maintenance costs and scheduling. Apache Spark [17] is an open-source platform that is very suitable for time series and streaming analytics. It supports the parallel computing paradigm at high speeds together with machine learning algorithms in various popular programming languages. An example of the use of Spark in R and time series analytics is provided in [18] for the prediction of the German day-ahead energy forecast. Effective realtime sequencing technologies have also been developed using the Apache Spark platform [19]. The work in this chapter is focused on the weed detection branch of precision agriculture. Four architectures, comprising two conventional and two new proposals, are presented for real-time imaging and processing, weed detection and insights relaying. The major contribution of this work is the proposal of an in-flight hybrid architecture that is based on that of Apache Spark. This proposed architecture is a futuristic approach that would improve computational speeds and be advantageous given UAVs’ battery lifetime constraints. The chapter is organised as follows. Section 2 gives an overview of the evolution of UAVs, their different applications with some focus on weed detection, the Apache Spark framework and artificial neural network (ANN), which are the building blocks for deep learning algorithms. The proposed architectures are described in Sect. 3, and the work is finally concluded in Sect. 4.
2 Overview of UAVs, Artificial Intelligence and Spark Architecture 2.1 Evolution of UAVs UAVs were initially used by the military for various applications. However, the rapid evolution of technology with lowered costs and increased reliability has made UAVs accessible to civilian domains [20–22]. Drones can be controlled from a remote station or console and can even operate autonomously with the aid of machine learning-based path planning algorithms. In the USA, the FAA predicted that the market for commercial drones would triple in size between 2019 and 2023 [23]. Goldman Sachs Research has forecasted that the market for commercial and civil government drone applications will grow to around 100 billion dollars by 2020 [24]. The regulatory environment for drone operation has also evolved in a positive direction. According to recent rules adopted by the European Commission (EU), most
2 Drone-Based Weed Detection Architectures Using Deep Learning …
17
drones will merely need to be registered and have electronic identification. Moreover, the EU is also developing different types of frameworks to enable intricate drone manoeuvres with a higher level of automation [25, 26].
2.2 Applications of UAVs The applications of UAVs in the commercial, as well as civilian domains, are numerous [27] including the provision of wireless coverage [4], real-time monitoring, search and rescue [5], goods delivery [6], remote sensing [7], security and surveillance [8], civil infrastructure inspection [9, 10] and precision agriculture [11, 28, 29]. Future wireless network architectures involve the deployment of UAVs in inaccessible serving areas to provide seamless coverage at high data rates [4]. Common examples of such scenarios are the requirement for rapid communication recovery of key regions after natural calamities or when cellular towers are temporarily out of service or no longer capable of serving ground users [30]. Road traffic management is of utmost concern in any country, and UAVs have been considered as the new age technology for traffic monitoring and data collection, which was traditionally being achieved using loop detectors, microwave sensors and video cameras [31]. UAVs demarcate themselves from the traditional techniques due to their cost-effectiveness and their ability to focus on specific road segments or ability to monitor voluminous traffic flows on large continuous road segments. UAVs are intensively used in two main formations in search and rescue operations. These formations can be either solitary or in swarm [5]. The solitary formation consists of a single drone sent out for the search operation, while the swarm involves multiple drones hovering over sections of the specified search area and communicating with each other. In both scenarios, the idea is to have real-time images or videos sent over to the ground station to be analysed by the rescue team. The goods delivery paradigm has adopted the use of UAVs for the transportation of food and different types of packages. One critical field is that of health care, where ambulance drones are flown in and out of unreachable areas for delivering blood samples, medicines and immunisations [6]. Additionally, in cases of health emergencies, drones can transport crucial instruments in a relatively lower time span as compared to other forms of transportations. UAVs used in the field of remote sensing [7] can perform either active or passive sensing. Devices used in active remote sensing applications require the ability to propagate through the atmosphere under various conditions. This requirement explains the use of microwaves in active sensing. Systems for active sensing involve laser altimeters, radars, ranging instruments, LiDAR, sounder and scatter metre. Passive sensors, on the other hand, function in the thermal infrared, visible and microwave portions of the electromagnetic spectrum. Passive remote sensing systems comprise radiometer, imaging radiometer, accelerometer, hyperspectral radiometer, sounder, spectrometer and spectroradiometer.
18
Y. Beeharry and V. Bassoo
Studies have demonstrated a growing interest in the use of UAVs in infrastructure and construction inspection applications [9, 10]. Their deployment ranges from monitoring large construction projects to the inspection of power lines, GSM towers and gas pipelines. Project managers benefit greatly from this kind of enhancement such that they have the ability to monitor and have better visibility on the progress without having to physically access the site.
2.3 UAVs in Agriculture The authors of [32] proposed a prototype system involving a swarm of UAVs with adaptive and intelligently re-configurable flight plans when sent on survey missions related to precision agriculture. A series of simulations and flight tests have been conducted in the work to evaluate the effectiveness of the proposed distributed mechanism of multi-agent technology. In view to deal with questions around how UAV innovations can help Indian agriculture with sustainability and how to address the governance issues of civil UAV innovations in crop insurance applications, the authors of [33] have performed an in-depth study with a number of interviewees selected using a snowball sampling technique. Results have shown that values such as safety, autonomy, environmental friendliness, trust and transparency assume very high significance in the governance of emerging UAV technology. The work in [34] presents a hyperspectral imaging system and describes, with detailed examples, the image processing and analysis of the acquired hyperspectral images. The major applications for this UAV-based hyperspectral imaging system are essentially in precision agriculture and forest management. With the extensive adoption of technology and UAVs for precision farming, the work in [35] proposes a prototype involving UAVs hovering over tomato fields collecting visual data. These data are in turn processed by a system capable of extracting insights to be further used by the farmers for maximising their crop yields. Results demonstrated the viability of the proposed system together with an accuracy of 90% in the detection of rotten tomatoes. Agriculture is one of the pillars of the economy of Armenia, and the authors of [36] have investigated the possibility of employing UAVs in that sector. The researchers have used DJI Phantom 3 drones to obtain digital models of three hectares of fields, thereby allowing the reception of up-to-date information and providing the possibility to perform various analyses and projections. In Southern Finland, researchers have presented a new technique for determining the locations of soil samples. The system is based on wearable augmented reality (AR) technology and a soil map generated from drone imaging [37]. The analysis performed in [38] proposed a workflow for drone image processing and flight planning with the ability to deliver accurate measurements of tree crops in addition to the accuracy of height and plant projective cover resulting from discrete trees within a commercial avocado orchard.
2 Drone-Based Weed Detection Architectures Using Deep Learning …
19
In [39], the downwash flow field of a quad-rotor drone was simulated using the computational fluid dynamics method based on the lattice Boltzmann method (LBM). Additionally, the Lagrangian discrete phase particle tracking method was used to simulate the trajectories of droplets with different particle sizes. Studies were carried out by altering parameters such as flight speed, altitude and lateral distance between the nozzles to analyse the effects on the droplet deposition and drift behind the fuselage. The authors of [40] tackled the problem faced by farmers manually applying pesticides and fertilisers in their fields which caused severe health issues due to the chemicals used. In line with the widespread use of drones in areas such as agricultural field monitoring and analysis, a UAV-based semi-automatic system has been proposed for agricultural spraying.
2.4 UAVs in Weed Detection and Management Weed detection and management using conventional techniques on very large crop fields involve certain level of time and dedicated resources. With the adoption of UAVs for these tasks in agriculture, considerable reductions have been experienced in terms of costs and time consumption making them very suitable and desired. Plant density estimations and tree pruning in field experiments which would have been time-consuming and costly when performed manually have been successfully undertaken using UAV platforms and imagery [41–43]. With the aim to come up with a system for early detection and mapping of Johnson grass patches and thereby launch the process of herbicide treatment based on the weed coverage, the authors of [44] have developed an object-based image analysis (OBIA) procedure. Images from visible and multi-spectral cameras fitted on UAVs flying at altitudes of 30, 60 and 100 m were collected from 2 maize fields, and weed maps were developed for the proper weed management program. The results demonstrated that potential savings on herbicides ranged from 85 to 96%. The potential of using images captured from UAVs has been assessed in the study presented in [45]. Simulations have been performed to study the proposed end-toend model and binary classifications: soil and vegetation; and monocotyledon and dicotyledon. Results demonstrated that the proposed model has the ability to assess crop weed discrimination. In order to detect Gramineae weed in rice fields from images taken by a fixedwing UAV at altitudes of 60 and 70 m, the authors of [46] have proposed a novel fuse high-resolution RGB and low-resolution multi-spectral image with better weed discrimination features. The work in [47] proposes a system for weed surveying. The system operated on image pattern recognition from pictures taken by UAVs. This alternative can take pictures very close to the plants allowing species recognition in lower infestation levels and without cloud interference. Results demonstrate that an overall accuracy of 82% can be achieved from preliminary tests.
20
Y. Beeharry and V. Bassoo
2.5 Evolution of Distributed and Parallel Computing for Real-Time Performance The goal to process large-scale data on a fault-tolerant system while using commodity hardware gave birth to the Google File System (GFS) [48] in 2003. Later in 2007, engineers at YAHOO!, working on the Hadoop sub-project, came up with a cluster of 1000 nodes and using the MapReduce programming model [9, 49]. The Hadoop Distributed File System (HDFS) is mainly used for batch data processing. However, the evolution of technology and demands in diverse fields required more real-time and on the fly data processing. To meet these requirements, the Apache Spark Project [17] arose. Speed tests revealed that Apache Spark could run workloads 100 times faster as compared to Hadoop. This is a key characteristic promoting Apache Spark’s usage for streaming systems. In addition to the speed feature of Apache Spark, the latter also supports multiple programming languages and advanced analytics.
2.6 Spark Architecture The Spark architecture comprises three main components, which are the master node, cluster manager and worker node(s). The global operating architecture for Apache Spark [17] is as shown in Fig. 1.
Master Node Driver Program Spark Context
Cluster Manager
Worker Node Task
Cache
Fig. 1 Spark architecture
Worker Node Task
Cache
Worker Node Task
Cache
2 Drone-Based Weed Detection Architectures Using Deep Learning …
21
The master node constitutes the driver program that controls the application. A Spark Context, which can be considered as the gateway to all the various functionalities in Spark, is created inside the driver program. The Spark Context works in collaboration with another entity, which is called the cluster manager, to manage the different jobs to be executed within the cluster. A job can be split into several tasks and distributed over the worker nodes. The worker nodes are essentially slave nodes that only execute the task they have been given. The abstractions used for different tasks by the worker nodes in Spark are resilient distributed dataframes (RDDs). Once the tasks have been successfully executed, the results are returned back to the Spark Context in the driver program of the master node. In summary, the Spark Context breaks the job at hand into tasks and distributes them over to the worker nodes through the cluster manager. The results are then collected back in the Spark Context once the operations have been successfully completed on the worker nodes. From a logical point of view, the performance in terms of the execution speed of the Spark framework can be enhanced by increasing the number of worker nodes which would therefore mean that more jobs can be split into more tasks and executed in parallel.
2.7 Spark Streaming Architecture The requirements for real-time applications have brought about upgraded versions of Spark known as Spark Streaming [50]. The architecture for Spark Streaming is as shown in Fig. 2. The Spark Streaming architecture consists of a receiver that ingests all the received data, sets all the different batches of RDDs and sends them over for execution to different Spark nodes. Thus, the only difference with Spark is that Spark Streaming has an additional mechanism to process data on the fly and send over the properly formatted batches to Spark nodes for computations. Additionally, Spark Streaming comprises the discretised streams (D-Streams) model, enabling improved efficiency with a parallel recovery mechanism [51]. Batches (RDDs) Records
Spark Spark
Receiver Batches Processed with Tasks
Fig. 2 Spark streaming architecture
Spark
22
Y. Beeharry and V. Bassoo
2.8 Artificial Neural Networks and Deep Learning ANN is an information processing paradigm that is inspired by the operating mechanism of the biological nervous systems. The key component in this system is the novel information processing structure. It consists of highly interconnected neurons in large numbers for specific problem-solving. Adjustments in the synaptic connections between neurons are key to the learning process in biological systems. Neural networks have been applied in many industries, given their wide-ranging applicability to real-world problems [52, 53]. The configuration of the neural network depends intensively on the problem at hand. Therefore, it is left on the designer’s experience to select an appropriate configuration in terms of the number of input, output and hidden layer nodes. A typical neural network consists of layers, and in a singlelayered network, there is one input, one hidden and one output layer of neurons, as shown in Fig. 3. A biological neuron is a piece of distinctive equipment carrying and transferring information to other neurons in the network chain. Artificial neurons mimic these functionalities together with their distinctive learning process [54]. An artificial neuron is depicted in Fig. 4 [55]. A multi-layer network has more than one hidden layer, and its structure is shown in Fig. 5. Additional hidden neurons increase the capability of the network by extracting
weights
Output Categories
. . .
. . .
- Neurons - Input Parameters
Input Layer
Hidden Layer
Fig. 3 Artificial neural network
Output Layer
2 Drone-Based Weed Detection Architectures Using Deep Learning … X1
X2
XN
23
Input (Dendrites)
...
W1
W2
Net
...
WN
N
∑W1 X1 + bias 1
Fct(Net)
Synaptic Weights. (They have inhibitory or excitatory edffect over input signals, resulting in the activation or not of neuron) Activation (It considers the set of input values and its respective synaptic weights) Activation Function (It regulates the output value)
Output (Axonic / Synapse connects to other unity)
Fig. 4 Artificial neuron
Output Categories
. . .
. . .
. . .
- Neurons weights - Input Parameters
Input Layer
1st
2nd Hidden Layer
Fig. 5 Multi-layer neural network
Output Layer
24
Y. Beeharry and V. Bassoo
higher-order statistics from the input. Furthermore, if every node in each layer of the network is connected to every other node in the adjacent forward layer, the network is said to be fully connected [56]. The network “learns” by modifying the weights between layers. A network is able to generalise relevant output(s) for an input data set when adequately trained, which is a valuable property, whereby correct matching can be achieved in the form of output data for a set of previously unseen input data [57]. A different flavour of the ANNs is the convolutional neural network (CNN) [58] which is a type of feed-forward network. CNNs are capable of performing classifications of 2-dimensional and 3-dimensional images [59] together with the identification of objects. The architecture consists of three basic components, namely convolutional layer [60], pooling layer [60] and fully connected layer [61]. Each unit of the architecture is used to apply filters through overlapping regions, starting from the convolutional layers to the fully connected layers, to extract a maximum number of features [62]. The optimiser is a parameter that determines how the model is updated based on the provided loss function of the network. Some of the commonly used optimisation methods are stochastic gradient descent (SGD) [63], root-meansquare propagation (RMSProp) [64] and adaptive moment estimation (ADAM) [65]. An activation function is a mathematical analysis that helps determine whether an output should fire based on the relevant input by the nodes [66]. Some examples are unipolar sigmoid function, hyperbolic tangent (tanh) function, rectified linear unit (ReLU), leaky ReLU, SoftMax, and exponential linear unit (ELU) [66]. An example of a complete flow of CNN to process an input image and classify the objects based on values is shown in Fig. 6. Deep learning networks are based on the principle of ANNs and comprise more than three layers. This is achieved by increasing the number of hidden layers in the architecture. Artificial neural networks have gained momentum in the field of agriculture with their ability to process different data types: textual, numerical, audio and images. The work in [68] presents an ANN-based model for the prediction of traction force using experimental data collected from a soil bin employing a single-wheel tester. Soil textures and tyre types were classified using different training algorithms and the
Fig. 6 Convolutional neural network [67]
2 Drone-Based Weed Detection Architectures Using Deep Learning …
25
inputs: slippage, velocity, wheel load and inflation pressure. The results demonstrated the suitability of ANNs for the modelling and prediction of traction force. ANNs incorporated with Geographical Information System (GIS) have also been used in the investigation of their potential for the assessment of the suitability of land portions for the cultivation of selected crops [69]. The assessments and tests demonstrated that the design of an ANN with four input parameters, six neurons in the hidden layer and one output could result in an accuracy level of 83.43%, which is acceptable and consistent for a real-world application. ANNs can be augmented by increasing the number of hidden layers and thereby giving rise to a new class of ANNs called deep neural networks [70]. These types of networks are being seriously investigated due to their higher performance accuracies and suitability for various applications. Different types of plant diseases are causes of threats for farmers in terms of cost and productivity implications. In anticipation of the obstacles encountered by farmers due to plant diseases, an image and deep neural network-based technique for plant disease identification are proposed in [71]. Experimental results demonstrate the achievement of a maximal accuracy of 87.7% with the proposed algorithm.
3 Proposed Architectures This chapter considers four architectures which are presented in this section. The following architectures are presented: Model 1—Conventional server-based architecture Model 2—Single-tier UAV-based architecture Model 3—Double-tier UAV-based architecture Model 4—Hybrid UAV architecture.
3.1 Model 1—Conventional Server-Based Architecture Figure 7 depicts the proposed Model 1, which is based on a conventional architecture with data being sent over to an application and database server for processing and storing. The feeder drones capture images from the field and transmit them wirelessly to the application server through an antenna. The application server treats the received data with localisation details, runs the deep learning algorithms for weed detection and sends back relevant information to the user application. The application and database servers could either operate locally or like in this case, on the cloud. The transceiving antenna and application server (if run locally) are the single point of failures in this architecture. If either of them experienced issues, the whole system would go down. One way to overcome this hurdle would be to use a backup antenna
26
Y. Beeharry and V. Bassoo
Downstream Data ApplicaƟon and Database Server
User ApplicaƟon
Antenna
Upstream Data
Feeder Drone
Fig. 7 Conventional server-based architecture (Model 1)
and offload the application to the cloud, where fault-tolerant services are incorporated. These would improve the resiliency of the architecture but add substantially to the costs of ownership.
3.2 Model 2—Single-Tier UAV-Based Architecture The proposed Model 2 shown in Fig. 8 replaces the use of the antenna and the application and database server from Model 1 by a master drone. The function of the master drone is to run the application, treat the received data containing localisation details, run the deep learning algorithms for weed detection and send relevant information to the user application. The computational load in this model is fully on the master drone. A major concern with this architecture is the battery life of the drone. The intense computations would drastically impact the battery draining profile of the master drone. This proposed model would be suitable in the advent that the master drone is geared up with self-charging ability using renewable energy such as solar [72] or a protocol for in-flight recharging [73–75] is put in place.
2 Drone-Based Weed Detection Architectures Using Deep Learning …
27
Master Drone User ApplicaƟon
Fig. 8 Single-tier UAV architecture (Model 2)
3.3 Model 3—Double-Tier UAV-Based Architecture The idea behind the proposed Model 3 shown in Fig. 9 is to amalgamate the architectures for Apache Spark and Spark Streaming for running deep learning algorithms in real time on UAVs in modern agricultural applications. The first tier, in this case, is represented by the master and feeder drones. The cluster manager and worker nodes represent the second tier. In-flight drones are assigned the jobs of the main nodes in the Spark architecture. Model 3 demonstrates feeder drones hovering over assigned sections of agricultural fields communicating wirelessly with the drone operating as the master node. Each of the feeder drones captures images in real time and transfers them over to the master node through discretised streams. The proposed framework for executing the DNN is to have different drones for performing the operations of the input, hidden and output layers separately. The data flow between the different layers would then be controlled by the cluster manager. Once the results are obtained at the output layer, the latter is sent back to the master node for relaying to the user application. The in-flight UAVs operating on the modified Spark architecture for DNN could be the ones that are being used for specific tasks such as weather parameter monitoring, pesticide spraying [39] or irrigation [76]. Various works on the UAV flight time and recharge scheduling [77] are ongoing, which would contribute towards the realisation of the proposed model.
28
Y. Beeharry and V. Bassoo
Worker Nodes Worker Node For DNN Input Layer
Worker Node For DNN Hidden Layer(s)
Worker Node For DNN Output Layer
Tier 1
Cluster Manager
Master Node Driver Program Spark Streaming Context User ApplicaƟon
DiscreƟsed Streams (RDDs from each source)
Tier 2
DiscreƟsed Streams (RDDs from each source)
Downstream Data from Cluster Manager
Upstream Data from Worker Nodes to Cluster Manager/ Master Node
Fig. 9 Double-tier UAV architecture (Model 3)
3.4 Model 4—Hybrid UAV Architecture Based on the observation of the number of drones required for the proposed Model 3, an improved architecture is proposed, as shown in Fig. 10. The idea is to not have separate drones for different functions by merging the functionalities of the two different tiers of Model 3. Instead, the drones hovering in the fields for data collection could share the functions of worker nodes for the different DNN layers. Additionally, the master node can run the function of the cluster manager in parallel. This function merging strategy would impact the energy consumption of the different drones as well as on the flight time. However, this would be compensated by the speed gains
2 Drone-Based Weed Detection Architectures Using Deep Learning …
29
Master Node / Cluster Manager Driver Program Spark Streaming Context User ApplicaƟon
Source / Worker Node For DNN Layer
DiscreƟsed Streams (RDDs from each source)
Downstream Data from Cluster Manager
Upstream Data from Worker Nodes to Cluster Manager/ Master Node
Fig. 10 Hybrid UAV architecture (Model 4)
when using the Spark architecture, as well as if an in-flight recharging mechanism for the drones can be put in place or using solar-powered drones.
4 Conclusion This chapter proposes four different architectures for weed detection in agriculture. The main contributions are the proposed in-flight hybrid architectures based on Apache Spark and Spark Streaming platform to better support the components of deep learning algorithms. These novel proposed architectures can be implemented because of the benefit that Spark’s framework brings in terms of the higher operational speeds. These futuristic solutions can prove to be valuable to large-scale farmers as it provides quick localisation of weedy sectors and thus enables efficient time and resource management. These proposed models could be further extended by employing solar-powered drones or in-flight charging mechanisms to improve battery lifetime and flight time.
30
Y. Beeharry and V. Bassoo
References 1. NASA, Shaw RJ (2014) History of flight. NASA, 12 June 2014. [Online]. Available: https:// www.grc.nasa.gov/www/k-12/UEET/StudentSite/historyofflight.html. Accessed 09 Mar 2020 2. DJI (2020) PHANTOM 4. DJI. [Online]. Available: https://www.dji.com/phantom-4. Accessed 09 Mar 2020 3. Dormehl L (2018) The history of drones in 10 milestones. Digital Trends, 11 Sept 2018. [Online]. Available: https://www.digitaltrends.com/cool-tech/history-of-drones/. Accessed 09 Mar 2020 4. Gupta L, Jain R, Vaszkun G (2015) Survey of important issues in UAV communication networks. IEEE Commun Surv Tutor 18(2):1123–1152 5. Silvagni M, Tonoli A, Zenerino E, Chiaberge M (2017) Multipurpose UAV for search and rescue operations in mountain avalanche events. Nat Hazards Risk 8(1):18–33 6. Margaret E, Evens E, Stankevitz K, Parkera C (2019) Using the unmanned aerial vehicle delivery decision tool to consider transporting medical supplies via drone. Glob Health Sci Pract 7(4):500–506 7. Niedzielski T (2019) Applications of unmanned aerial vehicles in geosciences. Birkhäuser, Basel 8. Wada A, Yamashita T, Maruyama M, Arai T, Adachi H, Tsuji H (2015) A surveillance system using small unmanned aerial vehicle. NEC Tech J 8(1):68–72 9. Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: OSDI’04: sixth symposium on operating system design and implementation, San Francisco 10. Ham Y, Han KK, Lin JJ, Golparvar-Fard M (2016) Visual monitoring of civil infrastructure systems via camera-equipped unmanned aerial vehicles (UAVs): a review of related works. Vis Eng 4(1):1 11. Khanal S, Fulton J, Shearer S (2017) An overview of current and potential applications of thermal remote sensing in precision agriculture. Comput Electron Agric 139:22–32 12. Comba L, Biglia A, Aimonino DR, Gay P (2018) Unsupervised detection of vineyards by 3D point-cloud UAV photogrammetry for precision agriculture. Comput Electron Agric 155:84–95 13. Rowley MJ (2016) How real-time drone data is saving lives. Cisco, 15 Nov 2016. [Online]. Available: https://newsroom.cisco.com/feature-content?type=webcontent&art icleId=1801379. Accessed 10 Mar 2020. 14. Deng C, Wang S, Huang Z, Tan Z, Liu J (2014) Unmanned aerial vehicles for power line inspection: a cooperative way in platforms and communications. J Commun 9(9):687–692 15. Shukla M, Chen Z, Lu C (2018) DIMPL: a distributed in-memory drone flight path builder system. J Big Data 5(24) 16. Wang Y, Yoshihashi R, Kawakami R, You S, Harano T, Ito M, Komagome K, Lida M, Naemura T (2019) Unsupervised anomaly detection with compact deep features for wind turbine blade images taken by a drone. IPSJ Trans Comput Vis Appl 11(3) 17. Apache Spark (2018) Apache Spark: lightning-fast unified analytics engine. [Online]. Available: https://spark.apache.org/. Accessed 02 Mar 2020 18. Krome C, Sander V (2018) Time series analysis with apache spark and its applications to energy informatics. Energy Inform 1(40) 19. Rathee S, Kashyap A (2018) StreamAligner: a streaming based sequence aligner on Apache Spark. J Big Data 5(8) 20. Hayat S, Yanmaz E, Muzaffar R (2016) Survey on unmanned aerial vehicle networks for civil applications: a communications view-point. IEEE Commun Surv Tutor 18(4):2624–2661 21. Gupta L, Jain R, Vaszkun G (2016) Survey of important issues in UAV communication networks. IEEE Commun Surv Tutor 18(2):1123–1152 22. Bassoo V, Hurbungs V, Fowdur TP, Beeharry Y (2020) 5G connectivity in the transport sector: vehicles and drones use cases 23. Federal Aviation Administration (2019) FAA aerospace forecast—fiscal years 2019–2020 24. Goldman Sachs Research (2019) [Online]. Available: https://www.goldmansachs.com/ins ights/technology-driving-innovation/drones/. Accessed 24 Sept 2019
2 Drone-Based Weed Detection Architectures Using Deep Learning …
31
25. European Commission (2019) [Online]. Available: https://ec.europa.eu/transport/modes/air/ news/2019-05-24-rules-operating-drones_en 26. Chandhar P, Larsson E (2019) Massive MIMO for connectivity with drones: case studies and future directions. IEEE Access 7:94677–94691 27. Radoglou-Grammatikis P, Sarigiannidis P, Lagkas T, Moscholios I (2020) A compilation of UAV applications for precision agriculture. Comput Netw 172 28. Elmokadem T (2019) Distributed coverage control of quadrotor multi-UAV systems for precision agriculture. IFAC-PapersOnLine 52(30):251–256 29. Srivastava K, Bhutoria AJ, Sharma JK, Sinha A, Pandey PC (2019) UAVs technology for the development of GUI based application for precision agriculture and environmental research. Remote Sens Appl Soc Environ 16 30. Deruyck M, Wyckmans J, Joseph W, Martens L (2018) Designing UAV-aided emergency networks for large-scale disaster scenarios. EURASIP J Wirel Commun Netw 79:2018 31. Elloumi M, Dhaou R, Escrig B, Idoudi H, Saidane LA (2018) Monitoring road traffic with a UAV-based system. In: IEEE wireless communications and networking conference (WCNC), Barcelona 32. Skobelev P, Budaev D, Gusev N, Voschuk G (2018) Designing multi-agent swarm of UAV for precise agriculture. In: Highlights of practical applications of agents, multi-agent systems, and complexity: the PAAMS collection. Springer, Cham, pp 47–59 33. Chamuah A, Singh R (2020) Securing sustainability in Indian agriculture through civilian UAV: a responsible innovation perspective. SN Appl Sci 2(106) 34. Kurihara J, Ishida T, Takahashi Y (2018) Unmanned aerial vehicle (UAV)-based hyperspectral imaging system for precision agriculture and forest management. In: Unmanned aerial vehicle: applications in agriculture and environment. Springer, Cham, pp 25–38 35. Altınba¸s MD, Serif T (2019) Detecting defected crops: precision agriculture using haar classifiers and UAV. In: Mobile web and intelligent information systems, MobiWIS. Springer, Cham, pp 27–40 36. Hovhannisyan T, Efendyan P, Vardanyan M (2018) Creation of a digital model of fields with application of DJI phantom 3 drone and the opportunities of its utilization in agriculture. Ann Agrar Sci 16(2):177–180 37. Huuskonen J, Oksanen T (2018) Soil sampling with drones and augmented reality in precision agriculture. Comput Electron Agric 154:25–35 38. Tu Y-H, Phinn S, Johansen K, Robson A, Wu D (2020) Optimising drone flight planning for measuring horticultural tree crop structure. ISPRS J Photogramm Remote Sens 160:83–96 39. Wen S, Han J, Ning Z, Lan Y, Yin X, Zhang J, Ge Y (2019) Numerical analysis and validation of spray distributions disturbed by quad-rotor drone wake at different flight speeds. Comput Electron Agric 166 40. Rao VPS, Rao GS. Design and modelling of an affordable UAV based pesticide sprayer in agriculture applications 41. Koh J, Hayden M, Daetwyler H, Kant S (2019) Estimation of crop plant density at early mixed growth stages using UAV imagery. Plant Methods 15(64) 42. Li B, Xu X, Han J, Zhang L, Bian C, Jin L, Liu J. The estimation of crop emergence in potatoes by UAV RGB imagery. Plant Methods 43. Zaman-Allah M, Vergara O, Tarekegne A, Magorokosho C, Zarco-Tejada PJ, Hornero A, Alba HA, Das B, Olsen M, Prasanna BM, Cairns J (2015) Unmanned aerial platform-based multi-spectral imaging for field phenotyping of maize. Plant Methods 11(35) 44. López-Granados F, Torres-Sánchez J, De Castro A, Serrano-Pérez A, Mesas-Carrascosa F, Peña J (2016) Object-based early monitoring of a grass weed in a grass crop using high resolution UAV imagery. Agron Sustain Dev 36(67) 45. Louargant M, Villette S, Jones G, Vigneau N, Paoli JN, Gée C (2018) Weed detection by UAV: simulation of the impact of spectral mixing in multispectral images. Precis Agric 18:932–951 46. Barrero O, Perdomo SA (2018) RGB and multispectral UAV image fusion for Gramineae weed detection in rice fields. Precis Agric 19:809–822
32
Y. Beeharry and V. Bassoo
47. Yano IH, Alves JR, Santiago WE, Mederos BJT (2016) Identification of weeds in sugarcane fields through images taken by UAV and Random Forest classifier. IFAC-PapersOnLine 49(16):415–420 48. Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: Proceedings of the 19th ACM symposium on operating systems principles. ACM, Bolton Landing, NY 49. Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: IEEE 26th symposium on mass storage systems and technologies (MSST), Incline Village, NV 50. Apache Spark. Spark streaming. [Online]. Available: https://spark.apache.org/docs/latest/str eaming-programming-guide.html. Accessed 03 Mar 2020 51. Zaharia MA, Das T, Li H, Hunter T, Shenker S, Stoica I (2013) Discretized streams: faulttolerant streaming computation at scale. In: SOP 2013—proceedings of the twenty-fourth ACM symposium on operating systems principles 52. Dastorani MT, Afkhami H, Sharifidar H, Dastorani M (2010) Application of ANN and ANFIS models on dryland precipitation prediction. J Appl Sci 10(20):2387–2394 53. Santhanam T, Subhajini AC (2011) An efficient weather forecasting system using radial basis function neural network. J Comput Sci 7(7):962–966 54. Fausett L (1994) Fundamental of neural networks. Prentice Hall, New York 55. Zurada JM (1992) Fundamental of neural networks. West Publishing Company, Saint Paul, MN 56. Khan MS, Coulibaly P (2010) Assessing hydrologic impact of climate change with uncertainty estimates: back propagation neural network approach. J Hydrometeorol 11:482–495 57. Haykin S (1994) Neural networks: a comprehensive foundation. Macmillan, Prentice Hall, New York 58. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324 59. Tan T, Li Z, Liu H, Zanjani FG, Ouyang Q, Tang Y, Hu Z, Li Q (2018) Optimize TL for lung diseases in bronchoscopy using a new concept: sequential fine-tuning. IEEE J Transl Eng Health Med 6(1800808):1–8 60. Adam Gibson JP (2020) Deep learning. Oreilly 61. Tan J, NourEldeen N, Mao K, Shi J, Li Z, Xu T, Yuan Z (2019) Deep learning convolutional neural network for the retrieval of land surface temperature from AMSR2 data in China. Sensors 19(2987) 62. Jung C, Zhou K, Feng J (2016) FusionNet: multispectral fusion of RGB and NIR images using two stage convolutional neural networks. IEEE Access 4:1–8 63. Benois-Pineau AZ (2020) Deep learning in mining of visual content. In: Springer briefs in computer science. Springer, Cham, pp 49–58 64. Misra I (2015) Optimization for deep networks, 11 Nov 2015. [Online]. Available: http://www. cs.cmu.edu/~imisra/data/Optimization_2015_11_11.pdf. Accessed 13 Mar 2020 65. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: ICLR 2015, San Diego, CA 66. Karlik B, Olgac AV (2011) Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int J Artif Intell Expert Syst (IJAE) 1(4):111–122 67. Prabhu R (2018) Understanding of convolutional neural network (CNN)—deep learning. Medium.com, 4 Mar 2018. [Online]. Available: https://medium.com/@RaghavPrabhu/unders tanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148. Accessed 13 Mar 2020 68. Taghavifar H, Mardani A (2014) Use of artificial neural networks for estimation of agricultural wheel traction force in soil bin. Neural Comput Appl 24:1249–1258 69. Ahmadi FF, Layegh NF (2015) Integration of artificial neural network and geographical information system for intelligent assessment of land suitability for the cultivation of a selected crop. Neural Comput Appl 26:1311–1320 70. Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90
2 Drone-Based Weed Detection Architectures Using Deep Learning …
33
71. Cristin R, Kumar BS, Priya C, Karthick K (2020) Deep neural network based Rider-Cuckoo Search Algorithm for plant disease detection. Artif Intell Rev 2020 72. Chow L (2018) 100% solar-powered quadcopter flies without batteries. EcoWatch—Environmental News for Healthier Planet and Life, 24 Aug 2018. [Online]. Available: https://www.eco watch.com/renewable-energy-innovations-drone-2598719042.html. Accessed 03 Mar 2020 73. Kyle B (2018) Wireless, in-flight charging allows drones to stay in the air for an infinite amount of time. DroneDJ, 18 Sept 2018. [Online]. Available: https://dronedj.com/2018/09/18/wirelessin-flight-charging-drones/. Accessed 03 Mar 2020 74. Johnson M (2019) Mid-air wireless charging could keep drones aloft indefinitely. itnews, 21 Oct 2019. [Online]. Available: https://www.itnews.com.au/news/mid-air-wireless-charging-couldkeep-drones-aloft-indefinitely-532697. Accessed 03 Mar 2020 75. Michelle H (2020) Drones use radio waves to recharge sensors while in flight. IEEE Spectrum, 17 Apr 2020. [Online]. Available: https://spectrum.ieee.org/tech-talk/sensors/remote-sensing/ uavs-prove-usefuldelivering-remote-power-charging-services. Accessed 30 Apr 2020 76. SenseFly Parrot Group (2020) Why use agriculture drones, 27 Jan 2020. [Online]. Available: https://www.sensefly.com/industry/agricultural-drones-industry/. Accessed 03 Mar 2020 77. Hassija V, Saxena V, Chamola V (2020) Scheduling drone charging for multi-drone network based on consensus time-stamp and game theory. Comput Commun 149:51–61
Chapter 3
A Deep Learning-Based Detection System of Multi-class Crops and Orchards Using a UAV Shahbaz Khan, Muhammad Tufail, Muhammad Tahir Khan, and Zubair Ahmad Khan
1 Introduction Agriculture is pivotal in the development and growth of a country [1–7]. It is considered as a main source of living and contributes significantly to the economic population. As the population rises, agriculture consumption will also increase [8, 9]. However, in the past few years, losses have been recorded in this context due to climate change, land degradation, soil erosion, pests, and diseases. Therefore, remedial actions need to be taken to cope with the rapidly growing population. The advancement in ICT services and agriculture robots solves these challenges and achieves the desired goal [10–15]. One such type of agriculture robot that can excel is an unmanned aerial vehicle (UAV). It can help farmers in places that are difficult to access in different applications such as health monitoring, yield estimation, spraying, and ultimately the growth of crops [16]. Because of their flexible maneuvering ability to accomplish complex missions, UAVs have sparked a surge in academic interest in recent years, particularly in multirotor aerial robots [17]. By 2025, the drone market is expected to be worth $467,401 million [18]. UAV deployment in different remote sensing applications like SAR, ecology, wildlife, and PA has been facilitated by advancements in computer technology and processing algorithms such as deep learning [19–21]. It is believed that the use of UAVs in agriculture will soar up to 80% in the upcoming years [10].
S. Khan (B) · M. Tufail · M. T. Khan · Z. A. Khan Department of Mechatronics Engineering, University of Engineering and Technology Peshawar, Peshawar, Pakistan e-mail: [email protected] S. Khan · M. Tufail · M. T. Khan Advanced Robotics and Automation Laboratory, National Center of Robotics and Automation (NCRA), Peshawar, Pakistan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_3
35
36
S. Khan et al.
Remote sensing may help farmers assess crop yield, health, and disease [22, 23]. Various sensors, including thermal, multispectral, RGB, and NIR, are used in UAVs to gather data and monitor crop health [24–26]. Image processing and other computer vision techniques are applied to remote images for decision-making in multiple remote sensing applications. These decisions are taken offline or online during the flight of the UAV [21, 27–31]. Instead of a reliable system for detecting targets, the main focus in aerial work has been on developing robust controllers. In a situation like PA, this ability can be beneficial. For example, instead of broadcast spraying, a UAV can accurately detect pest-infested crops and apply targeted pesticides, which has severe consequences for neighboring crops and the environment [32]. The research work in this chapter encompassed the key areas to address the main issues in the existing techniques about the ability of UAVs to detect accurately with minimum processing time. Furthermore, the chapter optimizes the established fasterRCNN framework according to the study requirements, as the study deals with the limited dataset and variable target sizes of multi-class crops and orchards. The remaining section of the chapter is organized as follows: Related work is explored in Sect. 2; methodology is detailed in Sect. 3, whereas results and discussion are analyzed in Sect. 4. The article is concluded in Sect. 5.
2 Related Work An accurate detection/recognition system is essential for several aforementioned precision agricultural applications. Some of the related work in this context are outlined in this section. Hung et al. [33] developed an algorithm based on deep learning for classifying weeds. Small UAVs were used for collecting data at the height of 5–10 m, and through positive and negative examples of weed images, the algorithm was trained. Furthermore, to distinguish between weeds and background features, learning was deployed. As a result, the developed system achieved an average F1 score of 94% [33]. In another instance, a hybrid neural network architecture combining histogram and convolutional units was developed [34] for crop recognition based on imagery collected by UAV. The developed system was evaluated on twenty-three (23) classification tasks and outperformed the conventional histogram and convolutional models on the comparison [34]. Finally, in [35], a detecting system for detecting sugar beet plants and weeds was developed for images collected through a UAV. Deep learning (convolutional neural network (CNN)) and vegetation detection were combined and were evaluated on data collected from different sugar beet fields [35]. A large dataset was required in all these studies, and the computational constraint associated with UAVs was not considered. Lottes et al. [36] proposed a detection system of weeds and crops based on object features, key points, and random forest (RF). The system was evaluated through experiments and demonstrated that the proposed framework could analyze and classify crops and weeds [36]. In another instance, a deep learning algorithm was
3 A Deep Learning-Based Detection System of Multi-class Crops …
37
proposed for detecting tobacco plants through UAV imagery [37]. The proposed framework is comprised of three stages. In the first stage, morphological and watershed segmentation operations were used for extracting the tobacco plant regions. A deep convolutional neural network (DCNN) was then developed and trained in the second phase for classifying tobacco from non-tobacco plants. Post-processing was then performed in the last stage for further removing the non-tobacco plant regions [37]. However, the framework was computationally intensive. de Castro et al. [38] developed a classification system of weeds between and within crop rows. Object-based image analysis merged with a digital surface model (DSM), and RF was deployed for self-training and orthomosaic for selecting features [38]. A CNN model was developed [39] for detecting rice areas from UAV. The characteristics of rice fields were identified through CNN and could outperform the conventional detection methods [39]. Valente [40] proposed a deep learning-based detection system for a particular weed called Rumex. Images were captured through UAV, and the developed framework was evaluated, showing an accuracy of almost 90% [40]. However, the detection system was carried offline and required a large amount of data. In [41], a machine learning mutual subspace method (MSM) was deployed to recognize spraying areas for UAV-based sprayers. Real-world outdoor fields of croplands and orchards were selected for evaluating the developed system. The developed method was able to achieve an average accuracy of almost 70% [41]. However, different environmental conditions in which they operate were not considered. An improved faster RCNN was developed [28, 29] for recognizing regions of interest (ROI) for two different case studies. The framework was able to perform better than other conventional machine and deep learning methods. Though, the developed framework was offline and did not consider different lighting conditions. The weed mapping framework was developed using various techniques [42, 43]. However, in all these cases, a large amount of data was required and was carried offline. In summary, most of the frameworks deployed in the literature for detecting targets in different applications are not real time, lack flexibility, and are computationally intensive. Furthermore, a large dataset is required for training and validating the framework, which is a tedious task. A flexible, real-time, and low computational cost is still an unexplored area. The research aims to extend this by developing an optimized deep learning framework that can detect accurately and with low processing time.
3 Materials and Methods 3.1 Data Collection The dataset in work was collected through flight tests conducted at Charsadda (District Peshawar, Khyber Pakhtunkhwa, Pakistan, coordinates 34.1682° N,
38
S. Khan et al.
Fig. 1 Multi-class crops and orchards selected for the study. a Peach, b loquat, c garlic, d coriander
71.7504° E) on different days at different lighting conditions and weather in March 2020. This research selected two different croplands (garlic and corridor) and two orchards (loquat and peach), as shown in Fig. 1. The data was collected through a UAV that was developed for performing outdoor experiments. The images were categorized into (a) coriander, (b) peach, (c) garlic, and (d) loquat. The hardware system deployed in the study is illustrated in Fig. 2. It used a Pixhawk open-source autopilot and X450mm frame. In addition, a Raspberry Pi 3 onboard computer and a camera were mounted on the system.
3.2 Data Preprocessing and Data Enhancement Videos were recorded and were converted into images through a JPEG converter. The dataset comprised 400 images each for both the crops and orchards, which were insufficient. Therefore, it was essential to perform data enhancement. Photometric and geometric distortion techniques were performed in this regard. Brightness, contrast, and hue were adjusted for the former distortions, while cropping, flipping, and rotation were performed for the latter technique. The resulting sample of the enhanced image is shown in Fig. 3. The final dataset reached 1500 images after completing the data enhancement process. Among these samples, 75% were used for training, 15%
3 A Deep Learning-Based Detection System of Multi-class Crops …
39
Fig. 2 UAV used for the experiments
Fig. 3 Enhanced image of the sample shown in Fig. 1d
each for validation and testing, respectively. Image labeling was performed through the makesense.ai tool.
3.3 Optimized Faster-RCNN Faster-RCNN is a deep learning framework for detecting targets by utilizing a region proposal network (RPN) [44, 45]. Convolutional features are combined with classification networks to train and test the network in a continuous process. It is comprised
40
S. Khan et al.
Fig. 4 Optimization block diagram
of two stages: (a) region proposal (RP) and (b) region classifier (RC). The first stage proposes regions through a fully convolutional network, while the second stage uses the proposed regions to determine the region belonging to an object class of interest [44–46]. One of the constraints associated with this framework is that it requires a large amount of data and is computationally expensive. It is usually used for offline detection and where a large dataset is available. Therefore, it was essential to optimize the network according to the specific needs of the study conducted. The technical route employed for optimizing is shown in Fig. 4.
3.3.1
Optimized Backbone (CNN)
VGG-16 is generally used in the conventional faster-RCNN having 13 convolutional layers, 13 ReLU layers, and four pooling layers. However, it is computationally expensive, and it was essential to replace it with an optimum model that could detect the target accurately and with less processing time. As a result, CNN3 [30], having five convolutional, five ReLU, and four pooling layers, was used instead of the conventional VGG-16 model.
3.3.2
Optimized Anchors in Region Proposal Network (RPN)
During RPN training, preselected boxes called anchors play an important role. The size of the anchor determines the quality obtained in the screening process that is followed. During the study, variable target sizes were considered, so optimizing the
3 A Deep Learning-Based Detection System of Multi-class Crops …
41
Fig. 5 Optimized faster RCNN architecture
anchor set in RPN was important. Twelve (12) anchors were used instead of the conventional nine (9) having the size of 64, 128, 256 and a ratio of 0.5, 1, 2. The complete training process of the architecture is as follows [46], and the optimized architecture is shown in Fig. 5: • After data preprocessing and enhancement, the dataset with an image size of 448 * 448 is applied to the optimized backbone for generating a feature map, which acts as input to the RPN for generating the preselected box through the optimized anchors. • The intersection over union (IOU) value is calculated between ground truth and the preselected (PS) boxes for classifying positive and negative samples. Then, the non-maximum suppression algorithm (NMS) is used for screening the PS boxes. In the end, the final PS region is selected and outputs the result. • The region of interest (ROI) pooling layer receives a feature map and PS regions. According to the PS regions, the feature map is partitioned into a fixed-size feature map. • Through the sigmoid function, the final feature map is classified.
42
S. Khan et al.
Fig. 6 Onboard framework for target detection
3.4 Proposed Real-Time Framework Description The framework deployed for target detection in real time is shown in Fig. 6 after image processing is performed through the camera deployed on the UAV shown in Fig. 2. The framework comprises a target proposal module using a computationally efficient color (color histogram) and shape (geometric moments) algorithm that generates proposals for the target detection framework to detect the target effectively. For the real-time inference, a new video is captured, and through the target proposals, trained system, and optimized faster RCNN, the targets are detected in real time. The framework is implemented on the onboard system (Raspberry Pi 3). Keras, which runs on top of TensorFlow due to its speed and ease of use, implemented the framework.
4 Results and Discussion The developed framework was evaluated extensively through rigorous experiments on two different croplands and orchards, and results are reported in this section. The mean average precision (mAP) and processing time were used as metrics for evaluating the framework. First, mean average precision (mAP) was computed over the two classes of average precision (AP) based on intersection over union (IoU). True positive and negative values based on the confusion matrix [41, 47] were compared to obtain the precision (Eq. 1) and recall rate (Eq. 2), from which mAP was calculated (Eq. 3).
3 A Deep Learning-Based Detection System of Multi-class Crops …
True Positive True Postive + False Positive
Precision = Recall (Rec) = mAP = 1/C
43
(1)
True Positive True Positive + False Negative
N
(2)
Precision (K )Recall (K )
(3)
k=i
C represents the number of categories. The area under the precision–recall curve of the detection task results in calculating the AP by averaging the precision values at specific spaced recall levels [0, 0.1, 0.2, … 1] and different values of IOU (>0.5) [47, 48]. Subsequently, averaging the AP values gives the desired mAP. The mAP for training and testing is shown in Tables 1 and 2. It is evident from Table 2 that the developed framework performed better with an enhanced dataset than the original data. A sample detected image with a confidence score is shown in Fig. 7. In addition to the mAP of the framework, it was essential to calculate the processing time due to computational constraints associated with a UAV. Therefore, the onboard computer measured the processing time by calculating the time per image (average of 256) required for extracting features and desired detection. The result of the developed framework is shown in Table 2. The overall mAP for the croplands was 90.92%, and the processing time was 0.23 s. At the same time, the mAP for orchards was 91.63%, and the processing time was 0.24. It was necessary to compare the developed method’s performance to other deep learning-based detection methods to understand its capabilities thoroughly. The methods considered in the paper were fast-RCNN [49], faster-RCNN, Yolo v5 [50], and SSD [51]. The overall calculation performance of the frameworks is shown in Table 3. Table 1 Training results of the developed framework Detection results of the developed architecture
Data enhancement
Experimental area Peach
Loquat
Coriander
Garlic
Mean average precision (mAP %)
No
82.14
83.11
81.65
82.02
Yes
95.56
96.07
95.33
95.78
Table 2 Testing result of the developed framework Detection results of the developed architecture
Experimental area Peach
Loquat
Coriander
Garlic
Mean average precision (mAP)
91.16
92.11
90.84
91.08
0.22
0.26
0.22
0.24
Processing time (s)
44
S. Khan et al.
Fig. 7 Detected sample with a confidence score
Table 3 Results from different detection frameworks Framework
mAP (%)
Average processing time (s)
Peach
Loquat
Coriander
Garlic
Fast-RCNN
84.2
85.3
83.14
82.28
3.4
Faster-RCNN (original)
91.2
92.3
91.01
91.09
1.3
Yolo v5
92.08
93.21
91.98
93.03
0.8
SSD
88.31
87.17
86.32
87.68
1.6
The developed method proved to be highly superior to all other methods in the average processing time, with an advantage of up to 0.5–3.2 s (Fig. 9). For peach, the developed method performed better mAP than the fast-RCNN and SSD (i.e., up to 7% superiority), while its performance was similar to Yolo v5 and fast-CNN (original) former marginally better. Furthermore, identical to the peach orchards, the developed method again offered better mAP than the fast-RCNN and SSD (i.e., up to 4.9% superiority) in the case of loquat mAP was again assessed as similar to Yolo v5 and fast-CNN (original). Almost identical to loquat, the developed method offered a better precision for coriander crop than the fast-RCNN and SSD (i.e., up to 4.5% superiority), while its mAP was similar to Yolo v5 and fast-CNN (original). Results for garlic also exhibited a similar pattern as the rest of the three crops. The developed method offered better precision once compared to fast-RCNN and SSD (i.e., up to 3.4% superiority), while its mAP was again assessed as similar to Yolo v5 and fast-CNN (original). The inferences are evident from the visualization appended in Figs. 8a–d and 9 regarding the developed method mAP and average processing time across the four different methods (the developed method represented on the right), with mAP being
3 A Deep Learning-Based Detection System of Multi-class Crops …
45
96 94 92
92.08
91.2
90
91.16 88.31
88 86 84
84.2
82 80 78 76 Fast-RCNN
Faster-CNN (original)
Yolo v5
SSD
Developed Method
mAP (%)
a) 96 94 92 90 88 86 84 82 80 78 76 74
93.21
92.3
92.11 87.17
85.3
Fast-RCNN
Faster-CNN (original)
Yolo v5
SSD
Developed Method
mAP (%)
b) 96 94 92
91.98
91.01
90
90.84
88 86.32
86 84 82
83.14
80 78 76 74 Fast-RCNN
Faster-CNN (original)
Yolo v5
SSD
Developed Method
mAP (%)
c) Fig. 8 Performance comparison of the developed method and other frameworks: a peach, b loquat, c coriander, d garlic
46
S. Khan et al. 96 94
93.03
92
91.09
90
91.08
88
87.68
86 84 82.28
82 80 78 76 74
Fast-RCNN
Faster-CNN (original)
Yolo v5
SSD
Developed Method
mAP (%)
d)
Fig. 8 (continued) 5 4.5 4 3.5
3.4
3 2.5 2 1.6
1.5
1.3
1
0.8
0.5 0.235
0 Fast-RCNN
Faster-CNN (original)
Yolo v5
SSD
Developed Method
Overall Processing Time (sec)
Fig. 9 Average processing time for the croplands and orchards
ideal as having a higher value along the Y-axis and processing time (s) being desirable having a lower value along Y-axis. That developed method proved superior in processing time, yielding overall similar or even better precision against at least two other methods under study. Though the mAP from the faster-CNN (original) as well as Yolo v5 methods was marginally better, however, statistical analysis revealed that the average mAP of the developed method was overall superior to the mAP of the other techniques once the average among different croplands and orchards is considered. Any technique is bound to be valid only if it offers superior mAP across various croplands and orchards; thus, the developed method appears to be more valuable
3 A Deep Learning-Based Detection System of Multi-class Crops …
47
and precise. It can be confidently inferred that the developed method offers superior performance, particularly in terms of processing time and mAP, especially once employed for detection across varying croplands and orchards.
5 Conclusion In this study, a deep learning framework based on optimized faster RCNN was developed for multi-class crops and orchards. The study’s main contribution was to optimize the structure of the backbone (CNN) and the RPN to deal with the challenge of a limited dataset, variable target sizes, and computational constraints associated with UAVs. A light network, CNN3, with five convolutional layers, five ReLU, and four dropout layers, was deployed instead of conventional VGG-16 resulting in better processing time. Furthermore, the anchor numbers in RPN were increased for detecting variable targets. The developed framework achieved an overall mAP of 91.3% and an overall processing time of 0.235 s. A comparison was performed with other detection methods, and it can be inferred that the developed method can perform superiorly in terms of processing time which is of utmost importance in UAVs due to their limited computational capabilities. The developed framework can be easily employed on autonomous systems to detect various croplands and orchards and for different precision agricultural applications.
References 1. Tian Y, Zhao C, Lu S, Guo X (2011) Multiple classifier combination for recognition of wheat leaf diseases. Intell Autom Soft Comput 17(5):519–529. https://doi.org/10.1080/10798587. 2011.10643166 2. Escalante HJ, Rodríguez-Sánchez S, Jiménez-Lizárraga M, Morales-Reyes A, De La Calleja J, Vazquez R (2019) Barley yield and fertilization analysis from UAV imagery: a deep learning approach. Int J Remote Sens 40(7):2493–2516. https://doi.org/10.1080/01431161.2019.157 7571 3. Di Girolamo-Neto C et al (2019) Assessment of texture features for Bermudagrass (Cynodon dactylon) detection in sugarcane plantations. Drones 3(2):36. https://doi.org/10.3390/drones 3020036 4. Srivastava K, Bhutoria AJ, Sharma JK, Sinha A, Chandra Pandey P (2019) UAVs technology for the development of GUI based application for precision agriculture and environmental research. Remote Sens Appl Soc Environ 16:100258. https://doi.org/10.1016/j.rsase.2019.100258 5. Ahmed F, Al-Mamun HA, Bari ASMH, Hossain E, Kwan P (2012) Classification of crops and weeds from digital images: a support vector machine approach. Crop Prot 40:98–104. https:// doi.org/10.1016/j.cropro.2012.04.024 6. Pajares G (2015) Overview and current status of remote sensing applications based on unmanned aerial vehicles (UAVs). Photogramm Eng Remote Sens 81(4):281–329. https:// doi.org/10.14358/PERS.81.4.281 7. Tri NC et al (2017) A novel approach based on deep learning techniques and UAVS to yield assessment of paddy fields. In: Proceedings—2017 9th international conference on knowledge
48
8.
9.
10.
11. 12.
13. 14.
15. 16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
S. Khan et al. and systems engineering, KSE 2017, vol 2017, pp 257–262. https://doi.org/10.1109/KSE.2017. 8119468 Mekuria W (2018) The link between agricultural production and population dynamics in Ethiopia: a review. Adv Plants Agric Res 8(4):348–353. https://doi.org/10.15406/apar.2018. 08.00336 Long S. Drones and precision agriculture: the future of farming. [Online]. Available: https://www.microdrones.com/en/content/drones-and-precision-agriculture-the-futureof-farming/. Accessed 25 May 2021 Radoglou-Grammatikis P, Sarigiannidis P, Lagkas T, Moscholios I (2020) A compilation of UAV applications for precision agriculture. Comput Netw 172:107148. https://doi.org/10.1016/ j.comnet.2020.107148 Lv Z (2019) The security of Internet of drones. Comput Commun 148:208–214. https://doi. org/10.1016/j.comcom.2019.09.018 Tokekar P, Vander Hook J, Mulla D, Isler V (2016) Sensor planning for a symbiotic UAV and UGV system for precision agriculture. IEEE Trans Robot 32(6):1498–1511. https://doi.org/10. 1109/TRO.2016.2603528 Stefas N, Bayram H, Isler V (2019) Vision-based monitoring of orchards with UAVs. Comput Electron Agric 163:104814. https://doi.org/10.1016/j.compag.2019.05.023 Lottes P, Behley J, Milioto A, Stachniss C (2018) Fully convolutional networks with sequential information for robust crop and weed detection in precision farming. IEEE Robot Autom Lett 3(4):2870–2877. https://doi.org/10.1109/LRA.2018.2846289 Ahmadi A, Nardi L, Chebrolu N, Stachniss C (2019) Visual servoing-based navigation for monitoring row-crop fields, no iii Hayat S, Yanmaz E, Muzaffar R (2016) Survey on unmanned aerial vehicle networks for civil applications: a communications viewpoint. IEEE Commun Surv Tutor 18(4):2624–2661. https://doi.org/10.1109/COMST.2016.2560343 Sampedro C, Rodriguez-Ramos A, Bavle H, Carrio A, de la Puente P, Campoy P (2018) A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. J Intell Robot Syst Theory Appl 95(2):601–627. https://doi.org/10. 1007/s10846-018-0898-1 More A. Drone market size 2021 to 2025 segmentation at region level Incl—WBOC TV. [Online]. Available: https://www.wboc.com/story/44081263/drone-market-size-2021to-2025-segmentation-at-region-level-including-market-revenue-share-and-price-analysis. Accessed 19 Aug 2021 Choi H, Geeves M, Alsalam B, Gonzalez F (2016) Open source computer-vision based guidance system for UAVs onboard decision making. In: IEEE aerospace conference proceedings, Mar 2016. https://doi.org/10.1109/AERO.2016.7500600 Ward S, Hensler J, Alsalam B, Gonzalez LF (2016) Autonomous UAVs wildlife detection using thermal imaging, predictive navigation and computer vision. In: IEEE aerospace conference proceedings, pp 1–8. https://doi.org/10.1109/AERO.2016.7500671 Hazim B, Alsalam Y, Campbell D, Morton K, Gonzalez F (2017) Autonomous UAV with vision based on-board decision making for remote sensing and precision agriculture. In: IEEE aerospace conference Hunt ER, Cavigelli M, Daughtry CST, McMurtrey JE, Walthall CL (2005) Evaluation of digital photography from model aircraft for remote sensing of crop biomass and nitrogen status. Precis Agric 6(4):359–378. https://doi.org/10.1007/s11119-005-2324-5 Xiang H, Tian L (2011) Method for automatic georeferencing aerial remote sensing (RS) images from an unmanned aerial vehicle (UAV) platform. Biosyst Eng 108(2):104–113. https://doi. org/10.1016/j.biosystemseng.2010.11.003 Lelong CCD, Burger P, Jubelin G, Roux B, Labbé S, Baret F (2008) Assessment of unmanned aerial vehicles imagery for quantitative monitoring of wheat crop in small plots. Sensors 8(5):3557–3585. https://doi.org/10.3390/s8053557 Gonzalez-Dugo V et al (2013) Using high resolution UAV thermal imagery to assess the variability in the water status of five fruit tree species within a commercial orchard. Precis Agric 14(6):660–678. https://doi.org/10.1007/s11119-013-9322-9
3 A Deep Learning-Based Detection System of Multi-class Crops …
49
26. Felderhof L, Gillieson D (2012) Near-infrared imagery from unmanned aerial systems and satellites can be used to specify fertilizer application rates in tree crops. Can J Remote Sens 37(4):376–386. https://doi.org/10.5589/m11-046 27. Von Bueren S, Yule I (2013) Multispectral aerial imaging of pasture quality and biomass using unmanned aerial vehicles (UAV) 28. Khan S, Tufail M, Khan MT (2021) Deep learning-based identification system of weeds and crops in strawberry and pea fields for a precision agriculture sprayer. Precis Agric 0123456789. https://doi.org/10.1007/s11119-021-09808-9 29. Khan S, Tufail M, Khan MT, Khan ZA, Anwar S (2021) Deep learning based spraying area recognition system for unmanned aerial vehicle based sprayers. Turk J Electr Eng Comput Sci 29(2021):241–256. https://doi.org/10.3906/elk-2004-4 30. Khan S, Tufail M, Khan MT, Khan A, Iqbal J, Wasim A (2021) Real-time recognition of spraying area for UAV sprayers using a deep learning approach. PLoS ONE 16(4):1–17. https:// doi.org/10.1371/journal.pone.0249436 31. Khan S, Tufail M, Khan MT, Khan ZA, Iqbal J, Wasim A (2021) A novel framework for multiple ground target detection, recognition and inspection in precision agriculture applications using a UAV. Unmanned Syst 10(1):1–12. https://doi.org/10.1142/S2301385022500029 32. Bah MD, Hafiane A, Canals R (2020) CRowNet: deep network for crop row detection in UAV images. IEEE Access 8:5189–5200. https://doi.org/10.1109/ACCESS.2019.2960873 33. Hung C, Xu Z, Sukkarieh S (2014) Feature learning based approach for weed classification using high resolution aerial images from a digital camera mounted on a UAV. Remote Sens 6(12):12037–12054. https://doi.org/10.3390/rs61212037 34. Rebetez J et al (2016) Augmenting a convolutional neural network with local histograms—a case study in crop classification from high-resolution UAV imagery. In: ESANN 2016—24th European symposium on artificial neural networks, Apr 2016, pp 515–520 35. Milioto A, Lottes P, Stachniss C (2017) Real-time blob-wise sugar beets vs weeds classification for monitoring fields using convolutional neural networks. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 4:41–48. https://doi.org/10.5194/isprs-annals-IV-2-W3-41-2017 36. Lottes P, Khanna R, Pfeifer J, Siegwart R, Stachniss C (2017) UAV-based crop and weed classification for smart farming. In: Proceedings—IEEE international conference on robotics and automation, pp 3024–3031. https://doi.org/10.1109/ICRA.2017.7989347 37. Fan Z, Lu J, Gong M, Xie H, Goodman ED (2018) Automatic tobacco plant detection in UAV images via deep neural networks. IEEE J Sel Top Appl Earth Obs Remote Sens 11(3):876–887. https://doi.org/10.1109/JSTARS.2018.2793849 38. de Castro AI, Torres-Sánchez J, Peña JM, Jiménez-Brenes FM, Csillik O, López-Granados F (2018) An automatic random forest-OBIA algorithm for early weed mapping between and within crop rows using UAV imagery. Remote Sens 10(2):1–21. https://doi.org/10.3390/rs1 0020285 39. Wei H, Mao J (2018) The recognition of rice area images by UAV based on deep learning. MATEC Web Conf 232:1–5. https://doi.org/10.1051/matecconf/201823202057 40. Valente J, Doldersum M, Roers C, Kooistra L (2019) Detecting Rumex obtusifolius weed plants in grasslands from UAV RGB imagery using deep learning. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 4:179–185. https://doi.org/10.5194/isprs-annals-IV-2-W5-179-2019 41. Gao P, Zhang Y, Zhang L, Noguchi R, Ahamed T (2019) Development of a recognition system for spraying areas from unmanned aerial vehicles using a machine learning approach. Sensors (Switzerland) 19(2). https://doi.org/10.3390/s19020313 42. Pérez-Ortiz M, Peña JM, Gutiérrez PA, Torres-Sánchez J, Hervás-Martínez C, López-Granados F (2016) Selecting patterns and features for between- and within-crop-row weed mapping using UAV-imagery. Expert Syst Appl 47(2016):85–94. https://doi.org/10.1016/j.eswa.2015.10.043 43. Huang H, Deng J, Lan Y, Yang A, Deng X, Zhang L (2018) A fully convolutional network for weed mapping of unmanned aerial vehicle (UAV) imagery. PLoS ONE 13(4). https://doi.org/ 10.1371/journal.pone.0196302 44. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, Montréal, QC, pp 91–99. https://doi.org/10.1109/TPAMI.2016.2577031
50
S. Khan et al.
45. Sa I, Ge Z, Dayoub F, Upcroft B, Perez T, McCool C (2016) DeepFruits: a fruit detection system using deep neural networks. Sensors (Switzerland) 16(8). https://doi.org/10.3390/s16 081222 46. Deng X, Tong Z, Lan Y, Huang Z (2020) Detection and location of pine wilt disease induced dead pine trees based on faster R-CNN. AgriEngineering 51(7):228–236. https://doi.org/10. 6041/j.issn.1000-1298.2020.07.026 47. Zheng YY, Kong JL, Jin XB, Wang XY, Su TL, Zuo M (2019) CropDeep: the crop vision dataset for deep-learning-based classification and detection in precision agriculture. Sensors (Switzerland) 19(5). https://doi.org/10.3390/s19051058 48. Ammar A, Koubaa A, Ahmed M, Saad A (2019) Aerial images processing for car detection using convolutional neural networks: comparison between faster R-CNN and YoloV3, pp 1–28 49. Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, vol 2015, pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169 50. Thuan D (2021) Evolution of Yolo algorithm and Yolov5: the state-of-the-art object detection algorithm 51. Liu W et al (2016) SSD: single shot multibox detector. In: European conference on computer vision, pp 21–27. https://doi.org/10.1007/978-3-319-46448-0
Chapter 4
Real-Life Agricultural Data Retrieval for Large-Scale Annotation Flow Optimization Hiba Najjar, Priyamvada Shankar, Jonatan Aponte, and Marek Schikora
1 Introduction Computer vision (CV) has emerged as a technique to solve multiple problems in the context of crop management [1]. Crop identification, crop plant emergence identification, identification of multiple weeds, land cover classification, fruit counting, disease identification, and crop type classification are typical problems solved through CV techniques [2, 3]. Deep learning strategies such as convolutional neural networks (CNN) or transformer models have been shown to achieve the best performance applied on automatic detection, recognition, and classification, which are essential tasks for the aforementioned use cases [4]. However, these techniques are all data-hungry, and the performance is strongly related to the size of the train set. In order to overcome this problem, large datasets for pre-training such as ImageNet [5], iNat2017 [6], and MS-COCO [7] have been created, but they lack enough agriculturally relevant images and therefore are not effective at domain-specific computer vision use cases. The available plant-specific datasets such as PlantVillage, Flowers 102 [8], and LeafSnap [9] are relatively small Hiba was doing her internship at BASF while working on this book chapter. H. Najjar MINES NANCY, Campus Artem, 92 rue du Sergent Blandan, 54042 Nancy, France P. Shankar (B) · J. Aponte · M. Schikora BASF Digital Farming GmbH, Im Zollhafen. 24, 50678 Cologne, Germany e-mail: [email protected] J. Aponte e-mail: [email protected] M. Schikora e-mail: [email protected] URL: http://www.xarvio.com © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_4
51
52
H. Najjar et al.
Fig. 1 General flow of exploiting agricultural images to provide smart digital farming services
with about 1–30 K images and aimed for image classification tasks and not detection since annotations are not provided. A recent work called CropDeep [10] tried to bridge this gap by providing a classification and detection dataset consisting of 31,147 images with over 49,000 annotated instances in real-time settings. However, most of these datasets were collected for research purposes and normally under specific conditions such as lighting, distance to the object, resolution of the camera, and depending on the use case, also with scene manipulations. Therefore, such datasets tend to fail at representing the conditions experienced by farmers on a normal day of work. In such datasets data imbalanced is compensated by collecting more images from less frequent classes. In reality, data imbalance is a major challenge to overcome. In use cases like disease identification or weed identification, it is normal to encounter long-tailed class distributions. This leads to the model being mostly influenced during the training by classes over-represented in the dataset and consequently deteriorating the performance for the rest of the classes. Furthermore, annotation is a laborious process, taking in some instances up to 40 s per object [11], which makes it a reasonably expensive and slow process. In order to mitigate this problem, strategies like image segmentation [12] have been studied and used in the industry as tools for speeding up the annotation workflow. With the growing popularity of artificial intelligence-based mobile applications that assist farmers with concerns on the field such as identification of unknown crops, diseases, and weeds [13–15], large annotated datasets are indeed crucial for providing accurate and reliable models for these use cases. Data collected from such apps also need to be continuously annotated to find new classes, thus updating the knowledge base to serve users better. However, finding and annotating relevant images from such big data, captured by inexperienced users and in uncontrolled settings, is extremely challenging and requires advanced agronomic expertise. In this study, a strategy to speed up the annotation workflow while reducing annotated class imbalance is presented. The strategy is based on image prioritization and semantic similarity. As shown in Fig. 1, after prioritization of images, they are added to an annotation queue for annotation by an expert. The annotated images are then used for training models. In this chapter, we present the prioritization process for two use cases: crop identification and emergence analysis. The major contributions of this chapter are: 1. An analysis on the applicability of image retrieval and similarity search approaches for enhancing annotation flow in the field of agriculture is presented.
4 Real-Life Agricultural Data Retrieval for Large-Scale …
53
2. Crop identification and leaf damage identification are a significant field in which vision-based solutions can benefit farmers, as the systems help them identify the cause of the plant defect. In order to have a relevant dataset to train those models, an agronomist has to annotate at least parts of the data by hand. Usually, the performance of the human annotators is much better if they can focus on a particular crop plant. Hence, we present how we can retrieve crop images from a new dataset of real field images. 3. A more challenging problem arises if not a single plant is visible on the image at hand, but many crop plants are visible. Here, approaches derived from image classification are not suitable. To solve the task in this context, an approach to identify individual plants and derive their feature vectors is described. However, some images have up to 1000 plants in an individual image, thus requiring an additional step. Here, a clustering method is analyzed to find the most important feature vector candidates, which describe the image best.
2 Previous Work Finding relevant images from a large database based on its content with the use of computer vision techniques is called Content-Based Image Retrieval (CBIR) or Query by Content. When deep learning is used for image retrieval, it is called Deep Image Retrieval [16]. Images are mapped to feature vectors in a latent space. Based on the query image, a sorted list of semantically similar images is retrieved by calculating the distance between the feature vectors of the image database and the feature vector of the image query. In the past, techniques to capture and convert low-level handcrafted visual features such as color, texture, and local shape to global feature representations have been used [17, 18]. Deep learning-based image retrieval techniques are the present state of the art for CBIR. Deep convolution neural networks are commonly used as feature extractors to capture subtle visual features within images and create rich feature representations. The captured features are then fine-tuned using specialized metric learning networks such as Siamese or triplet networks that minimize the distance between similar images and maximizes the distance between dissimilar images. This process is called verification-based supervised learning and is the most common method used in deep image retrieval. Chen et al. [19] present a detailed technical account of the same. These methods have been extremely popular in the last decade and have been used for various applications such as face recognition [20, 21], product recommendation [22], medical image retrieval [23], remote sensing [24], security video surveillance [25], and forensic [26]. Similarly, content-based image retrieval has been used in agriculture before. Retrieval tasks based on low-level features are quite popular [27, 28]. Common techniques used for low-level feature detection and description are as follows, 1. Color: color histograms and color co-occurrence matrices
54
H. Najjar et al.
2. Shape: SIFT (scale-invariant feature transform), saliency maps 3. Texture: Local Binary Pattern (LBP) and Gabor filter. Some works also suggested unique techniques like local angle co-occurrence histograms [29] to derive co-occurrence descriptors, but deep convolution neural networks are the most efficient feature extractors [30]. However, these methods are not very common in the field of agriculture. Our literature survey on the topic yielded just a handful of results, mostly presented as a next step post image classification. Trong et al. [31] train a VGG model for weed classification task and obtain descriptive feature vectors of weed images with a global average pooling layer on this model. Autoencoder is then used for dimensionality reduction of the feature vectors. Similar images are retrieved based on the shortest Euclidean distance of query image from the other images in the gallery. A similar approach is used by [32] for vegetable image retrieval, but instead of autoencoder, PCA is used for dimensionality reduction of the feature vectors. Loddo et al. [33] uses a similar approach for the seed classification and retrieval task. Loddo et al. [33] proposes a specialized CNN model called SeedNet for the seed classification and retrieval task and compared against other commonly used CNN-based architectures such as AlexNet [34], VGG16 [35], ResNet [36], and Inception-ResNet [37] for both the tasks. Although the proposed method worked best for the retrieval task, the overall performance was low, with an mean average precision (mAP) of 30%. Yin et al. [38] uses VGG with K-nearest neighbor model for retrieval of hot pepper diseases and pests. Triplet networks and Siamese networks have shown good performance for similarity matching and retrieval, but these methods have not been used extensively in agriculture yet. Gao et al. [39] is an exception here since it uses a Siamese network on features extracted by a recurrent neural network for plant image retrieval. Our literature survey yielded no work that uses triplet loss or object detection for retrieval tasks in agriculture, and this work bridges the gap.
3 Optimized Annotation Flow In this study, a strategy to speed up the annotation workflow while reducing annotated class imbalance is presented. In Fig. 2a, a standard annotation flow is described. Initially, agronomically relevant images go through a manual pre-annotation process by annotators in order to prioritize relevant image for each use case within the main annotation queue. The downside of this approach is that this additional prioritization step consumes efforts and annotators often end up reviewing the same image twice, which makes the task partially repetitive and inefficient. After images are annotated for each use case, models are trained. In this workflow, the manual prioritization is a bottle neck for annotation. Hence, improvements to the annotation flow may significantly reduce the time spent on training models. Figure 2b depicts the strategy proposed to speed up annotation flow. The prioritization model allows to sort the queue in a way that annotators approach images
4 Real-Life Agricultural Data Retrieval for Large-Scale …
55
Fig. 2 a A standard annotation flow. b Optimized annotation flow that relies on similarity image retrieval
semantically similar to targeted scope which allows them to contentrate into specific annotation tasks working faster. The first step in the process is the conversion of images into expressive feature vectors by a trained CNN or transformer model, just as in the image classification task but replacing the last layer with a 128/256 dimension output layer. Feature vectors are then semantically clustered in the latent space with the use of the triplet loss function. Finally, similar images are sorted into the queue for annotation. The process of building such model is detailed as follows: 1. For each image in the dataset features are extracted by means of the deep feature extractor. 2. For each feature, a vector is extracting the similarity model. 3. Clustering is applied over the vectors in produced by the similarity model. 4. N clusters are randomly selected (where n varies according to the use case). 5. Selected vectors are stored in the image database for fast searching with its relationship to the initial image. 6. When a certain type of task wants to be prioritized a relevant image (query image), the steps 1–4 are replicated over the image. 7. Euclidean distance between feature vectors of the query image and feature vectors of all other images in the dataset is computed. 8. Based on a distance or number of images threshold, images associated to nearest neighbors are prioritized in the annotation queue as shown in Fig. 3. A detailed description of the deep feature extraction and similarity model is explained in the following sections.
56
H. Najjar et al.
Fig. 3 An example of the search engine targeting chili pepper crops
Deep feature extraction with CNN-based architectures The vast success of AlexNet [34] has prompted the use of DCNN-based architectures for image retrieval tasks. A simple CNN is able to capture details of an input image due to its three building blocks: local receptive fields, shared weights, biases, and pooling. Local receptive fields are a small window over the larger image used to capture information pixel by pixel. This local receptive field moves over the entire image with a specified stride length and thus captures details from the whole image. Shared weights A specific feature can occur in multiple places within an image. Therefore, by sharing weights and biases the network learns to identify similar features regardless of where they occur in the image. Another advantage of shared weights and biases is the reduced number of parameters to learn, thus speeding the process and reducing complexity. Pooling The information generated from the first two methods is summarised by a pooling layer, thus further reducing the number of parameters. To create powerful CNN models, more layers can be added to create deeper models. However, this makes models computationally expensive. It increases the number of parameters needed, weights need to be back propagated to many layers, and the problem of vanishing gradients may occur as well. Architectures such as ResNet and InceptionNet have been proposed to overcome this problem. ResNet introduced a technique called identity shortcut connection that overcomes vanishing gradient problem by adding identity mappings to the network which can be skipped during backpropogation without having an adverse effect on performance. InceptionNet on the other hand introduced techniques to have flexible filter sizes based on the feature needed to be learned, thus reducing the number of parameters significantly. The two architectures can also be merged to form Inception-ResNet [37], which is one of the tested architectures in our work. Deep feature extraction with transformer blocks After witnessing the outstanding results achieved by transformer models [40] on natural language processing
4 Real-Life Agricultural Data Retrieval for Large-Scale …
57
tasks, the vision community tried to incorporate them into computer vision problems, successfully, on different tasks such as image recognition [41], object detection [42], and segmentation [43]. In general, there are two main components that have contributed toward the development of transformer models. The first one is the self attention mechanism, which captures ‘long-term’ information and dependencies between sequence elements. The second key component is that they are pre-trained on a large corpus and can later be fine-tuned for a specific task with a small labeled dataset. Since convolution operates on a fixed-sized window and are thus unable to capture long-range dependencies, such as arbitrary relations between pixels within spatial domains; this shortcoming and a few others are addressed by using transformer blocks. In [44] is conducted a survey where details about different transformer vision models are being explained and well-illustrated. For the second use case addressed in this chapter, we will be using the convolutions to vision transformers (CvT) [45] model. While the famous vision transformer (ViT) [41] model adapts transformer architectures initially introduced for language processing with minimal modifications, CvT improves ViT both in performance and efficiency, by introducing convolutions into it to yield the best of both designs. Triplet loss A popular technique for image similarity using deep learning is using a three-headed Siamese network in combination with the triplet loss function. Triplets form the basis of this architecture. They are a combination of three images: query, positive, and negative. The positive image is chosen such that it is from the same class as the query image, as opposed to the negative, which belongs to another class. Based on the selected triplets, the triplet loss is used as a loss function that a deep learning architecture learns to minimize, in order to organize the high dimensional space in a way that maps similar images closer to each other. The triplet loss function is defined as: L(δ+ , δ− ) = max(δ+ − δ− + μ, 0)
(1)
where δx = f (α), f (x) is the distance between an anchor and an example (positive (+) or negative (−)) in representation space. μ is called the margin, and is an hyperparameter often set equal to 1. A minimizer of this loss will favor representations where positive examples are closer to the anchor than negative examples, thus clustering similar samples. The parameter μ controls the scale of separation. Evaluation metrics The similarity models are evaluated on classification tasks by using the euclidean distance as in the triplet loss function and modifying the number k of nearest neighbors to be considered, k ∈ [5; 10; 20]. The following metrics are, therefore, computed over a whole evaluation subset: – Average Correct Proposals: given an anchor image with a specific label, first is calculated the number of images within the k-nearest-neighbors which are correct proposals, i.e., have the same label as the anchor. This metric averages the total number of correct proposals over the product of k times the number of anchors, which is typically the size of the subset.
58
H. Najjar et al.
– Correct Classifications: computes the number of images which have same label as the most frequent label within its k nearest neighbors, conflicts excluded. – Number of Conflicts: a conflict happens when there are two or more equally most frequent labels in the k nearest neighbors and one of them matches the anchor image’s label. – mean Adapted Average Precision (mAAP): the average over AAP of all images. The AAP of an image is the product of its average precision and the ratio of correct labels within the k-nn. This metric was constructed here as an adaptation of the mean average precision (mAP) metric used for information retrieval [46], and which measures the quality of retrieving documents relevant to the query, while also assessing their rank. The adjusted average precision in our case can be calculated for each query using the following formula: 1 P@i × rel@i k i k
AAP@k =
(2)
where rel@i is a relevance function returning 1 if the image at rank i is relevant and 0 otherwise, and P@i× is the precision at i which assesses the retrieval’s precision by considering the i first neighbors only. The precision in this context is defined as the ratio of positives within the considered retrieved images. – Confusion Matrix: the confusion matrix of the classification based on the k-nn is calculated and normalized over each row (true labels), including conflict prediction label whenever it occurs.
4 Crop Identification Crop images provided for this task are mainly disease images used to train a disease identification model, but all images have crop and disease labels. The challenging part is that each crop has different common diseases associated with it. Thus, when training the model using supervised learning on crop label, the triplet loss will group in the same region of the feature space images of the same crop instead of the specific disease. This outcome is very important for the annotation process as it can help prioritize the queue for expert annotators to annotate diseases specific to the crop. Therefore, crop identification is targeted rather than a specific disease, even if the data will later be used for a disease identification.
4.1 Data Preparation Labeled dataset The crop annotated dataset provided for the image retrieval trials has around 60 k images. The labels are EPPO codes encoded identifiers used by the
4 Real-Life Agricultural Data Retrieval for Large-Scale … Table 1 Dataset splits with size of each set Subset Number of samples Train Validation Balanced test Unbalanced test
22 k 880 880 25 k
59
Samples per class 1000 40 40 ≥40
European and Mediterranean Plant Protection Organization in a system designed to uniquely identify organisms, namely plants, pests, and pathogens. Crops belong to different taxonomy levels, and in our dataset, its either species or genera. A mapping has been applied to the labels to group species to the common genus they belong to. Some crops have very few images in our dataset, and therefore, only those with more than 400 annotated samples were used for the training. The resulting dataset has about 44 K images, with the following 22 classes: TRZAX, 1BEAG, 1BRSG, 1CPSG, LYPES, ZEAMX, GLXMA, 1PRNG, HELAN, VITVI, 1ALLG, MABSD, SOLTU, 1CUMG, 1CIDG, ORYSA, PYUCO, HORVX, 1CUUG, PIBSX, MNGIN, and FRAAN. Splitting the data For better training, having balanced subsets is essential. Given that the range of the number of images per class goes from 460 till 11 K, it is important to set a threshold on the number of images per class, and then, either randomly sample that number from large classes or fill the missing images by using augmentation for classes with fewer images. First, for the validation and test sets, 40 images per class were sampled, resulting in 880 images per set. The augmentation and down-sampling approach was applied to get 1000 images per class for the training set, resulting in 22 K images. An additional fourth set was created, the unbalanced test set, and it merges the test set and all images left from the random sampling. The relevance here is that both have a set with at least 40 images per selected crop and be as unbalanced as the real-time dataset might be since the later use of the similarity model will be to operate in such an environment (Table 1).
4.2 Model and Hyperparameters For the similarity model, an Inception Resnet V2 model was used, with its weights pre-trained on the classification task on ImageNet dataset. To obtain the image retrieval model, the last classification layer was removed and replaced with a dense layer of as much units as the dimension of the feature space, which was set to 128. For the training, the following parameters were fixed: a small learning rate (0.0001), a batch size of 32 that fits a single GPU memory, a semi-hard triplet loss
60
H. Najjar et al.
Table 2 Classification task metrics evaluation on the balanced test set k Avg correct Correct % #conflicts % proposals classifica(%) tions 5 10 20
57.86 56.10 53.08
539 539 556
61 61 63
28 36 30
3.2 4.1 3.4
Table 3 Classification task metrics evaluation on the unbalanced test set k Avg correct Correct % #conflicts % proposals classifica(%) tions 5 10 20
87.75 87.37 86.90
22,601 22,628 22,699
90 90 90
183 187 118
0.7 0.7 0.5
mAAP (%)
51.53 47.59 42.59
mAAP (%)
85.26 84.08 83.02
function was imported from the tensorflow-addons package, and the training lasted for 75 epochs since beyond that the model led to overfitting.
4.3 Results The performance of the image retrieval model was based on the evaluation of the classification tasks. Hence, computing these metrics on the validation set guided the selection of the hyperparameters, leading to the best performance. The final model was then evaluated similarly on both the balanced and unbalanced test sets, and the results are summarized on Tables 2, 3 and Figs. 4, 5. The metrics results and the confusion matrices suggest that the similarity model performs equally well on all the classes on balanced sets, with some exceptions on classes that look too similar, even for expert annotators, as is the case for HORVX and TRZAX. On a large unbalanced set, the metrics reach high values while the confusion matrix shows that fewer classes succeed in being distinguished by the model. In Fig. 6 are plotted the 10-nearest-neighbors of 4 examples, all captured from the test set.
5 Emergence Analysis Crop emergence is the first predictor of crop success, as the number of seeds that germinate and grow directly correlates to the total yield and harvest outcome. Hence,
4 Real-Life Agricultural Data Retrieval for Large-Scale …
61
Fig. 4 Confusion matrix of classification task on test set through the 20-nearest-neighbors (classification accuracies above 10% are highlighted)
providing a smart service able to detect, identify and analyze emerging crops is of prime importance. As the annotation step is essential to label the data used in training such models, this section details an image retrieval model adapted to this use case. Emergence data has a challenging characteristic: each image contains numerous plants, which are also from different species. Reasons for this could be due to the presence of weed in the field or the crop rotation practice, resulting in some seeds from the previous season emerging until the next one. To address the diversity of plants in each image, an object detection model will be used to detect the plants to crop the images around the created bounding boxes. The training of this model will not be detailed in this chapter, as it follows the general process of training any object detection model. Also, the data used to train the similarity model will have its bounding boxes created by annotators. In contrast, the object detection model will prepare images from the database that are not annotated yet. Since this approach will result in numerous feature vectors associated with each image, a clustering approach will be applied to select five representative vectors per image. A summary of the image retrieval flow for the emergence analysis images is shown on Fig. 7.
62
H. Najjar et al.
Fig. 5 Confusion matrix of classification task on unbalanced test set through the 20-nearestneighbors (classification accuracies above 10% are highlighted)
5.1 Data Preparation The annotated emergence images provided for this task contain 13 different plant types, 8 of which are substantial to be used for the training. First, the sub-images of interest were extracted. 1000 per class was extracted for the training set and 125 × 2 for the validation and test sets. A larger dataset was also prepared with up to 3000 samples for the train set and 500 × 2 for the validation and test sets. Extracting too many sub-images of the same crop from the same image was avoided, as they look very similar (same growth stage and soil color in the background). It is an important step since some images contain more than 50 sub images of the same crop within an image, which does not add any value to the training but increases processing complexity instead. Also, to preserve the leaves’ shape, subimages must be resized into a square before being fed to the model. Therefore, an adjusted version of the large dataset was created, where the bounding boxes were squared up to the largest edge. Small boxes, defined as having the largest edge smaller than 40 pixels, were also disregarded in this version.
4 Real-Life Agricultural Data Retrieval for Large-Scale …
Fig. 6 Four examples on fetching for the 5-nearest-neighbors within the test set
Fig. 7 Flow steps of preparing and storing feature vectors for emergence analysis images
63
64
H. Najjar et al.
Table 4 Classes distribution over the training, validation and test sets Subset
WEED
TRZAX
BEAVX
HORVX
BRSNN
GLXMA
ZEAMX
1GOSG
Training
Total crops
1000
779
1000
380
1000
1000
1000
1000
7159
Validation
125
125
125
125
125
125
125
125
1000
Test
125
125
125
125
125
125
125
125
1000
Table 5 Classes distribution over the training, validation, and test large sets Subset
WEED
TRZAX
BEAVX
HORVX
BRSNN
GLXMA
ZEAMX
1GOSG
Total crops
Training
17,097
3000
751
3000
380
3000
1751
3000
3000
Validation
500
500
500
379
500
500
500
500
3879
Test
500
500
500
371
500
500
500
500
3873
Table 6 Small classes available sub-images count Subset HELAN PIBSX BRSOX Validation
605
220
35
SORVU
VICFX
Total crops
33
15
908
An evaluation of the model on classes not seen during the training is also intended; therefore, all available sub-images from the small classes were thus prepared and later added to the validation set. It was impossible to allocate a small portion of them to the test set as they already were in a very small number. Tables 4, 5 and 6 summarize the content of each set. Some issues while loading the data affected the classes TRZAX and HORVX, but it did not affect the training as these two crops look very similar and thus complete each other.
5.2 Model Versions and Hyperparameters For this second task, the PyTorch framework was preferred, along with more recent model architectures. A Convolutions to Vision Transformers (CvT) model was thus chosen. One of its promising aspects is that it joins the good properties of convolutional neural networks (invariant to shift, scale, and distortion) and the merits of Transformers (dynamic attention, catches the global context, and better at generalization). Out of the different CvT models, we trained the CvT-21 with a 224 × 224 resolution, given that the sub-images are often very small, and thus a higher resolution is not needed for our case. The weights pre-trained on ImageNet-1k were also used. For the training, the model runs for 80 epochs on the small dataset and 100 epochs in all other versions, using the cosine annealing learning rate scheduler,
4 Real-Life Agricultural Data Retrieval for Large-Scale … Table 7 Characteristics table of tested model’s versions Model ID Dataset Bounding boxes 1 2 3 4
Small Large Large Large
Normal Normal Normal Adjusted
65
Frozen stages
Epochs
0 0 2 0
80 100 100 100
Table 8 Classification task metrics evaluation of the four models on the validation set, through the 20-nearest-neighbors Model Avg correct Correct % #conflicts % mAAP (%) proposals classifica(%) tions 1 2 3 4
61.91 73.04 70.18 72.72
704 3043 2980 3025
70 78 77 78
6 30 24 38
0.6 0.8 0.6 1.0
51.96 65.54 61.82 64.98
which decreased from 5 · 10−5 to 10−10 on the last epoch, except for the model with frozen stages for which the learning rate was set to decrease to 10−7 . The stochastic gradient descent optimizer was used, with a momentum of 0.9 and weight decay equal to 0.1. Table 7 summarizes the four different versions of the model for which the performance will be compared in the next section. More variants were actually tested, but only the results of the most relevant versions will be shared and compared below.
5.3 Results Using the same metrics as presented in the previous section, the different versions of our CvT model were evaluated. On Table 8 are the classification metrics results, evaluated on the validation set, within the 20-nearest-neighbors. It is noticeable that training on a large dataset gave better results, which is not surprising, and freezing the weights of the first two stages out of four negatively affected the training. When looking closer at the effect of adjusting the bounding boxes at the last model, it does not bring meaningful improvements to the second model, and this is explained by the fact that convolutional layers are highly invariant to translation, scaling, or other forms of deformation. Given that models 2 and 4 achieved the best results, they were further evaluated on the validation set to which we added the small classes unseen during the training. The
66
H. Najjar et al.
Table 9 Classification task metrics evaluation of the second and fourth models on the validation + new classes set, through the 20-nn Model Avg correct Correct % #conflicts % mAAP (%) proposals classifica(%) tions 2 4
67.35 66.55
3555 3548
74.26 74.21
46 48
1.0 1.0
59.00 57.88
Fig. 8 Confusion matrix of classification task through the 20-nn of model 2 on the validation + new classes set. Highlighted by a yellow box are the new classes
classification task metrics through the 20-nn are shown on Table 9, and the confusion matrices for model 2 and 4 are plotted in Figs. 8 and 9, respectively. Both models have a similar performance level on the crops seen during training. For instance, 1GOSG and GLXMA are the best-classified crops by both models, with more than 90% of true positives, and ZEAMX and BRSNN come in second place for both models as well. TRZAX and HORVX have very similar accuracy, and there is high confusion between both, which is mainly due to their similar looks, and merging these two classes makes perfect sense from the point of view of agronomists. The new classes, HELAN and PIBSX are the best classified, which can be explained by the large number of samples available for them compared to the other three new classes. To sum up, the generalization performance of both models is very good, since the gap between the average true positives scored on the new classes and the
4 Real-Life Agricultural Data Retrieval for Large-Scale …
67
Fig. 9 Confusion matrix of classification task through the 20-nn of model 4 on the validation + new classes set. Highlighted by a yellow box are the new classes
training classes does not exceed 10% (9.41% to be exact for model 2) as long as new classes with very few samples are disregarded. Indeed, the annotation process benefiting from this study rarely targets crops with very few samples in the database.
5.4 Feature Vectors Clustering Emergence images often contain many plants and mapping each detected plant to a feature vector results in a large multi-dimensional vector associated with each image. Their storage is necessary within the whole annotation flow optimization process and thus becomes problematic as the database gets larger. To reduce the number of feature vectors per image, multiple clustering algorithms were tested. Under the assumption that those vectors form multiple clusters, each of which groups a plant type present in the image, the goal is to keep at least one feature vector per cluster. To test different clustering algorithms, a small validation and test sets were first prepared, as detailed in Table 10: the images have between one and four different type of plants, the last one being very uncommon, and the same number of images was then sampled according to the number of plants present on them.
68
H. Najjar et al.
Table 10 Datasets arrangement for validating and testing different clustering algorithms #different plant 1 2 3 4 types Validation set Test set
30 30
30 30
30 30
5 5
Table 11 Summary of best metric results for each tested clustering algorithm algorithm parameters K-Means n-clusters=15 Affinity Propagation damping = 0.9 cluster all=True, Mean Shift bin seeding = True, n clusters=10, affinity=”cosine”, n clusters=10, gm=2 , affinity=”rbf” Spectral n clusters=15, gm=0.5, Clustering affinity=”rbf” distance th=1, linkage=”single” n-clusters=10, linkage=”complete” Agglomerative n-clusters=15, Clustering linkage=”complete” eps=0.5, DBSCAN min samples=2
ARI HI nbr clusters fitting time skipped img 0 0.74 13.73 11.99 s 0.01 0.41 5.72 3.71 s 56.24 s
33
8.38
62.9 s
11
8.95
28.98 s
20
0.03 0.79
12.54
54.88 s
13
0.15 0.41
0.02 0.44
3.15
0.05 0.67 0.01 0.71
4.25
0.95 s
0
0.66
9.69
0.98 s
0
0.75
13.73
0.98
0.14 0.37
2.42
12.16
Many different clustering algorithms are implemented in the scikit-learn library on Python, and most of them were fitted on the validation set with different parameter values. For the evaluation, we compared the algorithm clusters with the real clusters defined by the plant types for each image. We averaged the adjusted Rand index [47] and homogeneity metrics over all the images. Other metrics exist, but these two are the most relevant to this study. In Table 11 are gathered the best results achieved by each tested clustering algorithm and some of the key parameters which led to these results. Algorithms that are prone to skip images, mainly due to many samples, are not reliable here. The same goes for algorithms conditioned by creating too many clusters to have a decent/good performance. The agglomerative clustering algorithm results in the best compromise between the two metrics, as expected, since creating cluster using the euclidean distance along with a distance threshold of 1 perfectly fits how the triplet loss function with a margin value equal to 1 grouped the plants from the same type during the training of the similarity model. Figure 10 supports the assumptions that the feature vectors are clustered by plant type and that those belonging to the same image form clusters as well, and so it makes sense to decide to retrieve representatives of these clusters, the agglomerative
4 Real-Life Agricultural Data Retrieval for Large-Scale …
69
Fig. 10 TSN-E embeddings plot on a two-dimensional space: 1950 feature vectors colored by image source on the left and crop type on the right
algorithm being a good approach to identify them. The approach of selecting five feature vectors is as follows: When an image has more than 5 sub-images and 2 or more clusters returned by the algorithm, the size of the first and second largest clusters are compared to decide how many feature vectors will randomly be selected out of each cluster as representatives. In the following are separated the different possible situations of when the image has more than 1 cluster and more than 5 sub-images: Case of 2 clusters only: – M/L < 0.2 randomly sample 4 representatives from the largest cluster and 1 from the second. – M/L > 0.2 randomly sample 3 representatives from the largest cluster and 2 from the second. Case of 3 or more clusters: – M/L ≤ 0.1 randomly sample 3 representatives from the largest cluster and 2 from all the rest combined. – 0.1 < M/L ≤ 0.3 randomly sample 3 representatives from the largest cluster, 1 from the second and 1 from the rest. – 0.3 < M/L randomly sample 2 representatives from the largest cluster, 2 from the second and 1 from the rest, where L is the size of the largest cluster and M of the second large one.
70
H. Najjar et al.
6 Conclusion This chapter proposes a novel methodology for improving annotation flow of big datasets in agriculture by means of deep learning driven content-based image retrieval. The proposed approach is applied on two different use cases, crop identification, and crop emergence and can further be adapted to other agricultural tasks. The methodology makes use of a previously trained object detection model on the second use case to address its specific challenges and spot important image features (plants, weeds). Clustering is used to map multiple feature vectors to few representative feature vectors. This chapter shows that models trained using triplet loss function can effectively be applied to real-life datasets in order to build efficient similarity models. Moreover, agriculture data from use cases not presented in this chapter can introduce new challenges, which may call for additional steps of pre or post-processing the data to make the overall similarity search flow efficient and customized to the use case addressed.
References 1. Benos L, Tagarakis AC, Dolias G, Berruto R, Kateris D, Bochtis D (2021) Machine learning in agriculture: a comprehensive updated review. Sensors 21(11):3758 2. Cai E, Baireddy S, Yang C, Crawford M, Delp EJ (2020) Deep transfer learning for plant center localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 62–63 3. Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90 4. Hafiz AM, Parah SA, Bhat RUA (2021) Attention mechanisms and deep learning for machine vision: a survey of the state of the art 5. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 6. Van Horn G, Mac Aodha O, Song Y, Cui Y, Sun C, Shepard A, Adam H, Perona P, Belongie S (2018) The iNaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8769–8778 7. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Lawrence Zitnick C (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755 8. Nilsback M-E, Zisserman A (2006) A visual vocabulary for flower classification. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 1447–1454 9. Kumar N, Belhumeur PN, Biswas A, Jacobs DW, John Kress W, Lopez IC, Soares JVB (2012) Leafsnap: a computer vision system for automatic plant species identification. In: European conference on computer vision. Springer, pp 502–516 10. Zheng Y-Y, Kong J-L, Jin X-B, Wang X-Y, Su T-L, Zuo M (2019) CropDeep: the crop vision dataset for deep-learning-based classification and detection in precision agriculture. Sensors 19(5):1058 11. Ling H, Gao J, Kar A, Chen W, Fidler S (2019) Fast interactive object annotation with curveGCN. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5257–5266
4 Real-Life Agricultural Data Retrieval for Large-Scale …
71
12. Lin Z, Zhang Z, Chen L-Z, Cheng M-M, Lu S-P (2020) Interactive image segmentation with first click attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13339–13348 13. Barbosa JZ, Prior SA, Pedreira GQ, Motta ACV, Poggere GC, Goularte GD (2020) Global trends in apps for agriculture. Multi-Sci J 3(1):16–20. ISSN 2359-6902 14. Shankar P, Werner N, Selinger S, Janssen O (2020) Artificial intelligence driven crop protection optimization for sustainable agriculture. In: 2020 IEEE/ITU international conference on artificial intelligence for good (AI4G). IEEE, pp 1–6 15. Birner R, Daum T, Pray C (2021) Who drives the digital revolution in agriculture? A review of supply-side trends, players and challenges. Appl Econ Perspect Policy 16. Gordo A, Almazan J, Revaud J, Larlus D (2017) End-to-end learning of deep visual representations for image retrieval. Int J Comput Vis 124(2):237–254 17. Latif A, Rasheed A, Sajid U, Ahmed J, Ali N, Ratyal NI, Zafar B, Dar SH, Sajid M, Khalil T (2019) Content-based image retrieval and feature extraction: a comprehensive review. Math Probl Eng 2019 18. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380 19. Chen W, Liu Y, Wang W, Bakker E, Georgiou T, Fieguth P, Liu L, Lew MS (2021) Deep image retrieval: a survey. arXiv preprint arXiv:2101.11282 20. Schroff F, Kalenichenko D, Philbin J (2015) FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823 21. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124 22. Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1096–1104 23. Qayyum A, Anwar SM, Awais M, Majid M (2017) Medical image retrieval using deep convolutional neural network. Neurocomputing 266:8–20 24. Tong X-Y, Xia G-S, Fan H, Zhong Y, Datcu M, Zhang L (2019) Exploiting deep features for remote sensing image retrieval: a systematic investigation. IEEE Trans Big Data 6(3):507–521 25. Cheng H, Yang W, Tang R, Mao J, Luo Q, Li C, Wang A (2015) Distributed indexes design to accelerate similarity based images retrieval in airport video monitoring systems. In: 2015 12th international conference on fuzzy systems and knowledge discovery (FSKD). IEEE, pp 1908–1912 26. Liu Y, Huang Y, Gao Z (2014) Feature extraction and similarity measure for crime scene investigation image retrieval. J Xi’an Univ Posts Telecommun 19(6):11–16 27. Bakhshipour A, Jafari A (2018) Evaluation of support vector machine and artificial neural networks in weed detection using shape features. Comput Electron Agric 145:153–160 28. Tóth BP, Tóth MJ, Papp D, Szücs G (2016) Deep learning and SVM classification for plant recognition in content-based large scale image retrieval. In: CLEF (working notes), pp 569–578 29. Chen X, You J, Tang H, Wang B, Gao Y (2021) Fine-grained plant leaf image retrieval using local angle co-occurrence histograms. In: 2021 IEEE international conference on image processing (ICIP), pp 1599–1603 30. Babenko A, Lempitsky V (2015) Aggregating deep convolutional features for image retrieval. arXiv preprint arXiv:1510.07493 31. Trong VH, Yu G-H, Vu DT, Lee J-H, Toan NH, Kim J-Y (2020) A study on weeds retrieval based on deep neural network classification model. J Korean Inst Inf Technol 18(8):19–30 32. Yang Z, Yue J, Li Z, Zhu L (2018) Vegetable image retrieval with fine-tuning VGG model and image hash. IFAC-PapersOnLine 51(17):280–285 33. Loddo A, Loddo M, Di Ruberto C (2021) A novel deep learning based approach for seed image classification and retrieval. Comput Electron Agric 187:106269
72
H. Najjar et al.
34. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105 35. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 36. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 37. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence 38. Yin H, Gu YH, Park C-J, Park J-H, Yoo SJ (2020) Transfer learning-based search model for hot pepper diseases and pests. Agriculture 10(10):439 39. Gao Z-Y, Xie H-X, Li J-F, Liu S-L (2018) Spatial-structure Siamese network for plant identification. Int J Pattern Recogn Artif Intell 32(11):1850035 40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 41. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations (ICLR) 42. Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 43. Ye L, Rochan M, Liu Z, Wang Y (2019) Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10502–10511 44. Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: a survey. arXiv preprint arXiv:2101.01169 45. Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L (2021) CvT: introducing convolutions to vision transformers. arXiv preprint arXiv:2103.15808 46. Sakai T (2007) Alternatives to bpref. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp 71–78 47. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Chapter 5
Design and Analysis of IoT-Based Modern Agriculture Monitoring System for Real-Time Data Collection Bekele M. Zerihun, Thomas O. Olwal, and Murad R. Hassen
1 Introduction Agriculture is one of the most important sectors in the world not only for food stuff but also supply raw materials for industries and textile factories. The world population is growing exponentially and estimated about 9.8 billion in 2050. Therefore, modern agricultural technologies must be implemented to significantly improve the products in order to satisfy the huge global demands of food and raw materials for industries [1]. Agriculture plays a great role in Ethiopian economy, and it covers more than 81% of the labor force, half of gross domestic products, and 84% cash crops and goods for exports [2]. Although Ethiopia has a large portion of fertile land, sufficient rainfall, convenient weather, and great potential of labor force, its agriculture system is still underdeveloped and characterized by low productivity. This is due to lack of monitoring soil nutrient level, proper irrigation system, expert knowledge, real-time weather information, and optimal resource management. It is plagued by periodic drought, soil degradation, exposed to major climate changes, and lack of modern farming technologies [3]. With low productivity of the sector and high rise of the population, it is a major challenge to ensure food security. Exploiting modern agriculture monitoring system for real-time data collection about the environmental B. M. Zerihun (B) · M. R. Hassen School of Electrical and Computer Engineering, Addis Ababa Institute of Technology, Addis Ababa University, Addis Ababa, Ethiopia e-mail: [email protected] M. R. Hassen e-mail: [email protected] T. O. Olwal Department of Electrical Engineering, Facility of Engineering and Built Environment, Tshwane University of Technology, Pretoria, South Africa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_5
73
74
B. M. Zerihun et al.
conditions of the farm will highly improve productivity and efficiency of the farming process [4]. In modern agriculture, most of the tasks such as sowing, irrigation, fertilizing, and harvesting are processed by heavy machinery and urban equipment like combiners, harvesters, and tractors which are partially or fully operated and controlled by remote monitoring systems using sensors and actuators. In this regard, Internet of Things (IoT) plays an important role to develop a modern agriculture system in order to automate the farming process and improve productivity. IoT is an intelligent network system which offers connectivity of massive low-power communication devices and sensors through Internet. Therefore, in this chapter, we propose an IoT-based modern agriculture monitoring system using LoRaWAN and ThingSpeak to improve crop productivity and efficiency of resource management. The main contributions of this chapter are: • Explore current trends in IoT-based agriculture research and identify a number of critical concerns that must be addressed in order to alter the agriculture industry by leveraging recent IoT breakthroughs. • Design and develop a low-cost IoT-based modern farming system to collect a real-time data about the field and send it to the cloud storage for further analysis. • Evaluate the performance of the proposed system by implementing a ThingSpeak platform to accurately collect different field parameters pertaining to the growth of crops. The rest of this chapter is structured as follows. In Sect. 2, the proposed system model is presented. Experimental setup of the proposed system is depicted in Sect. 3. In Sect. 4, the result and discussion of the proposed system is explained. Finally, conclusion of the chapter is presented in Sect. 5.
2 Proposed System Model The objective of this chapter is to design and test a modern farm monitoring system based on IoT technology. The proposed system consists of IoT sensor nodes, microcontroller, communication technology/Wi-Fi, a cloud-based data storage platform, and a local database system as depicted in Fig. 1. Various types of IoT sensor nodes are deployed in different regions of the field to collect data about weather conditions, soil quality, crop growth, and more. The data is then sent to the Arduino IoT cloud, where valuable analytics can be retained. Based on the environmental conditions, the microcontroller monitors different types of farming equipment and actuators without human intervention. The data collected from the field can be also stored in the ThingSpeak channel using Write API Key for further analysis. The microcontroller connected with the communication technology uploads the sensed parameter values to the data storage system and receives the commands from the central control station.
5 Design and Analysis of IoT-Based Modern Agriculture Monitoring …
75
Fig. 1 Structure of the proposed system
The major components and technologies applied for the implementation of modern agriculture systems are discussed as follows.
2.1 IoT Sensor Nodes Sensor nodes are among the key components that support an IoT-based smart agriculture systems in order to capture different environmental parameters, crop conditions, and other information in the farmland [5]. The most common sensor nodes being used to collect the farm data such as soil moisture, humidity, temperature, pH level, soil nutrients, light intensity, and water-level sensors are deployed in appropriate locations of the field to collect and send the real-time data to the central control station.
76
B. M. Zerihun et al.
2.2 Controllers and Processing Units There are various low-cost development boards such as Arduino, Raspberry Pi, and Gen X that can be deployed in the field for the purpose of controlling and processing the sensed data [6]. These microcontrollers interface with IoT sensor nodes and a communication module in order to control and process the data collected by the IoT sensor nodes and forward instructions from the central station to actuators. The main criteria of selecting these microcontrollers are simplicity, availability, and cost. In this context, Arduino is a low-cost, an open-source electronics platform based on easy-to-use hardware and software. Thus, Arduino UNO is selected for the practical implementation of the proposed system.
2.3 Communication Technologies Communication and reporting the environmental conditions with up-to-date realtime data can be considered as the main element of the modern farming system. In recent times, a number of low-power, long-range, and low-cost IoT communication technologies are developed in order to achieve wide coverage connectivity with limited energy source [7]. The most common long-range IoT communication technologies are LoRaWAN, Sigfox, and narrowband IoT (NB-IoT). NB-IoT is a licensed technology and requires cellular infrastructure. Considering its cost and availability in rural areas, NB-IoT is not a feasible solution for the proposed system. Similarly, Sigfox requires a subscription fee. Unlike Sigfox and NB-IoT, LoRaWAN specification provides seamless interoperability among smart things with low power, long range and low infrastructure cost [8]. Therefore, we choose LoRaWAN as a communication technology in the proposed system.
2.4 Cloud Storage and Local Database Cloud storage platforms provide the capability of accessing the collected data anytime and anywhere using any type of cloud view tool [9]. The proposed system uses the ThingSpeak cloud platform to record different environmental data collected from the field. ThingSpeak is an open-source cloud platform to collect different environmental parameters at a remote location and visualize instantly. ThingSpeak allows users to aggregate, visualize, and analyze live data streams collected in different locations of the farmlands.
5 Design and Analysis of IoT-Based Modern Agriculture Monitoring …
77
2.5 Energy Solution In most rural regions of Ethiopia, the availability of power grids is limited. Therefore, an alternative energy solution is paramount important for sustainable operations and longer network lifetime [10]. The proposed system is required to include energy harvesting technologies such as solar, thermal, vibration, or acoustic in order to provide power supply for IoT nodes and other controller and data processing units. Energy harvesting from solar system is more applicable in rural regions where the main power grids are not available. Using a solar panel, up to 100 mW/cm2 of energy can be harvested and accumulated in the battery storage system [11]. This provides unlimited power supply which significantly increases the network lifetime and energy efficiency of the proposed system.
3 Experimental Setup In this section, to realize and evaluate the performance of the proposed system, we implement a cloud-based smart agriculture system where LoRaWAN is used as a communication technology in order to upload the data collected from the field into the cloud platform. The complete hardware implementation is given in Fig. 2. The system includes soil monitoring sensors (temperature, moisture, light, and humidity), Arduino UNO, LoRaWAN transceiver module, actuators (fan and pump), and LCD display. The Arduino broad is implemented as a processing unit, and the soil monitoring sensors are interfaced with it to feed the environment conditions. A simple and cheap digital temperature and humidity sensor is used to measure the humidity content and the temperature value in the field. The soil humidity can be measured by calculating the resistance differences between the two electrodes, and a negative temperature coefficient thermistor is used for measuring the temperature value. YL69 moisture sensor is used to measure the volumetric water content in soil. The moisture content of the soil is equivalent to the resistance values between the two electrodes. BH1750 light sensor is used to measure the light intensity. The measured values from these sensors are fed to the processing unit, and according to the sensed results and predefined threshold level, the Arduino board automatically controls the operation of actuators. In addition to controlling the operation of actuators, the Arduino board sends the sensed data to the ThingSpeak platform via LoRaWAN transceiver. The LCD display is interfaced with I2C adapter to visualize the ongoing process performed by the system. For designing and simulation, we used Arduino IDE and Proteus software. A sample simulation setup is depicted in Fig. 3.
78
B. M. Zerihun et al.
Fig. 2 Hardware setup for collecting real-time data
Fig. 3 Simulation setup to evaluate the performance of the proposed system
4 Simulation Results and Discussion To test and observe the performance of the proposed system, the field monitoring parameters are simulated and compared with their respective threshold values. Accordingly, the Arduino board controls the operation of actuators depending on the environmental parameters that directly affects the crop yield. For example, if the soil moisture content is less than the threshold value, the pump will turn on automatically and start watering the crops. Apart from automatic controlling of actuators, the
5 Design and Analysis of IoT-Based Modern Agriculture Monitoring …
79
processing unit uploads the field parametric values into the cloud storage platform for further data analysis, visualization, and actions to be taken in the central control station. Reading values of these environmental parameters are stored in the ThingSpeak cloud storage. So that we can fetch the data from ThingSpeak channel using Read API Key. We can also visualize the data using ThingView-free application on a smartphone. The proposed system collected different field parameters based on the experimental setup. Then, these sensed parametric values plotted in ThingSpeak channel implemented in the cloud environment and updated in every 15 s. Figure 4 shows the soil moisture-level graph gathered by a moisture sensor that deployed in the field to measure the volumetric water content of the soil. Based on the reading values of the moisture content in the soil, the pump will be turned on and off automatically to water the field optimally. The light intensity measured at some region of the field is observed in Fig. 5. The temperature value is also depicted in Fig. 6, which is collected by a digital temperature sensor. Figure 7 shows the soil humidity curve measured by a digital humidity sensor called DTH11. The data collected from the sensors can be also stored in a local database. A typical snapshot of soil condition monitoring system is displayed in a bar chart as shown in Fig. 8. The data displayed in the local database helps for further visualization, analysis, and manipulating the data according to the threshold values.
Fig. 4 Moisture level of soil
80
B. M. Zerihun et al.
Fig. 5 Light intensity measured at the field
Fig. 6 Field temperature value
5 Conclusion In this chapter, we design and implement an IoT-based modern agriculture monitoring system to collect, process, and store the real-time data. The proposed system deploys four low-cost IoT sensor nodes in order to capture different environmental parameters that pertains the growth of crops and affects the crop yield. This system provides an automated field monitoring mechanism by sending the real-time agriculture data to the Arduino board. Based on the environmental conditions, the Arduino board automatically controls the operation of actuators and farming equipment. This
5 Design and Analysis of IoT-Based Modern Agriculture Monitoring …
81
Fig. 7 Soil humidity level
Fig. 8 A sample data recorded on the local database
method is cost effective and enabling better decisions about equipment efficiency, plant growth, and staff productivity, or even automating processes such as fertilization, irrigation, and pest control to boost the quality of crops and significantly reduces human efforts. Furthermore, the sensed environmental parameters are uploaded to the cloud platform via LoRaWAN for further data analysis, visualization, and making informed decisions in the central control station which improves the efficiency of resource management and crop productivity.
82
B. M. Zerihun et al.
References 1. Abhishek D et al (2019) Estimates for world population and global food availability for global health. In: The role of functional food security in global health, vol 2, pp 3–24 2. Sisay BA (2016) Evaluating and enhancing irrigation water management in the upper Blue Nile basin, Ethiopia: the case of Koga large scale irrigation scheme. Agric Water Manag 170(2016):6–35 3. Zhang L, Dabipi IK, Brown WL (2018) Internet of Things applications for agriculture. In: Hassan Q (ed) Internet of Things A to Z: technologies and applications 4. Navulur S, Sastry ASCS, Giri Prasad MN (2017) Agricultural management through wireless sensors and Internet of Things. Int J Electr Comput Eng (IJECE) 7(6):3492–3499 5. Raut R, Varma H, Mulla C, Pawar V (2018) Soil monitoring, fertigation, and irrigation system using IoT for agricultural application. In: Intelligent communication and computational technologies. Singapore 6. Saini H, Thakur A, Kumar N (2016) Arduino base automatic wireless weather station with remote graphical application and alerts. In: 3rd international conference, Sept 2016. ISBN 978-1-4673-9197-9 7. Ahmad M, Ishtiaq A, Habib MA, Ahmed SH (2019) A review of Internet of Things (IoT) connectivity techniques. In: Recent trends and advances in wireless and IoT-enabled networks. Cham 8. Marais JM, Malekian R, Abu-Mahfouz AM (2019) Evaluating the LoRaWAN protocol using a permanent outdoor testbed. IEEE Sens J 19(12):4726–4733 9. TongKe F (2013) Smart agriculture based on cloud computing and IoT. J Converg Inf Technol 8(2) 10. Tang X, Cattley R, Gu F, Ball AD (2018) Energy harvesting technologies for achieving self-powered wireless sensor networks in machine condition monitoring: a review. Sensors 18(12):4113–4152 11. Pozo B, Garate JI, Araujo JÁ, Ferreiro S (2019) Energy harvesting technologies and equivalent electronic structural models. Electronics 8(5):486–517
Chapter 6
Estimation of Wheat Yield Based on Precipitation and Evapotranspiration Using Soft Computing Methods Abdüsselam Altunkaynak
and Eyyup Ensar Ba¸sakın
1 Introduction Wheat is a significant factor in human nutrition; therefore, planning of wheat cultivation has to be adequately fulfilled. Wheat (Tirticum aestivum and Triticum turgidum) is the third most cultivated plant in the world, on an area of 214 million hectare (Mha) in 2018. Total amount of wheat obtained from this area was approximately 734 million tons (Mt), and the average production was 3425 kg/ha [20]. Sufficient wheat cultivation requires relatively cool climates; nevertheless, it is cultivated in many places in the world regardless of the climate of the region. The most wheat-producing countries in the world are China (131 Mt), India (99 Mt), Russian Federation (72 Mt), USA (51 Mt), France (35 Mt), Canada (31 Mt), Pakistan (25 Mt), Ukraine (24 Mt), Australia (20.9 Mt), Germany (20.2 Mt) and Turkey (20 Mt) in descending order [20]. Factors affecting the wheat yield should be well defined during wheat cultivation. Efficient wheat cultivation can be performed under the light of the parameters affecting the yield. A strong correlation is observed between the required amount of water and the wheat yield, once the predictions for crop yield per applied water unit were performed. Determining the water productivity is highly crucial for evaluating the crop yield in arid and semiarid lands, where water is an important factor [2, 17, 19, 25]. They investigated the effects of precipitation on the winter wheat yield for consecutive years in the experimental area. Studies revealed that drought seasons and seasons with heavy precipitation have significant effects on the crop yield [26]. Pirttioja et al. [40] investigated the change in winter wheat yield with respect to change in seasons on multiple areas from different countries. For the range of climate change A. Altunkaynak · E. E. Ba¸sakın (B) Faculty of Civil Engineering, Hydraulics and Water Resource Engineering Division, Istanbul Technical University, Istanbul 34469, Turkey e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_6
83
84
A. Altunkaynak and E. E. Ba¸sakın
investigated in their study, the model results indicate that the yield is more dependent on the variations in temperature than on the precipitation change at the Finnish site, where the sensitivity of the wheat yield in temperature and precipitation variations are more balanced at the German and Spanish sites. According to their model results, under available CO2 levels, average of the current wheat yields decreases with relatively high temperatures and low precipitation; therefore, in particular, wheat yield would benefit from increase in precipitation. As a result of their findings, precipitation is the most significant parameter in some regions, while temperature is the most significant one in other regions. Clearly, temperature is one of the important climate parameters that affect the wheat growth and yield [24, 52, 56]. According to Asseng et al. [10], the temperature during wheat growth is the most significant parameter that has a dominant role in the duration of phenological stages. The value of crop yield may decrease due to increase in summer temperature during grain filling. A decrease of 5% in crop yield has been observed per increase of 1 °C [37]. In their study, reaction of the plants to water stress was measured in order to increase the crop yield to maximum level and quantify the water productivity in areas lacking enough water resources. The effect of crop water requirement on plant growth and crop yield was investigated by Bouazzama et al. [12]. It was showed that there was a linear correlation between the crop yield and the actual evapotranspiration using two years’ average values. It is found that lack of irrigation affected the growth length, expedited the aging of leaves and decreased the leaf area index. The evapotranspiration value (ET) of wheat could be low in non-irrigated droughtprone areas, while it could reach 600–800 mm under heavy irrigation conditions. ET value usually varies between 200 and 500 mm. Slope of grain yield to ET can be accepted as water use efficiency. There is a positive correlation between plant evapotranspiration and crop yield. For this reason, water stress is inevitably a factor that reduces grain yield. Musick et al. [33] observed a linear relationship between ET values and wheat yield in Texas, USA. Therefore, in this study, to predict wheat yield, ET variable is used as the input into the multilinear regression (MLR), support vector machine (SVM) and fuzzy models. Su et al. [47] developed the support vector machine (SVM)-based open crop model (SBOCM) for prediction of rice yield. Daily meteorological data was used to predict yield of rice. They reported that SVM model can be used in prediction of annual rice yield. Nitze et al. [36] compared the results of SVM, artificial neural networks, random forest and maximum probability models in order to establish an agricultural field classification by using remote sensing techniques. After the investigation of 500 agricultural fields in Canadian Prairies, it was found that SVM classification models were more accurate compared to the others. Rumpf et al. [42] used SVM in order to identify the hazardous strange herbs that may cause economical loss in agriculture production fields. In a study, where image processing methods were used based on shape properties, SVM provided a classification with superior accuracy rates. Kaundal et al. [27] developed a new SVM model for plant disease prediction by utilizing air condition data. The authors showed that SVM was superior for prediction of available plant diseases compared to readily used machine learning techniques and traditional regression approaches. Sujjaviriyasup and Pitiruek [48] used the SVM approach to predict Thailand’s orchid exportation.
6 Estimation of Wheat Yield Based on Precipitation …
85
That study, using time series, also developed a hybrid model with autoregressive integrated moving average (ARIMA) and compared stand-alone ARIMA with SVM models. It was observed that the hybrid model had better prediction results compared to other models. Ahmed et al. [1] used the SVM for weed identification. They implemented 224 data in their study and differentiated weed from crops with an accuracy of 0.97. Fuzzy logic (FL) approach has been used in many studies. Mazloumzadeh et al. [31] developed a FL model in order to predict date tree yield where change in the tree features was used as input. An accuracy ratio of 0.87 was obtained when their results were compared with expert opinions. Parker et al. [38] used the FL model to investigate the management and performance of wheat, rapeseed, barley and maize under the influence of possible climate changes in the future. Kumar and Kumar [30] predicted the rice yield for the upcoming year by using the previous production values (from past years). Prediction values were found to be very close to the observed data. A study was conducted in India to specify appropriate herb planting with the highest yield ratio under a particular condition. Authors used MATLAB to develop a fuzzy model to identify which herb is suitable for agriculture in a selected area using land features as inputs [51]. Nisar et al. [35] developed a fuzzy-based geographical information system model in order to analyze the suitability of areas for wheat cultivation. They showed that wheat grew the best in the most suitable agriculture land that was designated by utilizing the geographical information systems together with fuzzy classifications, allowing partial membership, and consequently, wheat yield has increased. A study compared fuzzy logic model with discriminant analysis for prediction of peanut maturation, and it was found that fuzzy logic model yielded accurate prediction rates of 0.45, 0.63 and 0.73 when there were 6 classes, 5 classes and 3 classes, respectively [43]. Another study evaluated the quality of apples using fuzzy logic model, where color, size and disturbances on the surface were used as inputs, and classification results of fuzzy model and those obtained from the experts were compared [28]. In another study, where greenhouse climate controller was developed with fuzzy model, researchers stated that the classical methods were insufficient to control air conditions in a greenhouse [46]. Sicat et al. [44] developed a model for appropriate classification of the agricultural areas using fuzzy model. Investigators selected agriculture areas in a particular region of India for their study and only used the information given by the farmers. Citakoglu et al. [14] predicted evapotranspiration values with ANN and ANFIS. Turkey was chosen as the study area, and various models were created with different input combinations. It is concluded that ANFIS model results had a better prediction accuracy than those of the ANN method. In this study, fuzzy, support vector machine (SVM) and multilinear regression (MLR) models are developed to predict the wheat yield from precipitation (P) data and deficit of crop water requirement (DCWR) data for Konya and Kirikkale provinces where wheat crop production is the most in Turkey. Objectives of the present study are three folded: (1) investigate the precipitation and evapotranspiration effects on wheat yield via multilinear regression, support vector machine and fuzzy-based models in detail, (2) improve the prediction accuracy of the annual wheat yield in Kirikkale and Konya provinces, and (3) evaluate the performances of the developed multilinear regression,
86
A. Altunkaynak and E. E. Ba¸sakın
support vector machine and fuzzy models quantitatively by means of root-meansquare error (RMSE) and coefficient of efficiency (CE). The main contribution of this study comes from the fact that this is the initial attempt for the prediction of wheat yield for Konya and Kirikkale regions. This research also developed predictive models for prediction of wheat yield based on the comparative analysis of MLR, SVM and fuzzy models. The rest of this study was organized with the following sections: Sect. 2. Materials and Methods, Sect. 3. Results, Sect. 4. Conclusions, respectively.
2 Materials and Methods 2.1 Crop Water Requirement Germination, emergence, tillering, floral initiation or double ridge, terminal spikelet, first node or beginning of stem elongation, boot, spike emergence, anthesis and maturity are the stages in the physiological growth of wheat. These stages are categorized as germination to emergence (E): growth stage 1 (GS1) from emergence to double ridge; growth stage 2 (GS2) from double ridge to anthesis; and growth stage 3 (GS3), from anthesis to maturity. Note that the grain filling period GS3 stage is included as shown in Fig. 1 [22]. In previous researches, various methods have been used to calculate the reference crop evapotranspiration, namely Blaney–Criddle [11], Penman [39], Pan Evaporation, Hargreaves–Samani [21] and FAO Penman–Monteith [3] methods. In this study, FAO Penman–Monteith method was selected to calculate the reference crop evapotranspiration.
Fig. 1 Schematic diagram of wheat growth and development stages
6 Estimation of Wheat Yield Based on Precipitation …
ETo =
900 0.408(Rn − G) + γ T +273 u 2 (es − ea )
+ γ (1 + 0.34u 2 )
87
(1)
where ETo is the reference evapotranspiration [mm/day]; Rn is the net radiation at the crop surface [MJ/m2 /day]; G is the soil heat flux density [MJ/m2 /day]; T is the mean daily air temperature at 2 m height [°C]; U 2 is the wind speed at 2 m height [m/s]; es is the saturation vapor pressure [kPa]; ea is the observed vapor pressure [kPa]; es –ea is the saturation vapor pressure deficit [kPa]; is the slope of vapor pressure curve [kPa/°C]; and γ is the psychometric constant [kPa/°C]. In order to be able to optimize irrigation on a regional scale, the reference grass evapotranspiration and the crop coefficient, K c are required to determine the ETc which represents crop-specific required water and is needed for accurate estimation of irrigation requirements [54]. ETc = ETo × K c
(2)
where ETc is the crop water requirement, ETo is the reference evapotranspiration and K c is the crop coefficient. In this study, E T c and E T o values were computed using CROPWAT software by taking meteorological parameters by taking the average of maximum and minimum temperatures, relative humidity, wind speed, sunshine duration and precipitation into consideration.
2.2 Description of the Methods Fuzzy Model A typical rule for Takagi–Sugeno [50] FIS can be shown as [4, 7, 8, 55]: IF (x1 = A11 ) AND (x 2 = A12 ) THEN (z 1 = p10 + p11 x1 + p12 x2 )
(3)
IF (x1 = A21 ) AND (x2 = A22 ) THEN (z 2 = p20 + p21 x1 + p22 x2 )
(4)
IF (x1 = A11 ) AND (x 2 = A12 ) THEN (z 1 = c1 )
(5)
IF (x1 = A21 ) AND (x2 = A22 ) THEN (z 2 = c2 )
(6)
or
Here, the linear function z = p10 + p11 x1 + p12 x2 is the consequent part of the remaining fuzzy rule. It is usually a linear function with one or multiple variables (x 1 and x 2 for Eq. 6). In general, the consequent part of each rule base is similar to
88
A. Altunkaynak and E. E. Ba¸sakın
a linear model. Each rule gives a numerical output value. Therefore, the final result is computed by weighted average processes that allow reaching the final value as a numerical output by applying simple algebra (mathematical operations). For this reason, in this study, Takagi–Sugeno [50] method is used to predict wheat yield from precipitation and deficit of crop water requirement values. The following three steps are required for the implementation of Takagi–Sugeno [50] method: 1.
The consequent part of each rule, the output, is represented as a linear function, that is, the weighted average of overall zr with the implementation rule Rr . zr = fr (x1 , x2 , x3 , . . . .xn ) = z 1 = p10 + p11 x1 + p12 x2 + · · · + pr n
2.
where n and r represent the numbers of input and the fuzzy rules, respectively. The membership degrees of fuzzy rules are selected as weights. cr = MDr1 MDr2 MDr3 , . . . , MDrk × wr
3.
(7)
(8)
where MDr1 MDr2 MDr3 , . . . , MDrk are the membership degrees of rules which are read from input variables of membership function for rth rule. wr is weighting of the rule and is considered 1 in terms of simplicity. The symbol is denoted as a min operation. The final result of wheat yield, WY, is determined by using weighted average of z functions. 4 WY = i=1 4
ci z i
i=1 ci
(9)
The FIS constructs a connection between the input and output variables. In this study, precipitation (P) and deficit of crop water requirement (DCWR) data were used as inputs into a FIS as depicted in Fig. 2. Support Vector Machine Support vector machine (SVM) is a machine learning algorithm that relies on optimization, and it works with the rule of minimizing the structural risks. SVM can be used for classification and solving regression problems [15]. The purpose is to separate the data correctly during classification. A border line separation of the data in a linear manner is utilized for this separating process. It is possible to encounter infinite choices while drawing such a line. A space called “margin” is used in order to find the line providing the best separation as shown in Fig. 3. Two straight lines that make the width of the margin maximum provide the optimal solution. The data samples are used for the identification of separation lines that are kept out both data sets called support vectors. Let the dual classification training data be termed as {x i , yi }, i = 1, 2, 3… l, yi ∈ {−1, 1}, x i ∈ d . The machine that uses training data during the learning phase can
6 Estimation of Wheat Yield Based on Precipitation …
89
Fig. 2 Inputs and output diagram of fuzzy inference system
Fig. 3 Separating hyperplane
find a discriminant or the parameters as shown in Eq. 10. That is, it can determine the parameters of decision function which are d (x, w, b), namely w = [w1 , w2 , … wn ]T and b (x, w ∈ n ). d = (x, w, b)w T x + b)i =
n i=1
wi xi + b
(10)
90
A. Altunkaynak and E. E. Ba¸sakın
where x is a point on the hyperplane; w is the weight vector; and b is the bias which helps to determine the location of the hyperplane. If d (x, w, b) > 0, then x class belongs to + 1, which means y1 = + 1, and if d (x, w, b) < 0, then x class belongs to −1, which means y1 = −1. If w is a vector which provides a functional margin which is 1 at points positive x (x + ) and negative x (x − ), then the geometrical margin can be calculated as: w.x + + b = +1
(11)
w.x − + b = −1
(12)
These two equations can be expressed as: yi (w.x i + b) ≥ +1, ∀i
(13)
To find the maximum margin: 1 Minimization: min w2 2
(14)
Constraint: yi (w.xi + b) +1, ∀i
(15)
A nonlinear optimization problem can be encountered during the process of solving these equations. Optimization problems can be solved by using the Lagrange function. Lagrange multiplier is defined for each constraint, and constraint equations are multiplied by Lagrange multiplier and subtracted from the objective function. LD ≡
l 1 w2 − αi yi ((w.xi + b) − 1) i=1 2 αi ≥ 0
(16) (17)
whereαi shows the Lagrange multipliers. LD should be minimized based on w and b; at this phase, differential of LD should be taken with respect to αi and equal to 0. At this point, this is a convex quadratic programming problem and can be related to a solution that is equivalent to the following dual problem. This dual problem has been named as wolf dual. The solution that renders a dual form is l
1 MaxL D ≡ αi − yi y j αi α j xi x j 2 i=1 j=1 i=1
(18)
αi ≥ 0, i = 1, 2, 3 . . . l
(19)
l
l
6 Estimation of Wheat Yield Based on Precipitation … l
91
yi αi = 0
(20)
j=1
Because the problem is quadratic, optimization process is always finalized with a global maximum. Therefore, maximum margin classifier can be derived by using the following equation: l f (x) = sig yi αi xi x + b
(21)
i=1
Generalization capacity is increased due to the low number of support vector [16]. Linear separable margin cannot be able to produce appropriate solutions encountered in engineering problems. To overcome this issue, a system that allows an error called soft margin has been developed. Maximization of the margin minimizes the number of false classifications. Allowing an error between the margin and the false classification can be controlled by the positive constant number that has been previously defined. In case of an inseparable linear condition, hyperplane can be defined as the one that maximizes the margin and minimizes the errors. Then, the optimization problem is:
w 2 +C ξi Min 2 i=1
(22)
Accordingly, the Lagrange function is rewritten again: LP =
1 w2 + C ξi − αi {yi (xi w + b) − 1 + ξi } − μi ξi 2 i i i
(23)
Lagrange multiplier is used to make the μi and ξi positive (in Eq. 23); then, the solution is defined through converting the objective function to a dual problem.
1 αi α j yi y j xi x j 2 i, j
(24)
αi yi = 0 and; 0 αi C∀i;
(25)
LD =
i
αi −
Constraints: i
Decision function can be converted to the following form by using duality and by showing w, and yi xi are linear components.
92
A. Altunkaynak and E. E. Ba¸sakın
f (x) =
l
αi xi (xi x) + b
(26)
i=1
y = sig( f (x))
(27)
In case of separating the data so that it is in the linear form, classification process can be possible by using a kernel function and by carrying the data to a high-dimensional space. In general, the kernel function is described as: K xi x j = (xi ). x j
(28)
With this adjustment, the problem can be solved in dual space of Lagrange multipliers, but not in primal-weighted space. Finally, the equation takes the following form. LD =
i
αi −
1 αi α j yi y j k xi x j 2 i, j
(29)
The most commonly used kernel functions are as follows: Linear kernel function: K xi , x j = xiT .x j Polynomial kernel function: d K xi , x j = (γ xiT .x j + r ) , γ > 0 Sigmoidal kernel function: d K xi , x j = tanh(γ xiT .x j + r ) Radial-based kernel function: K xi , x j = exp −γ xi − x j 2 , γ > 0 Here, γ , r and d are the kernel parameters. Multilinear Regression Model The aim of the multilinear regression models is to predict the dependent variable from independent variables and to determine which independent variables have more effect on the dependent variable. The y represents the dependent variable, and x1 , x2 , …, x p represent the independent variables in the multilinear regression model which can be defined as:
6 Estimation of Wheat Yield Based on Precipitation …
y = β0 + β1 x1 + β2 x2 + · · · + β j x j + · · · + β p x p + ε
93
(30)
where β0 , β1 , β2 , …, β j , … β p are the unknown model parameters and any β j model parameter represents the expected amount of change in the y variable corresponding to a unit change in the variable x j . In other words, β0 ,β1 , β2 , …, β j , … β p are the weight coefficients of independent variables for the prediction of the dependent variable, and because of that, parameters β j (j = 1, 2, …, p) are called the regression coefficients. The parameter β0 is called the intersect point or constant value of regression model, and it shows the value of the dependent variable when all variable values (x j ) are zero. The ε term in Eq. 30 represents the white noise (error term) [9, 29]. A comparison of fuzzy and support vector machine models with multilinear regression model is summarized below [6]: • The unknown model parameters, namely β0 , β1 , β2 , …, β j , … β p , are usually obtained by the least square method in the regression analysis which involves restrictive assumptions such as linearity, normality and homoscedasticity [5–7]. Further are required to be identified before applying the regression model. Fuzzy model does not have such assumptions and requirements. • The fuzzy models can overcome large amounts of complex nonlinear data easily. There may be more than one output in the fuzzy models, while only one output is allowed in MLR. • The regression analysis depends on numerical data; however, fuzzy models, in addition to the numerical data, can be employed for verbal data [6, 23]. For example, flood damages, earthquakes and hurricanes can be predicted by talking with people in the field, face to face, individually. • Fuzzy models have the ability to establish a nonlinear relationship between input and output variables with constructed IF–THEN fuzzy rules. • SVM is one of the widely used kernel learning algorithms which is implemented for linear and nonlinear pattern identification due to their convex optimization approach [41]. • To mention the disadvantages of the support vector machines, first, the kernel function comes into prominence. There is no clear view in the literature on which kernel function to choose [13]. • Another disadvantage of SVM is that the training process takes too long if large number of data is used. • The most important factor on the duration of training phase of a SVM model is the process of finding the optimum parameters of a kernel function. • If the radial-based kernel functions are used, penalty (C), gamma (γ ) and epsilon (ε) parameters must be determined a priori in order to solve them, these parameters are not unique, and they take different values for each data structure [49].
94
A. Altunkaynak and E. E. Ba¸sakın
Fig. 4 Map of study area for developed models
Description of the Study Area Konya is located at the coordinates of 32º E and 38º N as shown in Fig. 4. Konya includes Hotamis (including swamp), Karapinar, Karaman and Eregli (including Ayranci lowland) lowlands with the area of 800, 700, 500 and 2500 km2 , respectively. Meteorological data of Konya from 1992 through 2015, which consists of monthly maximum average temperature, minimum average temperature, relative humidity, average wind speed, average sunbathing time and total precipitation values, was obtained from station 17,244. Kirikkale is located at the coordinates of 33º 31 E and 39º 50 N (Fig. 4.). State Meteorological General Administration also provided 24 years of data from 1992 to 2015 observed at station 17,135. Temperature, humidity, wind speed and precipitation values were obtained monthly, and evapotranspiration (ET) values were calculated based on the period of the plants’ growth. Eventually, annual value of yield productivity in Konya and Kirikkale provinces has been predicted (Fig. 5). Wheat yield values for Konya and Kirikkale provinces were obtained from the webpage of Turkish Statistical Institution; average, maximum and minimum yield values for Konya city are measured as 218, 349 and 138 kg/da, respectively. Similarly, average, maximum and minimum yield values for Kirikkale city are measured as 190, 277 and 99 kg/da, respectively [53]. Majority of the wheat cultivation is performed dry due to the lack of sufficient water resources at Konya and Kirikkale areas. Amount of required water during the growth period of wheat was calculated. Deficient water amount was found by subtracting precipitation from crop water requirement.
6 Estimation of Wheat Yield Based on Precipitation …
95
Fig. 5 Scatter diagrams of observed data (a) precipitation versus corresponding wheat yield (b) deficit of crop water requirement versus corresponding wheat yield for Konya region
Yield is found to decrease when this amount increases and found to increase as this amount decreases. These two relations are used as inputs into the fuzzy, SVM and MLR models for predicting wheat yield. For development of the model, the data was separated into two parts. First part of the data, 14 years with 14 data points, were used for calibration (training) phase, and remaining 10-year data with 10 data points was used for validation (testing) of model. After testing triangular or trapezoidal membership functions via trial and error process, triangular membership function was found to give better prediction results. Therefore, triangle membership functions are used with two fuzzy subsets for the development of fuzzy model. Four fuzzy rules (decision matrix) are constructed for the precipitation and deficit of crop water requirement data in order to predict wheat yield values, as illustrated in Fig. 6. Performance Evaluation Criteria Performance of the models, in general, has been investigated utilizing various evaluation criteria in the literature. Root-mean-square error (RMSE) and coefficient of
96
A. Altunkaynak and E. E. Ba¸sakın
Fig. 6 Perfect model line for wheat yield of Konya: a fuzzy model prediction, b SVM model prediction, c multilinear regression model prediction
6 Estimation of Wheat Yield Based on Precipitation …
97
efficiency (CE) [34] are the two most extensively used parameters among the existing evaluation criteria according to [45]. Therefore, in this study, model performance was investigated in terms of RMSE and CE values. The RMSE, mean absolute error (MAE) and mean absolute percentage error (MAPE) are calculated by using the following equations:
n 1 (V pi − Voi )2 , RMSE = n i=1 MAE =
n
1
V pi − Voi n i=1
n 100%
V pi − Voi
MAPE =
n i=1 Voi
(31)
(32)
(33)
where n is the number of observed data, V pi is the predicted value and Voi is the observed value. The ideal RMSE value converges to zero. Another effective performance evaluation parameter is CE which measures the consistency between the observed data and the predicted values. CE parameter is defined as: n 2 i=1 V pi − Voi (34) CE = 1 − 2 n i=i (Voi − V o ) where V o is the mean value of observed data; the definitions of n, V pi , Voi are the same as they were for Eq. (33). The prediction performance of a model is very good when CE value approaches to 1. In general, the performance of a model is considered to be acceptable when CE value is greater than 0.50, based on [32]. In addition, according to [18] CE values are categorized into three classes. When the CE value is in the range between 0.65 and 0.75, the model is fair. When the value is between 0.75 and 0.85, the model is good and very well if the CE value is greater than 0.85. As a result, a model with the largest CE and smallest RMSE value is considered as the best model for prediction of yearly wheat yield for Konya and Kirikkale regions.
3 Results In this study, for predicting wheat yield, a FIS with two linguistic inputs, namely precipitation and deficit of crop water requirement, and one linguistic output, namely
98
A. Altunkaynak and E. E. Ba¸sakın
wheat yield, was assigned into two fuzzy sets, namely low (L) and high (H), respectively, as indicated in Fig. 2. To determine the optimal solution, the margin was used to detect the line that provides the best separating as depicted in Fig. 3. Two straight lines that make the width of the margin maximum provided optimal solution. In this study, SVM training data was transformed to a hyperplane by utilizing radial-based kernel functions and linear separation of the data was accomplished. The value of the penalty parameter, C, is named as regulator. When C has a small value, the tolerance of the error is very high. Conversely, if C is high, the tolerance is low, and the model suffers from overlearning and low generalization. On the other hand, gamma value is a multiplier that allows SVM to gain flexibility. When this value is too high, a hypersensitive learning takes place and the model suffers from overfitting. Epsilon (ε) is the parameter of the insensitive loss function, and it adjusts the desired error level when developing the regression model. The developed fuzzes and MLR models were run for Konya and Kirikkale regions as Fig. 4 shows. The scatter diagrams in Fig. 5 illustrate the relationships between the observed precipitation and deficit of crop water requirement, and corresponding wheat yield data. As can be seen from Fig. 5a, there is a linear relationship between precipitation and corresponding wheat yield, whereas Fig. 5b shows that there is an inverse linear relationship between the deficit of crop water requirement and wheat yield for Konya region. Same trends are also observed for Kirikkale region as depicted in Fig. 5a–c, respectively. Two linguistic variables, data of precipitation and deficit of crop water requirement and two fuzzy sets resulted in 2 × 2 = 4 fuzzy rules (Fig. 2). For the wheat yield data, fuzzy rules were described in order to predict wheat yield in Table 1. Table 2 illustrates an example of an overall fuzzy inference system that was used to obtain a real numerical value. The precipitation and deficit of crop water requirement values were selected as 228 and 300 mm, respectively. With low and high fuzzy subsets of precipitation and deficit of crop water requirement values (228 and 300 mm), multiplication of membership degrees for precipitation and deficit of crop water requirement variables were calculated for each rule on the consequent part as indicated in Table 2. The C, γ and ε parameters vary over a wide range, Table 1 Fuzzy rule base Number of rule
Antecedent
Consequent
Precipitation (P)
Deficit of crop water requirement (DCWR)
Output, z
R1
Low
Low
z1 = 1.019P + 0.6299(DCWR) + 0.00646
R2
Low
High
z2 = −0.705P + 0.431(DCWR) − 0.00052
R3
High
Low
z3 = 1.259P − 0.213 (DCWR) + 0.00504
R4
High
High
z4 = 0.192P + 0.529 (DCWR) − 0.00141
6 Estimation of Wheat Yield Based on Precipitation …
99
Table 2 An example of the overall fuzzy inference system Rules
Production
Antecedent
0.4
R1
Low
R2
Low
R3
Hi
R4
High
0.4
Low
Hi
0.30
273.94
80.90
0.14
115.84
15.73
0.39
358.21
139.62
65.60
11.76
0.6
0.5 Low
×
zi
0.6
0.3
Consequent
ci
0.18
0.5
Hi
0.31
(DCWR)=300 mm
P= 228 mm
and based on expert view, the trial–error method was designated to determine the epsilon value. After the implementation of the trial–error process, the appropriate parameter values were found to be C = 20, γ = 0.23 and ε = 0.4444. Subsequently, the mapping feature obtained from the training phase was performed for the test data for prediction of wheat yield. The precipitation and deficit of crop water requirement data was used as inputs into fuzzy model to develop the predictive model and was compared with MLR model. Table 3 presents the values of the RMSE, MAE, MAPE and CE calculated to be used in the validation phase (testing) of fuzzy, SVM and MLR models. The RMSE and CE values of MLR model were found to be the highest and the smallest, Table 3 Performance of fuzzy logic models, support vector machine and multilinear regression in terms of root-mean-square error (RMSE), coefficient of efficiency (CE), mean absolute error (MAE), mean absolute percentage error (MAPE) Performance indicator
Konya
Kirikkale
Fuzzy model
SVM model
MLR model
Fuzzy model
SVM model
MLR model
RMSE
9.72
8.53
12.00
15.31
14.25
17.47
CE
0.91
0.93
0.87
0.82
0.84
0.76
MAE
6.97
6.68
8.51
13.11
12.48
15.24
MAPE
3.40
3.23
4.11
8.08
6.80
8.28
100
A. Altunkaynak and E. E. Ba¸sakın
respectively, for both Konya and Kirikkale regions. This implies that the performance of the SVM model is better than fuzzy and MLR models for the implemented regions. The RMSE values of fuzzy, SVM and MLR models were found as 9.72 (kg/da), 8.53 (kg/da) and 12 (kg/da), respectively, for Konya region and 15.31 (kg/da), 14.25 (kg/da) and 17.47 (kg/da), for Kirikkale region, respectively. The MAE values of fuzzy, SVM and MLR models were found as 6.97 (kg/da), 6.68 (kg/da) and 8.51 (kg/da), respectively, for Konya region and 13.11 (kg/da), 12.48 (kg/da) and 15.24 (kg/da), for Kirikkale region, respectively. The MAPE values of fuzzy, SVM and MLR models were found to be 3.40, 3.23, 4.11 and 8.08, 6.80, 8.28 for Konya and Kirikkale regions, respectively. On the other hand, the CE values of fuzzy, SVM and MLR models were found to be 0.91 and 0.93, 0.87 and 0.82, and 0.84 and 0.76 for Konya and Kirikkale regions, respectively. The fuzzy and SVM models predicted the wheat yield better than MLR model based on the values of RMSE, 9.72 (kg/da), 8.53 (kg/da) and 12 (kg/da), MAE, 6.97 (kg/da), 6.68 (kg/da) and 8.51 (kg/da), MAPE, 3.40, 3.23 and 4.11 and CE, 0.91, 0.93 and 0.87, respectively, for Konya region. In the same fashion, the RMSE, MAE, MAPE and CE values were found to be 15.31 (kg/da), 14.25 (kg/da) and 17.47 (kg/da); 13.11 (kg/da), 12.48 (kg/da) and 15.24 (kg/da); 8.08 (kg/da), 6.80 (kg/da) and 8.28 (kg/da); and 0.82, 0.84 and 0.76, respectively, for Kirikkale region. According to the categorization of [18], MLR models performed fair, whereas fuzzy and SVM models performed good for prediction of wheat yield at Konya and Kirikkale regions based on CE values. Besides, the SVM model performed slightly better than the fuzzy model according to CE values for both regions, respectively, as presented in Table 3. On the other hand, according to [32], the performance of MLR model is found to be within the acceptable level for Konya and Kirikkale regions as corresponding CE values are greater than 0.50. The results of this study indicated that the fuzzy and SVM models can be used for an accurate prediction of wheat yield. Tables 4 and 5 demonstrate the comparison of fuzzy, SVM and MLR models’ results with the corresponding observed data for wheat yield of Konya and Kirikkale regions, respectively. For visual examination of the performance of fuzzy, SVM and MLR models in predicting wheat yield, the results of the fuzzy, SVM and MLR models were plotted on the 1:1 line model for Konya and Kirikkale regions in Figs. 6 and 7, respectively. The wheat yield predicted by fuzzy model and corresponding observed data was scattered tightly around the 1:1 line for Konya region (Fig. 6a) which implies that the prediction performance of fuzzy model is very good. Similarly, Fig. 6b depicts the scatter diagram of the SVM model prediction values of wheat yield and corresponding observed data for Konya region. As can be seen from this figure, the results of SVM model scattered closely around the 1:1 line. The fuzzy and SVM models performed very well for prediction of wheat yield at Konya region. The predicted wheat yield values by fuzzy model are highly correlated with the corresponding observed data as directed in Fig. 7a. This means that the fuzzy model performed good as regards the wheat yield prediction of Kirikkale region. Figure 7b illustrates that the prediction results of SVM model seem to follow the corresponding observed data closely for Kirikkale region. The prediction results of MLR model were highly distributed around the 1:1 line as shown in Fig. 7c.
6 Estimation of Wheat Yield Based on Precipitation …
101
Table 4 Comparison of fuzzy, SVM and MLR models’ results for wheat yield of Konya Years
Konya Observation data
Prediction results
Precipitation (mm)
Deficit of crop water requirement (mm)
Wheat yield (kg/da)
Fuzzy model (kg/da)
SVM model (kg/da)
MLR model (kg/da)
1992
139
379
195
190.56
193.82
194.42
1993
132
427
183
177.50
178.63
173.29
1994
183
396
193
196.89
198.19
202.99
1997
238
322
222
246.71
239.52
250.83
2003
174
387
201
197.81
198.93
203.39
2005
106
433
173
173.16
173.08
161.97
2007
103
438
159
171.77
171.50
158.99
2010
185
323
217
218.92
224.48
232.14
2012
228
300
257
248.00
245.51
255.95
2014
250
285
263
267.17
258.03
269.39
RMSE
9.72
8.53
12.00
CE
0.91
0.93
0.87
MAE
6.97
6.68
8.51
MAPE
3.40
3.23
4.11
In short, a predictive fuzzy, SVM and MLR models were developed to predict wheat yield from the precipitation and deficit of crop water requirement data. It was found that the fuzzy model can predict the wheat yield accurately using four simple fuzzy base rules. The fuzzy model can be implemented without complicated mathematical procedures and restrictive assumptions such as linearity, normality and homoscedasticity. The prediction results of fuzzy and SVM models were compared with prediction values of MLR model based on RMSE, MAE, MAPE and CE as performance evaluation criteria. As a result, the predictive performance of the fuzzy and SVM models was found to be better than MLR model which involves restrictive assumptions. Besides, as regards the fuzzy and SVM models, SVM model was found to be performed slightly better than the fuzzy model.
4 Conclusions Predictive models were developed for predicting wheat yield using fuzzy, SVM and MLR models and were implemented for Konya and Kirikkale regions, which are the most wheat crop-producing regions of Turkey. The fuzzy, SVM and MLR models
102
A. Altunkaynak and E. E. Ba¸sakın
Table 5 Comparison of fuzzy, SVM and MLR models’ results for wheat yield of Kirikkale Years
Kirikkale Observation data
Prediction results
Precipitation (mm)
Deficit of crop water requirement (mm)
Wheat yield (kg/da)
Fuzzy model (kg/da)
SVM model (kg/da)
MLR model (kg/da)
1993
230.4
225.6
212
216.82
195.29
199.68
1996
200.2
285.8
193
183.94
218.02
179.39
1997
286.0
157.0
193
205.42
177.15
227.75
2000
289.9
129.1
210
203.93
220.19
234.79
2001
157.4
332.6
153
132.27
153.03
159.11
2005
225.6
279.4
206
196.09
187.89
186.73
2006
252.1
254.9
189
207.61
200.20
198.31
2007
137.1
453.9
111
142.09
126.46
127.73
2009
348.2
139.8
238
242.76
232.98
246.06
2012
311.8
208.2
215
228.67
222.29
222.52
15.31
14.25
17.47
0.81
0.84
0.76
MAE
13.11
12.48
15.24
MAPE
8.08
6.80
8.28
RMSE CE
were evaluated quantitatively by means of RMSE and CE as the performance evaluation criteria, and accordingly, it was found that the fuzzy and SVM models performed good for predicting wheat. In this study, it was shown that fuzzy model could be used successfully in the predicting stages of the yield. Furthermore, while the majority of the previous studies performed the research on 100% irrigation areas, this study has investigated the relationship between the plant need and natural precipitation in a dry agriculture area. A sensitive model has been developed by not only using precipitation, but also by considering the deficit of crop water requirement.
6 Estimation of Wheat Yield Based on Precipitation …
103
Fig. 7 Perfect model line for wheat yield of Kirikkale: a fuzzy model prediction, b SVM model prediction, c multilinear regression model prediction
104
A. Altunkaynak and E. E. Ba¸sakın
Acknowledgements We would like to thank Meteorological General Institution and Turkish Statistical Institution for providing meteorological and wheat yield data, respectively.
References 1. Ahmed F, Al-Mamun HA, Bari ASMH, Hossain E, Kwan P (2012) Classification of crops and weeds from digital images: a support vector machine approach. Crop Prot 40:98–104 2. Ali S, Xu Y, Ma X, Ahmad I, Kamran M, Dong Z, Jias Q, Ren X, Zhang P, Jia Z (2017) Planting patterns and deficit irrigation strategies to improve wheat production and water use efficiency under simulated rainfall conditions. Front Plant Sci 8:1408 3. Allen RG, Pereira LS, Raes D, Smith M (1998) Crop evapotranspiration: guidelines for computing crop requirements. Irrig Drain Pap 1–56 4. Altunkaynak A, Chellam S (2010) Prediction of specific permeate flux during crossflow microfiltration of polydispersed colloidal suspensions by fuzzy logic models. Desalination 253:188–194 5. Altunkaynak A (2009) Sediment load prediction by genetic algorithms. Adv Eng Softw 40:928– 934 6. Altunkaynak A (2010) A predictive model for well loss using fuzzy logic approach. Hydrol Process 24:2400–2404 7. Altunkaynak A (2014) Predicting water level fluctuations in lake Michigan-Huron using wavelet-expert system methods. Water Resour Manag 28:2293–2314 8. Altunkaynak A, Özger M, Çakmakci M (2005) Water consumption prediction of Istanbul City by using fuzzy logic approach. Water Resour Manag 19:641–654 9. Altunkaynak A, Özger M, Sen Z (2003) Triple diagram model of level fluctuations in Lake Van, Turkey. Hydrol Earth Syst Sci 7:235–244 10. Asseng S, Foster I, Turner NC (2011) The impact of temperature variability on wheat yields. Glob Chang Biol 17:997–1012 11. Blaney HF, Criddle WD (1962) Determining consumptive use and irrigation water requirements. U.S. Department of Agriculture Research Service, Tech Bull 1275, pp 1–59 12. Bouazzama B, Xanthoulis D, Bouaziz A, Ruelle P, Mailhol JC (2012) Effect of water stress on growth, water consumption and yield of silage maize under flood irrigation in a semi-arid climate of Tadla (Morocco). Biotechnol Agron Soc Environ 16:468–477 13. Burges CJC (1996) Simplied support vector decision rules. Icml 96:71–77 14. Citakoglu H, Cobaner M, Haktanir T, Kisi O (2014) Estimation of monthly mean reference evapotranspiration in Turkey. Water Resour Manag 28:99–113 15. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297 16. David A, Lerner B (2005) Support vector machine-based image classification for genetic syndrome diagnosis. Pattern Recognit Lett 26:1029–1038 17. Dong Z, Zhang X, Li J, Zhang C, Wei T, Yang Z, Cai T, Zhang P, Ding R, Jia Z (2019) Photosynthetic characteristics and grain yield of winter wheat (Triticum aestivum L.) in response to fertilizer, precipitation, and soil water storage before sowing under the ridge and furrow system: A path analysis. Agric For Meteorol 273:12–19 18. Donigian AS, Love JT (2003) Sediment calibration procedures and guidelines for watershed Modeling. Proc Wat Env Fed 20:728–747 19. Farooq M, Hussain M, Siddique KHM (2014) Drought stress in wheat during flowering and grain-filling periods. CRC Crit Rev Plant Sci 33:331–349 20. Food and Agriculture Organization of the United Nations. FAOSTAT Statistical Database. [Rome]. 2018. 21. Hargreaves GH, Samani ZA (1985) Reference crop evapotranspiration from temperature. Appl Eng Agric 1:96–99
6 Estimation of Wheat Yield Based on Precipitation …
105
22. Hanft JM, Wych RD (1982) Visual indicators of physiological maturity of hard red spring Wheat. Crop Sci 22:584–588 23. Hatiboglu MA, Altunkaynak A, Ozger M, Iplikcioglu AC, Cosar M, Turgut N (2010) A predictive tool by fuzzy logic for outcome of patients with intracranial aneurysm. Expert Syst Appl 37:1043–1049 24. Ji H, Xiao L, Xia Y, Song H, Liu B, Tang L, Cao W, Zhu Y, Liu L (2017) Effects of jointing and booting low temperature stresses on grain yield and yield components in wheat. Agric For Meteorol 243:33–42 25. Johnson BL, Henderson TL (2002) Water use patterns of grain Amaranth in the Northern Great Plains. Agron J 94:1437–1443 26. Jolánkai M, Birkás M (2013) Precipitation impacts on yield quantity and quality of wheat crop. in 48. Hrvatski i 8. Medunarodni Simpozij Agronoma, Dubrovnik, Hrvatska, 17–22. Zbornik Radova, pp 489–493 27. Kaundal RA, Kapoor A, Raghava GPS (2006) Machine learning techniques in disease forecasting: a case study on rice blast prediction. BMC Bioinform 7:485 28. Kavdir I, Guyer DE (2003) Apple grading using fuzzy logic. Turkish J Agric 27:375–382 29. Kisi O (2005) Suspended sediment estimation using neuro-fuzzy and neural network approaches. Hydrol Sci J 50:683–696 30. Kumar S, Kumar N (2012) A novel method for rice production forecasting using fuzzy time series. Int J Comput Sci 9:455–459 31. Mazloumzadeh S, Shamsi MM, Nezamabadi-Pour H (2010) Fuzzy logic to classify date palm trees based on some physical properties related to precision agriculture. Precis Agric 11:258– 273 32. Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50:885–900 33. Musick JT, Jones OR, Stewart BA, Dusek DA (1994) Water-yield relationships for irrigated and dryland wheat in the U.S. southern plains. Agron J 86:980–986 34. Nash E, Sutcliffe V (1970) River flow forecasting through conceptual models PART I—a discussion of principles. J Hydrol 10:282–290 35. Nisar Ahamed TR, Gopal Rao K, Murthy JSR (2000) GIS-based fuzzy membership model for crop-land suitability analysis. Agric Syst 63:75–95 36. Nitze I, Schulthess U, Asche H (2012) Comparison of machine learning algorithms random forest, artificial neuronal network and support vector machine to maximum likelihood for supervised crop type classification. In: Proceedings of the 4th conference on geographic objectbased image analysis, pp 35–40 37. Olesen JE, Grevsen K (2000) A simulation model of climate effects on plant productivity and variability in cauliflower (Brassica oleracea L. botrytis). Sci Hortic (Amsterdam) 83:83–107 38. Parker P, Ingwersen J, Högy P, Priesack E, Aurbacher J (2016) Simulating regional climateadaptive field cropping with fuzzy logic management rules and genetic advance. J Agric Sci 154:207–222 39. Penman HL (1948) Natural evaporation from open water, hare soil and grass. Proc R Soc Lond A Math Phys Sci 193:120–145 40. Pirttioja N, Carter TR, Fronzek S, Bindi M, Hoffmann H, Palosuo T, Ruiz-Ramos M, Tao F, Trnka M, Acutis M, Asseng S, Baranowski P, Basso B, Bodin P, Buis S, Cammarano D, Deligios P, Destain MF, Dumont B, Ewert F, Ferrise R, François L, Gaiser T, Hlavinka P, Jacquemin I, Kersebaum KC, Kollas C, Krzyszczak J, Lorite IJ, Minet J, Minguez MI, Montesino M, Moriondo M, Müller C, Nendel C, Öztürk I, Perego A, Rodríguez A, Ruane AC, Ruget F, Sanna M, Semenov MA, Slawinski C, Stratonovitch P, Supit I, Waha K, Wang E, Wu L, Zhao Z, Rötter RP (2015) Temperature and precipitation effects on wheat yield across a European transect: a crop model ensemble analysis using impact response surfaces. Clim Res 65:87–105 41. Rivas-Perea P, Cota-Ruiz J, Chaparro DG, Venzor JAP, Carreón AQ, Rosiles JG (2013) Support vector machines for regression: a succinct review of large-scale and linear programming formulations. Int J Intell Sci 3:5–14
106
A. Altunkaynak and E. E. Ba¸sakın
42. Rumpf T, Römer C, Weis M, Sökefeld M, Gerhards R, Plümer L (2012) Sequential support vector machine classification for small-grain weed species discrimination with special regard to Cirsium arvense and Galium aparine. Comput Electron Agric 80:89–96 43. Shahin MA, Verma BP, Tollner EW (2000) Fuzzy logic model for predicting peanut maturity. Trans Am Soc Agric Eng 43:483–490 44. Sicat RS, Carranza EJM, Nidumolu UB (2005) Fuzzy modeling of farmers’ knowledge for land suitability classification. Agric Syst 83:49–75 45. Solomatine DP, Shrestha DL (2009) A novel method to estimate model uncertainty using machine learning techniques. Water Resour Res 45:1–16 46. Sriraman A, Mayorga RV (2007) Climate control inside a greenhouse: an intelligence system approach using fuzzy logic programming. J Environ Inform 10:68–74 47. Su YX, Xu H, Yan LJ (2017) Support vector machine-based open crop model (SBOCM): case of rice production in China. Saudi J Biol Sci 24:537–547 48. Sujjaviriyasup T, Pitiruek K (2013) Role of hybrid forecasting techniques for transportation planning of broiler meat under uncertain demand in thailand. Eng Ap Sci Res 41:427–435 49. Suykens JAK, Horvath G, Basu S, Micchelli C, Vandewalle J (2003) Advances in learning theory: methods, models and applications. In: 190 NATO-ASI Series III: Computer and Systems Sciences, IOS Press 50. Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybern SMC-15:116–132 51. Thakare VR, Baradkar HM (2013) Fuzzy system for maximum yield from crops. In: Proceedings on National Level Technical Conference. XPLORE, pp 4–9 52. Thakur P, Kumar S, Malik JA, Berger JD, Nayyar H (2010) Cold stress effects on reproductive development in grain crops: an overview. Environ Exp Bot 67:429–443 53. TSI (Turkish Statistical Institute). Production Statistics in Turkey, 2017. Available online at https://biruni.tuik.gov.tr/medas/?kn=92&locale=tr 54. Tyagi NK, Sharma DK, Luthra SK (2000) Evapotranspiration and crop coefficients of wheat and sorghum. J Irrig Drain Eng 126:215–222 55. Uyumaz A, Altunkaynak A, Özger M (2006) Fuzzy logic model for equilibrium scour downstream of a dam’s vertical gate. J Hydraul Eng 132:1069–1075 56. Venzhik YV, Titov AF, Talanova VV, Frolova SA, Talanov AV, Nazarkina YA (2011) Influence of lowered temperature on the resistance and functional activity of the photosynthetic apparatus of wheat plants. Biol Bull 38:132–137
Chapter 7
Coconut Maturity Recognition Using Convolutional Neural Network Parvathi Subramanian
and Tamil Selvi Sankar
1 Introduction and Related Works In recent years, smart farming played a key role and provided attractive solutions to defeat the challenges associated with agriculture. As agriculture plays a vital role in a country’s economy like India, traditional agricultural methods must be flattered by innovative technologies to accelerate production more accurately to develop highquality, high-yield agriculture. The use of computer vision technology has become an essential tool in agricultural operations during the past few decades. Expert and intelligent agricultural systems based on computer vision and machine learning algorithms are increasingly used to increase productivity and efficiency in agriculture production. The development of artificial intelligence, computer vision, and machine learning algorithms has provided many implications and insights for decision support and practices for farmers. Therefore, computer vision and machine learning technologies will be increasingly applied to the field of automation in agriculture and will steadily promote the development of agriculture to the era of intelligent agriculture. Intelligent robots in smart farming draw more attention worldwide, particularly in harvesting operations [16]. The accurate recognition of fruits and their location is crucial for a fruit-picking robot. The fruit recognition methods are mainly classified into two major categories, i.e., traditional image processing methods and deep learning methods. Conventional image processing methods have been widely used to do recognition of fruits, such as the Otsu algorithm [30], circular Hough transform [35], k-means algorithm [20], and so on. These fruit recognition methods used color, P. Subramanian (B) · T. S. Sankar Department of ECE, National Engineering College, K.R. Nagar, Kovilpatti, Tamil Nadu 628503, India T. S. Sankar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_7
107
108
P. Subramanian and T. S. Sankar
shape, and texture features. The recognition accuracy of the fruits is reduced when there is uneven illumination and occlusions between fruits. Deep learning methods are more robust for fruit recognition under challenging real-world conditions such as varying illumination, brightness, resolution, size, scale, and orientation. These methods provide better results for recognizing fruits under complex background which has similar colored and occluded objects. In banana leaf classification [2], the model provided better results using illumination variation and complex background images. Also, in fruit counting work [6, 24], the models were powerful to achieve better results under occlusion, variation, illumination, and scale. The recognition of fruits using computer vision is the first detracting action of the autonomous fruit harvesting robot. Computer vision is a field that uses a camera and a computer to make a machine to see, identify, track and measure the targets in the agriculture field for further image processing. The challenges involved in autonomous fruit from an image are varying illumination, fruit density distribution, overlapping and occlusion, reflection property of the object, and fluctuating factors such as color, size, and shape. Intensive exertion is required for developing an autonomous vision system for harvesting robots in the development of large datasets. Recent developments in agriculture using computer vision are crop health growth monitoring for precision agriculture, prevention and control of crop diseases, pests, and weeds, autonomous harvesting of crops, grading of fruits and vegetables, and smart farm automation management. Traditional monitoring methods in the agricultural field mainly rely on the subjective human decision, which is not timely or accurate. Crop monitoring to capture the information about the different growth stages of crops is an essential aspect of precision agriculture. Understanding the growth environment to make necessary arrangements helps improve production efficiency [19]. The application of computer vision technology in monitoring crop growth stages can detect any changes in earlier stages in the crops accurately due to malnutrition compared with manual checking [8]. Computer vision technology has the advantages of low cost, a small error, high efficiency, and good robustness in monitoring crop growth stages. The prevention and control of crop diseases, insects, and weeds are essential for high-quality production and pollution-free products with high yields in agriculture. Computer vision can be used for quick and accurate diagnosis of pests and diseases automatically and accurately to estimate the severity of the disease. The rise in production cost and shortage in labor availability are the key points to involve the automated harvesting techniques in agricultural production management. The development of autonomous harvesting robots need multidisciplinary supports: horticultural engineering, computer science, mechatronics, deep learning, and intelligent system design. Quality inspection of agricultural products such as grading helps to judge and determine the quality of fruits and vegetables to promote commercialization. At present, computer vision technology is mainly applied in grading agricultural products such as fruits and vegetables to improve the economic benefit of the same. Smart farm management using computer vision and machine learning provides knowledge-based support and insights concerning decision-making and practice to farmers. Usage of computer vision can save human
7 Coconut Maturity Recognition Using Convolutional Neural Network
109
resources involvement and reduce material costs, making farmers’ works as simple, scientific, and persuasive and eliminating the hard work involved [17]. Anami et al. [3] presented a deep convolutional neural network (DCNN) framework for automatically recognizing and classifying various biotic and abiotic paddy crop stresses using field images. They have adopted the pre-trained VGG-16 CNN model for the automatic classification of stressed paddy crop images captured during the booting growth stage and achieved an average accuracy of 92.89%. Ye et al. [31] presented their work on recognizing vegetable pest images using the pre-trained models VGG16, VGG19, Inception V3, and ResNet50. Their experimental results revealed that compared with Inception V3 and ResNet50, the recognition accuracy of the pre-trained models using VGG16 and VGG19 were higher, and the test accuracy of the two models were 99.90% and 99.99%, respectively. Helwan et al. [10] used ResNet-50 to grade bananas and achieved experimental results with an accuracy of 99%. Perez et al. [22] worked with eight different CNNs, i.e., VGG-16, VGG19, ResNet-50, ResNet-101, ResNet-152, AlexNet, Inception V3, and CNN from Scratch for the identification and sorting of ripe fruit—Medjool date. Among them, the VGG-19 model produced the best performance with an accuracy of 99.32%. Qi et al. [23] proposed automatic identification of peanut-leaf diseases using ResNet50 and DenseNet121 with a logistic regression classifier and achieved an accuracy of 97.59% and an F1 score of 90.50. Ashiqul Islam [4] proposed an automated convolutional neural network-based approach for paddy leaf disease detection. They have analyzed four models such as VGG-19, Inception-ResNet-V2, ResNet-101, Xception, and achieved better accuracy from Inception-ResNet-V2 as 92.68%. Ai et al. [1] used a convolutional neural network to identify crop diseases automatically. In their work, the Inception-ResNet-v2 model produced overall recognition accuracy of 86.1% for 27 disease images of 10 crops. Computer vision is an important technology that collects crop growth information from crop imaging for harvesting operations [21, 34]. Zhang et al. [33] proposed an image-based fruit category classification using a 13-layer convolutional neural network (CNN) with greater accuracy. Fruit segmentation and classification are the essential point techniques for fruit detection. Krig [18] mentioned in his book that the extraction of color feature alone could not give complete information about the object since it causes the loss of other visual information, which is essential for recognizing the objects efficiently in an image. Presently, researchers have involved machine learning for fruit detection using computer vision techniques. Ilyas et al. [14] proposed an algorithm for fruit recognition using the KNN algorithm with features like color, shape, size, and texture and recognized the fruits with an accuracy of 97%. The performance results of the Naive Bayes approach were evaluated with another classifier, SVM, and achieved 90% accuracy. Song et al. [26] proposed a method to recognize and count peppers using Bayes classifier and SVM using color and texture features in a cluttered greenhouse environment. The use of the deep learning (DL) technique in the agricultural environment to solve many problems using machine vision and image analysis concepts is increased to perform smart agriculture [36]. Kamilaris et al. [15] stated that DL provides better performance when compared with other popular image processing techniques in
110
P. Subramanian and T. S. Sankar
agricultural applications. Hussain et al. [13] presented a fruit recognition algorithm based on deep convolution neural network (DCNN) by recognizing the fruits with an accuracy of 99%, which are in the same color and shape. Brahimi et al. [5] used DL techniques in their work to classify nine tomato leaf diseases with an accuracy of 99.18%. In agricultural automation, computer vision technology will monitor crops, protect the plant, and harvest. But still, the application of computer vision and machine learning technology in agriculture is in the initial stage of development. Currently, there is no large-scale public database in the agricultural sector. The current research results rely on data collected by researchers that are not universal and comparable Tian et al. [29]. Therefore, it is necessary to generate a complete agricultural database since the existing computer vision techniques are limited in detecting single species or varieties in agriculture. The existing object recognition and detection techniques used in agriculture arise issues in gathering image information under complex background and environmental constraints in feature extraction. Hence, a robust algorithm is needed for object recognition with high accuracy, regardless of the color and application environment in the field. Thus, seven different CNN architectures like VGG16, VGG19, Xception, MobileNet, InceptionV3, InceptionResNetV2, and ResNet50 were used in this work to recognize two major coconut maturity stages such as tender coconut and mature coconut successfully. The workflow of the chapter is organized under separate sections as follows. Section 2 describes the development of the coconut image database. Section 3 presents the materials and methods for coconut maturity stages recognition using the CNN models. The experimental results have been discussed in Sect. 4, and the performance evaluation for coconut recognition is analyzed in Sect. 5. Finally, the conclusions are drawn in Sect. 6.
2 Development of Coconut Image Database 2.1 Image Acquisition and Preprocessing The coconut image acquisition was performed at various coconut farms with varying illumination conditions in a complex background. The viewing distance and the camera’s angle are not the same and are based on the visibility of the coconuts. Most coconut bunches are distributed densely, overlapping, and hindered by the petiole and pinnae presented in the tree crown. These similar colored objects are produced more significant challenges for detecting coconuts in real-time. The maturity stages of coconuts are identified with the differences of objects in size, color, and cluster density. Coconuts are big, green or red, and round-shaped when they are young or tender. The maturation process involves a change in color and shape. The nuts husk color is gray, yellow-green, brown patches, stringy, and oval shape in the mature
7 Coconut Maturity Recognition Using Convolutional Neural Network
111
coconut stage. In this work, images of nuts in the main maturity stages, including the young/tender and mature coconuts, are collected and considered for classification and detection. Two thousand coconut images were acquired from various farms in Tamil Nadu, India, to evaluate the proposed algorithm in real-time, using a digital singlelens reflex (DSLR) camera (Nikon D810), Samsung, and a mobile phone camera. The steps involved in coconut maturity stages recognition are shown in Fig. 1. Images acquired under different lighting conditions and cloudy weather conditions are shown in Fig. 2. The collected coconut images were categorized into two classes, and the database developed for this work related to the recognition of coconut maturity stages is listed in Table 1.
Image acquisition and preprocessing
Images categorization into two classes
Recognition of coconut maturity stages
Feature extraction
Fig. 1 Steps involved in coconut maturity stages recognition
Fig. 2 Sample images with different lighting conditions
Table 1 Categorization of images for training and testing case 1 and case 2 Name of class
Total images (100%)
Train set # (images) (90%)
Test set # (images) (10%)
Train set # (images) (70%)
Test set # (images) (30%)
Mature coconut
1000
900
100
700
300
Tender coconut
1000
900
100
700
300
Total
2000
1800
200
1400
600
112
P. Subramanian and T. S. Sankar
Mature coconut
Young / Tender coconut
Fig. 3 Sample images from coconut database
2.2 Image Categorization The coconut images acquired for this work are categorized into tender coconut or coconut and mature coconut and were randomly selected with the visual selection of coconuts, which are situated at different places in the tree crown for the coconut recognition process with or without interfering with other objects.
2.3 Ground Truth Image Creation Furthermore, the image preprocessing procedure involves developing ground truth images from the collected image database by selecting the required objects to highlight the region of interest (ROI). Some of the examples from our dataset are illustrated in Fig. 3. During the database development, the images with smaller resolutions are enhanced, and lesser dimensions images are avoided to maintain the dataset with good quality. Also, the images with high resolution in the ROI area were selected from the database for better feature learning.
2.4 Image Augmentation The image augmentation technique is used to refine the database with different pose variations of images to reduce overfitting error and noise during training. This work uses image rotation, width shift and height shift, shear, zoom, fill mode, and horizontal flip operations to make the model more generalized for recognizing coconuts. The database divided for the training and testing phase is then expanded with the image augmentation process, and the extended training set was used for training.
7 Coconut Maturity Recognition Using Convolutional Neural Network
113
3 Materials and Methods In this proposed work, coconut recognition using a convolutional neural network is carried out on the developed coconut image database, and the results are compared. The workflow of the proposed coconut maturity recognition system is shown in Fig. 4.
3.1 Convolutional Neural Networks The convolutional neural network (CNN) is a type of artificial neural network with neurons similar to the neurons in the primary visual cortex of a biological brain. It comprises weights and bios for learning. Each neuron receives some input, performs a scalar product, and then applies an activation function for getting output. Figure 5 presents a CNN structure, which consists of three blocks. The first is the input, an image next, feature extraction block, which has convolutional and pooling layers finally, classification in the third block consists of fully connected layers and softmax, by increasing the number of convolution and pooling layers, the structure of the CNN modified. This study considered seven CNN architectures to recognize coconut maturity stages like VGG-16, VGG-19, Inception V3, and ResNet-50. MobileNet, InceptionResNetV2, and Xception.
Image preprocessing Coconut image database
Ground truth image creation
Coconut maturity recognition using CNN architectures
Mature coconut Feature extraction
Classification
Fig. 4 Workflow of the proposed coconut maturity recognition system
Fig. 5 CNN architecture
Classification results
Tender coconut
114
P. Subramanian and T. S. Sankar
VGG-16 and VGG-19 Architectures VGG is the abbreviation for the Visual Geometric Group [32]. Simonyan and Zisserman [25] developed the VGG model in 2014, which has 3 × 3 convolutional layers stacked on top of each other in increasing depth. The max-pooling layers are used to handle the reduction of the volume size. Two fully connected layers, each with 4096 nodes, are then followed by a softmax classifier. The number 16 or 19 is the layer of networks considered deep. Inception V3 Architecture (GoogLeNet V3) This architecture was born with the name of GoogLeNet, but subsequent updates have been called Inception VN, where N refers to the version number put out by Google. The basic module of Inception [28] consists of four branches concatenated in parallel: a 1 × 1 kernel convolution followed by two 3 × 3 convolutions; a 1 × 1 convolution, followed by a 3 × 3 convolution; a pooling, followed by a 1 × 1 convolution; and finally a 1 × 1 convolution. Inception consists of 10 modules, although these modules are going slightly as the net gets deeper. Five of the modules are changed to reduce the computational cost by replacing the n × n convolutions with two convolutions, a 1 × 7 followed by a 7 × 1. Two last modules replace the last two convolutions: 3 × 3 of the first branch with two convolutions each and one 1 × 3 followed by another 3 × 1, this time in parallel. In total, Inception V3 has 42 layers with parameters. ResNet-50 Architecture (Residual Neural Network) ResNet [9] does not have a fixed depth and depends on the number of consecutive modules used. However, increasing the network’s depth to obtain greater precision makes the network more difficult to optimize. ResNet addresses this problem by adjusting a residual application to the original and adding several connections between layers. These new connections skip several layers and perform an identity or a 1 × 1 convolution. The base block of this network is called the residual block. When the network has 50 or more layers, it is composed of three sequential convolutions, a 1 × 1, a 3 × 3, a 1 × 1, and a connection that links the input of the first convolution to the output of the third convolution. Xception The creator of Keras library proposed Xception. It extends the Inception network, which replaces the standard Inception modules with depthwise separable convolutions. It reached a 5.5% top-5 error rate on the ImageNet dataset [7]. It outputs a 2048 dimension feature vector. InceptionResNetV2 The InceptionResNetV2 architecture combines residual connection and the Inception architecture [27]. InceptionResNetV2 consists of three inception blocks. The 3 × 3 standard convolution was replaced by 3 × 3 depthwise separable convolution, where the 7 × 7 standard convolutional structure of the inception model was replaced by 7 × 7 depthwise separable convolution. An increase in the number of convolutional layers and the deepening of the network improved performance accuracy. MobileNet The main intention behind MobileNet architecture is the convolutional layer, which is quite expensive in regular convolutions compared to MobileNet.
7 Coconut Maturity Recognition Using Convolutional Neural Network
115
Depthwise separable convolution is used in the MobileNet architecture [11]. Depthwise convolution is independently performed for each input channel. The first layer is called the expansion layer of 1 × 1 convolution, and its purpose is to expand the number of channels in the data. Next is the projection layer. In this layer, a high number of dimensions is reduced to a smaller number. Except for the projection layer, each layer comprises a batch-normalization function and activation function ReLU. In the MobileNet architecture, there is one residual connection between input and output layers. The residual network tries to learn already learned features; those not useful in decision-making are discarded. This architecture can reduce the number of computations and parameters. The MobileNet architecture consists of 17 building blocks in a row, followed by a 1 × 1 convolutional layer, global average pooling layer, and classification layer. The training and testing for the coconut maturity recognition like coconuts and mature coconuts of this work are performed in Python software with OpenCV in Google Colab to implement the CNN-based coconut maturity stages recognition. Coconut maturity stages identification has been completed through the training of the CNN by assigning a proper feature matrix and obtaining the target class based on the input class of the dataset.
4 Experimental Results In this section, the performance of the proposed method was evaluated and analyzed. However, the recognition rate in this proposed method is excellent, with fewer images used for training and testing, with variations in color, size, shape, texture, illumination, and brightness. The proposed algorithm detects two major maturity stages of coconuts from our dataset images. These experimental results show that this algorithm can detect coconuts from real-world environments, i.e., coconut farms and Google images.
5 Performance Evaluation Analyses 5.1 Coconut Maturity Stages Recognition The performance of the object detection algorithm depends on the performance of the classifier used. The performance was evaluated by calculating the statistical indicators, including precision, recall or sensitivity, and F1 score. Precision =
True Positives True Positives + False Positives
(1)
116
P. Subramanian and T. S. Sankar
Table 2 Performance comparison of the various classifiers with different performance metrics Name of the classifier
classes
Precision
Recall
F1 score
DCNN
Mature coconut
99
98
98
Tender coconut
100
99
100
SVM
Mature coconut
96
97
98
Tender coconut
99
98
98
Mature coconut
90
93
91
Tender coconut
92
94
93
Decision tree
Mature coconut
90
97
94
Tender coconut
90
100
95
Naïve Bayes
Mature coconut
85
82
80
Tender coconut
71
86
77
KNN
Recall =
True Positives True Positives + False Negatives
F1 Score =
2 × Precision × Recall Precision + Recall
(2) (3)
The performance comparison of various classifiers like DCNN, SVM, KNN, decision tree, and Naïve Bayes with precision, recall, and F1 score measures is presented in Table 2. According to the classification metrics obtained from the validation step, the F1 score value of the DCNN for the detection of tender, mature coconuts were 98, 99, and the overall F1 score achieved by the algorithm was 99%, respectively. The results of the proposed work witnessed that the DCNN classification algorithm is better than other classifiers such as SVM, K-nearest neighbor (KNN), decision tree, and Naïve Bayes for the classification of the above two classes, which are taken from the developed coconut image dataset. Further, the performance comparison of various CNN models with precision, recall, and F1 score measures are presented in Table 3. According to the classification metrics obtained from the validation step, the F1 score value of the ResNet50 for the recognition of tender and mature coconuts was 98 and 100, and the overall F1 score value was 99%, respectively. The F1 score value for the InceptionV3 architecture for recognizing tender and mature coconuts was 98 and 98. Nevertheless, the recognition results are better in InceptionV3, but it has less classification rate for tender and mature coconuts. The results of the proposed work witnessed that the ResNet50 CNN architecture is better than other architectures such as Xception, InceptionV3, VGG16, VGG19, MobileNet, and InceptionResNetV2 for the classification of the above two classes, which are taken from the developed coconut image dataset. The performance evaluation results of the proposed CNN architectures with two different test sizes are given in Tables 4 and 5 with the measure of two accuracy
7 Coconut Maturity Recognition Using Convolutional Neural Network
117
Table 3 Performance comparison of the various CNN models with different performance metrics Name of the CNN model
Classes
Precision
Recall
F1 score
Xception
Mature coconut
92
96
94
Tender coconut
100
99
99
InceptionV3
Mature coconut
100
95
98
Tender coconut
100
97
98
Mature coconut
94
100
97
Tender coconut
90
100
93
VGG19
Mature coconut
100
93
96
Tender coconut
90
100
95
ResNet50
Mature coconut
100
100
100
Tender coconut
100
97
98
Mature coconut
91
83
87
Tender coconut
85
96
90
Mature coconut
100
93
96
Tender coconut
92
100
96
VGG16
MobileNet Inception ResNetV2
Table 4 Performance results of CNN models for test size: 0.10
Table 5 Performance results of CNN models for test size: 0.30
CNN model
Top-1 accuracy (%) Top-5 accuracy (%)
Xception
92.79
97.30
InceptionV3
97.24
99.65
VGG16
87.29
98.53
VGG19
88.24
99.26
ResNet50
98.32
99.85
MobileNet
76.62
90.44
Inception ResNetV2 96.06
99.26
CNN model
Top-1 accuracy (%) Top-5 accuracy (%)
Xception
92.84
98.57
InceptionV3
97.70
99.85
VGG16
89.45
99.32
VGG19
87.34
98.77
ResNet50
98.53
100
MobileNet
79.80
91.52
Inception ResNetV2 97.54
99.78
118
P. Subramanian and T. S. Sankar
rates [12]. The top-1 accuracy is the conventional accuracy, which means that the model answer (the one with the highest probability) must be exactly the expected answer. The top-5 accuracy implies that any of the neural network model that gives 5 highest probability answers must match with the expected response. The top-1, top-5 accuracy may also be called rank-1, rank-5 accuracy. Furthermore, the results proved that ResNet50 could recognize the coconut/bunch of coconuts with all variations like color, size, shape, illuminations, and brightness. This underlying work act as a base to locate the position of the coconuts and mature coconuts in a complex background in the tree crown for harvesting operation.
6 Conclusion The coconut maturity stages recognition using CNN models with the dataset generated is discussed based on our experimental work. The maturity stages of the coconuts in different environmental conditions were detected efficiently. We verified realworld challenges with our dataset by checking the algorithm with real-time images captured in coconut farms and Google. Hence, the ResNet50 model efficiently detects coconut maturity stages when compared with other CNN models used in this study with the top-1 accuracy of 98.32% and top−5 accuracy of 99.85% for the test size 0.10, top−1 accuracy of 98.53%, and top-5 accuracy of 100% for the test size 0.30. Acknowledgements This research did not receive specific grants from funding agencies in the public, commercial, or not-for-profit sectors. Conflict of Interest None.
References 1. Ai Y, Sun C, Tie J, Cai X (2020) Research on recognition model of crop diseases and insect pests based on deep learning in harsh environments. IEEE Access 8:171686–171693. https:// doi.org/10.1109/ACCESS.2020.3025325 2. Amara J, Bouaziz B, Algergawy A (2017) A deep learning-based approach for banana leaf diseases classification. In: BTW workshop, Stuttgart, pp 79–88 3. Anami BS, Malvade NN, Palaiah S (2020) Deep learning approach for recognition and classification of yield affecting paddy crop stresses using field images. Artif Intell Agric 4:12–20, ISSN 2589-7217. https://doi.org/10.1016/j.aiia.2020.03.001 4. Ashiqul Islam Md, Nymur Rahman Shuvo Md, Shamsojjaman M, Hasan S, Shahadat Hossain Md, Khatun T (2021) An automated convolutional neural network based approach for paddy leaf disease detection. Int J Adv Comput Sci Appl 12(1) 5. Brahimi M, Boukhalfa K, Moussaoui A (2017) Deep learning for tomato diseases: classification and symptoms visualization. Appl Artif Intell 299–315. https://doi.org/10.1080/08839514. 2017.1315516 6. Chen SW, Shivakumar SS, Dcunha S, Das J, Okon E, Qu C, Kumar V (2017) Counting apples and oranges with deep learning: a data-driven approach. IEEE Robot Autom Lett 2(2):781–788
7 Coconut Maturity Recognition Using Convolutional Neural Network
119
7. Chollet F (2017) Xception: deep learning with depth-wise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1251–1258 8. Choudhury SD, Goswami S, Bashyam S, et al (2017) Automated stem angle determination for temporal plant phenotyping analysis 237:2022–2029 9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, pp 770–778 10. Helwan A, Ma’aitah MKS, Abiyev RH, Uzelaltinbulat S, Sonyel B (2021) Deep learning based on residual networks for automatic sorting of bananas. J Food Qual 2021:11. Article ID 5516368. https://doi.org/10.1155/2021/5516368 11. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 12. https://towardsdatascience.com/accuracy-and-loss-things-to-know-about-the-top-1-and-top5-accuracy-1d6beb8f6df3 13. Hussain I, He Q, Chen Z (2018) Automatic fruit recognition based on DCNN for commercial source trace system. Int J Comput Sci Appl (IJCSA) 8(2/3) (2018) 14. Ilyas M, Ur Rahman S, Waqas M, Alam F (2018) A robust algorithm for fruits recognition system. Transylvanian Rev XXVI(32):8319–8325 15. Kamilaris A, Prenafeta-Boldu FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90 16. Kapach K, Barnea E, Mairon R, Edan Y, Ben-Shahar O (2012) Computer vision for fruit harvesting robots—state of the art and challenges ahead. Int J Comput Vision Robot 3(1/2):4–34 17. Kim H, Kim J, Choi S-W et al (2016) The study of MP-MAS utilization to support decisionmaking for climate-smart agriculture in rice farming. Korean J Agric Forest Meteorol 18:378– 388 18. Krig S (2016) Computer vision metrics: survey, taxonomy, and analysis of computer vision. In: Visual neuroscience, and deep learning. Berlin, Germany, Springe, p 637 19. Li K, Lian H, Van Deun R et al (2019) A far-red-emitting NaMgLaTeO6:Mn4+ phosphor with perovskite structure for indoor plant growth. Dyes Pigm 162:214–221 20. Luo L, Tang Y, Lu Q, Chen X, Zhang P, Zou X (2018) A vision methodology for harvesting robot to detect cutting points on peduncles of double overlapping grape clusters in a vineyard. Comput Ind 99:130–139. https://doi.org/10.1016/j.compind.2018.03.017 21. Patrícioa DI, Rieder, R (2018) Computer vision and artificial intelligence in precision agriculture for grain crops: a systematic review. Comput Electron Agric 153:69–81 22. Pérez-Pérez BD, García Vázquez JP, Salomón-Torres R (2021) Evaluation of convolutional neural networks’ hyper parameters with transfer learning to determine sorting of ripe medjool dates. Agriculture 11:115. https://doi.org/10.3390/agriculture11020115 23. Qi H, Liang Y, Ding Q, Zou J (2021) Automatic identification of peanut-leaf diseases based on stack ensemble. Appl Sci 11:1950. https://doi.org/10.3390/app11041950 24. Rahnemoonfar M, Sheppard C (2017) Deep count: fruit counting based on deep simulated learning. Sensors 17(4):905 25. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 26. Song Y, Glasbey C, Horgan G, Polder G, Dieleman J, Van der Heijden G (2014) Automatic fruit recognition and counting from multiple images. Biosys Eng 118:203–215 27. Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence 2017, San Francisco, CA, USA, p 31 28. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, pp 2818–2826 29. Tian H, Wang T, Liu Y, Qiao X, Li Y (2020) Computer vision technology in agricultural automation—a review. Inf Process Agric 7(1):1–19, ISSN 2214-3173. https://doi.org/10.1016/ j.inpa.2019.09.006
120
P. Subramanian and T. S. Sankar
30. Xiong J, Lin R, Liu Z, He Z, Tang L, Yang Z, et al (2018) The recognition of litchi clusters and the calculation of picking point in a nocturnal natural environment. Biosyst Eng 166:44–57. https://doi.org/10.1016/j.biosystemseng.2017.11.005 31. Ye H, Han H, Zhu L, Duan Q (2019) Vegetable pest image recognition method based on improved VGG convolution neural network. J Phys Conf Ser 1237:032018. https://doi.org/10. 1088/1742-6596/1237/3/032018 32. Zaccone G, Karim MR (2018) Deep learning with tensorFlow: explore neural networks and build intelligent systems with python. Packt Publishing Ltd., Birmingham, UK 33. Zhang YD, Dong Z, Chen X, Jia W, Du S, Muhammad K, Wang SH (2017) Image based fruit category classification by 13-layer deep convolutional neural network and data augmentation. Multimedia Tools Appl 1–20 34. Zhao Y, Gong L, Huang Y, Liu C (2016) A review of key techniques of vision-based control for harvesting robot. Comput Electron Agric 127:311–323 35. Zhou WJ, Zha ZH, Wu J (2020) Maturity discrimination of “Red Globe” grape cluster in grapery by improved circle Hough transform. Trans Chin Soc Agric Eng 36:205–213 36. Zhu N, Liu X, Liu Z, Hu K, Wang Y, Tan J, Huang M, Zhu Q, Ji X, Jiang Y, Guo Y (2018) Deep learning for smart agriculture: concepts, tools, applications, and opportunities. Int J Agric Biol Eng 11(4):32–44
Chapter 8
Agri-Food Products Quality Assessment Methods Sowmya Natarajan
and Vijayakumar Ponnusamy
1 Introduction Agriculture plays a significant role in producing better quality food products for consumers around the world. The quality and safety of food products are the greater concern in this new era of economics. Nowadays, the quality is degraded due to many factors, such as colorant additives, foreign bodies’ inclusion, adulterants mixture, soil nutrition, spraying harmful pesticides, and insecticides. Food analysis determines various information about characteristics of food products, which includes their structure, composition, sensory attributes, and physicochemical properties. The foremost concern for the food processing analysis is to ensure the security and safety. Since food safety is much essential nowadays due to the rise of contamination scares [1], government imposes some of the policies, recommendations and regulations designed to sustain the food quality supply and to ensure the safety of the consumers, also to deliver information about the nutritional values and composition of food parameters. From the data labeled, they can able to know their diet information in the food and eradicate the economic fraud. There are many food regulations followed, such as the United States Department of Agriculture (USDA), Food and Drug Administration (FDA), Environmental Protection Agency (EPA), and National Marine Fisheries Service (NMFS) [2]. Traditional monitoring system involves destructive analysis for food quality monitoring in terms of biochemical analysis. Food quality monitoring is efficient and necessary in the food production industry. Nowadays, the latest technology involves online real-time food quality monitoring with the emerging deep neural network algorithms. One of the major techniques that provide a solution for the issues in the quality of food products is precision agriculture, which plays a better role and provides better S. Natarajan · V. Ponnusamy (B) SRM Institute of Science and Technology, Kattankulathur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_8
121
122
S. Natarajan and V. Ponnusamy
solutions for some of the current issues. On-site crop monitoring, sensor data involve better prediction of yield reduction and able to find the root cause. By combining the agriculture system with data analytics and machine learning algorithms, smart agriculture plays a better enhancement in delivering good agri products to individuals. Precision agriculture takes advantage of the latest technologies such as unmanned aerial vehicle (UAV), Internet of things (IoT), and augmented reality (AR) to predict and assess the healthy agri products [3]. Toxicity/hazards may be chemical, physical, and biological, causing adverse/harmful effects on consumers’ health. The presence of foreign materials such as stones, buttons, hair and strings in the food products are referred as physical hazards. Adulteration can be the intentional or incidental addition of substances to make the real product spoiled. Incident adulteration includes intrusion of pests and chemicals like the presence of arsenic in the soil. Intentional adulteration defines the addition of chemical substances, fertilizer, and herbicides to the agri products to visualize better for their freshness and life. Thus, by adding the toxic substances, the product quality is harmful to the consumers. Some of the common adulterant identification spectroscopic techniques are discussed in this work. Ultraviolet, terahertz, near infrared, and microwave spectral radiation employed for the adulterant determination. It is concluded that Fourier transform infrared spectroscopy (FT-IR) and near infra-red (NIR) produces adulteration prediction with the chemo metric analysis. Chemo metric technique involves machine algorithm processing with the acquired spectral data. It includes both solid and fluid agri products. Low-power and low-cost device will revolutionize the deployment of toxic determination in the near future. Figure 1 shows the various chemical toxins existence in the foods. The presence of some microorganisms (pathogens, insects, food-borne pathogens, bacteria, viruses, molds, and protozoa) leads to biological hazards [4]. Sound wave techniques are deployed non-destructively for the determination of food quality. Acoustic and ultrasonic waves with frequencies above 20 kHz to 1 MHz [5] passed through the food product tissue. During postharvest, the vibrated response sound waves are used to determine/predict the ripening stage, maturity, and internal quality parameters. Ultrasound waves cannot be perceived by human ears. High- and low-intensity mechanical wave plays a significant role in food processing applications. Sound waves determine the quality of cow milk, oil grading, freezing, thawing, dehydration, emulsification, and foreign body identification. For in vitro property determination, a high-frequency microwave component was deployed. This technique is deployed in large-scale industrial measures, real-time applicable design, free from environmental conditions experimental setup, temperature error, and biochemical effects.
8 Agri-Food Products Quality Assessment Methods
123
Cleaning Chemicals
Pesticide Residues
Non- permissible Food Additives Excess Permissible Food Additives
Adulterants
Veterinary Residues Fig. 1 Chemical hazards in food substances
2 Techniques for Food Quality Analysis Current food industries focused on developing smart products which fulfill the requirement demanded by the consumers in the market, looking forward to rapid results with better accuracy. Analysis of food products brings more results about the presence of nutritional compositions and the absence of metals/foreign elements. Moreover, foodstuffs being touted with organic names are indeed organic. Thereby, large-scale industries can ensure their laws and regulations proposed by the government. However, some of the existing mechanisms deliver results by consuming more time, laboratory-based environmental analysis, expensive, and not suitable for realtime analysis. Though, many non-destructive techniques are deployed for the determination of nutritional compositions, but not for adulterant identification. Still, lowcost, rapid, non-destructive applications need to be enhanced for the identification of adulterants in food substances with better accuracy [6]. Food products assessment involves the evaluation of internal and external factors. Table 1 describes the internal Table 1 Internal and external factors of vegetables and fruits S. No.
Internal factors
External factors
1
Texture—crispness, firmness, juiciness
Shape—depth/diameter, ratio
2
Defect—water core, rotten, frost damage, internal cavity
Defect—spot, stab, bruise
3
Nutrition—proteins, vitamins, carbohydrates Color—intensity, uniformity
4
Flavor—sourness, astringency, sweetness, aroma
Size—dimension, weight, volume
124
S. Natarajan and V. Ponnusamy
Table 2 Non-destructive methods for quality analysis Methods
Techniques deployed
Components estimated
Dynamics
Sonic, Ultrasonic, CT and X-ray image
Internal cavity density, viscoelasticity, structure, firmness ripeness
Optics
Reflectance, transmittance, laser and Color, internal and external defects, absorbance spectroscopy image and size, shape, chemical constituents etc. vibrated excitation
Electromagnetic NMR, impedance
Sugar, moisture content, oil, structure internal defects, density
Chemical
Sugar, acidity
E-tongue, E-nose
and external quality parameter of fruits and vegetables. Finding nutritional quality and adulterants presence in the external as well as an internal part of food products needs to be done to deliver better quality for the consumers without affecting the surface of the products. Table 2 shows some of the non-destructive mechanisms deployed for the identification of the quality of agricultural food products. Many of the usual methods include the electronic nose, hyper-spectral imaging, machine-vision, near-infrared spectroscopy, acoustic, and ultrasound measurement. This can supply mechanical, structural, chemical, and physical data of the food products. Analysis processes are influenced by the modifications in those properties of the food substances and accuracies can be measured. This chapter intended to contribute in the following area. • This work focuses on more insight into state of art on food quality assessment, particularly non-destructive techniques. • More focused summary on imaging, spectroscopic, sound wave mechanisms, and blockchain technology for traceability in the food supply chain also presented. • The challenges, research gaps, future scope in agri-product quality assessment mechanism are also summarized. The following section describes the imaging and spectroscopic data combined with chemo metric analysis are delineated. Continued grading of foods through supply chain activity based on blockchain mechanism is also discussed.
3 Imaging Techniques for Quality Assessment With the growing population among worldwide, it is necessary to process and handle the expectation of food quantitative and qualitative monitoring standard system [7]. A consistent, rapid, and economical inspection technique named computer vision is deployed in many industries for the determination of food quality analysis. Nowadays, machine vision system (MVS) assists researchers and industrial people in enhancing food processing and analysis. Computer-aided techniques are adapted for
8 Agri-Food Products Quality Assessment Methods
125
a wide range of agriculture and food materials. This technique can able to analyze the food materials by size, shape, color, microbial infections, grading, sorting, and defect detection. Image processing mechanism can be applied in many large-scale industries such as fruit sorting with respect to the size, shape, color, texture, and grading also performed based on sorting properties in bakeries, dairy products, meat quality estimation etc. This non-destructive technique is a completely automated process that provides better accuracy. For quality and grading fruits and nuts, this application is widely deployed in many industries. Apples, orange, strawberries, oil palm fruits, papaya, kiwi, nuts, tomato, peach, pear, pomegranate, and in fruits post harvesting fields, computer vision system produces a maximum of 70% grading accuracy. It has some limitations on lighting conditions, the overall surface of the food products is not covered fully under data acquisition, nature of unstructured food substances, field variation, and plant biological variation. To alleviate these limitations, three-dimensional computer vision may ensure to meet with better accuracy and reliability. Color is also the primary factor that tells the quality and freshness of meat. Some traditional biochemical experimentation was employed for assessing the quality of meat [8]. Examining the meat through a visual method consumes time, and the results may be inappropriate. Another method of testing involves colorimetric analysis, which needs technicians to operate, equipment is expensive, and it is a timeconsuming process. To alleviate the limitations, this work [8] proposes an anondestructive, low-cost technique for the evaluation of meat quality concerning the color image of the meat sample. Images are acquired via smartphone and the color card is placed beside the sample of meat and extracts the color information automatically. Color blocks are used to extract the pixel values and calculated the transformation color matrix to obtain the color correlation. Hierarchical clustering is applied to all the experimental samples and achieves three various quality levels. With the resultant centroid cluster points, a new sample was assigned on three level of grades to validate the quality of its color. Analyzing the nutritional quality of agricultural food products is also one of the forms to eradicate malnutrition activities [9]. Some of the nutritional factors evaluating indicators are discussed and analyzed by the researchers. Existing evaluators capture the compositional elements of various stages of nutritional quality agricultural products. Food products are taken from the production stage to chosen aquaculture system and compared the limitation and strength of the indicator system. The resultant indicators are ‘potential nutrient adequacy,’ nutritional yields’ and ‘Rao’s quadratic entropy’ provide better capturing ability on the production system to nourish many numbers of peoples and which helps to be a better tool for decision-making and prioritizing investment in the private and government sectors. Among worldwide, leafy vegetables are consumed in every individual diet. Nutritional quality analysis is becoming the most important factor for a healthy human diet [10]. This work focuses on traceability and evaluation methods to track the leafy vegetables nutritional quality. Hazard Analysis and Critical Control Point (HACCP) joined with Fault Tree Analysis (FTA) implemented from the production to sale process to trace the leafy vegetable nutritional quality. Spinach, lettuce, celery, and
126
S. Natarajan and V. Ponnusamy
rape are the four common leafy vegetables employed for the analysis. By monitoring production to transportation to the sale process, issues can be solved in time and can be able to achieve better reliability with the proposed techniques. Fuzzy allows the leafy vegetable to get sorted or graded for its nutritional quality and also enhances the traceability of the system. This information are store in two-dimensional code which enables the consumers to know about the product’s nutritional quality with traceability information efficiently. Post-harvest quality control is one major factor for the food industry to be competitive in markets. The traditional inspection system is based on a worker’s eye inspection; sometimes, it may give an error due to poor vision or proficiency. Imaging techniques can execute more variety of preceding that are needed for quality control. Based on color, shape, texture, and some of the physical parts of the foods are evaluated with the aid of imaging techniques and provides 100% of accuracy in detecting the nutritional as well as adulterant presence. With the automating assessment mechanism, imaging technique gives more sophistication in assessing the quality of food products and serves better for the consumers as well as the industries.
4 Spectroscopic Methods for Food Quality Analysis Traditional methods for food quality assessment involve laborious, offline, timeconsuming, and destructive methods for internal quality analysis of agri-food products. Spectroscopic methods aid to solve the soft skin agri products internal quality assessment. Food substances contain various chemical components which absorb the light at specific energy wavelengths. In the infrared and visible wavelength range, some of the chemical compounds present in the food materials may absorb the light energy at the range of (380–900 nm). The light emitted by the food materials also be captured using spectrophotometers and reveals the data about the surface structure and chemical properties [11]. This spectroscopic method provides a promising technique for discrimination, authenticity, and quality measurement. The ability of internal solid and liquid characteristics can be achieved better by time domain nuclear magnetic resonance. The work [12] proposes to evaluate the in-shell quality of hazelnuts. This technique delivers an inline method and also reduces the time of analysis and determines the defective and good hazelnuts rapidly. The signal information is processed in three algorithms in which the presence of mold, kernel development, and moisture content are evaluated. The experiment resultant accuracy of 95% was achieved for the determination of good and bad ones. A method combines multispectral imaging with machine learning (ML) algorithms [13] for the validation and assessment of microbiological meat quality is presented. The spectral data are processed via support vector machine algorithm and delivers better root mean square error value of 1.146 for aerobic and modified atmospheric package condition storage achieves 0.886. This technique proves better meat quality estimation under various temperature and packaging conditions.
8 Agri-Food Products Quality Assessment Methods
127
A non-destructive fruit internal quality monitoring system is designed by employing time-resolved reflectance spectroscopy [14]. It identifies the internal quality of the fruit, optimum harvesting stage, and changes during the storage time. The reflectance is determined by the internal scatter for the mean free path of the scattered light from the sample. Red star and granny smith apples are utilized for the experimentation. Change in the internal pulp pigmentation leads to change in the order of scattering and the mean-free path, and it affects the scattering of reflected light. Degradation in quality is correlated with the degree of polarization. The changes occurred in the internal pulp of the fruit causes modification in the degree of polarization. It is a low-cost, reliable, and simple model for quality assessment. Many non-destructive and fingerprint techniques are evolving in the field of food quality analysis. Surface enhanced Raman spectroscopy (SERS) plays an alternative role for colloidal and rigid substrates received greater interest in this field. The review work [15] examines in-situ destructive methods, swab sampling scheme, flexible material for constructing flexible SERS substrate. Cellulose paper method and polymer membrane method applied on tapes, cotton fabrics, and biomaterials. Tape is sticky, and it influences viscosity and high fluorescence. The latest application is discussed with the SERS in determining the presence of pesticides in the vegetables and fruit samples, food adulteration, animal farming chemical residues, and food borne infection raised from pathogens are analyzed. These techniques utilized to trace and sense the pesticides such as carbaryl, chlorpyrifos, and parathion methyl. Near-infrared hyper spectral imaging was deployed for the deterioration analysis and shelf life of baked cakes [16]. During storage, the cake is influenced by microorganisms. This method provides a non-destructive method of determination in the range of 935–1720 nm. To predict the sponge cakes storage time partial least square regression (PLSR) technique is adopted. The techniques deliver the precise correlation coefficient of 83% and root mean square error (RMSE) of prediction of 1.242. The next analysis is made for distinguishing expired and non-expired tween sponge cakes, which are applied with partial least square-discriminant analysis (PLS-DA). The results were obtained with a prediction accuracy of 91.3%. Therefore, these techniques can be employed for the cakes storage time prediction and discriminating expired and non-expired cakes. One of the food quality degradations comes with an uneven cooking process. Especially in microwave oven, ununiformed temperature distribution exists, which may affect the quality of baked goods [17]. Conventional oven heating mechanisms focus on the simulated ones, not on the quantitative analysis of baked food goods. This work experimented for the identification of uniformity in temperature on baked goods. Image processing model deployed to visualize the uniformity in various baked conditions and inside the oven and internal field temperature is expressed digitally. Digital cameras are employed to capture baked goods. Then extracted the regions contains baked goods and applied with linear iterative clustering segmentation is deployed to obtain the baked states of goods. Color features are validated for the oven uniformity of temperature. Non-invasive acoustic technique [18] can be used to determine the change in quality and Elasticity Index (EI) of fruits. This work employs kiwi fruit for the
128
S. Natarajan and V. Ponnusamy
quality and EI to the number of days also stored in various storage temperatures. The results say that the variation in temperature affects the quality indices, EI of the kiwi fruit. The EI correlates sensory indices and tissue strength of the fruit. The fruit sample is kept for acoustic experimentation in vibration mode and extracted the variation of parameters. The data are applied to the zero-, first-, and second-order kinetic models to analyze the quality indices. Out of three models, the zero-order kinetic model delivers better accuracy 99.7% with the Arrhenius model on specified temperature. Multispectral spectroscopic sensor system [19] designed for identification of four adulterants such as dextrose, ammonium sulfate, hydrogen peroxide, and sodium salicylate. The adulterants are added with milk in various concentrations, and the data are acquired through the multispectral sensor system, which is processed with five machine learning algorithms. The five machine learning algorithms of Naive Bayes, linear discriminant analysis, decision tree, K-nearest neighbour and neural network model are applied. The neural network model achieves the highest accuracy 100% detecting adulterants in milk samples. Figure 2 shows some of the common non-destructive techniques involved in the agri-food product analysis. Spectroscopic and imaging techniques play a vital role in most food quality assessments. The strength and weaknesses of the latest non-destructive examining techniques are discussed below: Currently, hyper spectral imaging (HSI), laser light backscattering imaging (LLBI), magnetic resonance imaging, X-ray CT, Raman spectroscopy, and nearinfrared spectroscopy techniques are deployed for the fruit quality analysis in a nondestructive manner. Hyper spectral imaging combines the features of spectroscopic and imaging techniques. It should deliver better results with precise food quality. However, there are few limitations in image extraction, acquisition, and analysis, thus limiting its result access in automatic online image fruit quality classification. The cost of the system is expensive, regardless of the cost issue, it can attain better sensitivity. LLBI helps in assessing the chemical and physical properties of the fruits. It provides better quality analysis, low cost, rapid, but it has limitations such that the fruits need to be circulated (round up) while accessing the data and not suitable for continuous quality fruits. Magnetic resonance imaging (MRI) is another nondestructive technique that can able to analyze the complex, uneven food varieties. Due to its huge cost, standard operational procedures, and safety issues, it is limited to few applications. With the complex nature analysis on fruits, it is required to maintain some standard operating procedures by the technicians to compare results with other fruit quality measures. X-ray CT has a rapid detection mechanism, better resolution, stronger penetrating rays, and provides more intuitive assessment, thus by takes advantage of non-destructive analysis. It can be able to provide high resolution of less than 1% density difference which is distinguishable, but near infrared technique unable to differentiate. X-ray CT can provide three-dimensional CT imaging of food products; however, it may take a long time to acquire the image like hyper spectral imaging
8 Agri-Food Products Quality Assessment Methods
129
Hand-held Portable devices
NIR
E-nose
HSI
LC-MS
FTIR HPLC
Raman GC-MS Spectroscopy
Food Quality Evaluation Spectrophotometer
NMR E-tongue
Fluorescence Spectroscopy
Computer vision
Fig. 2 Block diagram for non-destructive techniques for food quality evaluation
(HSI) and also cannot be able to apply for online quality sorting and grading, but suitable for thick fruit quality analysis. NIR spectroscopic technique is the common methodology applied for food product analysis. It provides food compositional elements, but it fails to detect the samples with less than 1% of total soluble substrates (TSS). This model will deliver better results under certain conditions. It requires a significant number of reference data samples to calibrate the results. The cost of the system is cheap, so it can be designed to be portable and can achieve online fruit sorting and grading with better precise results. It is more suitable for thin fruit skin assessment. Raman spectroscopy is another traditional instrument employed for nondestructive analysis. Undesirable baseline noise and background noise may cause some difficulties for the researchers while fruit quality analysis. It is hard to analyze the complex fruit structures [20]. The overall focus of the spectroscopic determination mechanism deployment on food quality analysis is a non-destructive technique. This aids the consumers to know about the internal as well as the external qualitative and quantitative measures
130
S. Natarajan and V. Ponnusamy
about the products. Quality, authentication, discrimination among organic and nonorganic varieties, harvesting time, pathogen detection, pesticide quantification, shelflife determination, and elasticity index of the food product can be found with respect to the number of days it is stored and maintained, internal chemical properties also identified with this non-destructive spectroscopic method. All the findings are carried out in a laboratory environment, not in the agricultural fields. So there is a gap in developing low cost, portable, real-time, and on-field result monitoring systems that need to be focused on and enhanced.
5 Role of Machine Learning and Deep Learning in Food Quality Evaluation With the increase in the world’s population, there is a rise in food demand; also, food industries are emerging vastly. Now, it is the need to minimize the hour in reducing the food waste management, supply chain optimization, food delivery, food safety, and enhancing food logistics. Machine learning (ML) and deep learning aids in achieving these objectives by the way of predicting the outcomes by bare clear programming. These ML and deep learning algorithms provide a better classification for imaging and spectroscopic data. Spectral responses and images are pre-processed and applied with the neural network, machine learning algorithms to get better accuracy in classification. Those classified results are communicated through IoT devices which help industrial persons to visualize the results anywhere around the world. This section describes the implementation of machine learning and deep learning techniques in food quality assessment with imaging mechanism and lists few challenges and research gaps. Image processing is one of the components in the machine vision system and takes the machine learning and deep learning algorithms for rapid determination of foreign objects. There are many types’ images such as stereo systems, remote sensing, hyper spectral, thermal image, X-ray, magnetic resonance imaging, and terahertz imaging. Once the images are acquired, it is subjected for the level of pre-processing. During pre-processing, the noise is removed and the contrast of the images is enhanced. Then, it is segmented via various methods of segmentation. After that, it is represented, described, recognized, and interpreted in the different levels of processing. Finally, the images are analyzed for their determination of parameters [4, 21]. Machine learning and deep learning algorithms are designed to train the patterns of images and produce accurate results in a rapid way. Logistic regression, KNN, support vector machine (SVM), K-means clustering, Bayesian network, random forest, decision tree, and fuzzy C-means methods are the traditional algorithms applied. These algorithms are applied for fruits, vegetable grading, defect monitoring, foreign object detection, and adulterant determination [4].
8 Agri-Food Products Quality Assessment Methods
131
Low cost, real time moisture content detection is challenging one and in demand. This work [22] involve less expensive mechanism which utilize terahertz waveform based characterization of moisture content through transmittance response data which is collected from Vector Network Analyser. These transmittance response data is utilized for detecting the moisture content using ML. This work focuses on less expensive feasible terahertz combined ML algorithms deployed for the determination of moisture content. Multi-domain features are obtained and processed through three various ML algorithms, which are decision tree, support vector machine, and KNN. The moisture content variation evaluated for four days and classifiers gives 100% of accuracy for days 1–4. Fruit classification is considered as one of the significant applications in the field of agriculture, especially in the market and supermarkets, to identify the species and its price. It helps to determine the variety of the fruits, not only the species. Two deep learning models are implemented [23] such that first is coined with six convolutional neural networks, whereas the next is framed with a combination of 16 pre-trained deep neural network models. Two set image databases are maintained, one with the images which are publicly available and one with complex images which are challenging to identify the variety as well as the species for the fruits. The resultant classification accuracy achieves 99.75% for the first dataset, and 96.75% were obtained for the second set of dataset images. The analysis of total volatile organic compounds (TVOCs) from the packed food samples is presented [24]. Fish and pork samples are taken for analysis. The samples are maintained for eight days in both refrigerator and ambient environment. Capacity humidity sensor and a metal-oxide gas sensor deployed for the TVOCs analysis, and it is energized through an energy harvester. The data are acquired through the sensor and processed with three algorithms, namely convolutional neural network (CNN), SVM, and multi-layer perceptron (MLP). Compared to pork meat, fish sample data delivers better accuracy of 99.9% with the CNN algorithm than the other algorithms. Sensor array-based technique employed for the intentional papaya seeds adulterant determination in black pepper seeds [25] which describes the real-time, rapid, and low-cost pepper adulteration determination. Three gas sensors were employed for the assessment, and the papaya samples were mixed with the commercial black pepper samples in various proportions. The data samples are acquired through the gas sensors and processed through machine learning algorithms which achieves 100% of accuracy in determining the papaya seeds adulterant identification. With the deployment of IoT architecture, the results can be accessed from anywhere. Image processing technique also provides efficient nutritional analysis on the food quality. Fruit freshness is another visual indicator for the customer to analyze the quality of fruit. Again, it makes them desire to buy, health concept, and predicts the price of the fruit. Therefore, the research area also concentrating on designing and developing a non-destructive quality analysis for the freshness of the fruit. The apple freshness found by employing a Division of Focal Plane (DoLP) polarized camera [26] is presented. The images are processed with two machine learning algorithms and validated for external rot conditions and predict whether to consume the fruit or not. First, the images are reconstructed from the angle of polarization and DoLP and
132
S. Natarajan and V. Ponnusamy
generated its correlate image with respect to its age (days); then, it is fed as input to the ML algorithms. The result achieves an accuracy of 92.57% for prediction of the age of apples non-invasively. The banana sample was analyzed [27] for the freshness modification with respect to the storage information. This information are processed and evaluated in the transfer learning algorithm. The results arrived with an accuracy of 98.92% in detecting the freshness. This method is also applied to various fruit varieties. Some of the future challenging tasks with respect to imaging techniques are as follows: 1.
2.
3.
With respect to the machine vision system, if the target image is noisy, unsatisfactory illumination condition leads to difficulty in determination. The reliability and robustness of machine vision systems may not compete with the food processing and industrial requirements. Due to the various feature aspects of food substances, more precise algorithms need to be developed to determine the food information. Imaging technology will not be able to correlate its smell, which is one of the essential features for the quality and safe food evaluation. It is difficult to measure the online smell determination for the imaging method. Storage and computation capability is another tough core in machine vision systems. With the increase in the number of images, the processing time will also rise automatically; then, there will be a delay in delivering the results. As a result, this technique is unable to apply to large-scale food industrial applications. Sometimes, it is unable to meet the real-time determination necessaries.
The future research objective should be focused to eradicate the above-mentioned drawbacks like. • Enhancing the present algorithms to improve the robustness, reliability, and accuracy. • Implementation of MVS in large-scale food production deployed with embedded vision systems because of rapid processing, small structure, and also inexpensive. • MVS also be incorporate with the IoT, edge computing to support smart agricultural methods, and food processing. With the aid of this technology, food delivery becomes easier, tracking the products from agricultural land to the supermarket, reduces wastage of vegetables and fruits, enables the consumers to order their foods through a self-ordering kiosk system. These techniques are applied widely in many food processing applications such as food grading and sorting, foreign object identification, monitoring the defect area in the surface of food items, moisture content estimation, and adulteration identification through TVOC’s from the sensor data, fruit freshness, and external rotten. Machine learning and deep learning algorithms acquire the data from the system, which may be destructive/non-destructive methods and process the information accordingly. There are some drawbacks identified in classifying few data such as stored product information, very large-scale information cannot be trained by these algorithms due to complexity of data and consumes more time to deliver the results,
8 Agri-Food Products Quality Assessment Methods
133
noisy images processing. These problems can be overcome by pre-processing the input information or enhancing the processing speed of systems.
6 Blockchain-Based Grading Mechanism Food quality and safety can be improved via supply chain traceability with blockchain technology. Deployment/implementation of blockchain in agri-food products supply chain is at the initial stage. Retailers, leading companies must include this technique in specific objectives such as to give better transparency in the food product supply chain, food quality, and responsiveness. Crypto currency is one of the emerging technology for the secure transactions of data without the interventions of trusted parties [28]. Blockchain is one of the underlying technology of crypto currency. It holds the distributed ledger containing a block of information. It is an open ledger that keeps track of transaction records between two persons in a permanent and verifiable manner. Nowadays, an increasing number of research works are focused on blockchain-based traceability works to enhance the applications on food safety and quality. Three ways of input level traceability application on agri-foods collects information is discussed. First, gets data through GUI of buttons or text fields. User has to provide information manually to the system that could be leverage input form on any supply chain such as logistics, livestock management, and crop registration. The second step of obtaining input from the labels which are attached to the piece of food substances, which provides the traceability data of a product to the customer. The third method is to gather the sensor data such as location, fertilizer concentration in the food, soil, humidity, and temperature [28]. Traditional food traceability involves isotopic techniques, barcode readers, and radio frequency identification devices (RFID). This technique will provide the information from the collection point to the distribution end. Conventional food quality monitoring methods lack in automation inefficient and barriers to the reliability of monitoring. Blockchain delivers a tamper proof and decentralized mechanism, which combined with smart contracts also provides self-verifying, self-executing mutual transactions between the entities that are made securely with untrusted parties. This work [29] proposes an intelligent automated food quality monitoring system in the fruit juice production industry. It provides better reliability and automation. Based on the pre-production of information collected, surface models are built, and in each stage, production conditions are determined. As these data serve as input for the validation models, smart contracts conclude whether to resume the production process or not. Quality monitoring of peach juice is implemented using Ethereum platform on Remix Integrated Development Environment (IDE). From the collected various forms of inputs, it is formatted with a pattern. It can be plotted in table format with the food properties tabulated information such as origin
134
S. Natarajan and V. Ponnusamy
and age of livestock information. It can also give a timeline for the oil and egg traceability application. However, tracking and maintain the provenance of information is a challenging task on the supply chain network traceability. The traditional supply chain requires third-party approval for trading. It lacks accountability, auditability, time-consuming, complexity, and transparency. Existing food traceability blockchain application has some drawbacks in scalability, reliability, and accuracy on data. One of the latest solutions preferred by the researchers for the traditional supply chain issues is the deployment of smart contract and blockchain over the Ethereum blockchain network. This proposed network uploads the overall transaction data from the blockchain to interplanetary file storage system (IPFS) [30]. This storage module gives a hash of the data which ensure reliable, secure, and efficient solution for information in the blockchain network. Visualizing the interaction between the entities of the network is also provided through a smart contract system. Evaluation and simulations of smart contracts along with vulnerability and security are also employed in this proposed work. To alleviate the traceability issue in the supply chain, blockchain [31] provides a new ontology of consensus mechanism and flow of data in blockchain evolved for crypto currency, but not applicable for food supply chain traceability. A blockchain– IoT—food traceability is deployed with the integration of IoT technology, fuzzy logic, and blockchain to trace the shelf life management of perishable food. Blockchain combined with consensus mechanism considers the transit time of shipment, shipment volume, and stakeholder assessment. Though there are many traditional methods to deploy traceability for tracing the food products and their supply chain, blockchain is an emerging and new methodology that provides efficiency, reliability, responsibility in traceability, and delivering the products to the distribution end. It is utilized to tap each and every node on the supply chain. Moreover, it enables stakeholders and retailers to hold tamper-proof information with better features available in the blockchain. Since it is an emerging technique in the agri-food product area, it has a lot of research areas to be focused such as assuring the quality of products in each stage for large food materials, IoT integration, and also identifying the quantification of the food substances. Many applications are developed using machine and deep learning mechanisms in other areas [32–34]; those techniques can also be applied for food quality analysis.
7 Conclusion In developing countries, the agriculture production concern and cause environmental threats for the future generation. Food is the most necessary for each and every living being. Therefore, it should not create any harmful issues to the living ones. A wide variety of enhanced techniques are evolving for the agri products quality assessment. With the growing expectations on high-quality safety and food products standards, there is the requirement for rapid, accurate, and identification of qualitative features
8 Agri-Food Products Quality Assessment Methods
135
on the food products continues to rise. This chapter deals with the review of noninvasive food quality assessment. The evaluation methods for the quality analysis are expected to be mostly non-destructive. Traditional imaging and spectroscopic mechanisms are laborious, time-consuming, and not portable for real-time assessment. The latest imaging and spectroscopic methodologies are applied on a non-destructive basis, and the collected data is applied with machine learning and deep learning algorithms to get good and rapid result analysis. These results were communicated through IoT devices for better visualization anywhere in the world. Blockchainbased food product supply chain traceability is delineated with its processing and limitations. Microfluidic food quality evaluation methods are emerging as invasive methods. Nanomaterial-based biosensors need to be enhanced for rapid, portable, and on-field agri-food product quality assessment. Though technologies have emerged still some limitations, need to be fulfilled for the commercialization of prototypes on quality assessment of food products.
References 1. Food quality analysis. Homepage: http://epgp.inflibnet.ac.in/epgpdata/uploads/epgp_content/ S000015FT/P000065/M002606/ET/14619144641ET.pdf 2. Analysis of Food Products. https://people.umass.edu/~mcclemen/581Introduction.html 3. Ponnusamy V, Natarajan S (2021) Precision agriculture using advanced technology of IoT, unmanned aerial vehicle, augmented reality, and machine learning. In: Smart sensors for industrial internet of things, Springer, Cham, pp 207–229 4. Zhu L, Spachos P, Pensini E, Plataniotis KN (2021) Deep learning and machine vision for food processing: a survey. Curr Res Food Sci 4:233–249 5. Natarajan S, Ponnusamy V (2020) A review on the applications of ultrasound in food processing. Mater Today: Proc 6. El-Mesery HS, Mao H, Abomohra AEF (2019) Applications of non-destructive technologies for agricultural and food products quality inspection. Sensors 19(4):846 7. Narendra VG, Hareesha KS (2010) Quality inspection and grading of agricultural and food products by computer vision—a review. Int J Comput Appl 2(1):43–65 8. You M, Liu J, Zhang J, Xv M, He D (2020) A novel chicken meat quality evaluation method based on color card localization and color correction. IEEE Access 8:170093–170100 9. Bogard JR, Marks GC, Wood S, Thilsted SH (2018) Measuring nutritional quality of agricultural production systems: application to fish production. Glob Food Sec 16:54–64 10. Dong Y, Fu Z, Stankovski S, Wang S, Li X (2020) Nutritional quality and safety traceability system for China’s leafy vegetable supply chain based on fault tree analysis and QR code. IEEE Access 8:161261–161275 11. Aboonajmi M, Faridi H (2016) Nondestructive quality assessment of agro-food products. In: Proceedings of the 3rd Iranian international NDT conference 12. Di Caro D, Liguori C, Pietrosanto A, Sommella P (2019) Quality assessment of the inshell hazelnuts based on TD-NMR analysis. IEEE Trans Instrum Meas 69(6):3770–3779 13. Fengou LC, Mporas I, Spyrelli E, Lianou A, Nychas GJ (2020) Estimation of the microbiological quality of meat using rapid and non-invasive spectroscopic sensors. IEEE Access 8:106614–106628 14. Sarkar M, Gupta N, Assaad M (2020) Nondestructive food quality monitoring using phase information in time-resolved reflectance spectroscopy. IEEE Trans Instrum Meas 69(10):7787– 7795
136
S. Natarajan and V. Ponnusamy
15. Zhang D, Pu H, Huang L, Sun DW (2021) Advances in flexible surface-enhanced Raman scattering (SERS) substrates for nondestructive food detection: fundamentals and recent applications. Trends Food Sci Technol 16. Sricharoonratana M, Thompson AK, Teerachaichayut S (2021) Use of near infrared hyperspectral imaging as a nondestructive method of determining and classifying shelf life of cakes. LWT 136:110369 17. Wang C, Hou B, Shi J, Yang J (2020) Uniformity evaluation of temperature field in an oven based on image processing. IEEE Access 8:10243–10253 18. Zhang W, Lv Z, Shi B, Xu Z, Zhang L (2021) Evaluation of quality changes and elasticity index of kiwifruit in shelf life by a nondestructive acoustic vibration method. Postharvest Biol Technol 173:111398 19. Sowmya N, Ponnusamy V (2021) Development of spectroscopic sensor system for an IoT application of adulteration identification on milk using machine learning. IEEE Access 9:53979–53995. https://doi.org/10.1109/ACCESS.2021.3070558 20. Li JL, Sun DW, Cheng, JH (2016) Recent advances in nondestructive analytical techniques for determining the total soluble solids in fruits: a review. Compr Rev Food Sci Food Saf 15(5):897– 911 21. Natarajan S, Ponnusamy V (2020) Adulterant identification on food using various spectroscopic techniques. Mater Today: Proc 22. Ren A, Zahid A, Zoha A, Shah SA, Imran MA, Alomainy A, Abbasi QH (2019) Machine learning driven approach towards the quality assessment of fresh fruits using non-invasive sensing. IEEE Sens J 20(4):2075–2083 23. Hossain MS, Al-Hammadi M, Muhammad G (2018) Automatic fruit classification using deep learning for industrial applications. IEEE Trans Ind Inf 15(2):1027–1034 24. Lam MB, Nguyen TH, Chung WY (2020) Deep learning-based food quality estimation using radio frequency-powered sensor mote. IEEE Access 8:88360–88371 25. Rao GP (2021) Development of IoT sensor for pepper adulteration detection using sensor arrays. Turk J Comput Math Educ (TURCOMAT) 12(11):5538–5545 26. Takruri M, Abubakar A, Alnaqbi N, Al Shehhi H, Jallad AHM, Bermak A (2021) DoFP-ML: a machine learning approach to food quality monitoring using a DoFP polarization image sensor. IEEE Access 8:150282–150290 27. Ni J, Gao J, Deng L, Han Z (2020) Monitoring the change process of banana freshness by GoogLeNet. IEEE Access 8:228369–228376 28. Tharatipyakul A, Pongnumkul S (2021) User interface of blockchain-based agri-food traceability applications: a review. IEEE Access 29. Yu B, Zhan P, Lei M, Zhou F, Wang P (2020) Food quality monitoring system based on smart contracts and evaluation models. IEEE Access 8:12479–12490 30. Shahid A, Almogren A, Javaid N, Al-Zahrani FA, Zuair M, Alam M (2020) Blockchain-based agri-food supply chain: a complete solution. IEEE Access 8:69230–69243 31. Tsang YP, Choy KL, Wu CH, Ho GTS, Lam HY (2019) Blockchain-driven IoT for food traceability with an integrated consensus mechanism. IEEE Access 7:129000–129017 32. Ponnusamy V, Kottursamy K, Karthick T, Mukeshkrishnan MB, Malathi D, Ahanger TA (2020) Primary user emulation attack mitigation using neural network. Comput Electr Eng 88:106849 33. Ponnusamy V, Coumaran A, Shunmugam AS, Rajaram K, Senthilvelavan S (2020) Smart glass: real-time leaf disease detection using YOLO transfer learning. In: 2020 international conference on communication and signal processing (ICCSP), IEEE, pp 1150–1154 34. Ponnusamy V, Malarvihi S (2017) Hardware impairment detection and pre whitening on MIMO pre-coder for spectrum sharing. Wireless Pers Commun 96(1):1557–1576
Chapter 9
Medicinal Plant Recognition from Leaf Images Using Deep Learning Md. Ariful Hassan, Md. Sydul Islam, Md. Mehedi Hasan, Sumaita Binte Shorif , Md. Tarek Habib, and Mohammad Shorif Uddin
1 Introduction The articulation “medicinal plant” recollects numerous sorts of plants used for herbalism (“herbology” or “local medicine”). It is the usage of plants for useful functions, and therefore the examination of such jobs. These remedial plants square is used as food, flavonoid, drug or aroma, and so on. World Health Organization (WHO) has shown that around 21,000 plant species have the potential for getting used as medicinal plants [1]. Medicinal plants, for example, aloe vera, holy basil, neem, turmeric, and ginger fix normal sickness. Plants for the treatment of essential sickness, for example, detachment of the center, obstructing, high blood pressure, low spermatozoon check, loose bowels and weak erectile organ erection, piles, lined tongue, refined problems, respiratory disease, leukorrhea, and fevers are frequently prescribed by the quality specialists. So, medicinal plants play a vital role for humans, and this is why we tend to use convolutional neural network (CNN) models for recognizing the eleven medicinal plants in Bangladesh. In this chapter, we profoundly perform an exploratory examination of medicinal plant recognition following a machine-vision approach based on deep learning. We use four state-of-the-art CNN models, namely, MobileNet, ResNet-50, Xception, and InceptionV3 for the recognition of medicinal plants. For each of the four CNN models, we come up with a suitable configuration after in-depth experimentation. Then the performance of each of these CNN models is assessed in terms of four indicative performance metrics accuracy, precision, recall, and F 1 -score. In essence, the major contributions of this research work are: Md. Ariful Hassan · Md. Sydul Islam · Md. Mehedi Hasan · Md. Tarek Habib (B) Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh S. B. Shorif · M. S. Uddin Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_9
137
138
Md. Ariful Hassan et al.
• The primary endeavor to take care of the problem of automated medicinal plant recognition. • Efficiently arrangement of a solid base of prejudicial highlights for deep learning models to effectively recognize medicinal plants. • A plausible investigation of various characterizations in the unique circumstance of automated medicinal plant recognition to facilitate the current research trend. • An exhibition that our proposed machine-vision approach has a profound indication on the informational indices of automated medicinal plant recognition. The remainder of the chapter is organized as follows. Section 2 portrays the present status of the answers for addressing the different issues of automated medicinal plant recognition. Section 3 depicts the entire strategy of our research work. Section 4 shows the data collection and preprocessing. Sections 5 and 6 describe the experimentation and comparative analysis of the results, respectively. At last, the conclusion along with future work is provided in Sect. 7.
2 Literature Review Automated medicinal plant recognition can be regarded as very good and promising research. Nonetheless, very few attempts have been made for automated medicinal plant recognition, especially focused on Bangladeshi medicinal plants. Amuthalingeswaran et al. [2] have used deep CNN models for identifying medicinal plants. To prepare the model, they have utilized around 8,000 pictures having a place with four unique classes. At last, they have shown up with great exactness of 85% when testing with pictures taken from the open field land zones. Jayalath et al. [3] have worked on the recognition of medicinal plants by visual characteristics of leaves and flowers. They have used a CNN model. For this research, an information base has been made from examined pictures of leaves and blossoms of uncommon medicinal plants utilized in Sri Lankan Ayurveda medication. Both the front and posteriors of leaves and blossoms have been caught. The leaves have been grouped depending on the novel component blend. The recognition rate rises to 98% when dealing with more than 10 plants. Sivaranjani et al. [4] have recognized real-time medicinal plants by using machine learning techniques. They have used logistic regression for classification. The excess green minus excess red (ExG − ExR) list distinguishes a parallel plant district of interest. The unique shading pixel of the paired picture fills in as the veil which disengages leaves as sub-pictures. The plant species have been characterized by the shading and surface highlights on each separated leaf utilizing a calculated relapse classifier with an accuracy of 93.3%. Arun and Christopher [5] have worked on the recognition of medicinal plant leaves using textures and optimal color spaces channels. They have used color spaces and texture features. As a consequence of characterization on a data set of 250 leaf pictures having a place with five unique types of plants, a recognition rate of 98.7% has been exhibited. Khairul et al. [6] have worked on Bangladeshi plant leaf recognition using a clever CNN named YOLO
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
139
which stands for You only look once. The recognition rate has been 96%. Begue et al. [7] have researched automatic recognition of medicinal plants using machine learning techniques. Leaves from 24 diverse medicinal plant species have been gathered and captured utilizing a cell phone in a laboratory setting. Countless highlights have been removed from each leaf, for example, its length, width, edge, zone, number of vertices, shading, edge, and territory of the body. A few inferred highlights have then been figured out from these characteristics. The best outcomes were gotten from a random forest classifier utilizing a tenfold cross-validation evaluation procedure. With an exactness of 90.1%, the random forest classifier has performed in a way that is better than all other machine learning approaches used. Duong-Trung et al. [8] have worked on medicinal plant recognition using transfer learning on a deep neural network. They have used a CNN, namely, MobileNet which has displayed an accuracy of 98.7%. From the above portrayal, we can break down that no in-depth and rigorous work has been performed on our neighborhood medicinal plants, which can convey credit of Bangladesh. At this point, there exist insufficient assets concerning this theme where we can examine different relevant issues.
3 Research Methodology Our machine-vision approach starts with images of medicinal plant leaves which are commonly found in Bangladesh. We have collected about one thousand two hundred and fifty (1250) images of eleven such medicinal plant leaves. Among them, 80% have been treated as training data and 20% as test data. Our data collection and data preprocessing techniques’ layout has been shown in the next section. After preprocessing and augmentation, our data set gets ready for the next step, i.e., classification. Then we have applied four state-of-the-art CNN models, viz. MobileNet, ResNet50, Xception, and InceptionV3. Then we have evaluated the performance of each of these classification models based on accuracy and other indicative metrics like precision, recall, and F 1 -score. The entire working approach has been delineated in the flow diagram shown in Fig. 1. In the case that we apply classification models, four up-to-the-minute CNN models have been chosen. ResNet-50 is one of these CNN’s. ResNet-50 is a variant of the ResNet model that has forty-eight convolution layers together with one Max Pool and one Average Pool layer. It has 3.8 × 109 floating-point operations. It is a widely used ResNet model, and for that we have explored ResNet-50 design comprehensively [9] as shown in Fig. 2. MobileNet is an efficient design that uses depth-wise divisible convolutions to construct a lightweight deep CNN’s and provides an economical model for mobile and embedded vision applications. As the name applied, the MobileNet model is intended to be utilized in mobile applications, and it is TensorFlow’s first versatile
140
Md. Ariful Hassan et al.
Data Collection
Data Preprocessing
Data Augmentation
Processed Data
Applying Convolution Neural Network Models
MobileNet
Xception
InceptionV3
ResNet-50
Model Evaluation
Decision Making
Fig. 1 The methodology applied for the recognition of medicinal plants
Fig. 2 The ResNet-50 architecture
PC vision model. MobileNet utilizes depth-wise separable convolutions. It fundamentally diminishes the number of boundaries when contrasted with the organization with customary convolutions with similar profundity in the nets. This results in lightweight deep neural networks [10] as shown in Fig. 3. Inception design is thought-about additional computationally economical in terms of various parameters and resource management compared to VGG16 design [11]. InceptionV3 modifies the sooner architectures to attain potency by mistreatment less machine power. InceptionV3 is a CNN architecture from the Inception family that makes a few upgrades including utilizing label smoothing, factorized 7 × 7 convolutions, and the utilization of a helper classifier to engender mark data lower down the organization [12] as shown in Fig. 4. The Xception model outperforms the inception model by utilizing a changed depth-wise divisible convolution. Xception model is thought-about the additional extreme interpretation of the inception design. The Xception design has 36 layers
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
141
Fig. 3 The MobileNet architecture
Fig. 4 The InceptionV3 architecture
shaping the element extraction base of the organization. To put it plainly, the Xception architecture is a direct pile of depth-wise divisible convolution layers with residual connections [13] as shown in Fig. 5. In the case that the performance of each of the CNN models is analyzed, accuracy may not be sufficient for assessing performance when we break down the performance exhibition. In this way, different measurements like precision, recall, and F 1 -score have also been selected for assessing the performance of each of these classification
Fig. 5 The Xception architecture
142
Md. Ariful Hassan et al.
models. Precision is the measurement of exactness. It is the ratio of true positive value and predicted positive value. The recall is the measurement of completeness. It is the ratio of true positive value and true negative value. F 1 -score is the measurement of the harmonic mean of recall and precision. It considers both false positive and false negative values for calculation. A confusion matrix is a table that provides the true positive value and true negative value, false positive value, and false negative value in a tabular format based on the test data. The confusion matrix is very important for measuring the performance of any classifier. In a multiclass confusion matrix, the framework will be of measurement n × n (n > 2), which leads itself to contain n rows, n columns, and n × n entries altogether. The last confusion matrix conveys the average values of the n confusion matrices for each class and is of measurement 2 × 2 after using the method described in [14, 15]. Utilizing this confusion matrix, accuracy, precision, recall, and F 1 -score are determined as follows: Accuracy =
TP + TN × 100% TP + FN + FP + TN
(1)
TP × 100% TP + FP
(2)
Precision = Recall = F1 Score =
TP × 100% TP + FN
(2 × precicion × recall) × 100% (precision + recall)
(3) (4)
4 Data Collection and Preprocessing Medicinal plant observation is a very important part of our work because it helps select features to distinguish a local medicinal plant. In this work, we deal with eleven species of medicinal plants frequently in Bangladesh, namely, Malabar nut (Justicia adhatoda), guava (Psidium guajava), jackfruit (Artocarpus heterophyllus), black plum (Syzygium cumini), mango (Mangifera indica), neem (Azadirachta indica), cathedral bells (Bryophyllum pinnatum), mint (Mentha spicata), moringa (Moringa oleifera), gotu kola (Centella asiatica), holy basil (Ocimum tenuiflorum). We have collected the data from Bangladesh Academy for Rural Development (BARD) and different neighboring places of Comilla, a large and fertile district in Bangladesh. We have taken the pictures by using a smart mobile phone camera. We captured about one thousand two hundred and fifty (1250) pictures of medicinal plant leaves, among which some illustrative examples are shown in Fig. 6. We have used data augmentation techniques to train the four CNN models. We have applied rotation, width shift, height shift, zoom, shear, horizontal flip techniques.
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
143
Fig. 6 a Malabar nut leaf, b Guava leaf, c Jackfruit leaf, d Black plum leaf, e Mango leaf, f Neem leaf, g Cathedral bells leaf, h Mint leaf, i Moringa leaf, j Gotu kola leaf, k Holy basil leaf
After augmenting the data set, we have got twenty thousand and twelve (20,012) images in total ready for our CNN models. The class-wise distribution of data has been provided in Table 1. We have made informational indices in local disks of PC. Table 1 Class-wise distribution of data Sl #
Plant
Training data
Test data
Total data
01
Malabar nut leaves
1535
398
1933
02
Guava leaves
1444
360
1804
03
Jackfruit leaves
1445
360
1805
04
Black plum leaves
1445
360
1805
05
Mango leaves
1508
378
1886
06
Neem leaves
1510
378
1888
07
Cathedral bells leaves
1468
378
1846
08
Mint leaves
1424
360
1784
09
Moringa leaves
1360
340
1700
10
Gotu kola leaves
1423
356
1779
11
Holy basil leaves
1422
360
1782
Total
15,984
4,028
20,012
144
Md. Ariful Hassan et al.
5 Experimental Evaluation An investigative experiment has been performed where automated local medicinal plant recognition has been followed as per Fig. 1. Capturing an image of a local medicinal plant leaf is our first step of work. Then the captured image is altered into a 224 × 224-pixel image. About one thousand two hundred and fifty (1,250) color images of eleven (11) different classes of local medicinal plant leaves have been gathered for this experiment. Then image augmentation has taken place using rotation, width shift, height shift, zoom, shear, and horizontal flip operations, whose illustrative parts are shown in Fig. 7. Thus, the size of the augmented data set becomes twenty thousand and twelve (20,012). The entire set of data has been divided into two sections known as training set and test set according to the holdout method [16]. We have divided the entire set of data into 80% as the training set (15,984 images) and the remaining 20% as the testing set (4028 images). The frequency of each class of medicinal plant leaf is provided in Table 1. Then we have applied four CNN models, namely, MobileNet, ResNet-50, Xception, and InceptionV3 for our work. We have tuned the corresponding parameters of each of these CNN models and come up with the best configuration of each of the CNN models. After the completion of the parameter tuning of each of the four CNN models, we have got the pictorial results as shown in Fig. 8. To rigorously evaluate the performances of these four CNN models, a multiclass confusion matrix has been formed for each model. The four confusion matrices of the corresponding four models are displayed in Tables 2, 3, 4 and 5. Accuracy, precision, recall, F 1 -score—these prominent performance metrics have been calculated for each model, which is shown in Tables 6, 7, 8 and 9. Finally, Table 10 delineates the comparison of performances of all four CNN models implemented. We can see from Table 10 that the accuracy of MobileNet is 100%, which exhibits the best accuracy among all of the four CNN models implemented. Not only accuracy but also precision, recall, F 1 -score of the MobileNet model is all of the magical value 100%, which are, of course, better than that of the other three models. So, we can conclude that the MobileNet has performed the best in terms of all performance metrics used in the context of automated medicinal plant recognition.
6 Comparative Analysis of Results To assess the merits of our medicinal plant recognition endeavor, we need to contrast our work and some new and applicable exploration works. We should consider that the assumption received by the scientists in gathering tests and announcing aftereffects of their exploration exercises in handling those examples will have an extraordinary sign of our undertaking for a similar execution assessment. We have strived to contrast our work and the other depends on a portion of the boundaries like sample size, size
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
145
Malabar nut leaf:
Guava leaf:
Jackfruit leaf:
Black plum leaf:
Mango leaf:
Neem leaf:
Cathedral bells leaf:
Mint leaf:
Moringa leaf: Gotu kola leaf:
Holy basil leaf:
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 7 a Plant, b captured image, c 224 × 224-pixel resized image, d Augmented image (rotation), e augmented image (zoom), f augmented image (flip)
of feature set, algorithm, and accuracy. Table 11 shows a near outline of different works and our work. Amuthalingeswaran et al. [2] have used deep CNN models for recognizing medicinal plants. To prepare the model, they have utilized around 8,000 images having a place with four unique classes. At last, they have shown up with an accuracy of 85% when testing with pictures. Jayalath et al. [3] have used a CNN. An information
146
Md. Ariful Hassan et al.
Fig. 8 Training and validation accuracy of the CNN models used. a MobileNet model, b Xception model, c Inception v3 model, d ResNet-50 model
base has been made from examined pictures of leaves and blossoms of uncommon therapeutic plants utilized in Sri Lankan Ayurveda medication. Both the front and posteriors of leaves and blossoms have been caught. The leaves have been grouped depending on the novel component blend. Recognition rates up to 98% have been gotten when trying with more than 10 plants. Sivaranjani et al. [4] have used logistic regression for the recognition of medicinal plants. The ExG-ExR list distinguishes a parallel plant district of interest. The plant species have been characterized by the shading and surface highlights on each separated leaf by utilizing a calculated relapse classifier with an accuracy of 93.3%. Arun and Christopher [5] have used color spaces and texture features for the recognition of medicinal plant leaves. Consequences of characterization on a data set of 250 leaf pictures having a place with five unique types of plants show the recognition rate of 98.7%. Khairul et al. [6] have worked on Bangladeshi plant leaf recognition using YOLO neural network. They have used the YOLOv2 CNN. The accuracy has been 96%. Duong-Trung et al. [8] have worked on a combination of transfer learning and deep learning techniques for medicinal plant classification. They have used the CNN model MobileNet. Their accuracy has come to 98.7%. Concerning the general picture portrayed here, our achieved accuracy of 100% has ended up being great as well as promising enough.
0
Holy basil
0
Cathedral bells
0
0
Neem
Gotu kola
0
Mango
0
0
Black plum
0
0
Jackfruit
Moringa
0
Guava
Mint
398
Malabar nut
Malabar nut
0
0
0
0
0
0
0
0
0
360
0
Guava
1
0
0
0
0
0
0
0
360
0
0
Jackfruit
Table 2 Confusion matrix of MobileNet model
0
0
0
0
0
0
4
360
0
0
0
Black plum
0
0
0
0
0
0
374
0
0
0
0
Mango
0
0
0
0
0
378
0
0
0
0
0
Neem
0
0
0
0
378
0
0
0
0
0
0
Cathedral bells
2
0
2
360
0
0
0
0
0
0
0
Mint
0
0
338
0
0
0
0
0
0
0
0
Moringa
0
356
0
0
0
0
0
0
0
0
0
Gotu kola
357
0
0
0
0
0
0
0
0
0
0
Holy basil
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning 147
0
Holy basil
0
Cathedral bells
0
0
Neem
Gotu kola
0
Mango
0
0
Black plum
0
0
Jackfruit
Moringa
0
Guava
Mint
394
Malabar nut
Malabar nut
0
0
0
0
0
0
0
0
0
359
0
Guava
0
0
0
0
0
0
0
0
360
0
0
Jackfruit
Table 3 Confusion matrix of Xception model
0
0
0
0
0
1
2
358
0
0
3
Black plum
2
0
1
0
0
1
376
0
0
0
1
Mango
0
0
0
0
0
374
0
0
0
0
0
Neem
3
0
0
0
378
0
0
2
0
1
0
Cathedral bells
11
0
0
358
0
2
0
0
0
0
0
Mint
0
0
339
0
0
0
0
0
0
0
0
Moringa
1
356
0
0
0
0
0
0
0
0
0
Gotu kola
343
0
0
2
0
0
0
0
0
0
0
Holy basil
148 Md. Ariful Hassan et al.
0
Holy basil
0
Cathedral bells
0
0
Neem
Gotu kola
0
Mango
0
0
Black plum
0
0
Jackfruit
Moringa
0
Guava
Mint
395
Malabar nut
Malabar nut
1
0
0
0
0
0
0
0
0
359
0
Guava
0
0
0
0
0
0
0
0
359
0
0
Jackfruit
Table 4 Confusion matrix of InceptionV3 model
0
0
8
0
0
0
7
360
1
1
3
Black plum
0
0
1
0
0
0
371
0
0
0
0
Mango
0
0
0
0
0
378
0
0
0
0
0
Neem
2
0
0
0
378
0
0
0
0
0
0
Cathedral bells
7
0
0
359
0
0
0
0
0
0
0
Mint
0
0
331
0
0
0
0
0
0
0
0
Moringa
1
356
0
1
0
0
0
0
0
0
0
Gotu kola
349
0
0
0
0
0
0
0
0
0
0
Holy basil
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning 149
39
Holy basil
30
Cathedral bells
42
36
Neem
Gotu kola
36
Mango
35
42
Black plum
30
33
Jackfruit
Moringa
35
Guava
Mint
42
Malabar nut
Malabar nut
21
12
13
12
21
22
16
16
14
17
19
Guava
24
25
22
31
33
25
25
30
35
28
33
Jackfruit
Table 5 Confusion matrix of ResNet-50 model
98
80
88
80
91
96
96
90
98
87
96
Black plum
1
1
0
1
1
2
3
0
2
0
1
Mango
29
44
49
38
40
31
50
36
45
54
43
Neem
66
63
63
77
68
85
73
72
76
55
78
Cathedral bells
2
0
1
3
3
2
1
0
0
3
0
Mint
22
28
19
33
25
20
23
21
15
26
29
Moringa
31
30
34
24
37
38
32
32
22
26
33
Gotu kola
27
31
21
26
29
21
23
21
20
29
24
Holy basil
150 Md. Ariful Hassan et al.
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
151
Table 6 Classification performances of MobileNet model Malabar nut
Accuracy (%)
Precision (%)
Recall (%)
F 1 -score (%)
100
100
100
100
Guava
100
100
100
100
Jackfruit
100
100
100
100
Black plum
100
99
100
99
Mango
100
100
99
99
Neem
100
100
100
100
Cathedral bells
100
100
100
100
Mint
100
99
100
99
Moringa
100
100
99
100
Gotu kola
100
100
100
100
Holy basil
100
100
99
100
Table 7 Classification performances of Xception model Accuracy (%)
Precision (%)
Recall (%)
F 1 -score (%)
Malabar nut
99
100
99
99
Guava
99
100
100
100
Jackfruit
99
100
100
100
Black plum
99
98
99
99
Mango
99
99
99
99
Neem
99
100
99
99
Cathedral bells
99
98
100
99
Mint
99
96
99
98
Moringa
99
100
100
100
Gotu Kola
99
100
100
100
Holy basil
99
99
95
97
7 Conclusion and Future Work The work performed in this chapter can be partitioned into three significant parts - information assortment, strategy, and experimentation results. We have checked neighboring places of Comilla with personals and gathered 11 classes of medicinal plant leaves pictures (mango, jackfruit, black plum, cathedral bells, neem, holy basil, Malabar nut, mint, moringa, guava, gotu kola). We have had to do some information preprocessing before applying our technique to the data set. As the subsequent step to preprocessing, we have done augmentation on our data set. We have utilized four deep CNN models. They are MobileNet, Xception, InceptionV3, and ResNet-50. We have effectively anticipated 11 restorative plants by utilizing these deep CNN
152
Md. Ariful Hassan et al.
Table 8 Classification performances of InceptionV3 model Accuracy (%)
Precision (%)
Recall (%)
F 1 -score (%)
99
100
99
100
Guava
99
100
100
100
Jackfruit
99
100
100
100
Black plum
99
95
100
97
Mango
99
100
98
99
Neem
99
100
100
100
Cathedral bells
99
99
100
100
Malabar nut
Mint
99
98
100
99
Moringa
99
100
97
99
Gotu kola
99
99
100
100
Holy basil
99
100
97
98
Table 9 Classification performances of ResNet-50 model Accuracy (%)
Precision (%)
Recall (%)
F 1 -score (%)
Malabar nut
9
10
11
11
Guava
9
9
5
6
Jackfruit
9
11
10
10
Black plum
9
9
25
13
Mango
9
25
1
2
Neem
9
7
8
7
Cathedral bells
9
9
18
12
Mint
9
20
1
2
Moringa
9
7
6
6
Gotu kola
9
9
8
9
Holy basil
9
10
7
9
Table 10 Comparison of the four CNN models implemented CNN Model
Accuracy (%)
Precision (%)
Recall (%)
F 1 -score (%)
MobileNet
100
100
100
100
Xception
99
99
99
99
InceptionV3
99
99
99
99
ResNet-50
9
12
9
8
9 Medicinal Plant Recognition from Leaf Images Using Deep Learning
153
Table 11 Results of the comparison of our work and other works Method/Work done
Object(s) dealt with
Size of data set
Technique
Class
Algorithm
Accuracy (%)
This work
Medicinal plant leaves
20,012
Deep learning
11
MobileNet
100
Amuthalingeswaran et al. [2]
Medicinal plants
8000
Deep learning
4
Deep NN
85
Jayalath et al. [3]
Leaves and flowers
5000
Deep learning
10
CNN
98
Sivaranjani et al. [4]
Medicinal plant leaves
100
Classical ML 2
5
Logistic regression
93.3
Arun and Christopher [5]
Medicinal plant leaves
250
Classical ML
5
k-NN
98.7
Khairul et al. [6]
Plant leaves
NM 1
Deep learning
NM 1
YOLO
96
Duong-Trung et al. [8]
Plant leaves
2296
Deep learning
10
CNN
98.7
1 NM 2 ML
not mentioned machine learning
models. Among four CNN models, MobileNet gives the best result (100% accuracy). There remains research opportunity for executing new calculations, adding various boundaries, and adding some more highlights in varying lighting and positioning conditions, which will result in a robust model. Moreover, a very large and diverse data set can be built by gathering images of a larger variety of classes of medicinal plant leaves in this respect.
References 1. Introduction and Importance of Medicinal Plants and Herbs. Available online: https://www. nhp.gov.in/introduction-and-importance-of-medicinal-plants-and-herbs_mtl#:~:text=Med icinal%20plants%20are%20considered%20as,non%2D%20pharmacopoeial%20or%20synt hetic%20drugs.&text=Moreover%2C%20some%20plants%20are%20considered,recomm ended%20for%20their%20therapeutic%20values. Last accessed on 1 May 2021 2. Amuthalingeswaran C, Sivakumar M, Renuga P, Alexpandi S, Elamathi J, Hari SS (2019) Identification of medicinal plant’s and their usage by using deep learning. In: 3rdInternational Conference on Trends in Electronics and Informatics (ICOEI), India, April 2019 3. Jayalath ADADS, Amarawanshaline TGAGD, Nawinna DP, Nadeeshan PVD, Jayasuriya HP (2019) Identification of medicinal plants by visual characteristics of leaves and flowers. In: 2019 IEEE 14th International Conference on Industrial and Information Systems (ICIIS), Sri Lanka, Dec 2019 4. Sivaranjani C, Kalinathan L, Amutha R, Kathavarayan RS, Kumar KJJ (2019) Real-time identification of medicinal plants using machine learning techniques. In: 2ndInternational conference on computational intelligence in data science (ICCIDS), India, Feb 2019
154
Md. Ariful Hassan et al.
5. Arun CH, Durairaj DC (2017) Identifying medicinal plant leaves using textures and optimal colour spaces channel. J Ilmu Komputer dan Informasi 10(1):19–28 6. Islam MK, Habiba SU, Ahsan SMM (2019) Bangladeshi plant leaf classification and recognition using YOLO neural network. In: 2nd International conference on innovation in engineering and technology (ICIET), Bangladesh, Dec 2019 7. Begue A, Kowlessur V, Mahomoodally F, Singh U, Pudaruth S (2017) Automatic recognition of medicinal plants using machine learning techniques. Int J Adv Comput Sci Appl (IJACSA) 8(4) (2017) 8. Duong-Trung N, Quach L-D, Nguyen M-H, Nguyen C-N (2019) A combination of transfer learning and deep learning for medicinal plant classification. In: 4th International conference on intelligent information technology (ICIIT’19), pp 83–90, Vietnam, Feb 2019 9. Understanding ResNet50 Architecture. Available online: https://iq.opengenus.org/resnet50architecture/#:~:text=ResNet50%20is%20a%20variant%20of,explored%20ResNet50%20a rchitecture%20in%20depth. Last accessed on 1 May 2021 10. Image Classification with MobileNet. Available online: https://medium.com/analytics-vidhya/ image-classification-with-MobileNet-cc6fbb2cd470. Last accessed on 1 May 2021 11. Step by step VGG16 implementation in Keras for beginners. Available online: https://tow ardsdatascience.com/step-by-step-VGG16-implementation-in-keras-for-beginners-a833c6 86ae6c#:~:text=VGG16%20is%20a%20convolution%20neural,vision%20model%20archite cture%20till%20date.&text=It%20follows%20this%20arrangement%20of,consistently%20t hroughout%20the%20whole%20architecture. Last Accessed on 1 May 2021 12. Inception-v3. Available online: https://paperswithcode.com/method/inception-v3. Last accessed on 1 May 2021 13. Xception: Deep learning with depthwise separable convolutions. Available online: https://ope naccess.thecvf.com/content_cvpr_2017/papers/Chollet_Xception_Deep_Learning_CVPR_2 017_paper.pdp. Last accessed on 1 May 2021 14. Habib MT, Majumder A, Jakaria AZM, Akter M, Uddin MS, Ahmed F (2020) Machine vision based papaya disease recognition. J King Saud Univ Comput Inf Sci 32(3):300–309 15. Habib MT, Mia MJ, Uddin MS et al (2020) An in-depth exploration of automated jackfruit disease recognition. J King Saud Univ Comput Inf Sci (2020) 16. Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Addison-Wesley
Chapter 10
ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach H. K. Jayaramu , Dharavath Ramesh , and Sonal Jain
1 Introduction India’s 70% of the population depends on agriculture. Thus, it is an essential source of income for a large population in India. But moist weather, waterlogged soil, insufficient rainfall, insufficient sunlight, environmental pollution, pathogens, or sudden change in weather cause plant diseases. This leads to a significant reduction in crop production and quality. According to the report of FAO (United Nations, 2009), the global population is increased to more than 30%, and urbanization is increased to 70% by 2050, which leads to an increase in food demand of more than 70%. The problem of increasing food demand can be addressed by delivering advanced scientific and engineering solutions to farmers. Similarly, a suitable plant disease detection may help in increasing food production by on-time identification and prevention of disease. Generally, the disease of the plant can be detected by the texture and color of plant leaves. Some of the plant diseases caused by bacteria, such as rice blast, bacterial blight, and sheath blight, are detected through image processing techniques [3, 13, 24]. However, identifying plant diseases is a tedious task because of the huge inconsistency in the color, texture, and size of plant leaves. The disease detection techniques typically involve two steps: First is feature extraction from input plant leaf images, and the second step is to classify the extracted features as diseased or healthy using classifiers [13]. Some of the classification techniques used in literature for classification are naive Bayes (NB) [28], support vector machines (SVM) [9], decision trees (DT) [2], random forest (RF) [14], k-nearest neighbor (kNN) [7], neural networks [27] etc. Feature extraction is a dimensionality reduction technique. The feature extraction technique for plant disease detection can be classified as color information-based, H. K. Jayaramu · D. Ramesh (B) · S. Jain Indian Institute of Technology (Indian School of Mines), Dhanbad, India e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_10
155
156
H. K. Jayaramu et al.
shape-based, and texture-based. Features such as interblock and intrablock correlations (CHEN) are used by Chen and Shi [5]. Spatial domain features extracted using subtractive pixel adjacency matrix (SPAM) technique are used by Pevny et al. [21] in the field of JPEG steganalysis. A method for disease detection in soybean crop using plant leaf image is presented by Pires et al. [23], which uses local descriptor in combination with a bag of visual words (BOVW) for classifying leaf as healthy or diseased [22]. A detailed survey for plant disease identification, quantification, and classification is given by Barbedo [4]. Similarly, a methodology to detect citrus canker is presented by Zhang and Meng [31] using color and texture features such as local LBPH descriptors. They also developed a variant of AdaBoost named as SceBoost to select the most significant features. A method for paddy disease detection is presented by [18] using five features such as boundary color, broken paddy leaf color, lesion percentage, spot color, and lesion type were used for classification of disease. The disease detection of various plant leaf images is given in [25]. Feature extracted using feature extraction methods may contain some redundant or skipped features, which increases computation cost and requires the human expert for more accurate results. Hence, a methodology for selecting important features needs to be incorporated to increase classification accuracy. For selecting features, three methods are introduced in literature such as filter-based, wrapper-based [15], and embedded method [8]. In the filter method, features are selected regardless of the model; hence, it may also select redundant features. Whereas, in the wrapper-based method, an optimal or sub-optimal features subgroup is generated based on model classification performance. Still, this method may be subject to overfitting in case of insufficient or noisy data. The embedded method combines the advantage of the previous two methods by adding regularization parameters. In this work, a nature-inspired algorithm is used for optimal feature selection, which is a type of wrapper-based method. Nature-inspired algorithms [12, 26] like particle swarm optimization (PSO) [6], genetic algorithm (GA) [30], spider monkey optimization (SMO) [3], and artificial bee colony (ABC) [19] are popular for feature selection in metaheuristic. In this chapter, a variant of SMO named as exponential spider monkey optimization (ESMO) is used for feature selection. Some other variants of SMO are also available in literature like fitness-based position update in SMO [17], hybrid SMO, and GA [1], SMO for constrained optimization [10], modified SMO [11], modified position update in SMO [16], improved SMO [29] etc. The proposed techniques shown above performed the maximum accuracy of 92%. To increase accuracy, the exponential SMO has been considered along with the SPAM method, which extracts the leaf image features more accurately. On the other hand, the proposed methodology is compared with other existing techniques to showcase its performance. Contributions regarding the proposed algorithmic approach are as follows: • Applied an evolutionary optimization algorithm named ESMO for plant leaf disease identification. • The used SPAM features selection method extracts the 686 features from the image dataset.
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
157
• The methodology of proposed ESMO outperforms most of the benchmark functions as compared to SMO and PSO. The rest of the chapter is structured as follows. Section 2 explains the brief introduction and algorithm for SMO. Section 2.2 explains proposed methodology consisting of feature extraction, feature selection, and classification. The results of proposed work are reported in Sect. 3. Finally, Sect. 4 concludes the chapter.
2 Methodology 2.1 Spider Monkey Optimization The foraging behavior of spider monkey is categorized as fission–fusion social structure (FFSS). In this structure, the large set of spider monkeys is divided into a smallscale subset to search for food, which can be united again depending upon environmental and social conditions. The characteristics of FFSS in SMO are described in the following manner. 1. A set of 40–50 spider monkeys live together. The set is led by a global leader and can be split into a subset to reduce the foraging competition between the monkeys. 2. The female leader will lead the entire set, which can be further divided into subsets if the food is not enough for the whole set. Each subset of 3–8 size will forage independently. 3. The planning of the route for each day to search for food is decided by the global leader. 4. The subset members communicate with each other by their unique feature of visual or voice to stay away from adversaries or for sharing food. The six prominent steps of the SMO algorithm are explained as follows. Initialization: Initially, the population (Z ) of spider monkeys is initialized randomly. In which, each spider monkey S M is of D dimension vector for the optimization problem with D decision variables. Each S M can be defined by Eq. (1). SM pq = SMmin q + UR(0, 1) × (SMmax q − SMmin q )
(1)
where SM pq denotes the pth spider monkey in qth direction, UR(0, 1) is a random number generated uniformly in range [0, 1], and SMmax q and SMmin q are upper and lower bounds of SM p , respectively. Local Leader Step (LLS): Each SM modifies its current position by the experience from the local leader and local group members. If the new location’s fitness value is greater than its previous location’s fitness value, then SM position is updated with the latest one. The new location fitness value is calculated using the Eq. (2).
158
H. K. Jayaramu et al.
SMnew pq = SM pq + UR(0, 1) × (L kq − SM pq ) + UR(−1, 1) × (SMrq − SM pq ) (2) where S Mnew pq is the new location of qth dimension of the pth SM. SM pq is previous position of qth dimension of the pth SM. Similarly, the kth subset local leader position in qth dimension is defined as L kq . S Mrq is the randomly chosen r th SM from kth subset in qth dimension and r = p. UR(0, 1) is the randomly chosen number between range [0, 1]. The procedure for updating local leader position is shown in Algorithm 1. The perturbation in spider monkey position is controlled by a perturbation parameter per t. The range of per t is defined as [0.1, 0.8]. Algorithm 1 Local Leader Update procedure for all member S M p ∈ kth subset do for all q ∈ {1, ..., A} do if U R(0, 1) ≥ per t then S Mnewpq = S M pq + U R(0, 1) × (L kq − S M pq ) + U R(−1, 1) × (S Mrq − S M pq ) else S Mnewpq = S M pq end if end for end for
2.1.1
Global Leader Step (GLS):
In this phase, the position of all set members is updated based upon local subset members and global leader experiences. The members of the set are updated using Eq. (3). SMnew pq = SM pq + UR(0, 1) × (G q − SM pq ) + UR(−1, 1) × (SMrq − SM pq ) (3) where G q represents the global leadership location in randomly chosen qth dimension, i.e., q ∈ {1, 2, ..., D}. In this phase, prob p is the probability based on which the location of SM will be updated. The calculation of prob p for pth SM is shown in Eq. (4), which depends upon its fitness value fit p . fit p prob p = Z p=1
fit p
(4)
Further, the new fitness value is compared with the previous position fitness value and modifies to new if it is good than the previous fitness value. Algorithm 2 depicts the procedure of update position in the Global Leader Step phase. Global Leader Learning Step (GLLS): In this phase, the location of the global leader is modified as the SM’s position with the best fitness value in a set. If the
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
159
Algorithm 2 Global Leader Update procedure total = 0; while total < setsi ze do for all member S M p ∈ set do if U R(0, 1) < pr ob p then total = total + 1. Randomly pick q ∈ 1...A Randomly pick S Mr ∈ set and r = p. S Mnewpq = S M pq + U R(0, 1) × (G q − S M pq ) + U R(−1, 1) × (S Mrq − S M pq ) end if end for end while
location of the global leader is unchanged after updating, then GLC (global-limitcount) is increased by 1. Local Leader Learning Step (LLLS): The SM location having the best fitness value in the subset updates the location of the local leader. If the location of a local leader is unchanged, then LLC (local-limit-count) is increased by 1. Local Leader Decision Step (LLDS): If the location of the local leader remains unchanged up to a certain threshold value named local-leader-limit (LLL), then the location of all the members of the set are either initialized randomly or by merging information from the local leader and global leader using Eq. (5). Algorithm 3 depicts the procedure of update position in the local leader decision step. Algorithm 3 shows the process of the Local Leader Decision step. SMnew pq = SM pq + UR(0, 1) × (G q − SM pq ) + UR(0, 1) × (SM pq − L kq ) (5)
Algorithm 3 Local Leader Decision procedure if L LC > L L L then LLC = 0. S Mnewpq = S M pq for all q ∈ 1....A do if U R(0, 1) ≥ per t then S Mnewpq = S Mmin q + U R(0, 1) × (S Mmax q − S Mmin q ) else S Mnewpq = S M pq + U R(0, 1) × (G q − S M pq ) + U R(0, 1) × (S M pq − L kq ) end if end for end if
Global Leader Decision Step (GLDS): In this phase, the large set of SMs is divided into small subsets if the location of the global leader remains unchanged up to a predetermined threshold GlobalLeaderLimit (GLL), where the number of subsets
160
H. K. Jayaramu et al.
should not exceed the limit for maximum subsets. If the location of the global leader remains unchanged after forming the maximum number of subsets, then all subsets are merged into one single set. The procedure for the global leader decision step is shown in Algorithm 4. The complete procedure of spider monkey optimization is given in Algorithm 5. Algorithm 4 Global-Leader-Decision-process if G LC > G L L then GLC = 0. if Number of subset < Max subset limit then Slice the population into subsets else Merge all the small-scale subsets to single set. end if Update Local-Leader-position. end if
Algorithm 5 Spider Monkey Optimization Input Population, L L L, G L L, per t. Compute the fitness. Using a greedy selection select a global and local leader. while Not End do find the new location for the whole set by using the local-leader and global-leader experiences from Algorithm 1. for all the group members, apply a greedy selection method based on the fitness. Compute the pr ob p for each member of a group by using Eq. (4). Produce the new location for all members in the set, selected by pr obi , using experiences of self, a global leader and set members using Algorithm 2. Modify the location of a local leader and global leader for all subsets and set. if the local leader location remains unchanged up to predetermined threshold value (LocalLeader-Limit), then all members of that set are redirected for forging as specified by Algorithm 3. The set is further divided into small-scale subsets if the position of the global leader remains unchanged up to a predetermined threshold value (Global-Leader-Limit) by Algorithm 4. end while
2.2 Proposed ESMO: a Plant Disease Identification Approach This section presents a novel methodology for detecting plant leaf disease. Initially, the plant leaf image data is taken as input, and feature extraction is accomplished using a subtractive pixel adjacency matrix (SPAM). Subsequently, the extracted fea-
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
Plant Leaf Images
Feature Extraction (SPAM)
Feature Selection (ESMO)
Prediction and Accuracy
Classifier (SVM,k-NN)
Optimal Set of features
161
Fig. 1 Block diagram of the proposed mechanism
tures are fed into an exponential spider monkey optimization algorithm (ESMO) to select an optimal set of features. The output of this step is used for the classification of plant leaf images using classifiers such as SVM and k-NN. The detailed information about each step is given below. The block diagram of the proposed mechanism is presented in Fig. 1.
2.3 Exponential SMO An important parameter of SMO is the perturbation rate that affects the convergence behavior of the SMO algorithm, which is defined as a linearly increasing function. However, the SMO performance may be degraded due to non-linearity from different applications. Hence, the modified SMO, named ESMO, with improved perturbation rate, will increase the global search performance. In ESMO, the perturbation rate is defined by an exponentially increasing function shown in Eq. (6). rnew = (rinit )
m it t
(6)
where m it and t represent the maximum and current iteration count, respectively, and rinit is an initial perturbation rate, randomly initialized in the range [0, 1].
2.4 Feature Extraction The classification accuracy of the model depends upon the selected features and classification algorithm. Hence, the aim is to extract the multi-divergent and dimensionrelevant features from the image dataset, which can differentiate healthy plant leaf images from diseased ones. SPAM is a feature extraction method introduced by [21]. In this method, the features are extracted from the spatial domain of an image based
162
H. K. Jayaramu et al.
Table 1 Dimensions of the model D Order 2 3
Dimension (l)
First Second
162 686
on Markov chain features. Initially, the model computes transition probabilities in eight directions, while the computation for differences and transition probability is always done in the same direction. Table 1 shows the SPAM features of the first- and second-order models and the different array D by using the first and second-order Markov process. Here, the number of features l is given by l = (2D + 1)2 for the first and l = (2D + 1)3 for second-order features. In this, the value of D is chosen as D = 2 and D = 3 for first- and second-order SPAM features, respectively. Thus, obtaining 2l = 162 features for D = 2 and 2l = 686 features for D = 3. By the sample Markov transition probability matrices, the SPAM features are found in the range [−D, D]. The SPAM method extracts a total of 686 features from the image dataset.
2.5 Feature Selection Features extracted using SPAM may contain unnecessary and extraneous features that lead to a decrease in the performance of the classification. Hence, the extracted SPAM features are fed into an optimization algorithm to select the optimal features that give the best performance. The exponential spider monkey optimization (ESMO) is used to select an optimal set of features. At first, the number of spider monkeys Z in population is considered, and the location of each spider monkey SM is initialized randomly. The dimension of SM is defined as the number of extracted features n from SPAM. Each spider monkey, p in a population of N , is denoted as below. X p = x1 , x2 , ....., xn ; p = 1, 2, ...., N
2.6 Performance Comparison The performance of the proposed ESMO method is analyzed for seven standard benchmark functions given in Table 2. The ESMO performance is also compared against metaheuristic approaches such as SMO and PSO. The default parameters setting is used for all the approaches. The result is computed over 1000 iterations
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach Table 2 Benchmark functions [3] S. No. Equation 1 2 3 4 5 6 7 8
d f 1 (X ) = i=1 xi2 d d f 2 (X ) = i=1 |xi | + i=1 |xi | f 3 (X ) = maxi |xi |, 1 ≤ i ≤ d d f 4 (X ) = i=1 ([xi + 0.5])2 d f 5 (X ) = i=1 i xi4 + random + [0, 1] d f 6 (X ) =-20 exp(-0.2 d1 i=1 xi2 ) − 1 d exp( d i=1 cos(2π xi )) + 20 + e f 7 (X ) = 4x12 − 2.1x14 + 13 x16 + x1 x2 − 4x22 + 4x24 f 8 (X ) = (1 + (x1 + x2 + 1)2 .(19 − 14x1 + 3x12 − 14x2 + 6x1 x2 + 3x22 )) .(30 + (2x1 − 3x2 )2 .(18 − 32x1 + 12x12 + 48x2 − 36x1 x2 + 27x22 ))
Table 3 Mean fitness value comparison Function ESMO f 1 (X ) f 2 (X ) f 3 (X ) f 4 (X ) f 5 (X ) f 6 (X ) f 7 (X ) f 8 (X )
−7.246156e+190 −5.865074e+280 −1.434760e+189 −1.078704e+190 −5.818072e+188 −2.056615e+189 −4.429892e+188 −5.039292e+307
Dimensions
Range
Optimal value
30 30 30 30 30
[−100, 100] [−10, 10] [−100, 100] [−100, 100] [−1.28, 1.28]
0 0 0 0 0
30
[−32, 32]
0
2
[−5, 5]
0.0003
2
[ −2, 2]
3
SMO
PSO
−1.260096e+188 −2.407429e+279 3.471750e+187 −9.667407e+187 −1.698736e+190 −1.172066e+188 −2.154990e+188 2.358577e+39
2.231762e−15 −1.996000e+01 −9.979999e+00 8.909608e−09 9.186722e−09 1.392610e−14 −3.760540e−01 −2.943116e+05
163
considering a population size of 1000. The mean fitness value of all the approaches is compared, which is depicted in Table 3. The ESMO method returns minimum mean fitness value for most of the seven benchmark functions, whereas PSO returns best results for benchmark functions F1. The best fitness values of all benchmark functions against the number of iterations are shown in Fig. 6.
2.7 Classifier The features selected from the feature selection phase using ESMO are fed as input to the classification algorithm. For this, two classification algorithms such as SVM
164
H. K. Jayaramu et al.
and kNN are used to classify input images as healthy or diseased. Moreover, the classification accuracy of these two methods is compared, which is given by Table 4 in the result section.
3 Results To simulate the proposed methodology, the dataset of 100 plant leaf images obtained from plant image dataset [20] is used. The dataset includes 50 healthy and 50 unhealthy leaf images. Figure 2 and 3 show the diseased leaf images of rice and cotton plant, and Figs. 4 and 5 show healthy leaf images of rice and cotton plants. Initially, a total of 686 features are extracted by the SPAM method for every image, which is then fed into feature selection phase. The feature selection method using ESMO selects 82 features (Fig. 6). The ESMO is also compared against the original SMO and PSO algorithm. It can be observed from Table 4 that ESMO gives the least features than others, which is more than 85% reduction in extracted features. Moreover, the classification accuracy of SVM and k-NN is also compared, which depicts that SVM gives a better classification accuracy of 93.67%. The confusion matrix achieved using SVM and k-NN classifier is depicted in Fig. 7a, b, respectively. From the confusion matrix, the classification of healthy leaves is done more accurately than diseased leaves, which is due to poor disease symptoms on some of the leaves. Moreover, some of the healthy leaves are classified as diseased, which is due to the poor lighting conditions when the image is captured. Figure 8 shows some of the leaf samples classified using the proposed method.
Fig. 2 Unhealthy rice leaf
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
165
Fig. 3 Unhealthy cotton leaf images
Fig. 4 Healthy rice leaf images
4 Conclusions and Future Work In this work, a mechanism to detect unhealthy or diseased plants using plant leaf image data is presented. The novel idea of exponential spider monkey optimization (ESMO) is proposed for selecting optimal features to increase the accuracy of classification. The EMSO outperforms for most of the benchmark functions when
166
H. K. Jayaramu et al.
Fig. 5 Healthy cotton leaf images Table 4 Features selection comparison Feature selection No. of features method selected None
686
PSO
97
SMO
84
ESMO
82
Classification method
Accuracy (%)
SVM k-NN SVM k-NN SVM k-NN SVM k-NN
80.02 82.35 90.80 83.52 90.89 82.75 93.67 84.20
compared against SMO and PSO for mean fitness value. The method is implemented for healthy and unhealthy plant leaf images of rice and cotton plant. The performance of ESMO is compared with PSO and SMO, which shows that using EMSO for feature selection and SVM gives maximum classification accuracy of 93.67%. The future plan is to recognize different plant diseases using a multi-SVM classifier and also analyze the result with different classification techniques such as SVM, NB, and RF classifiers.
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach -10 -50
10 20 ESMO PSO SMO
10 -20 10 -40 10 -60 10 -80
ESMO PSO SMO
-10 0
Best Fitness Value(in log)
Best Fitness Value(in log)
10 0
-10 50
-10 100
-10 150
-10 200
10 -100
-10 250
10 -120 0
200
400
600
800
0
1000
200
400
-10 -50
-10 50
-10 100
-10 150
ESMO PSO SMO
10 2 10 0
Best Fitness Value(in log)
Best Fitness Value(in log)
1000
10 4 ESMO PSO SMO
-10 0
10 -2 10 -4 10 -6
200
10 -8
0
200
400
600
800
10 -10
1000
0
200
400
Iteration
600
800
1000
Iteration
(c) Benchmark Function F3.
(d) Benchmark Function F4.
10 190
2
800
(b) Benchmark Function F2.
(a) Benchmark Function F1.
-10 250
600
Iteration
Iteration
-10
167
10 193
1
0
0
Best Fitness Value(in log)
Best Fitness Value(in log)
-2 -4 -6
ESMO PSO SMO
-8 -10 -12
-1
ESMO PSO SMO
-2
-3
-4
-5
-14
-6
-16 0
200
400
600
800
0
1000
200
400
800
1000
(f) Benchmark Function F6.
(e) Benchmark Function F5. 10 302
1
10 192
0.5
600
Iteration
Iteration
0
Best Fitness Value(in log)
0
-1 -2
ESMO PSO SMO
-0.5
-3
-1
ESMO PSO SMO
-4 -1.5
-5 -6
-2
-7 -2.5 0
200
400
600
800
Iteration
(g) Benchmark Function F7.
Fig. 6 Benchmark functions
1000
-8 0
200
400
600
800
(h) Benchmark Function F8.
1000
168
H. K. Jayaramu et al.
(a) SVM classifier
(b) K-NN classifier
Fig. 7 Confusion matrix
(a) True: Healthy
Diseased;
Predicted: (b) True: Diseased; Predicted: Diseased
(c) True: Healthy
Healthy;
Predicted: (d) True: Healthy; Predicted: Diseased
Fig. 8 Leaf classification using proposed method
Acknowledgements Funding: This work is supported by Science and Engineering Research Board (SERB-DST), Govt. of India. Under Grant no. EEQ/2018/000108. Conflicts of interest: The authors declare that they have no conflict of interests.
10 ESMO-based Plant Leaf Disease Identification: A Machine Learning Approach
169
References 1. Agrawal A, Farswan P, Agrawal V, Tiwari D, Bansal JC (2017) On the hybridization of spider monkey optimization and genetic algorithms. In: Proceedings of sixth international conference on soft computing for problem solving,. Springer, pp 185–196 2. Akhtar A, Khanum A, Khan SA, Shaukat A (2013) Automated plant disease analysis (APDA): performance comparison of machine learning techniques. In: 2013 11th International conference on frontiers of information technology. IEEE, pp 60–65 3. Bansal JC, Singh PK, Deep K, Pant M, Nagar AK (2012) Proceedings of seventh international conference on bio-inspired computing: theories and applications (BIC-TA 2012), vol 2. Springer 4. Barbedo JGA (2013) Digital image processing techniques for detecting, quantifying and classifying plant diseases. SpringerPlus 2(1):660 5. Chen C, Shi YQ (2008) Jpeg image steganalysis utilizing both intrablock and interblock correlations. In: 2008 IEEE International symposium on circuits and systems. IEEE, pp 3029–3032 6. Chhikara RR, Sharma P, Singh L (2016) A hybrid feature selection approach based on improved PSO and filter approaches for image steganalysis. Int J Mach Learn Cybern 7(6):1195–1206 7. Deepa S, Umarani R (2017) Steganalysis on images using SVM with selected hybrid features of GINI index feature selection algorithm. Int J Adv Res Comput Sci 8(5) 8. Deng H, Runger G (2012) Feature selection via regularized trees. In: The 2012 International joint conference on neural networks (IJCNN). IEEE, pp 1–8 9. Guettari N, Capelle-Laizé AS, Carré P (2016) Blind image steganalysis based on evidential knearest neighbors. In: 2016 IEEE International conference on image processing (ICIP). IEEE, pp 2742–2746 10. Gupta K, Deep K, Bansal JC (2017) Spider monkey optimization algorithm for constrained optimization problems. Soft Comput 21(23):6933–6962 11. Hazrati G, Sharma H, Sharma N, Bansal JC (2016) Modified spider monkey optimization. In: 2016 International workshop on computational intelligence (IWCI). IEEE, pp 209–214 12. Hussain K, Salleh MNM, Cheng S, Shi Y (2019) Metaheuristic research: a comprehensive survey. Arti Intell Rev 52(4):2191–2233 13. Kaur S, Pandey S, Goel S (2019) Plants disease identification and classification through leaf images: a survey. Arch Comput Methods Eng 26(2):507–530 14. Kodovsky J, Fridrich J, Holub V (2011) Ensemble classifiers for steganalysis of digital media. IEEE Trans Inf Forensics Secur 7(2):432–444 15. Kohavi R, John GH et al (1997) Wrappers for feature subset selection. Artif intell 97(1–2):273– 324 16. Kumar, S., Kumari, R.: Modified position update in spider monkey optimization algorithm. Int J Emerg Technol Comput Appl Sci (IJETCAS). Citeseer (2014) 17. Kumar S, Kumari R, Sharma VK (2015) Fitness based position update in spider monkey optimization algorithm. Procedia Comput Sci 62:442–449 18. Kurniawati NN, Abdullah SNHS, Abdullah S, Abdullah S (2009) Texture analysis for diagnosing paddy disease. In: 2009 International conference on electrical engineering and informatics, vol 1. IEEE, pp 23–27 19. Mohammadi FG, Abadeh MS (2014) Image steganalysis using a bee colony based feature selection algorithm. Eng Appl Artif Intell 31:35–43 20. Mohanty SP, Hughes DP, Salathé M (2016) Using deep learning for image-based plant disease detection. Front Plant Sci 7:1419 21. Pevny T, Bas P, Fridrich J (2010) Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inf Forensics Secur 5(2):215–224 22. Pflanz M, Nordmeyer H, Schirrmann M (2018) Weed mapping with UAS imagery and a bag of visual words based image classifier. Remote Sens 10(10):1530 23. Pires RDL, Gonçalves DN, Oruê JPM, Kanashiro WES, Rodrigues JF Jr, Machado BB, Gonçalves WN (2016) Local descriptors for soybean disease recognition. Comput Electron Agricult 125:48–55
170
H. K. Jayaramu et al.
24. Priya R, Ramesh D, Khosla E (2018) Biodegradation of pesticides using density-based clustering on cotton crop affected by Xanthomonas malvacearum. Environ Dev Sustaina 1–17 25. Raghavendra B et al (2019) Diseases detection of various plant leaf using image processing techniques: a review. In: 2019 5th International conference on advanced computing & communication systems (ICACCS). IEEE, pp 313–316 26. Saraswat M, Arya K, Sharma H (2013) Leukocyte segmentation in tissue images using differential evolution algorithm. Swarm Evolut Comput 11:46–54 27. Sheikhan M, Pezhmanpour M, Moin MS (2012) Improved contourlet-based steganalysis using binary particle swarm optimization and radial basis neural networks. Neural Comput Appl 21(7):1717–1728 28. Sujatha R, Isakki P (2016) A study on crop yield forecasting using classification techniques. In: 2016 International conference on computing technologies and intelligent data engineering (ICCTIDE’16). IEEE, pp 1–4 29. Swami V, Kumar S, Jain S (2018) An improved spider monkey optimization algorithm. In: Soft computing: theories and applications. Springer, pp 73–81 30. Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4(2):65–85 31. Zhang M, Meng Q (2010) Citrus canker detection based on leaf images analysis. In: The 2nd international conference on information science and engineering. IEEE, pp 3584–3587
Chapter 11
Deep Learning-Based Cauliflower Disease Classification Md. Abdul Malek, Sanjida Sultana Reya, Nusrat Zahan, Md. Zahid Hasan, and Mohammad Shorif Uddin
1 Introduction Cauliflower is such a tuber that is associated with the Brassicaceae family of Cruciferae and is considered in the cabbage family as well [1]. A cup of simmered cauliflower is a tremendous source of vitamin C (91.5% of the daily value, DV), folate (13.6% of the DV), as well as dietary fiber (13.4% of the DV). That equivalent measure of this tuber additionally fills in as a huge source of vitamin B5, vitamin B6, manganese 16, and omega-3 fatty acids [2]. China, India, Bangladesh, and the United States of America (the USA) are the leading countries [3] for Cauliflower production. Although cauliflower production is increasing rapidly because of its huge demand, this production can be hampered due to some dangerous diseases like bacterial soft rot, black rot, buttoning, and white rust. Researchers around the world are trying to correctly identify these diseases along with developing disease diagnosis techniques to prevent production loss. This study also introduces a sophisticated disease identification model using the deep learning technique. Deep learning covers a wide area of computer vision that automatically extracts features from images to identify and Md. Abdul Malek · S. S. Reya · N. Zahan · Md. Zahid Hasan (B) Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] Md. Abdul Malek e-mail: [email protected] S. S. Reya e-mail: [email protected] N. Zahan e-mail: [email protected] M. S. Uddin Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_11
171
172
Md. Abdul Malek et al.
Fig. 1 Cauliflower harvesting in the district of Rajshahi, Bangladesh
recognize the disease. Figure 1 represents the harvesting of cauliflower in the village field of Rajshahi in Bangladesh. Computer vision as well as object identification has gone through enormous development in the previous several years. More recently, the PASCAL VOC Challenge [4], Large Scale Visual Recognition Challenge (ILSVRC) [5] as well as the ImageNet dataset [6] have been generally utilized as standards for evaluating various representation-related concerns in computer vision containing object classification tasks. For the image classification into 1000 possible classes, a large deep convolutional neural network accomplished a top-5 inaccuracy of 16.4% in 2012. In the next three years, numerous improvements in deep convolutional neural networks dropped the inaccuracy rate by 3.57% [7]. Deep learning is a subset of machine learning that copies the functioning of the human brain and eventually generates a pattern for decision-making [8]. Many different domains are successfully applying deep neural networks as end-to-end learning. A mapping among input and output is generated by neural networks. Mathematical functions like the nodes in a neural network accept incoming edge numerical inputs. Further, it generates outgoing edges numerical outputs. Simply, deep learning uses nodes containing a series of stacked layers to map the input layer into the output layer. But it is challenging to ensure the network structure and functions as well as weights of edge map from the input to the output correctly. Training deep neural networks can improve the mapping. This challenging process has been improved by conceptual and engineering breakthroughs [9]. However, the training of a large neural network is time-consuming. To ease this problem, transfer learning is an effective way. Transfer learning has two advantages: reduction in training time and increase in testing accuracy. With this view, the current chapter utilizes the transfer learning method for extracting features from images using the variation of VGG, ResNet50,
11 Deep Learning-Based Cauliflower Disease Classification
173
and InceptionV3 architectures. Later, the feature maps are reshaped into a singledimensional vector followed by the softmax activation function that supports the model to recognize the cauliflower disease. The remaining sections are organized as follows. Section 2 describes related literature. Section 3 presents the used tools and techniques along with datasets. Section 4 discusses experimental findings along with comparative analysis. Finally, Sect. 5 concludes the chapter.
2 Related Works In the agricultural research area, multiple approaches are utilized for detecting as well as classifying plant diseases [10]. Although a lot of research works have been accomplished using computer vision-based disease recognition, very limited work was done on the detection of cauliflower diseases. This section discusses some of the existing methods of computer vision for plant disease detection. In their work, Dhakal and Subarna [11] used artificial neural networks to detect plant disease. Infected leaves have been labeled and collected to enhance images through the extraction of features from the segmented image and classification of patterns. The extraction function fits into the 20 epochs of the neural network and shows an accuracy of 98.59% in the determination of plant disease. Too et al. [12] used a dataset of 38 several classes containing images of both healthy and infected leaves of 14 plants. They assessed various deep learning architectures and found that the DenseNet perform is better with an increasing number of epochs and without overfitting and performance deterioration. With fewer parameters and enough computing time, 99.75% accuracy is achieved by DenseNet, where Keras and Theano have been used for architectural training. Afifi et al. [13] have developed and tested CNN methods, such as ResNet18, ResNet34, and ResNet50 along with a triplet network using minimal data to classify plant diseases. Islam et al. [14] introduced an integrated approach with image processing as well as machine learning for the identification of potato leaf diseases. The dataset was used from a publicly available plant image database known as “plant village” where over 300 images were used in disease classification and got 95% accuracy. Athanikar and Badar [15] used neural network techniques for the classification of potato leaf diseases. Their experimental result showed that BPNN performed better in detecting the spots of infected leaves as well as classify the specific disease with an accuracy of 92%. Another work had been done by Wang et al. [16] with the idea of finding a method for realizing recognition of plant diseases. They found that the neural network-based method could effectively identify and diagnose plant diseases. Similarly, Samanta et al. [17] utilized procedures for enhancement as well as segmentation of potato leaf images to classify leaf diseases. For food grading along with disease detection, Jhuria et al. [18] utilized image processing techniques on two different datasets. They used artificial neural networks for disease detection. Color, texture, and morphology are the three feature vectors
174
Md. Abdul Malek et al.
that were considered in this study, and they showed that morphological features came out with a better result rather than the other features. Fuentes et al. [19] implemented a platform utilizing deep learning models for performing tomato plant disease diagnosis as well as localization. The authors achieved a higher average precision using their own automated tomato leaf image database. On the other hand, Gutierrez et al. [20] investigated automated pest detection using machine learning techniques. Detecting the harmful pests in greenhouse tomato and pepper crops was the main focus of this work. Their obtained result proved that deep learning architectures performed better because of their ability for performing improved classification and detection than other machine learning techniques. Another study by Ramcharan et al. [21] obtained satisfactory results in Cassava-leave disease diagnosis by deep learning approach through the use of Single Shot MultiBox Detector (SSD). According to the review of different research articles, it is seen that a major portion of disease recognition, as well as classification work in agriculture, is accomplished by utilizing computer vision-based machine learning approaches. However, though cauliflower is a very popular edible crop in many countries in the world that is affected by different diseases, it is not properly addressed in the literature.
3 Materials and Methods The total system is overviewed in Fig. 2, which consists of three different stages: data preparation, data preprocessing, and classification.
Fig. 2 Representation of cauliflower diseases classification using deep learning methods
11 Deep Learning-Based Cauliflower Disease Classification
175
3.1 Dataset It is very important to accumulate images for solving the problem based on image analysis. Consequently, images used in this dataset were taken using a 12-megapixel smartphone camera from various cauliflower fields in the district of Rajshahi, Bangladesh. Images of four different kinds of diseases (bacterial soft rot, black rot, buttoning, and white rust) were collected over three weeks. However, these images were not sufficient for the training of the CNN model. So, some images were collected from the Internet. After that, the acquired images were augmented using rotation, scaling, shearing, and transformation. Finally, a total of 2500 images were prepared. Among these, 75% were used for training and 25% were used for the validation of the model, and 264 new images (66 from each disease class) were used for testing. Some sample images are shown in Table 1. Images taken from the cauliflower fields were full of noise, so preprocessing of data was essential. We applied a non-local mean denoising algorithm for reducing the noise from the images (see Fig. 3b, c). Table 1 Sample images of four different diseases of cauliflower Disease image
Disease name
Bacterial soft rot
Black rot
Buttoning
Cause of the disease
Caused by many bacteria, the most common is Pectobacterium carotovorum
Caused by a bacterium named Xanthomonas campestris
Caused due to Caused by Albugo shortage of nitrogen candida
(a) Original image
(b) Salt and pepper noise addition
Fig. 3 Salt and pepper noise removal using a denoising
White rust
(c) Noise removal using nonlocal means denoising algorithm
176
Md. Abdul Malek et al.
Cauliflower disease images are affected by the salt and pepper noise, whose salt noise value is represented as 255 and pepper noise value is represented as 0. The noise levels can be varied in a range of 10–70%, and the noise is represented in the image I(m,n) as:
I(m,n)
⎧ ⎨ X j if I ∈ 0 . . . T j = X k if I ∈ [Tk , 255] ⎩ C(i, j) if t ∈ T p , Ts
(1)
The noise level is measured by the peak signal to noise ratio (PSNR) as follows: PSNR = 10 log10
1 MN
2552 2 i, j ri, j − x i, j
(2)
ri, j and xi, j represent the values of pixels restored image as well as the original image, respectively. The median filter is utilized in this experiment for eliminating noise from the image by following the algorithm which is presented in Algorithm 1. Algorithm 1: Median filtering algorithm to remove salt and pepper noise Step-1: Allocate outpixvalue[m][n] Step-2: Allocate filter size with filter width and filter height. Step-3: Define the edge of x-axis (ex) which is divided by 2 Step-4: Define the edge of y-axis (ey) which is divided by 2 Step-5: for i=ex to filter_width do for j=ey to filter_height do set x to 0 for :=0 to filter_width do for :=0 to filter_height do filter[x]:= inpixvalue [i+ -ex] [j+ -ey] set x to x+1 sorting values into filter[] Step-6: set outpixvalue [m][n] to filter [filter_widht * filter_height / 2]
After that, the CLAHE (contrast limited adaptive histogram equalization) technique has been utilized for enhancing the contrast of each image. The raw image histogram shows that the raw images are full of noises and the pixels are not stable, but the histogram is far more stable than the raw images after applying the CLAHE method (see Fig. 4). Later, using the Gaussian mixture model (GMM), all the images were segmented into several regions. GMM is known to be a weighted sum, in which M is the number of components [22], ρ( f |λ) =
M
i=1
ωi g( f |oi ,
i)
(3)
11 Deep Learning-Based Cauliflower Disease Classification
Disease Name
Raw Image
Histogram of Raw Image
Contrastenhanced Image
177
Histogram of Contrastenhanced Image
(a) Bacterial Soft rot
(b) Black Rot
(c) Buttoning
(d) White Rust
Fig. 4 Histogram analysis before and after applying the CLAHE method of four different cauliflower disease contained images
Here, f = D-data vector with continuous value, ωi , i = 1, …, M (weights of mixture), g(f |oi , i), I = 1,…., M (Gaussian densities factor). g( f |oi ,
⎧ ⎫ −1 ⎨ ⎬ ( f −o 1 1 i i) = exp − f − oi i ) D 1 ⎩ ⎭ 2 2 (2π ) | i| 2
(4)
Figure 5 illustrates the steps that have been followed to process the image for feeding the deep learning network. In step 1, we have acquired the images and save them to the local storage. Then, images are resized by the reshape function of the Python to 64 × 64 pixels. To enhance the contrast of every image presented in the dataset, the CLAHE method is applied which is represented in Fig. 5c. GMM is a well-known segmentation technique that has been applied to segment the image in the dataset.
178
Md. Abdul Malek et al.
(a) Input image
(b) Resized image
(c) Contrast enhanced image
(d) Segmented image
Fig. 5 Image preprocessing steps
3.2 Convolutional Neural Network (CNN) A convolutional neural network (CNN) is a kind of artificial neural network typically utilized in the classification of images in the deep learning domain. Depending on their shared weights and translation invariance functions, they are recognized as shift invariant or space invariant artificial neural networks (SIANN). Usually, regularized versions of multilayer perceptron are also known as CNN. Deep convolutional networks are motivated by animal brains because the pattern of connectivity among neurons matches the visual cortex arrangement of the animal [23]. A convolutional neural network (CNN) contains an input layer, hidden layers, and an output layer. The hidden layers in a convolutional neural network contain layers executing convolutions, and their activation function is normally ReLU, monitored by pooling layers and fully connected layers.
11 Deep Learning-Based Cauliflower Disease Classification
179
3.3 State-of-the-Art CNN Architectures VGG16: VGG16 is a convolutional neural network architecture approached by Simonyan and Zisserman [24] in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition” along with obtaining top-5 test accuracy of 92.7% in ImageNet. The most enchanting attribute of VGG-16 is that it has a large number of hyperparameters instead of others. It often uses the same convolution layer 3 × 3 filter with stride 1 and 2 × 2 filters by the same padding along with maximum layer pooling with stride 2. After following the order of convolutional and max-pooling layers throughout the architecture, finally, 2 fully connected layers are connected to the sigmoid layer. This architecture contains around 138 million parameters and takes 224 × 224 pixels as input images. The architecture of VGG16 is shown in Fig. 6. VGG19: Like VGG16, VGG19 is also a convolutional neural network represented in Fig. 7. The number 19 in VGG19 reflects the number of layers in which 16 convolution layers, 3 completely connected layers, 5 max-pooling layers, and finally, 1 sigmoid layer are used. ResNet50: ResNet is commonly referred to as the residual network that is proposed by He et al. [25]. It provides a residual learning structure to facilitate networks for training that are significantly deeper than those utilized previously. Such residual networks are simpler to improve and can obtain accurateness from substantially augmented depth. Residual networks along with a depth of up to 152 layers are evaluated on the ImageNet dataset. The architecture of ResNet50 is shown in Fig. 8 that uses 3 × 3 filters and takes 224 × 224 pixel input images. Inception V3: It is proposed by Szegedy et al. [26] that examines methods for scaling up networks aimed at proficient computing with appropriate factorization and aggressive regularization. The main goal of factorization is to decrease the number
Fig. 6 Architecture of VGG16
180
Md. Abdul Malek et al.
Fig. 7 Architecture of VGG19
Fig. 8 Architecture of Resnet50
of connections as well as the parameters. This architecture uses the softmax function for normalization in the last layer and consists of 42 layers in total, and the input layer takes images of 299 × 299 pixels. The model of Inception V3 is shown in Fig. 9.
3.4 Computational Environment The experiment was conducted using a 64-bit 33 MHz Windows 10 computer with 16 GB RAM, Tesla K80 graphics in the Python programming language environment along with Keras and TensorFlow APIs.
11 Deep Learning-Based Cauliflower Disease Classification
181
Fig. 9 Architecture of InceptionV3
3.5 Training As the CNN model takes a very long time for training, transfer learning is introduced to shortcut the model training time. The experiment was performed through various CNN architectures, such as VGG16, VGG19, InceptionV3, and ResNet50 where all models were pre-trained using the ImageNet dataset. The input image size was taken according to the input size of the respective architectures. Adam optimizer was picked as the optimization technique for every architecture with 0.001 as a learning rate. Figure 10 demonstrates the outcome of the deployment of filters in the transitional convolution layers of deep learning structural design. The feature map of InceptionV3 revealed that pixel information of all the images was correctly retained using those small filters. The model showed a great performance in the classification of diseases with different attributes. The performance of any deep learning architecture is usually evaluated using the accuracy, precision, recall, F1-Score, false-positive rate (FPR), and falsenegative rate (FNR) from the confusion matrix. These metrics are defined through Equations (5) to (8). Accuracy (% ) =
TP + TN × 100 TP + FN + FP + TN
(5)
TP × 100 TP + FP
(6)
Precision (% ) =
182
Md. Abdul Malek et al.
Fig. 10 Result of applying the filters in the intermediate convolutional layers of InceptionV3 architecture
Recall (% ) = F1 Score (% ) = 2 ×
TP × 100 TP + FN
(7)
Precision × Recall × 100 Precision + Recall
(8)
4 Result Analysis To find the most effective model among the considered CNN architectures, we have investigated the CNN models with the same number of test images. Figure 11 shows the performance of the investigated four CNN models: VGG16, VGG19, ResNet50,
Accuracy
1
0.875
0.939
0.858
0.8
0.587
0.6 0.4 0.2 0
VGG16
VGG19
Resnet50
CNN Architectures Fig. 11 Test accuracy of different CNN architectures
IncepƟonV3
11 Deep Learning-Based Cauliflower Disease Classification
183
and InceptionV3. Except for Resnet50, all other architectures have shown a decent result in the cauliflower diseases classification, but the best result is obtained from InceptionV3 whose accuracy is 93.90%. The complete performance metrics for all CNN models are shown in Table 2. InceptionV3 demonstrated the consistently best performance indicating that the model is very effective in detecting the diseases. We include the receiver operating characteristic (ROC) curves to ensure the classification performance of different classes using InceptionV3. Figure 12 shows that the area under the ROC curve (AUC) of each class is much closer to 1 which indicates that the InceptionV3 classifier demonstrates its effectiveness. Table 3 shows the obtained confusion matrix of InceptionV3. We have found a recent work on cauliflower disease identification by Taki et al. [27] using a random forest classifier; hence, we compared our findings with that findings. As very little work has been done on cauliflower disease identification, we compared our results with some related disease classification approaches [28– 30]. Table 4 presents the comparative analysis that confirms the superiority of our investigated method. Table 2 Performance metrics of CNN architectures Architecture
Precision (avg)
Recall (avg)
F1-score (avg)
Accuracy (avg)
VGG16
0.89
0.88
0.88
0.875
VGG19
0.87
0.86
0.86
0.858
ResNet50
0.60
0.59
0.59
0.587
InceptionV3
0.94
0.94
0.94
0.939
Fig. 12 AUC of the ROC curve of each class was obtained from the InceptionV3
184
Md. Abdul Malek et al.
Table 3 Confusion matrix of InceptionV3 Predicted Value Actual Value
Bacterial soft rot
Black rot
Buttoning
White rust
Bacterial soft rot
58
0
8
0
Black rot
0
65
0
1
Buttoning
0
7
59
0
White rust
0
0
0
66
Table 4 Comparative analysis of different disease identification techniques References
Plant disease
No. of images in the dataset
No. of classes
Classifier
Accuracy (%)
Proposed model
Cauliflower
2500
4
CNN
93.93
Taki et al. [27]
Cauliflower
500
5
Random forest
81.68
Ghyar and Birajdar [28]
Rice leaf
120
3
SVM, ANN
92.5, 87.5
Habib et al. [29]
Jackfruit
480
5
Random forest
90
Goncharov et al. [30]
Grape leaf
200
3
Siamese
90
5 Conclusion In this work, different convolutional neural network models along with transfer learning have been used for classifying four common diseases in cauliflowers, such as bacterial soft rot, black rot, buttoning, and white rust. InceptionV3 shows the highest performance with an accuracy of 93.90%. In addition, we have compared our method with a recently reported method. We hope this technique will provide farmers a feasible, effective, and time-saving way of identifying diseases and enhancing crop yields.
References 1. Biodatabase.de. [Online]. Available: http://www.biodatabase.de/Cauliflower. Accessed: 10 Apr 2021 2. Cauliflower. Whfoods.org. [Online]. Available: http://whfoods.org/genpage.php?tname=foo dspice&dbid=13. Accessed 10 Apr 2021 3. Hasan MR, Mutatisse AA, Nakamoto E, Bai H (2014) Profitability of cauliflower and bean production in Bangladesh—a case study in three districts. Bangladesh J Extens Educ ISSN 1011:3916
11 Deep Learning-Based Cauliflower Disease Classification
185
4. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338 5. Russakovsky O et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 6. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition 7. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90 8. Hargrave M (2021) Deep learning. Investopedia.com, 06-Apr-2021. Available: https://www.inv estopedia.com/terms/d/deep-learning.asp#:~:text=Deeplearning is an AI, is both unstructured and unlabeled. Accessed: 10 Apr 2021 9. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444 10. Jayamala KP, Kumar R (2011) Advances in image processing for detection of plant diseases. J Adv Bioinf Appl Res 2(2):135–141 11. Dhaka A, Shakya S (2018) Image-based plant disease detection with deep learning. Int J Comput Trends Technol 61(2) 12. Too EC, Yujian L, Njuki S, Yingchun L (2019) A comparative study of fine-tuning deep learning models for plant disease identification. Comput Electron Agric 161:272–279 13. Afifi A, Alhumam A, Abdelwahab A (2020) Convolutional neural network for automatic identification of plant diseases with limited data. Plants 10(1):28 14. Islam M, Dinh A, Wahid K, Bhowmik P (2017) Detection of potato diseases using image segmentation and multiclass support vector machine. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE) 15. Athanikar G, Badar MP (2016) Potato leaf diseases detection and classification system. Int J Comput Sci Mobile Comput 5(2):76–88 16. Wang H, Li G, Ma Z, Li X (2012) Application of neural networks to image recognition of plant diseases. In: 2012 International conference on systems and informatics (ICSAI2012) 17. Samanta D, Chaudhury PP, Ghosh A (2012) Scab diseases detection of potato using image processing 3:109–113 18. Jhuria M, Kumar A, Borse R (2013) Image processing for smart farming: Detection of disease and fruit grading. In: 2013 IEEE second international conference on image information processing (ICIIP-2013) 19. Fuentes A, Yoon S, Kim S, Park D (2017) A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors (Basel) 17(9):2022 20. Gutierrez A, Ansuategi A, Susperregi L, Tubío C, Ranki´c I, Lenža L (2019) A benchmarking of learning strategies for pest detection and identification on tomato plants for autonomous scouting robots using internal databases. J Sens 2019:1–15 21. Ramcharan A et al (2019) A mobile-based deep learning model for cassava disease diagnosis. Front Plant Sci 10:272 22. Reynolds D (2021) Gaussian mixture models. Bit.ly. [Online]. Available: https://bit.ly/3tK CIBq. Accessed: 10-Apr-2021 23. Wikipedia contributors, Convolutional neural network. Wikipedia, The Free Encyclopedia, 01 Apr 2021. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Convolutional_n eural_network&oldid=1015376106. Accessed: 10 Apr 2021 24. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv [cs.CV] 25. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR) 26. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. arXiv [cs.CV] 27. Taki SS, Maria SK (2020) Computer vision approach for cauliflower disease recognition 28. Ghyar BS, Birajdar GK (2017) Computer vision based approach to detect rice leaf diseases using texture and color descriptors. In: 2017 International conference on inventive computing and informatics (ICICI)
186
Md. Abdul Malek et al.
29. Habib MT, Mia MJ, Uddin MS, Ahmed F (2020) An in-depth exploration of automated jackfruit disease recognition. J King Saud Univ Comput Inf Sci 30. Goncharov P, Ososkov G, Nechaevskiy A, Uzhinskiy A, Nestsiarenia I (2019) Disease detection on the plant leaves by deep learning. In: Advances in neural computation, machine learning, and cognitive research II. Springer International Publishing, Cham, pp 151–159
Chapter 12
An Intelligent System for Crop Disease Identification and Dispersion Forecasting in Sri Lanka Janaka L. Wijekoon, Dasuni Nawinna, Erandika Gamage, Yashodha Samarawickrama, Ryan Miriyagalla, Dharatha Rathnaweera, and Lashan Liyanage
1 Introduction In Sri Lanka, the agriculture sector contributes approximately 6.9% to the national GDP and approximately 25% of Sri Lankans are employed in the agricultural industry [1]. On rainfall distribution, Sri Lanka has traditionally been classified into three climatic zones, namely wet zone, dry zone, and intermediate zone. Consequently, the agricultural industry of Sri Lanka is divided mainly into 24 agro-ecological regions of Sri Lanka into 46 sub-regions [2]. Recent studies incur that, potato covers the largest extent of land of the upcountry intermediate zone and other dominant vegetables are beans, carrot, and tomato [3, 4]. J. L. Wijekoon (B) · D. Nawinna · E. Gamage · Y. Samarawickrama · R. Miriyagalla · D. Rathnaweera · L. Liyanage Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka e-mail: [email protected] D. Nawinna e-mail: [email protected] E. Gamage e-mail: [email protected] Y. Samarawickrama e-mail: [email protected] R. Miriyagalla e-mail: [email protected] D. Rathnaweera e-mail: [email protected] L. Liyanage e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_12
187
188
J. L. Wijekoon et al.
A number of challenges stand between farmers and profitability; e.g. increasing cost of production, and climate changes. Moreover, background literature on plant disease and plant pathology reveals that potato, tomato, and bean plants frequently get affected by fungal diseases. The variation of factors such as seasonal changes, environmental conditions, and spread of pathogens largely contribute to make these plants more vulnerable to infection [5] leading to financial losses for farmers. To effectively manage these conditions, advanced ICT technologies such as computer vision and machine learning (CV-ML) can be utilized to identify various prominent diseases affect for potatoes, tomatoes, and beans. Thus, studies [4, 6–8] show that such technologies provide promising results. To this end, this book chapter presents the design, development, and validation of an intelligent system, comprised of both mobile and Web platform, to facilitate the early detection of various diseases, and to predict the dispersion pattern of those diseases. A mobile application is developed to upload an image of a leaf, and the system can identify crop type, healthy/diseased, type of the disease, stage of the disease, and propagation of the disease. The identification and the dispersion pattern of the diseases are visualized on a GIS-based dense map highlighting potential areas that are vulnerable to infection. This study was carried out by using three cash crops in Sri Lanka, namely tomato, potato, and beans. This book chapter is organized as follows. Firstly, the background literature of CVML applications in the field of plant disease assessment is summarized. Secondly, the methodology section presents the use of CV-ML techniques to accurately identify the disease by analysing the images of potato leaves and to predict the disease dispersion pattern. Thirdly, the results and discussion section presents the findings of the study by providing interpretations of the results from the ML models. Finally, the chapter concludes with a summary of the study.
2 Background Study In [3], the department of agriculture shows that potato (Solanum tuberosum) is the most preferred crop in the upcountry due to its high net return. Potato plants prefer temperate, tropical, and subtropical conditions for growth. Optimum harvest from potato could be gained by maintaining a mean daily temperature within the range of 18–20 °C [9]. At present, potato is cultivated in two major seasons: ‘Yala’ (February– July) and ‘Maha’ (August–December). Also, tomato (Solanum lycopersicum) is a crop with a high commercial value and is currently being cultivated in many parts of the wet, dry, and intermediate zones in Sri Lanka [10]. It is well known as a warmseason crop and is known to be highly affected by climatic changes [11]. Sources claim that tomato is cultivated in both ‘Yala’ and ‘Maha’ seasons and is best grown at an altitude of 1300 m above sea level where the annual rainfall is not less than 2000 mm [12]. Figure 1 depicts the tomato production growth in Sri Lanka within past years. Similarly, bean (Phaseolus vulgaris) plants also provide promising yield for the
12 An Intelligent System for Crop Disease Identification …
189
Fig. 1 Tomato production growth in Sri Lanka within past years [14]
bean farmers [13]. Beans are mainly categorized into two: bush beans and pole beans. Beans are also categorized by their pods like green beans and runner beans. Generally, for cultivation, a sunny spot with well-drained soil which contains compost or organic manure is preferred. Normally, these plants require 6–8 h of direct sunlight throughout the growth cycle to sprout as healthy plants. A plant disease is a condition that can inhibit the vital functions of the infected plant [5]. The occurrence of such diseases can be caused by various factors such as seasonal changes, presence of the pathogen, environmental conditions, and existence of large numbers of the same species or variety, having a uniform genetic background, and grown close together. Moreover, a plant is said to be diseased when it is being continuously disturbed by a certain agent either infectious or non-infectious [5]. An infectious agent can be fungus, bacterium, mycoplasma, or virus. The most important factor being the ability of these agents to reproduce or replicate themselves within the host plant and then spread to the nearby fewer resistant plants, while having more than one pathogen in a plant at the same instance [5]. Several dominant diseases that infected potato, tomato, and beans are listed in Table 1. According to Patil and Kumar [4], agriculture practitioners have been following a manual procedure to diagnose plant diseases. This process is proposed to be replaced with computer-oriented concepts such as image processing and machine learning to achieve speed and accuracy. The image of the infected area is sent through the trained machine learning models to identify the disease. To increase the accuracy of this process, different image pre-processing techniques such as image clipping, noise reduction, and thresholding were proposed. For future developments, the authors suggested that hybrid algorithms such as genetic algorithms and neural networks may increase the recognition rate of the final classification process [4]. Highlighting the importance of the accuracy of leaf disease identification, Sujatha et al. [6] has discussed how MATLAB can be used for image processing by considering diseases infected to cotton leaves, which has been considered here as well. For
190
J. L. Wijekoon et al.
Table 1 Summary of the diseases that infected potato, tomato, and beans Disease
Summary
Affected plant
Late blight (Phytophthora infestans)
A microorganism is an Oomycete or water mould fungus [15]
Potato, tomato
Early blight (Alternaria solani) A fungus called Alternaria solani and the Potato, tomato disease symptoms are primarily observed symptoms [16]. Early blight is caused. Once a spore land on a leaf, under ideal conditions, it starts germinating and penetrates to epidermal cells through the leaf surface [17] Target spot (Corynespora cassiicola)
Target spot, a fungal disease caused by Tomato Corynespora cassiicola shows symptoms at the early stage which are similar to bacteria spot and early blight [18–20]
Septoria leaf spot (Septoria lycopersici)
It is caused by a fungus, Septoria lycopersici and is mostly found in the debris of fallen tomato plants. Symptoms initially appear on older leaves of the plant [21]
Rust (Uromyces appendiculatus)
Rust is a fungal disease mostly found in Beans all varieties of bean plants which are caused by Uromyces appendiculatus. The initial symptoms show the leaves changing their colour to a mix of yellow, brown, and red [22]
Tomato
identifying diseases, techniques related to segmentation such as the K-means clustering algorithm and support vector machine (SVM) have been used for segmentation and detection [6]. According to Jaware et al. [7], the K-means clustering technique is mostly known for its ability to deal with low-level image segmentation. This highly efficient algorithm has been identified as scalable to deal with larger data sets, but the biggest drawback of this algorithm is that it can be influenced by noise [7]. Further, Varshney and Dalal in [8] has discussed how digital image processing can be used for plant disease identification. And, it has highlighted how segmentation of the plant leaf image can be done based on edge detection. In this process, the input image is taken as an RGB model, and then, the feature values are calculated and classified using different image analysis techniques [8]. As discussed by Ferentinos in [23], different CNN models can be used for image classification in the process of leaf disease diagnosis. To improve the accuracy and the validity of the results, these CNN models have been trained with a data set of images taken inside laboratories as well as in real open fields. The most successful CNN model recorded so far is the VGG which has been achieved an accuracy of 99.53%. It is also discussed that CNNs are highly suitable for the automated detection and diagnosis of plant diseases through the analysis of simple leaves images.
12 An Intelligent System for Crop Disease Identification …
191
Furthermore, the authors [23] state the importance of a neural model being capable of performing under low computational power to make it possible to be integrated into mobile applications. As a future development aspect, it has been suggested to include images considering different geographies, cultivar, and different image capturing techniques to improve the accuracy of the final result [23]. The study given in [3] shows that AlexNet and VGG16 are used as deep learning architectures to classify images of tomato plants. Through this study, it has been identified that different parameters such as mini-batch size and bias learning rate highly affect the accuracy of the classification. An accuracy level of 97.29% has been achieved by using 13,262 images with the use of the VGG16 net architecture. Further, to the above-explained studies, Table 2 summarizes the available solutions for plant disease diagnosis with a comparison of technologies used and types of plants considered.
3 Research Methodology The methodology of the research can be summarized as; problem identification, solution design, background study about plant pathology and neural models, data gathering for the image pool, training the neural model, integration with the mobile application, and finally, the module testing and verification. As stated previously, a feasibility study was conducted by selecting several farmers from Bandarawela, Sri Lanka, to identify the need for an intelligent system that detects diseases at an early stage. Furthermore, the same study was used to identify prominent types of fungal diseases in Sri Lanka that are commonly infected for potatoes, tomatoes, and beans. According to the farmers’ feedback, the final product of this study was designed as a smartphone application. Moreover, as the solution, a GIS-based visualization feature was also integrated to display possible areas of disease propagation to multiple stakeholders such as industrial users, farmers, and related government organizations.
3.1 Solution Design Figure 2 illustrates the overall system diagram. The smartphone has been identified as the most effective method of interacting with the system. The system functions were designed based on two scenarios. The first one is users have the Internet, and they can use real-time disease detection using cloud infrastructure. The second one is users do not have the Internet, and the disease identification is done on the mobile device which is less accurate and has slow processing. Consequently, the functionalities that were enabled within the mobile application provide the ability for a user to select the user type to capture an image from the mobile phone camera or to select an image from the gallery to upload. Then, the uploaded image will be subjected to diagnosis
192
J. L. Wijekoon et al.
Table 2 Comparison between available solutions Research/product Author
Types of Technologies/techniques Conceptual/implemented plants used discussed
Advances in Patil image processing for detection of plant diseases [4]
NA
Image clipping
Implemented
Leaf disease Sujatha detection using image processing [6]
NA
MATLAB, SVM
Implemented
Crop disease detection using image segmentation [7]
NA
K-means clustering
Conceptual
Plant disease Varshney prediction using image processing techniques [8]
NA
Edge detection
Conceptual
Deep learning Ferentinos models for plant disease detection and diagnosis [23]
NA
VGG 16
Implemented
Tomato crop Rangarajan Tomato disease classification using pre-trained deep learning algorithm [3]
AlexNet, VGG 16
Implemented
Plantix [24]
https://pla Bean, ntix.net/en/ maize, chilly, tomato
Image processing
Implemented
PlantNet [25]
https:// play.goo gle.com/ store/apps/ details?id= org.pla ntnet& hl=en
Grasses, Image processing conifers, ferns and cacti
Implemented
Jaware
(continued)
12 An Intelligent System for Crop Disease Identification …
193
Table 2 (continued) Research/product Author
Types of Technologies/techniques Conceptual/implemented plants used discussed
Agrobase [26]
Rice, Image processing wheat, barley, soybeans
https:// play.goo gle.com/ store/apps/ details?id= lt.farmis. apps.farmis catalog& hl=en
Fig. 2 High-level system diagram
Implemented
194
J. L. Wijekoon et al.
using the AI engine implemented in the cloud or the mobile application. Based on the diagnosis results, treatments will be recommended. The recommendations are given based on the information provided by the government agricultural authorities.
3.2 Selection of Neural Network Models Following neural models were considered for implementation of the system. As already discussed in the background study, it was decided to use a CNN for the identification of plant type and disease type and to use a mask regional convolutional neural network (mask R-CNN) for the identification of disease progression level. Even though these choices were theoretically confirmed, as one of the objectives of this study, it was practically tested to understand the behaviour in the Sri Lankan context. To test the accuracy levels and behavioural patterns of CNNs, a Google Inception V3 model was obtained from the Internet. It was trained using different test data sets and was then retrained with a flower data set until the actual data set was prepared for the study. After observing the results and the confidence level of the neural model, it was confirmed to use Inception V3 for disease diagnosis by identifying the plant type and the disease type. A more intelligent and complex deep learning neural model was needed to classify a plant image based on the disease progression level. A trained RestNet101 model was used, and a balloon data set was used for retraining. The target was to check if the neural model can separate the background data from the balloon. Therefore, by analysing the above results, it was decided to continue using RestNet101 as the neural model for disease progression level identification.
3.3 Data Gathering for the Image Pools As stated previously, to identify the disease and its propagation level, it is required to use two neural models which can classify an image based on its trained image data pool. The images for each disease type for tomato, potato, and beans were collected using three methods: 1. 2. 3.
using online sources to download image data, manually capturing images during field visits, planting tomato and potato crops for observations.
A considerable amount of leaf images was be collected for the data set including healthy and diseased leaf images as depicted in Figs. 3 and 4.
12 An Intelligent System for Crop Disease Identification …
195
Fig. 3 Data set collected for ‘Bean Healthy’
Fig. 4 Data set collected for ‘Bean Rust’
3.4 Implement the Neural and Transfer Learning Models When training a CNN, each image passes through a consecutive series of convolutional layers with kernels. After pooling, during classification, the Softmax function is applied when classifying an image and assigning a probabilistic value between 0
196
J. L. Wijekoon et al.
Fig. 5 Snapshot of the functionality of a CNN [27]
and 1. Figure 5 represents the complete flow of a CNN in processing an input image and classifying that object to a class defined when training the neural model [27]. Convolution is where the feature extraction happens at first. It is a layer that protects the relationship between pixels by learning image features. This learning happens as small squares of input data. This process happens as a result of a mathematical operation [27]. The number of pixels that shifts over the input matrix is known as a ‘Stride’. It is said that, if the Stride is 1, then the filters are moved to 1 pixel at a time. In a case where the Stride is 2, then the filters are moved to 2 pixels at a time [27]. Transfer learning signifies using a machine learning model which has been trained previously for a different set of data, being trained for a new data set to a different but quite similar type of problem. In this research study, an Inception V3 model was cloned to the local computer, from an online repository. This system was implemented using Python and the Django Framework. To set up the local environment, necessary dependencies such as ‘qtconsole 4.4.3’, ‘scikitimage 0.15.0’, ‘scikit-learn0.21.1’, and ‘scipy 1.3.0’ were installed. Once the configurations were completed, the image data set were inserted into the neural model for the training. During the training, it was observed that the model starts extracting details from the new data set. The neural model then successfully identified and classified the diseased and healthy leaves. Moreover, it also performed a cross-entropy validation on the data set during the training process.
3.5 Identification of Disease Progression Table 3 summarizes the diseases and their symptoms which were identified during the study. As explained in Table 3, it was observed that all the diseases considered in this study such as anthracnose, blights, and mildews show a considerable change in the leaf surface. Therefore, it is important to use a technique that can distinguish these spots from the leaf surface. Once a leaf had been classified as diseased, further classification n was performed to diagnose the type of disease the plant has been
12 An Intelligent System for Crop Disease Identification …
197
Table 3 Analysis of symptoms Disease name
Name of pathogen
Symptoms
Late blight
Phytophthora infestans
• Purplish-brown lesions on the surface of tubers • Blackish water-soaked lesions occur on the leaves
Early blight
Alternaria solani
• Circular lesions up to 1/2 in diameter are produced • Damping-off, collar rot, stem cankers, leaf blight, and fruit rot
Anthracnose
Colletotrichum lindemuthianum
• On leaves as small and irregular yellow, brown, dark brown, or black spots [15]
Rust
Pucciniales
• White, slightly raised spots on the undersides of leaves and the stems [13] • Leaf pustules may turn yellow–green and eventually black. Severe infestations will deform and yellow leaves and cause leaf drop [13]
Powdery mildew
Sphaerotheca fuliginea
• White powdery spots on the leaves and stems [13]
Downy mildew
Hyaloperonospora brassicae
• Purplish dark grey spores on leaf surface [13]
infected with using region-convolutional neural network (R-CNN) to identify the stage of the disease as illustrated in Fig. 6. Semantic segmentation helps to distinguish the disease spots from the leaf surface as illustrated in Fig. 7. Further, the semantic segmentation helps to recognize and understand what is in the image at pixel level and models the relationship between pixels, i.e. nearby pixels more likely to have the same label or pixels with similar colour having the same label. This is a high-level technique used for ‘Complete scene understanding’. Therefore, this study utilized the concept of ‘Complete scene understanding’ as a key role for diagnosing leaf diseases propagation. Finally, depending on the stage of the disease, the mobile application then recommends a few early precautions or remedies; an assumption is made that the necessary authorities shall verify the answers predicted by the implemented AI. The disease level for the ‘Late blight’ was classified by considering the data provided by the Department of Agriculture, Sri Lanka, which is given in Fig. 8. To improve the accuracy of the system, it was suggested that users must upload more than one image and optimize the methodology in a way that considers all the uploaded images and produces the classification by taking the highest progression level found from the uploaded images. For this, a new method had to be built not only to consider the image data but also other background data that is input based on the eye observations of the user. A new architecture was introduced as illustrated in Fig. 9 by modifying the existing architecture followed by single image support. In this architecture, images are sent to the server through Amazon Web Services (AWS). The mobile app has been connected to a Simple Storage Service Bucket (S3
198
J. L. Wijekoon et al.
Fig. 6 Image classification stages used for diseases propagation
Fig. 7 Results after applying semantic segmentation on a plant leaf [28]
buckets) in AWS. The images selected by the user are sent to this by using two image caching libraries, i.e. HTTP Client Mime and Glide. These libraries fetched and sent the images serially to the server. As illustrated in Fig. 10, the uploaded images are stored as a stack that follows the last in, first out (LIFO) principle. The method first determined the highest progression
12 An Intelligent System for Crop Disease Identification …
199
Fig. 8 Grid to classify the late blight progression
Fig. 9 The method followed to implement multiple image support
level found in the multiple images uploaded and then improved the accuracy of the result considering the status of the background data. The final output could be one out of the 4 stages, namely 1 2, 3, and 4, where Stage 1 is considered as the most initial stage of the disease and Stage 4 is considered as the most critical stage of the disease. The in-between stages are determined by the method by considering different combinations between the two inputs as illustrated in Fig. 11.
200
J. L. Wijekoon et al.
Fig. 10 Illustration of detecting the highest propagated disease using multiple images
Fig. 11 Reference table for detecting the disease progression level using multiple images
3.6 Visualization of Disease Propagation It was identified that diseased phores of different types of crops take different periods to disperse according to various environmental factors. A model has been developed to visualize the dispersion of tomato late blight according to the environmental factors in Bandarawela, Sri Lanka. Therefore, once a disease has been identified, a GIS-based dense map was utilized to visualize the propagation of the disease dispersion. The forecasting was done by taking the past wind data as the main parameter while assuming the rain and cloudiness as constants. By the outcome of this component, a mobile application that can be used by industrial professionals has been presented
12 An Intelligent System for Crop Disease Identification …
201
where the user can get the predicted dispersion patterns based on the location and real-time environmental conditions. Two experiments have been conducted in a controlled environment to observe and monitor the instances such as propagation patterns of pathogens, time is taken for a controlled instance and aggressiveness of the propagation levels. The first experiment had been conducted to simulate pathogen dispersion in a controlled environment with near-constant environmental factors. The second experiment had been conducted under provided environmental conditions by varying the environmental factors such as environmental humidity, soil moisture, and wind. Microscopic images from different simulated scenarios had been further analysed to come up with an experimental data set. By conducting these two experiments, the historic data set has been created and used for the predictions in the next step. The correlation among factors affecting the dispersion, humidity, soil moisture, and wind, was tested using the linear regression model. As the next step Gaussian plume model was utilized to predict the atmospheric dispersion of certain particles [7]. Gaussian plume model is an atmospheric dispersion model widely used for estimating and predicting airborne exposure of a certain pollutant which assumes that the atmospheric turbulence is both stationary and homogenous where extensive validation and a review of the Gaussian plume atmospheric dispersion model is done by Miller and Hively [8]. In the Gaussian plume model, concentration of pollution from a source is treated as spreading outward from a centreline of the plume following a normal statistical distribution where the plume spreads in both horizontal and vertical directions. Using the gathered data sets from the experiments mentioned above, a neural model had been trained. Then, the model was used to implement the mobile application. The mobile application consisted of the following functions, • Identification of the location of the user (location of the diseased crop) using GPS geotagging • Forecasting the dispersion of diseased phores based on the current environmental factors (humidity, soil moisture, and wind) • Visualizing the dispersion using dense maps. Once the mobile application was developed, Fig. 12 illustrates the concentrated map of disease dispersion, and Fig. 13 shows the GIS-based dense map. Simultaneously, the farmers registered to the application will be notified using SMS about the threat they are about to experience.
4 Results and Discussion Personal computer (processor i5, Intel HD GPU, 8 GB RAM) was used for initial model training and testing inside localhost, while AWS server setup (instance type: t3.medium with 2vCPU and 8 GB RAM) was used for testing in the production environment. Then, testing was conducted for various test cases, using potato, beans,
202
J. L. Wijekoon et al.
Fig. 12 Concentration map of disease distribution based on the wind direction
and tomato with various diseases, and the yielded results are given in Table 4. The test results show that the implemented model has promising results. The testing scenarios were conducted using more than 10,000 images to train the system. When testing the given scenarios, the authors selected a particular leaf and consulted an expert, from Kahagolla Agricultural Research Institute, Sri Lanka, in the agricultural domain and asked him to identify the disease and the propagated level of the disease. Then, the same image was fed to the implemented mobile application and cross-checked the results of the application with the expert and the compared results are given in Table 4.
5 Conclusion The plant disease identification and dispersion forecasting system presented here consists of two main functionalities: detect plant diseases using multiple images and visualize the diseases dispersion. The system was implemented as a mobile application enabling the users to simply upload an image to the mobile application and get a diagnosis about the name of the disease, information about the disease, the level of progression, and recommended treatment options. The mobile application
12 An Intelligent System for Crop Disease Identification …
203
Fig. 13 Overlay map of disease distribution based on the wind direction Table 4 Testing conducted for various test cases using potato, beans and tomato Test case
Crop name
Disease
Accuracy (%)
Identification of healthy potato leaves
Potato
Healthy
95
Identification of healthy bean leaves
Bean
Healthy
97
Identification of tomato leaves infected with late blight
Tomato
Potato late blight
42
Identification of potato leaves infected with late blight
Potato
Potato late blight
54
Identification of potato leaves infected with early blight
Potato
Potato early blight
87
Identification of bean leaves infected with rust
Bean
Bean rust
92
Identification of tomato leaves infected Septoria leaf spot
Tomato
Tomato Septoria leaf spot
52
Identification of tomato leaves infected target spot
Tomato
Tomato target spot
59
204
J. L. Wijekoon et al.
also enables the user to receive a risk level assessment of the probability of the dispersion of the disease based on the user’s geographical location to other off-site destinations, including a graphical representation of the forecast visualized through a GIS-based map. The system has been implemented and tested for the detection of fungal diseases found in potato, tomato, and bean plants with an accuracy ranging from 90 to 94%. One of the major future implementations of this study is to use thermal imaging technologies to identify the diseases before it provides visual symptoms on the leaf surface. Given that agriculture is one of the main contributors to the national economy and incorporating many stakeholders ranging from farmers to national-level policymakers, this study makes a significant contribution to Sri Lanka. Acknowledgements This study was supported by the Sri Lanka Institute of Information Technology (SLIIT), Department of Agriculture (DoA-SL) Sri Lanka and Kahagolla Agrarian Research institute, Kahagolla, Diyathalawa, Sri Lanka.
References 1. (2019) Agriculture home. In: Statistics.gov.lk. http://www.statistics.gov.lk/agriculture/. Accessed 30 Mar 2019 2. (2019) Sri Lanka biodiversity clearing house mechanism, agro ecological regions. In: lk/chmcbd.net. http://lk.chm-cbd.net/?page_id=176. Accessed 15 Sept 2019 3. (2019) Potato production and marketing in Sri Lanka. In: UKEssays.com. https://www.uke ssays.com/essays/economics/potato-production-and-marketing-in-sri-lanka-economics-ess ay.php. Accessed 9 Aug 2019 4. Patil JK, Kumar R (2011) Advances in image processing for detection of plant diseases. J Adv Bioinform Appl Res 2(2):135–141. ISSN 0976-2604 5. (2019) Plant disease | Importance, types, transmission, & control. In: Encyclopedia Britannica. https://www.britannica.com/science/plant-disease#ref63296. Accessed 14 Jan 2019 6. Sujatha R, Sravan Kumar Y, Akhil GU (2017) Leaf disease detection using image processing. J Chem Pharm Sci 7. Jaware TH, Badgujar RD, Patil PG (2012) Crop disease detection using image segmentation. World J Sci Technol 2(4):190–194. ISSN: 2231-2587 8. Varshney S, Dalal T (2016) Plant disease prediction using image processing techniques—a review. Int J Comput Sci Mob Comput 5(5):394–398 9. (2019) The potato: cultivation—international year of the potato 2008. In: Fao.org. http://www. fao.org/potato-2008/en/potato/cultivation.html. Accessed 13 Aug 2019 10. Felix ED, Mahendran T (2011) Physico-chemical properties of mature green tomatoes (Lycopersicon esculentum) coated with pectin during storage and ripening. Trop Agric Res Ext 12(2):110–112. https://doi.org/10.4038/tare.v12i2.2800 11. (2019) Agronomic principles in tomato production | Yara United States. In: Yara United States. https://www.yara.us/crop-nutrition/tomato/agronomic-principles-in-tomato-pro duction/. Accessed 10 Aug 2019 12. (2019) A toast for tomatoes! In: Daily News. http://www.dailynews.lk/2017/01/30/features/ 106060/toast-tomatoes. Accessed 15 Aug 2019 13. (2019) Growing beans from sowing to harvest. In: GrowVeg. https://www.growveg.com/gui des/growing-beans-from-sowing-to-harvest/. Accessed 15 Sept 2019 14. (2019) Statistics, country and market data & more | Tilasto. In: Factfish.com. http://www.fac tfish.com. Accessed 15 Sept 2019
12 An Intelligent System for Crop Disease Identification …
205
15. (2019) Phytophthora infestans—Wikipedia. In: En.wikipedia.org. https://en.wikipedia.org/ wiki/Phytophthora_infestans. Accessed 15 Sept 2019 16. (2019) Early blight of potato and tomato. https://www.apsnet.org/edcenter/disandpath/fungal asco/pdlessons/Pages/PotatoTomato.aspx. Accessed 15 Sept 2019 17. (2019) Early blight in potato—publications. In: Ag.ndsu.edu. https://www.ag.ndsu.edu/public ations/crops/early-blight-in-potato. Accessed 15 Sept 2019 18. (2019) http://203.64.245.61/web_crops/tomato/target.pdf. Accessed 15 Sept 2019 19. Jackson G (2019) Tomato target spot (163). In: Pestnet.org. http://www.pestnet.org/fact_sheets/ tomato_target_spot_163.htm. Accessed 15 Sept 2019 20. Tomatoes T (2019) Identifying target spot of tomato: information on target spot tomato treatment. In: Gardening Know How. https://www.gardeningknowhow.com/edible/vegetables/tom ato/target-spot-on-tomatoes.htm. Accessed 15 Sept 2019 21. (2019) Save your tomatoes from septoria leaf spot. The Spruce. https://www.thespruce.com/ identifying-and-controlling-septoria-leaf-spot-of-tomato-1402974. Accessed 15 Sept 2019 22. (2019) How to get rid of rust on beans. In: Home Guides | SF Gate. https://homeguides.sfgate. com/rid-rust-beans-28584.html. Accessed 15 Sept 2019 23. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318. https://doi.org/10.1016/j.compag.2018.01.009. ISSN 0168-1699 24. (2019) Plantix | Best agriculture app. In: plantix.net. https://plantix.net/en/. Accessed 15 Sept 2019 25. (2019) PlantNet plant identification—apps on Google Play. In: play.google.com. https://play. google.com/store/apps/details?id=org.plantnet&hl=en. Accessed 15 Sept 2019 26. (2019) Agrobase—weed, disease, insect—apps on Google Play. In: play.google.com. https:// play.google.com/store/apps/details?id=lt.farmis.apps.farmiscatalog&hl=en. Accessed 15 Sept 2019 27. (2019) Understanding of convolutional neural network (CNN)—deep learning. https:// medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-lea rning-99760835f148. Accessed 22 Apr 2019 28. (2019) Potato early blight fact sheet. In: Vegetablemdonline.ppath.cornell.edu. http://vegetable mdonline.ppath.cornell.edu/factsheets/Potato_EarlyBlt.htm. Accessed 15 Sept 2019
Chapter 13
Apple Leaves Diseases Detection Using Deep Convolutional Neural Networks and Transfer Learning Shivam Kejriwal , Devika Patadia , and Vinaya Sawant
1 Introduction There are about 7500 different varieties of apples. While most are consumed fresh locally, some are extremely important economically as commercial products. Hence, the quality of apples is essential in the fruit industry. Unfortunately, the quality of apples can be seriously compromised by various diseases that affect apple trees. Some of the most prevalent apple tree diseases are caused by fungi. Foliar diseases often appear as lesions first on the undersides of the leaves but can also develop on the upper leaf surfaces and the fruit. As fungi spread from infected plants to healthy ones via airborne spores, prevention is crucial. It involves providing proper spacing, improving soil drainage, and removing infected plant parts as soon as they are found. Identifying the most widespread apple tree illnesses is the first step toward mitigating the problems. Detecting defects in fruits at an early stage can help reduce further infection spreading to other parts of the fruit and economic losses in agricultural industries. The present disease diagnostic technique in apple orchards is based on manual scouting, also known as field scouting, the primary practice of traveling across a crop field and frequently stopping for observations. Shipping potentially infected samples with visible symptoms to faraway laboratories for chemical testing, as well as exchanging images with specialists for diagnosis, is part of this manual recognition procedure used by expert pathologists. Observation, sample collection, transportation, analysis, and identification can take so long during a busy growing season that solutions are quite often sent to planters too late to implement critical control measures. Also, a limited number of extension professionals are available to conduct in-person and onsite identification. It is a labor-intensive and time-consuming method that frequently leads to inaccurate diagnosis and inappropriate pesticide application. S. Kejriwal · D. Patadia · V. Sawant (B) Information Technology, Dwarkadas J. Sanghvi College of Engineering, Mumbai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_13
207
208
S. Kejriwal et al.
Machine learning algorithms [1] have gained traction for solving identification and classification problems with enhanced efficiency and accuracy. Despite the introduction of copious machine learning techniques to improve the overall efficiency of disease analysis in plants/crops, many factors such as crop image brightness and disease variations impact disease detection accuracy. These shortcomings can be overcome with the help of deep learning, which is an advanced class of machine learning algorithms that tries to mimic the human brain using numerous layers of interconnected nodes. It has been increasingly popular and has gained much traction in the field of agriculture recently. Deep learning techniques may be applied directly to raw data in multiple forms because they automatically learn data features and tasks. Machine learning, on the other hand, requires an additional step of preprocessing in the form of feature extraction. Therefore, support vector machines [2], random forests [3], Bayesian networks [4], and other traditional machine learning techniques cannot be applied directly to raw data. In the case of deep learning, time is saved because no features need to be extracted in advance. Advancements in computer vision and deep learning have led to the emergence of convolutional neural networks as the models of choice for image classification problems. The convolutional neural network [5] is a deep learning model that can automatically extract features from a training dataset, bypassing the need for complex preprocessing and ensuring high performance. A simple CNN is a neural network with only two layers: convolution and pooling. Convolution involves a filter moving along the dimensions of an image that is used for image feature extraction. Pooling consists of a window that slides over all the regions and summarizes all features in the pool. Deep CNNs [6] are deep multi-layered neural networks with a large number of parameters that outperform basic CNNs in handling large datasets, feature extraction, and performance. This chapter proposes a novel ensemble of three pre-trained deep convolutional models, namely, ResNet101V2 [7], Xception [8], and InceptionResNetV2 [9], for disease detection in apple trees. Our process for disease detection from apple tree leaves is divided into two major steps—data augmentation and classification using transfer learning. We used data augmentation to diversify the training set by applying randomized transformations. We make use of several data augmentation techniques such as flipping, rotation, zoom, translation, along with variations in image brightness, contrast, saturation, and hue. It aids in the prevention of overfitting and boosts the model’s overall performance. We used transfer learning [10] to transfer knowledge of pre-trained models and make predictions in our research. We used a simple average technique to ensemble the models to reduce variance and generalization error. Our proposed model achieved an F1-score of 0.9625 on the testing dataset. Computer vision combined with deep CNNs proves to be a viable solution for timely, efficient, and accurate detection of foliar diseases in apples. The sections ahead are laid out as follows: In Sect. 2, we discuss related works, previously proposed systems and methodologies, and their drawbacks. Then, in Sect. 3, we go over our proposed methodology in further detail. It includes details about the dataset, preprocessing and augmentation techniques, models, transfer learning, and ensembling. Section 4 presents the experimental results, evaluation
13 Apple Leaves Diseases Detection Using Deep Convolutional …
209
metrics, and brief comparison and analysis of the results. Finally, Sect. 5 is the chapter’s conclusion.
2 Literature Review Several traditional machine learning algorithms have been previously used for plant disease detection from images of leaves. These techniques include support vector machines [2], K-means, K-nearest neighbors [11], and random forest classification [3]. Kirti and Rajpal [2] analyzed various SVM kernels and developed their model for Black Rot disease detection in grape plants, achieving an accuracy of 94.1%. Sandika et al. [3] compared probabilistic neural network, Bayesian PNN, SVM, and random forest regression, and different texture features like local textures, LBP, and GLCM for the classification of three grape diseases, namely, Anthracnose, Powdery Mildew, and Downy Mildew. The authors achieved the best accuracy of 86% using random forest and GLCM. Islam et al. [12] used image segmentation approach and SVM to detect potato diseases over 300 images with an accuracy of 95%. The authors analyze the images’ color and luminosity components in L * a * b color spaces, isolate regions with disease symptoms from the surroundings, and use GLCM for statistical features extraction. Machine learning techniques rely extensively on image preprocessing and feature extraction to extract the essential features and improve their accuracy. These approaches, however, are more suitable for images with uniform backgrounds, angles, and light conditions that do not reflect real-world scenarios in the fields. Several deep learning algorithms are used for disease classification in plants, CNN being the most prevalent. For example, Militante et al. [13] trained and tested a simple CNN with seven different classes to detect sugarcane diseases and obtained an accuracy of 95%. Jenifa et al. [6] proposed a deep CNN-based method for cotton leaf disease classification with an accuracy of 96%. Shrivastava et al. [14] explored the performance of various pre-trained DCNN models for classifying rice plant leaf diseases into seven classes and obtained the highest accuracy of 93.11% with VGG16. Surya and Gautama [15] implemented a combination of CNN and MobileNetV2 for classifying diseases in cassava leaves and obtained a validation accuracy of 74.96%. Trang et al. [16] presented a custom residual neural network (ResNet) with convolutional, normalization, and Relu layers for identification of three common mango diseases and achieved an accuracy of 88.46%. Zhang et al. [17] proposed GoogLeNet and Cifar10 based models to identify eight different maize leaf diseases. The GoogLeNet model achieved an average accuracy of 98.9%. The Cifar10 model was modified, the Relu function and dropout operation were added between the two dense layers, and improved accuracy of 98.8% was achieved. Sardo˘gan et al. [18] suggested a faster R-CNN model with Inception v2 architecture for real-time detection of apple black spot disease, with an accuracy of 84.5%. Dai et al. [19] implemented SVM with HOG for feature extraction and compared it to transfer learning with four pre-trained models, namely, VGG16, DenseNet, ResNet,
210
S. Kejriwal et al.
and GoogLeNet. While SVM underperformed with an accuracy of little over 50%, the accuracy of transfer learning models exceeded 90%, with GoogLeNet having the best accuracy of 94%. Jiang et al. [20] presented INAR-SSD, a novel model that includes a pre-network and a structure for feature extraction and fusion, for the realtime classification of images in the apple leaf diseases dataset (ALDD). In addition, the authors introduced GoogLeNet Inception along with Rainbow Concatenation to detect Alternaria leaf spot, Gray spot, Mosaic, Brown spot, and Rust diseases with an mAP of 78.8% and a high detection rate of 23.13 fps. Based on the review above, we can see that several deep learning models and transfer learning have achieved high accuracy in detecting plant leaf disease. However, the performance of many of the previously presented models was inadequate, particularly in challenging real-world scenarios. Furthermore, existing models have limitations in entirely utilizing the benefits of image augmentation techniques. Based on these inferences, we present our approach for detecting and classifying apple leaf diseases.
3 Proposed Methodology This section describes the step-by-step process for developing the proposed model. Figure 1 depicts the flow of the primary processes involved. Firstly, the dataset is pre-processed and partitioned. The training set is then augmented and used with the validation set for training the three pre-trained models. The models are then ensembled and evaluated on the test dataset.
3.1 Dataset The apple leaves dataset utilized in this study came from the Kaggle competition Plant Pathology 2021-FGVC8 [21, 22]. The original data contains roughly 23,000 highresolution RGB apple leaf images having foliar diseases, including a large expertannotated disease dataset. This dataset depicts real-world scenarios by displaying leaf images captured at varied settings directly from apple orchards at various stages of development and at various times of day using various focal camera settings and angles. The dataset used in this research contains 18,632 images of apple leaves that belong to either healthy or one or more of the five disease classes—scab, rust, complex, powdery mildew, and frog eye leaf spot. The frequency distribution of these classes is represented in Fig. 2. Healthy. Figure 3 is of a healthy leaf. Healthy apple tree leaves are green and spotless with no signs of any disease. Healthy is the second most frequent class in the dataset, and it accounts for about 24.8% of the total images in the dataset.
13 Apple Leaves Diseases Detection Using Deep Convolutional …
211
Fig. 1 Flow of the proposed approach
Scab. Leaves affected with scab develop dark-colored lesions on the underside and the top. Scab is the most common label in our dataset. About 25.9% of the total images in the dataset are affected only with scab. It is also found in leaves along with other diseases. Figure 4 shows an apple tree leaf with scab disease. Rust. Leaves with rust develop dense orange-yellow marks, as shown in Fig. 5. About 10% of the total images in the dataset are infected only with rust. Frog eye leaf spot. Leaf spots (called “frog eye”) appear brown with purple margins on the infected plant leaves, as illustrated in Fig. 6. About 17.07% of the images in the dataset are only affected by frog eye leaf spot. Powdery mildew. As shown in Fig. 7, powdery mildew is present as small patches of white or gray powdery masses that coat the leaves of many apple trees.
212
Fig. 2 Class-wise frequency distribution of the dataset Fig. 3 A healthy leaf
Fig. 4 A leaf infected with apple scab
S. Kejriwal et al.
13 Apple Leaves Diseases Detection Using Deep Convolutional … Fig. 5 A leaf from the rust category
Fig. 6 A leaf with frog eye leaf spot
Fig. 7 A leaf infected with powdery mildew
213
214
S. Kejriwal et al.
Fig. 8 A leaf from the complex class
Approximately, 6.35% of the leaves in the dataset are infected only with powdery mildew. Complex. Leaves that are unhealthy and have too many diseases to be detected visually belong to the complex class and may or may not have a subset of the identified diseases. Figure 8 is a leaf from the complex class.
3.2 Data Preprocessing All images are resized to 512 * 512 to ensure uniformity for neural networks and to reduce model complexity and training time. Next, the dataset, containing a total of 18,632 images, is shuffled randomly and divided into training, validation, and testing datasets in the ratio of 75:12.5:12.5. This implies that there are 13,974 training, 2329 validation, and 2329 test images. The training data is then augmented using various data augmentation techniques, and the size of the training dataset is increased to 27,948 images.
3.3 Data Augmentation To avoid model overfitting, data augmentation is performed on training images to help boost the model’s generalization ability and robustness. Keras [23] and TensorFlow [24] are used to flip, rotate, zoom, translate and perform brightness, contrast, saturation and hue variations on each image to generate an augmented version of that image. A few augmented images are shown in Fig. 9. The algorithms used are: Flipping. Flipping is the operation of flipping or reversing the contents of the rows (Horizontal) or columns (Vertical) of an image. Let Ai jk be an image, where i ∈ {a:
13 Apple Leaves Diseases Detection Using Deep Convolutional …
215
Fig. 9 Output of data augmentation techniques on apple leaf images
a ∈ N and a ≤ m} represents the rows, j ∈ {a: a ∈ N and a ≤ n} represents the columns, and k represents the three channels, red, blue, and green. Horizontal Flip Ai jk → Ai(n+1− j )k
(1)
Ai jk → A(m+1−i) jk
(2)
Vertical Flip
Rotation, zoom, and translation. The image is randomly magnified and turned clockwise or counterclockwise at minor angles. The images are then translated,
216
S. Kejriwal et al.
which means they are shifted horizontally or vertically. Finally, the image’s edge pixels fill the resulting empty space. Brightness and contrast adjustment. Brightness and contrast adjustments are performed on the images. While brightness refers to an image’s overall lightness or darkness, contrast refers to the difference in brightness between various objects or parts of the picture. These are mathematically expressed as follows: x = x +b
(3)
x = (x − m) ∗ c + m
(4)
where x is the pixel, b is the brightness, c is the contrast, and m is the mean of all pixels in the same channel as x, and since the contrast is calculated independently for each channel of the image. Hue and saturation adjustment. Color augmentation is completed by adjusting the hue and saturation. Hue refers to the dominant shade in an image. Saturation refers to the intensity of colors in an image.
3.4 Models and Methods In this work, we have performed a multilabel classification of diseases in apple leaves. Three models with distinct networks pre-trained on ImageNet along with some common layers have been employed. The models have also been fine-tuned to enhance the performance. Finally, all three models are ensembled. All of the methods mentioned above and our implementation of them have been described below in further detail. Multilabel classification. Unlike typical classification tasks where class labels are mutually exclusive, multilabel classification requires associating with each instance multiple mutually non-exclusive labels. Multilabel classification is a subset of classification problems, where there are two or more classes. The data to classify may belong to none of the classes or all of them. Apple leaves in our research could be healthy or infected with one or more of the five apple leaf diseases. A leaf, for example, could be linked to scab, frog eye leaf spot, and complex diseases all at once. Transfer learning and fine-tuning. In simplistic terms, transfer learning is the process by which a model trained to handle one problem focuses on retaining information learned from that problem and applying it to solve a new but related problem. Transfer learning [10] is a crucial deep learning method for resolving the fundamental problem of insufficient training data, which in many cases, is inevitable. The entire data collection and refinement procedure is time-consuming and demands patience.
13 Apple Leaves Diseases Detection Using Deep Convolutional …
217
Transfer learning’s primary goal is to increase performance on target problem by transferring information from several different but related source problems. Fine-tuning, in general, refers to slightly modifying a process to obtain the desired performance or result. In deep learning, it is a technique used to implement transfer learning. In our research, we have designed an ensemble of three pre-trained deep learning models. First, the head of the base models is replaced with a custom model structure. Then, the base model’s layers are frozen, the head is initialized with random weights, and the model is trained. Finally, all the layers are unfrozen, and the entire model is allowed to fine-tune to the new dataset. Proposed model structure and common layers. As illustrated in Fig. 10, each of the deep learning models in the final ensemble assumes a generic flow. Firstly, an input layer is used to instantiate a Keras tensor. The images are then fed into the base pre-trained model. The functional layer represents the base model. These models are previously trained on the ImageNet dataset, and we use their weights via transfer learning. Next, the outputs are fed to the Global Average Pooling Layer. The purpose of this layer is to downscale the feature map to a 1D vector by averaging the values in the pool. This one-dimensional vector is fed to a fully connected dense layer with 256 units along with Relu activation [25]. The dense layer is accompanied by a dropout layer with a dropout frequency of 0.2 during training to help prevent overfitting. This layer’s output is subsequently sent into a batch normalization layer. It standardizes and normalizes the values passed to the layer. It does this by employing transformations that keep the mean output and standard deviation close to 0 and 1, respectively. Batch normalization accelerates the training of the neural network, provides regularization and helps reduce generalization error. This layer is followed by two more similar sets of dense, dropout, and batch normalization layers. Finally, the output is passed to a fully connected dense layer with six units. This final layer reduces the size to six classification categories and applies the sigmoid activation [25]. The sigmoid converts raw output values to the final probabilities of the classification falling in each of the six classes. The final output consists of classes with probabilities over a certain threshold. In our models, we make use of the adam optimizer [26], which is an adaptive optimizer and requires less memory. Adam optimization is an algorithm for updating network weights iteratively, epoch after epoch. It is more effective and quicker when compared to its counter algorithms. We froze the pre-trained base model and trained the models for 50 epochs before fine-tuning the entire model for 25 more epochs with the batch size per epoch as 128. Resnet101V2. ResNet [7] is an acronym for residual networks, which is a popular network for computer vision problems. When the number of layers in a deep learning model increases, a problem known as the vanishing/exploding gradient occurs. As a result, the gradient becomes 0 or too huge. Hence, as the number of layers increases, so does the training and test error rate. In ResNets, a technique called skip connections is used. The skip links directly to the output connection by skipping training from a few layers. This network uses a 34layer basic network design inspired by VGG-19, and after that the skip connections
218
Fig. 10 Basic structure of our three fine-tuned models
S. Kejriwal et al.
13 Apple Leaves Diseases Detection Using Deep Convolutional …
219
are added. The advantage of having this sort of skip connection is that regularization will successfully skip any layer that adversely affects architectural performance. As a result, training a very deep neural network without the difficulties caused by vanishing/exploding gradients is achievable. Figure 11 represents the ResNet-101-v2 model that we have used in this work. It is 101 layers deep and has been trained on over a million photos from the ImageNet database. Consequently, the model acquired various feature representations for a broad scope of images. For example, it can sort photos into 1000 different object categories, including keyboards, pencils, and various animals. The key distinction between ResNetV2 and ResNetV1 is that ResNV2 uses batch normalization before each weight layer. Xception. Xception [8] is short for “Extreme Inception.” The Xception architecture is based on Inception; however, instead of Inception modules, depth wise separable convolutions are used. These are spatial convolutions performed separately over every channel of an input, followed by a 1 × 1 convolution that projects the output of the depth wise convolution onto a new channel space. They separately map the cross-channel and spatial correlations of each output channel.
Fig. 11 Condensed architecture of ResNet101-v2
Fig. 12 Condensed architecture of Xception
220
S. Kejriwal et al.
As shown in Fig. 12, the Xception architecture consists of 36 convolutional layers organized into 14 modules with linear residual connections around them. Xception outperforms Inception V3 [27] by efficient use of model parameters. InceptionResNetV2. InceptionResNetv2 [9] combines two networks: residual network (ResNet) connections [7] and a newer version of the Inception [27] architecture. Using residual connection eliminates degradation problems during deep structure while still providing accurate feature information. The residual modules are used to increase the number of Inception blocks thus, increasing the network depth. The Inception module comprises several convolutional layers, pooling layers, and feature maps, which are all combined into a single vector in the result section. We chose the Inception-ResNet-v2 for our research because it uses a complex architecture to extract essential features from images. Szegedy et al. [9] go over the network in greater depth. Figure 13 depicts the structure of InceptionResNetV2. The network’s first layers consist of three standard convolutional layers followed by a max-pooling layer, followed by two convolutional layers and another max-pooling layer. Inception convolution is the next stage in the network. It involves simultaneously, convoluting an input using varying sizes of filters for each convolution and then merging and feeding it to the rest of the network. The following sections of the network repeat inceptions and residuals several times, and the network additionally employs dropout layers to randomly decrease weights to prevent overfitting. Then, finally, there is a dense layer followed by Softmax to probabilistically distribute the scores to the final 1000 neurons. Ensembling. Deep neural networks are susceptible to minor details in the training data and random initializations during training. This behavior is notably observed in the case of small datasets. This high variance in deep networks can be subdued by implementing ensemble learning. Ensembling [28] is often used in machine learning in several algorithms. Random forest [3], for example, combines a variable number of decision trees. An ensemble can be thought of as a learning strategy that combines numerous models to solve a problem. Ensembles generally perform better than separate models by reducing variance and improving generalization, which results in an
Fig. 13 Condensed architecture of InceptionResNet-v2
13 Apple Leaves Diseases Detection Using Deep Convolutional …
221
Fig. 14 Ensemble fundamental block
improved F1-score. The structure of our final model is an ensemble of three separate models based on pre-trained ResNet101V2, Xception, and InceptionResNetV2, into a single model that averages their predictions as shown in Fig. 14. PredEnsemble =
PredInceptionResnetV2 + PredXception + PredResNet101V2 3
(5)
As a result, each model contributes equally to the final prediction.
4 Experimental Results This section, firstly, describes the experiment’s setting, followed by a discussion of several evaluation metrics and a comparison and analysis of our results.
4.1 Experimental Setup All experiments were executed on a Kaggle Kernel with an Intel(R) Xeon(R) CPU accelerated by Tensor Processing Units [29] (TPU v3-8). The TPU v3-8 provides 8 TPU cores and 128 GB of high-speed RAM. We used Python 3 and TensorFlow 2.4.1 to develop and train our models.
4.2 Evaluation Metrics Because we are dealing with imbalanced classes and multilabel classification, three metrics, precision, recall, and F1-score, are utilized for the performance evaluation of our proposed model and the three individual pre-trained models. Precision = Recall =
True Positives True Positives + False Positives
True Positives True Positives + False Negatives
(6) (7)
222
S. Kejriwal et al.
Table 1 Precision, recall, and F1-score of our three proposed fine-tuned models on the training set
Table 2 Precision, recall, and F1-score of our three fine-tuned models on the validation set
Model
Precision
Recall
F1-score
ResNet101V2
0.9775
0.9335
0.9459
Xception
0.9665
0.9136
0.9307
InceptionResNetV2
0.9500
0.8921
0.9117
Model
Precision
Recall
F1-score
ResNet101V2
0.9760
0.9294
0.9440
Xception
0.9696
0.9213
0.9361
InceptionResNetV2
0.9813
0.9250
0.9473
F1-Score =
2 × Precision × Recall Precision + Recall
(8)
Finally, for the pre-trained models, charts of F1-score versus epochs are given, and the appropriate conclusions are derived.
4.3 Performance Evaluation This section, firstly, assesses the performance of the pre-trained models on the training and validation datasets. Table 1 presents the precision, recall, and F1-score of the pre-trained models on the training set. ResNet101V2 outperforms Xception and InceptionResNetV2 in all three metrics with a training F1-score of 0.9459. Table 2 presents the precision, recall, and F1-score of the pre-trained models on the validation set. InceptionResNet101V2 outperforms Xception and ResNetV2 with a validation F1-score of 0.9473. Figure 15 shows the plots of F1-score versus epochs for the pre-trained models. It can be seen that the training curve is relatively smooth and has much less variation as compared to the validation curve. We can also see that the score seems to plateau after about 30–35 epochs. The green vertical line marks the start of fine-tuning. It can be seen that the scores improve marginally by about 25–30% during fine-tuning of the model. Finally, we compare the results of our ensemble model alongside the results of the pre-trained models. Table 3 presents the precision, recall, and F1-score on the test set. Our ensemble model outperforms all other separate models. Our model obtained a test F1-score of 0.9625. Among the other three models, the InceptionResNetV2 performed best with a test F1-score of 0.9476. While our model performs remarkably well for all classes, it has some difficulties with the complex class. At times, the model predicts extra diseases in addition to the complex class, while at other times, it misses particular diseases, as shown in
13 Apple Leaves Diseases Detection Using Deep Convolutional …
Fig. 15 F1-score versus epochs during training and fine-tuning of the three models
223
224
S. Kejriwal et al.
Table 3 Precision, recall, and F1-score of our three fine-tuned models and their ensemble on the test set Model
Precision
Recall
F1-score
ResNet101V2
0.9761
0.9297
0.9385
Xception
0.9708
0.9177
0.9347
InceptionResNetV2
0.9834
0.9233
0.9476
Our Ensemble Model
0.9743
0.9541
0.9625
Fig. 16 Two examples where the model incorrectly classifies the image
Fig. 16. This occurs because leaves in the complex class have a variety of diseases that seem extremely similar to the other diseases in the dataset, making it difficult to distinguish between them. This inference can be confirmed by analyzing Fig. 17, which represents the confusion matrices for each class in the dataset. It describes the performance of our proposed model and the errors made by it on the test dataset.
5 Conclusion This chapter proposed an ensemble of pre-trained deep convolutional neural networks for multilabel classification of apple leaves into healthy and five types of foliar diseases—scab, rust, complex, powdery mildew, and frog eye leaf spot. Even though the diseases can be diagnosed manually, often, it can be labor-intensive, timeconsuming and challenging to detect the disease correctly. Traditional machine learning methods are appealing, but they struggle with non-uniform backgrounds and cannot extract information as precisely as deep convolutional neural networks.
13 Apple Leaves Diseases Detection Using Deep Convolutional …
225
Fig. 17 Class-wise confusion matrices for the ensemble model
Hence, we first implemented three deep learning models—ResNet101V2, Xception, and InceptionResNetV2 pre-trained on ImageNet and assessed their performance on a dataset of apple leaf images in this research. Data has been augmented to prevent overfitting, thereby increasing accuracy. Image augmentation methods such as flipping, rotating, zoom, translating and performing brightness, contrast, saturation, and hue variations have been used to construct models with a relatively large training
226
S. Kejriwal et al.
set. Then, we assessed the performance of the three models separately for various criteria like precision, recall, and F1-score. Finally, we ensembled the predictions of all three models via averaging. The primary reason for using an ensemble model is to decrease the generalization error of the prediction, reduce variance and leverage the diversity and independence of the base models. On the test dataset with as many as 2329 images of apple leaves, our final model yields an F1-score of 0.9625 while successfully classifying not just the healthy leaves but also leaves with single and multiple diseases. Farmers may employ this method to automate disease classification in apple trees to save time and minimize the dependence on specialists.
References 1. Khirade SD, Patil AB (2015) Plant disease detection using image processing. In: 2015 international conference on computing communication control and automation, pp 768–771. https:// doi.org/10.1109/ICCUBEA.2015.153 2. Kirti, Rajpal N (2020) Black rot disease detection in grape plant (Vitis vinifera) using colour based segmentation & machine learning. In: 2020 2nd international conference on advances in computing, communication control and networking (ICACCCN), pp 976–979. https://doi.org/ 10.1109/ICACCCN51052.2020.9362812 3. Sandika B, Avil S, Sanat S, Srinivasu P (2016) Random forest based classification of diseases in grapes from images captured in uncontrolled environments. In: 2016 IEEE 13th international conference on signal processing (ICSP), pp 1775–1780. https://doi.org/10.1109/ICSP.2016. 7878133 4. Mondal D, Chakraborty A, Kole DK, Majumder DD (2015) Detection and classification technique of yellow vein mosaic virus disease in okra leaf images using leaf vein extraction and Naive Bayesian classifier. In: 2015 international conference on soft computing techniques and implementations (ICSCTI), pp 166–171. https://doi.org/10.1109/ICSCTI.2015.7489626 5. Madhulatha G, Ramadevi O (2020) Recognition of plant diseases using convolutional neural network. In: 2020 fourth international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC), pp 738–743. https://doi.org/10.1109/I-SMAC49090.2020.9243422 6. Jenifa A, Ramalakshmi R, Ramachandran V (2019) Cotton leaf disease classification using deep convolution neural network for sustainable cotton production. In: 2019 IEEE international conference on clean energy and energy efficient electronics circuit for sustainable development (INCCES), pp 1–3. https://doi.org/10.1109/INCCES47820.2019.9167715 7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi. org/10.1109/CVPR.2016.90 8. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807. https:// doi.org/10.1109/CVPR.2017.195 9. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-V4, Inception-ResNet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 10. Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: Proceedings of the 27th international conference on artificial neural networks (ICANN). Lecture notes in computer science, vol 11141, pp 270–279 11. Parikh A, Raval MS, Parmar C, Chaudhary S (2016) Disease detection and severity estimation in cotton plant from unconstrained images. In: 2016 IEEE international conference on data science and advanced analytics (DSAA), pp 594–601. https://doi.org/10.1109/DSAA.2016.81 12. Islam M, Dinh A, Wahid K, Bhowmik P (2017) Detection of potato diseases using image segmentation and multiclass support vector machine. In: 2017 IEEE 30th Canadian conference
13 Apple Leaves Diseases Detection Using Deep Convolutional …
13.
14.
15.
16.
17.
18. 19. 20.
21. 22. 23. 24. 25.
26. 27.
28. 29.
227
on electrical and computer engineering (CCECE), pp 1–4. https://doi.org/10.1109/CCECE. 2017.7946594 Militante SV, Gerardo BD, Medina RP (2019) Sugarcane disease recognition using deep learning. In: 2019 IEEE Eurasia conference on IOT, communication and engineering (ECICE), pp 575–578. https://doi.org/10.1109/ECICE47484.2019.8942690 Shrivastava VK, Pradhan MK, Thakur MP (2021) Application of pre-trained deep convolutional neural networks for rice plant disease classification. In: 2021 international conference on artificial intelligence and smart systems (ICAIS), pp 1023–1030. https://doi.org/10.1109/ ICAIS50930.2021.9395813 Surya R, Gautama E (2020) Cassava leaf disease detection using convolutional neural networks. In: 2020 6th international conference on science in information technology (ICSITech), pp 97–102. https://doi.org/10.1109/ICSITech49800.2020.9392051 Trang K, TonThat L, Gia Minh Thao N, Tran Ta Thi N (2019) Mango diseases identification by a deep residual network with contrast enhancement and transfer learning. In: 2019 IEEE conference on sustainable utilization and development in engineering and technologies (CSUDET), pp 138–142. https://doi.org/10.1109/CSUDET47057.2019.9214620 Zhang X, Qiao Y, Meng F, Fan C, Zhang M (2018) Identification of maize leaf diseases using improved deep convolutional neural networks. IEEE Access 6:30370–30377. https://doi.org/ 10.1109/ACCESS.2018.2844405 Sardo˘gan M, Özen Y, Tuncer A (2020) Detection of apple leaf diseases using faster R-CNN. Düzce Üniv Bilim Teknol Dergisi 8(1):1110–1117. https://doi.org/10.29130/dubited.648387 Dai B, Qui T, Ye K. Foliar disease classification. [Online]. Available: http://noiselab.ucsd.edu/ ECE228/projects/Report/15Report.pdf. Accessed 7 Sept 2021 Jiang P, Chen Y, Liu B, He D, Liang C (2019) Real-time detection of apple leaf diseases using deep learning approach based on improved convolutional neural networks. IEEE Access 7:59069–59080. https://doi.org/10.1109/ACCESS.2019.2914929 Plant pathology 2021 FGVC8. kaggle.com. https://www.kaggle.com/c/plant-pathology-2021fgvc8. Accessed 7 Sept 2021 Thapa R, Zhang K, Snavely N, Belongie S, Khan A (2020) The plant pathology challenge 2020 data set to classify foliar disease of apples. Appl Plant Sci 8(9). Art. no. e11390 Keras homepage. https://keras.io/. Accessed 7 Sept 2021 Tensorflow homepage. https://www.tensorflow.org/. Accessed 7 Sept 2021 Szandała T (2021) Review and comparison of commonly used activation functions for deep neural networks. In: Bhoi A, Mallick P, Liu CM, Balas V (eds) Bio-inspired neurocomputing. Studies in computational intelligence, vol 903. Springer, Singapore. https://doi.org/10.1007/ 978-981-15-5495-7_11 Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv: 1412.6980 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308 Cerliani M. Neural networks ensemble. https://towardsdatascience.com/neural-networks-ens emble-33f33bea7df3. Accessed 7 Sept 2021 Kaggle TPU documentation. kaggle.com. https://www.kaggle.com/docs/tpu. Accessed 7 Sept 2021
Chapter 14
A Deep Learning Paradigm for Detection and Segmentation of Plant Leaves Diseases R. Kavitha Lakshmi
and Nickolas Savarimuthu
1 Introduction The country’s economic growth is reliant on agriculture production. One of the fundamental and vital necessities for any country is the protection and sustainability of crops. Malnutrition has long been a concern in developing nations like India, and it is closely linked to food security. Plant diseases are a significant source of crop loss. They can severely degrade the quality of the crop; as a result, the amount of agriculture yield can be reduced. Detecting plant diseases and preventing crop loss is difficult due to farmers’ lack of understanding about plant disease detection and necessary measures. Furthermore, existing manual methods for identifying plant diseases employed by agronomists are costly, time-demanding, and prone to mistakes. Hence, the automated identification of plant diseases is an essential part of precision agriculture. The enormous success of deep learning-based methods has sparked several rapid and practical techniques for plant disease detection and drastically minimized agricultural production losses [1]. The most recent image processing techniques have identified several solutions for disease detection that will assist farmers and agronomists [2]. Many studies have been proposed using machine learning algorithms [3]. In recent years, there have been tremendous advances in using deep learning concepts in agriculture to tackle various problems such as insect detection, fruit disease detection, crop estimation, and plant disease detection [1]. This study aims to develop effective real-time automated plant disease detection and segmentation frameworks using vision-based models. In addition, this work focuses on achieving better accuracy with fewer computational resources.
R. Kavitha Lakshmi (B) · N. Savarimuthu Department of Computer Applications, NIT Trichy, Trichy, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_14
229
230
R. Kavitha Lakshmi and N. Savarimuthu
The significant contributions of this work are summarized as follows: 1. We propose novel deep learning frameworks for effective plant disease detection and segmentation. 2. Explores the recent detection and segmentation deep learning models for practical plant disease diagnosis. 3. Generates the custom annotated dataset for training two proposed models gives better mean average precision (mAP) than other state-of-art models. The remaining section of the chapter is structured as follows: the survey is presented in Sect. 2. The detailed methodology of the proposed frameworks is described in Sect. 3. The dataset details are elaborated in Sect. 4. In Sect. 5, results and comparative analysis of this work with other benchmark state-of-the-art models are presented. Finally, Sect. 6 concludes this chapter and gives direction for future study.
2 Related Work Deep learning is a form of machine learning that employs the concepts of artificial neural networks. Especially, convolutional neural networks (CNN) are well-known classifiers used to classify and detect plant diseases [4]. This section examined numerous relevant research publications to discover the methods and strategies utilized in existing studies and the research gaps. Dyrmann et al. [5] trained a convolutional neural network (CNN) with 10,413 images acquired from 22 different weed and crop species under different lighting conditions and could achieve 86.2% classification accuracy. The problem here is network failed in classifying Vernoics, Field Pansy, and BroadGrasses species due to lack of availability of adequate training samples. Liu et al. [6] diagnosed four common types of apple leaf diseases, namely mosaic, rust, brown, spot, and Alternaria, using AlexNet, a deep neural network architecture model. 1053 images with disease-based symptoms like mosaic, brown spot, rust, and Alternaria leaf spot are pre-processed and prepared for training. Indeed, during image acquisition from the farm, several weather factors like sunlight variation, random clouds, and sand dust may affect their brightness and balance. Hence their brightness, contrast, and sharpness values are fine-tuned as part of preprocessing. With this fine-tuned dataset, the network can identify the four common apple leaf diseases with an accuracy of 97.62%. In Ferentinos et al. [7] work, 26 plant diseases, with an open database of 87,848 leaves images from 14 different plants of fruits, vegetables, and grains, are diagnosed. Five basic CNN architectures like AlexNet, AlexNetOWTBn, GoogleNet, Over feat, and VGG are trained, and as a result, 99.53% success rate achieved in identifying the corresponding plant–disease combination. However, at the same time, the misclassification rate is also seen in banana, tomato, and corn plants. The database includes images of partial shading, leaves with small occupancy in the noon centric part of the frame, and other unrelated parts like fingers, entire hands, shoes, and parts of shirts.
14 A Deep Learning Paradigm for Detection …
231
Ma et al. [8] exclusively used cucumber leaf images for disease recognition. Symptom images are segmented and trained using deep convolutional neural network to identify the infected portions accumulated on the leaf. As a result, 93.4% accuracy achieved. The model is trained with images about one kind of species, i.e., cucumber, which does not apply to other species. Even though deep learning methods produce positive results, still some issues are hidden when we go for an in-depth analysis. Zhang et al. [9] used three color components of a vegetable leaf, a three-channel convolutional neural network for plant disease recognition. The network is trained on the PlantVillage tomato leaf database containing 15,817 color leaf images with eight different diseases photographed in a simple background and different orientations. Further, the network is tested on 500 diseased cucumber leaf images captured in different sizes, orientations, locations, illumination, and backgrounds, and as a result, an 85–91.6% recognition rate is achieved. Singh et al. [10] research focuses on diagnosing mango tree fruits and leaves adversely affected by a fungal disease called Anthracnose. In this perspective, a multilayer convolutional neural network-based ternary classification model was proposed. The ternary model runs on three tasks. The first one is to classify mango leaves images from the other. The second one is to identify whether it is diseased or not. From around 2200 images, 1070 images were photographed in real-time, and the remaining were taken from the PlantVillage dataset. These are categorized into two classes, namely mango leafy images with the disease and without the disease. After training, an accuracy of 97.13% was achieved in parallel with other state-of-the-art approaches. Coulibaly [11] used transfer learning approach in CNN for plant disease detection and diagnosis. In the proposed VGG16 architecture, apart from the image acquisition and disease classification, two additional modules, namely pre-trained model VGG16 and feature extraction, are incorporated to avoid overfitting and fine-tuning the network to the desired accuracy. In order to maintain invariance and avoid overfitting in classical CNN, pooling layers are used, and at the same time, the spatial resolution of input images is reduced, and as a result, spatial information of the extracted features is lost. Zhang et al. [12] integrated two concepts in existing CNN, namely dilated convolution and global pooling. Dilated convolution aims to recover spatial resolution by accumulating multi-scale contextual information while preserving the exact resolution. The limitation is that too many parameters in fully connected layers of standard CNN reduce training speed, leading to overfitting. The global pooling layer replaces a fully connected layer by imposing correspondence between extracted feature maps and categories as no parameter is left in the layer that is to be optimized, thereby eliminating overfitting. Priyadharshini et al. [13] developed an automated plant disease system by considering the maize crop from a freely available plant village database. Maeda-Gutiérrez et al. [14], compared five CNN’s namely, AlexNet, GoogleNet, InceptionV3, ResNet18, and ResNet-50 for tomato crop disease identification. Finally, this work concludes that GoogleNet achieves better accuracy compared to other models.
232
R. Kavitha Lakshmi and N. Savarimuthu
It is well understood from the above literature that most researchers used existing deep learning methodologies to perform the classification task of plant leave diseases. The classification task itself is not sufficient to find the area of the infected part and the extent to which fungus or virus spreads over the plant or from one plant to another. Detection and segmentation are suitable tools to find precisely infected lesions on plant leaves. It will help to make timely decisions and estimate the crop diseases within a stipulated time. This research attempts to address this limitation by considering detection and segmentation on plant diseased leaves at different stages of growth.
3 Methodology 3.1 Proposed Object Detection Architecture-1 EfficientDet is a state-of-the-art algorithm for object detection that essentially follows a one-stage object detection pattern developed by Google Research [15]. The overall architecture of the proposed object detection framework based on the EfficientDet deep learning model is depicted in Fig. 1. It is a hybrid network that relies on three main components, namely: (1) Backbone Network (2) BiFPN Network and (3) Class/Box Prediction Network. Backbone Network Most of the convolutional neural networks are employed to extract the fine-grained features from the images. In this work, we utilized the recent backbone network called EfficientNet [16] as a feature extractor from the EfficientNet family. These networks improve the accuracy with fewer parameters and FLOPS
Fig. 1 Schematic illustration of proposed framework using EfficientDet
14 A Deep Learning Paradigm for Detection …
233
Fig. 2 MBConv block
(floating point operations per second) compared to recent benchmark models such as ResNetxt, Inceptionv3, and DenseNet121. The fundamental building block of this network is Mobile Inverted Bottleneck Conv Block, MBconv, to which optimization of squeeze-and-excitation is applied. MBConv is identical to the residual inverted blocks used in MobileNet v2. This forms a shortcut link between the start and end of a convolutional block as shown in Fig. 2. Here, DWConv stands for depthwise Conv, k3×3/k5×5 is the kernel size, BN stands for batch normalization, HxWxF defines the tensor shape (height, width, depth), and × 1/2/3/4 represents the multiplier for the number of repeated layers. This arrangement helps to minimize the total number of necessary operations as well as the size of the model. Formally, EfficientNet architecture is designed to scale-up all dimensions by depth (number of layers), width (number of filters), and resolution (image height and width) using compound scaling technique. Depth scaling allows the model to learn more complex features. Width scaling helps to extract best fine-grained patterns. Higher input image resolution gives more visual information, boosts the model to extract indepth features for smaller objects. The compound scaling approach uses a compound coefficient ϕ to scale network width, depth, and resolution uniformly in a principled manner as shown in Eq. 1. A user-defined, global scaling factor ϕ (integer) controls the number of resources available, while α, β, and γ specify how to assign these resources to the depth, width, and resolution of the network, respectively. The floating point operations per second (FLOPS) of a convolutional process is proportional to
234
R. Kavitha Lakshmi and N. Savarimuthu
Table 1 Architecture of efficientNet-B0 baseline network Stage Operator Resolution 1 2 3 4 5 6 7 8 9
Conv3×3 MBConv1,k3× 3 MBConv6,k3 × 3 MBConv6,k5 × 5 MBConv6,k3 × 3 MBConv6,k5 × 5 MBConv6,k5 × 5 MBConv6,k3 × 3 Conv1×1 /Pooling /FC
224 × 224 112 × 112 112 × 112 56 × 56 28 × 28 14 × 14 14 × 14 7×7 7×7
#Channels
#Layers
32 16 24 40 80 112 192 320 280
1 1 2 2 3 3 4 1 1
d, w 2 , r 2 , as doubling the depth doubles the FLOPS, while doubling the width or resolution almost four times increases the FLOPS. So, scaling the network using Eq. 1 will increase the total FLOPS by (α · β 2 · γ 2 )ϕ . Hence, to ensure that the total FLOPS do not exceed 2ϕ , the constraint (α · β 2 · γ 2 ) ≈ 2 is applied. The parameters α, β, and γ are constants acquired exponentially by using a grid search. By changing ϕ, we can go from EfficientNet-B0 (ϕ=0) to EfficientNet-B7 (ϕ=7) architectures of the EfficientNet family. The description of the baseline model called EfficientNet-B0 is described in Table 1. depth = αϕ width = β ϕ resolution = γ ϕ s·t· α · β 2 · γ 2 ≈ 2
(1)
α ≥ 1, β ≥ 1, γ ≥ 1 Wbifpn = 64 · (1 · 35ϕ ), Dbifpn = 3 + ϕ
(2)
Dbox = Dclass = 3+ ϕ/3
(3)
Rinput = 512+ϕ · 128
(4)
BiFPN Network A bi-directional pyramid feature network (BiFPN) is taken from the concept of FPN by adding more powerful multi-scale feature fusion. Here, BiFPN acts as the feature network and receives around 3–7 levels, i.e., [P3, P4, P5, P6, P7] features from the backbone network (EfficientNet) and continually conducts a top-down and bottom-up bidirectional feature fusion as depicted in Fig. 2. Earlier object detection models relied on gigantic backbone networks or high input resolution images to achieve better accuracy. EfficientDet introduces a compound scaling method that allows a simple compound coefficient ϕ to jointly scale-up all dimen-
14 A Deep Learning Paradigm for Detection …
235
Table 2 Scaling configurations of EfficientDet: (D0-D6) models Input size Backbone BiFPN (Rinput ) network #Channels #Layers (Wbifpn ) (Dbifpn ) D0 (ϕ =0) D1 (ϕ = 1) D2 (ϕ = 2) D3 (ϕ = 3) D4 (ϕ = 4) D5 (ϕ = 5) D6 (ϕ = 6) D7 (ϕ = 7) D7× x
512 640 768 896 1024 1280 1280 1536 1536
B0 B1 B2 B3 B4 B5 B6 B6 B7
64 88 112 160 224 288 384 384 384
3 4 5 6 7 7 8 8 8
Class/box #Layers (Dclass ) 3 3 3 4 4 4 5 5 5
sions to a backbone (EfficientNet), feature (BiFPN), class/box prediction network, and resolution. Equation 2 gives the width and height of the Bi-FPN network. Class/Box prediction Network The width (#channels) remains the same as the BiFPN (i.e., Wpred = Wbifpn ), but the depth (#layers) is linearly increased as given in Eq. 3. As feature levels 3–7 are used in the BiFPN network, the resolution of the input must be divisible by 27 = 128 by linearly increasing the resolution by Eq. 4. Using Eqs. 2, 3, 4 and changing ϕ value, we can go from EfficientDet-D0 (ϕ=0) to EfficientDet-D7 (ϕ = 7), as shown in Table 2, where the compound coefficient ϕ controls all the scaling dimensions of BiFPN, class/box prediction network, and input size. Finally, all the fused features are fed into the class/box prediction network to classify and detect the diseased leaf images.
3.2 Proposed Instance Segmentation Architecture 2 In this section, the proposed plant disease instance segmentation framework based on Mask_RCNN is introduced, as shown in Fig. 3. Mask_RCNN [17], by Facebook AI Research, is a two-stage object detection convolutional neural network, an extension of Faster_RCNN with an additional mask brach added. The overall architecture of the proposed framework is divided into two stages. The first stage is integrated into two parts. One is the backbone network, and the other is region proposal network (RPN). In this work, we chose a backbone network as ResNet-152 with a feature pyramid network to extract in-depth features from multiple scales. The layered architecture of ResNet-152 is shown in Fig. 4. ResNet includes skip connections (also known as shortcut connections) to mitigate deep neural networks deterioration. This structure helps to improve the classification results and reduce the model complexity. Figure 4
236
R. Kavitha Lakshmi and N. Savarimuthu
Fig. 3 Schematic illustration of proposed framework using Mask_RCNN
Fig. 4 Layered architecture of ResNet-152
has 5 convolutional blocks, and each block is composed of the ReLu activation function and batch normalization (BN) function. Feature pyramid network (FPN) consists of two pathways, namely bottom-up and top-down pathways. The bottom-up pathway is usually a backbone network of ResNet. The ResNet backbone comprise of five convolution blocks, and the output of each convolution block produces the feature maps as [C1, C2, C3, C4, C5]. FPN merges the same spatial size feature maps from the respective ResNet output stage. To build the top-down pathway, it first adjusts a 1 × 1 convolution to decrease the number of feature map channels to 256. These regenerated layers help the detector for better classification results. The portion that combines the FPN and ResNet is shown in Fig. 5. Finally, four feature maps (P2, P3, P4, P5) outputs are generated through the 3 × 3 convolution layer. The results are fed into RPN network containing two
Fig. 5 A schematic overview that combines the residual network (ResNet) and feature pyramid network (FPN)
14 A Deep Learning Paradigm for Detection …
237
simultaneous branches that produce objectness score and bounding box coordinates. RPN generates region of interest (ROIs) from the image using different scales and aspect ratios using anchors. Among the top N anchors, the one with the highest likelihood of RPN is chosen. In the second stage, all feature maps with different sizes and aspect ratios are passed to the ROIAlign layer. This layer helps to generate fixed-size feature maps and finally passed to fully connected network (FCN) to produce mask pixel-to-pixel wise.
4 Image Dataset Acquisition Two datasets were selected , Dataset-1 and Dataset-2, to build a custom annotated datasets. Dataset-1 is the “Plant village dataset” [18], which contains 54,305 images with 38 categories of healthy and diseased leaves. We consider three crops, namely apple, grape, and tomato crops were taken. Dataset-2 is “A Database of Leaf images Dataset” [19] contains 4,503 images of health and disease with 14 categories. Here, we utilized 4 crop species, i.e., Jamun, mango, pomegranate, and pongamia pinnata. Both datasets are used for detection and segmentation tasks on diseased plant leaves. We made annotations separately for object detection and segentation tasks for plant leave disease detection. For, detection task, we consider the “LabelImg” annotation tool, and for the segmentation task, we prefer the “Visual Graphic Group” annotation tool to make annotations. Using “LabelImg,” all annotated images are saved into “XML” format used for detection tasks while training. Using “VGG,” all annotated images information is converted into .json file format, used as ground truth, for segmentation tasks during training.
5 Experimental Setup and Results Analysis 5.1 Performance Measures The performance of proposed detection and instance segmentation models are evaluated using four criteria: precision, recall, F1-Score, and mean average precision (mAP). The most widely used mAP metric, which is derived from average precision (AP). The AP is calculated using precision and Recall. It is typically computed for each object class separately. Finally, mAP is taken by averaging all the object classes. All these metrics for object detection and segmentation models depend on the threshold value given by the user. The intersection over union (IoU) metric is used to define thresholds values for AP and mAP. All the metrics are represented as given in mathematical notation as below:
238
R. Kavitha Lakshmi and N. Savarimuthu
IoU =
B BPredicted ∩ B BGroundTruth B BPredicted ∪ B BGroundTruth
(5)
where B BGroundTruth is ground truth given as input to the model, and B BPredicted represents the predicted bounding box detected by model. K mAP =
i=1
APi
K
(6)
5.2 Experimental Setup and Results Discussions In this work, the proposed plant disease detection and instance segmentation modes are implemented using NVIDIA Quadro K2200 graphics card, 12GB RAM, and 4GB graphics memory. We examine two models based on object detection and instance segmentation methods. In the proposed plant disease detection model, we use the EfficientDet-D2 model due to limited memory constraints. All the images are resized to 256 px × 256 px and are split based on 80–20 rule. For training, we leverage pretrained weights on the ImageNet dataset. For experimental settings, the model is trained with 100 epochs with 10000 iterations per epoch. We set a mini batch size two and the learning rate by a factor of 0.001 as the learning plateau. Adam optimizer with momentum 0.6 with weight decay of 3e–4 is considered. During training, we used regularization techniques such as batch normalization (BN), dropout to prevent overfitting. We use the Swish activation function instead of the ReLu function to minimize the loss. The focal loss function is also employed during training. The proposed plant disease segmentation model is trained based on Mask_RCNN. All the images are resized to 512 px × 512 px and are split based on 80–20 rule. Here, we leverage weights on the COCO image dataset. The model is trained with 50 epochs with batch size 16 and a learning rate as 0.0001. During training, we prefer adam optimizer and Leaky-ReLU activation function. We train the same dataset using two proposed models individually in this work. Moreover, empirically observe the effectiveness of two proposed models by considering and adjusting these hyperparameters. Finally, we compare the various state-ofthe-art models with proposed plant disease detection and segmentation models, i.e., SSD, YOLOv3, Faster_RCNN, and Faster_RCNN +FPN. In order to test the performance of the model, precision, recall, F1-Score, and mAP metrics are considered. Table 3 gives precision, recall, and F1-Score results for both proposed models. In addition, for any object detection and segmentation algorithms, mean average precision metric is used to detect the model’s overall performance. As shown in Table 4, it is observed that proposed plant disease detection and instance segmentation models give better mAP results with fewer computational parameters and the FLOPS. Figures 6 and 7 give few detected samples of proposed models. From Fig. 6, though, YOLOv3 gives better detection results; it takes more computational parameters and FLOPs than the proposed object detection model. From Fig. 7, Faster_RCNN +FPN
14 A Deep Learning Paradigm for Detection …
239
Table 3 Performance comparison in terms precision, recall, F1-Score with existing state-of-art models Method Precision (%) Recall (%) F1-score (%) SSD [20] YOLOv3 [21] Faster_ RCNN [22] Faster_ RCNN + FPN [23] Proposed Model (Detection) Proposed Model (Instance segmentation)
69.03 79.32 74.13 78.93
72.04 80.41 75.06 79.54
70.50 79.86 74.59 79.23
83.62
84.93
84.26
85.96
84.23
85.12
Table 4 Performance comparison in terms of total parameters, mAP and FLOPS with existing models Method # Parameters mAP@IoU = 0.7 (%) # FLOPS (billions) (millions) SSD YOLOv3 Faster_ RCNN Faster_ RCNN + FPN Proposed model (detection) Proposed model (instance segmentation)
34.84 60.78 40.16 60.13 8.1
62.40 70.83 64.18 66.86 75.16
110 71 87 98 11
44.4
76.94
103
gives better results, but this model does not segment the diseased portions pixel-topixel wise. From Figs. 6 and 7, the proposed plant disease detection and segmentation frameworks effectively identify the infected portions on the disease leaves with a specific class label. Furthermore, the empirical findings demonstrate that proposed frameworks are capable of accurately detecting plant diseases.
6 Conclusion and Future Work In this work, domain-specific plant disease detection and instance segmentation models are designed. We generate a custom-annotated dataset from two publicly available plant disease image databases. The proposed plant disease detection models are designed by adopting the recent EfficientDet and the Mask_RCNN model. We
240
R. Kavitha Lakshmi and N. Savarimuthu
Fig. 6 Comparision of proposed detection framework results with existing models: a Original Image, b Ground _Truth, c SSD, d YOLOv3, e Proposed model 1
14 A Deep Learning Paradigm for Detection …
241
Fig. 7 Comparision of proposed framework results with existing models: a Original Image, b Ground_Truth (proposed model), c Ground_Truth (existing models), d Faster_RCNN, e Faster_RCNN FPN, f Proposed model 2
242
R. Kavitha Lakshmi and N. Savarimuthu
Fig. 7 (continued)
observed from empirical results, the proposed plant disease detection and segmentation frameworks achieve satisfiable mAP as 75.16% and 76.94% and effective results by identifying small infected lesions. In addition, the proposed detection model identifies the small infected lesions with few computational parameters and FLOPS compared to recent state-of-the-art models SSD, YOLOv3, Faster_RCNN, and Faster_RCNN+FPN. Furthermore, the proposed segmentation model extracts the exact diseased portions on the leaves pixel-to-pixel wise. Overall, the proposed plant disease detection and segmentation models can be applied for real-time detection in precision farming. In the future, we intend to expand the dataset by adding more crop results to get a more accurate detection framework.
References 1. Ferentinos, Konstantinos P (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318 2. Gavhale KR, Gawande U (2014) An overview of the research on plant leaves disease detection using image processing techniques. IOSR J Comput Eng (IOSR-JCE) 16(1):10–16 3. Liakos KG et al (2018) Machine learning in agriculture: a review. Sensors 18(8):2674 4. Lee SH et al (2017) How deep learning extracts and learns leaf features for plant classification. Pattern Recogn 71:1–13 5. Dyrmann M, Karstoft H, Midtiby HS (2016) Plant species classification using deep convolutional neural network. Biosyst Eng 151:72–80 6. Liu B et al (2018) Identification of apple leaf diseases based on deep convolutional neural networks. Symmetry 10(1):11 7. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318 8. Ma J et al (2018) A recognition method for cucumber diseases using leaf symptom images based on deep convolutional neural network. Comput Electron Agric 154:18–24 9. Zhang S et al (2019) Cucumber leaf disease identification with global pooling dilated convolutional neural network. Comput Electron Agric 162:422–430 10. Singh UP et al (2019) Multilayer convolution neural network for the classification of mango leaves infected by anthracnose disease. IEEE Access 7:43721–43729 11. Coulibaly S et al (2019) Deep neural networks with transfer learning in millet crop images. Comput Ind 108:115–120
14 A Deep Learning Paradigm for Detection …
243
12. Zhang S, Huang W, Zhang C (2019) Three-channel convolutional neural networks for vegetable leaf disease recognition. Cognit Syst Res 53:31–41 13. Priyadharshini RA et al (2019) Maize leaf disease classification using deep convolutional neural networks. Neural Comput Appl 31(12):8887–8895 14. Maeda-Gutierrez V et al (2020) Comparison of convolutional neural network architectures for classification of tomato plant diseases. Appl Sci 10(4):1245 15. Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 16. Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR 17. He K et al (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision 18. Mohanty SP, Hughes DP, Salathé M (2016) Using deep learning for image-based plant disease detection. Front Plant Sci 7:1419 19. Chouhan SS et al (2019) A data repository of leaf images: practice towards plant conservation with plant pathology. In: 2019 4th international conference on information systems and computer networks (ISCON). IEEE 20. Liu W et al (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham 21. Ren S et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99 22. Lin T-Y et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition 23. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767
Chapter 15
Early Stage Prediction of Plant Leaf Diseases Using Deep Learning Models N. Rajathi and P. Parameswari
1 Introduction Agriculture in every country is determined by the price and quantity of cultivating items, particularly plants. Current technological advancements can provide enough food to feed more than 7 billion people. Even though, food security continues to be jeopardized by a variety of factors such as environmental change, a decline in pollinators, plant diseases, and so on [1]. The rapid detection of plant infections is the most important and crucial movement in agriculture. So far, most of the testing has been done physically, which may make it difficult to distinguish between the disease and its category. The identification of plant diseases mandated the use of image processing strategies by a large number of specialists to make this time-consuming task easier. Plant infections that originate in living beings are classified as biotic. The fundamental origins of various types of biotic diseases are fungi, bacteria, and viruses [2]. In all seasons, the agriculture sector has increased its efforts to protect plants from various types of diseases. The semantic gap in the reorganization of infections develops gradually, as pathologists discover new infections. Computerization can replace manual infection surveillance in plants by utilizing digital image processing methods that have taken over a function in the computer era. Computerization aids in avoiding the application of large amounts of insecticides to plants. Furthermore, computerization can foresee human life [3]. N. Rajathi (B) Information Technology Department, Kumaraguru College of Technology, Coimbatore 641049, India e-mail: [email protected] P. Parameswari MCA Department, Kumaraguru College of Technology, Coimbatore 641049, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. S. Uddin and J. C. Bansal (eds.), Computer Vision and Machine Learning in Agriculture, Volume 2, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-16-9991-7_15
245
246
N. Rajathi and P. Parameswari
Fig. 1 General architecture of the proposed classification model
This study will focus on developing deep learning models to detect and classify various types of plant leaf diseases. The proposed system is built using deep learning techniques, and the image data set for various plant leaves obtained from the Plant village repository is shown in this process. The framework of the proposed model is depicted in Fig. 1. The major contribution of this article is to early detection the plant diseases by applying XGBoost with CNN-SVM classifier to achieve better classification accuracy. Early detection will be immensely helpful in controlling diseases and reducing global warming. This chapter is structured as follows: Sect. 2 describes the literature on plant diseases; Sect. 3 gives an overview of different types of plant diseases. Section 4, preliminary overview of SVM and XGBoost are discussed. Section 5 explains the proposed methodology, the results and discussion parts are described in Sect. 6. Finally, Sect. 7 includes concluding remarks.
2 Literature Survey Until now, numerous researchers have conducted extensive research and evaluations on plant leaf data in order to distinguish and understand about the infections. This
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
247
automation system can discover primary identification, which aids in preventing plant harm, and a list of specific infections and methods that are listed below in the literature review. Fujita et al. [4] created a model for detecting seven forms of cucumber viral infections. The author used a dataset of 7250 images that included both viral diseases and healthy leaves. The authors also divided the datasets into two parts: good and infected plant leaf images. Using data expansion strategies and training two distinct CNN models, one for each set of data, the classifiers achieved an average accuracy of 82.3% using a “four-fold” cross approval methodology. Fuentes et al. [5] proposed a DL-based identifier for recurring tomato plant diseases that identifies pests. They developed the model using their own dataset, which contained 5000 images captured by field cameras in a variety of locations and states. When using the ResNet-50 with R-FCN, the proposed framework achieved the highest characterization precision of 86 percent (Region-based fully convolutional networks). Zhang et al. [6] used modified versions of the “GoogLeNet” and “Cifar10” models to distinguish eight diseases in maize crop. The dataset used consist of 500 different plant leaf images gathered from various sources, for example, Plant Village dataset and Google sites, and improved by smearing various enlargement methods, for example, turn, scaling, and shading change, to expand the dataset size up to 3060 images. Taking into account a training and testing set division of 80% and 20%, the two proposed frameworks achieved a detection rate of 98.9% and 98.85%, respectively. Ma et al. [7] described the deep convolutional neural network (DCNN) framework for classifying the four types of cucumber infections. The image dataset, which was derived from the Plant Village dataset, included websites as well as field images, which were then increased using various enlargement strategies and further fragmented to include only the indication regions. The results showed that using the trained “AlexNet” model, DCNN achieved the highest accuracy of 94%. Sardogan et al. [8] used convolutional neural networks (CNN) as a feature extractor before building a CNN system based on “RGB” segments of the tomato leaf image dataset. The convolution part’s yield feature vector is then fed into an learning vector quantization (LVQ) classifier. The image dataset used was a subset of the Plant Village dataset, which includes 500 images of five distinct classes separated into four maladies and then solid leaves. The proposed framework achieved a normal accuracy of 86%. Gandhi et al. [9] proposed increasing the dataset with generative adversarial networks (GANs) to overcome the predetermined number of images issue, with characterization accomplished using a CNN model. The Plant Village dataset was used, as well as two distinct CNN models: a trained Inception v3 framework, which achieved an accuracy of 88.6%, and a trained MobileNet framework, which achieved an accuracy of 92%. The trained system was embedded in a portable application. Table 1 summarizes various literature related to plant leaf disease prediction.
248
N. Rajathi and P. Parameswari
Table 1 Summary of literature Authors
Dataset
Model
Accuracy (%)
Crop type
Source
No. of images
Wang et al. [10]
Apple
Plant village
2199
Deep VGG16 90.4 model trained with transfer learning
De Luna et al. [11]
Tomato
ImageNet
4923
Transfer learning disease recognition model
Militante et al. Apple, corn, Plant village [12] grapes, potato, sugarcane, and tomato
35,000
95.75
Convolutional 96.5 neural network
Patidar et al. [13]
Rice
UCI Machine 1200 learning repository
RNN
95.83
Bhatia et al. [14]
Various plants
Own
ResNet 50
92.52
19,944
3 Types of Plants Diseases Both living organisms (Biotic/Infectious) and environmental factors influence plant infections. Because they are not irresistible and are not contagious, the latter diseases are less dangerous and generally avoidable [15]. Biotic illnesses, on the other hand, are the most dangerous and cause the most significant damage to yield. Figure 2 depicts the various types of plant diseases. Plant diseases caused by living organisms are biotic. Fungi, bacteria, and viruses are the primary causes of biotic diseases. They are classified into 3 main groups [16].
Fig. 2 Types of plant diseases
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
249
Fungal Diseases 1. Aggregate sheath 2. Black horse riding 3. Blast (leaf, neck, nodal and collar 4. Brown spot 5. Crown sheath rot 6. Downy mildew 7. Eyespot
8. 9. 10. 11. 12. 13.
False smut Kerriel smut Leaf smut Leaf scald Narrow brown leaf spot Pecky rice (Kernel spotting) 14. Root rots
15. Seeding blight 16. Sheath blight 17. Sheath rot 18. Sheath spot 19. Stackburn 20. Stem rot 21. Water-mold
Fig. 3 Various fungal diseases
Fungal Diseases: They are caused by fungus like organisms and are responsible for around 85 percent of plant diseases. Fungal spores are very small and light, which means they can travel through the air to infect other plants or trees. Fig. 3 shows the various fungal diseases and the image in Fig. 4 shows the leaf infected by fungal disease. Bacterial Diseases: Bacterial diseases are caused by approximately 200 different types of bacteria and can be spread by insects, splashing water, and other disease plants or tools. The various bacterial diseases are depicted in Fig. 5. Viral diseases are the rarest type of plant diseases and are caused by viruses. However, once a virus has been infected, there are no chemical treatments available to eliminate it, and all suspect plants should be removed to stop the infection. Insects are the most common carriers because they must physically enter the plant. The various viral diseases are depicted in Fig. 6. Fig. 4 Plant leaf infected by fungal disease
250
N. Rajathi and P. Parameswari
Bacterial Diseases 1. 2. 3. 4. 5. 6.
Bacterial Blight Bacterial Leaf Streak Foot Rot Grain Rot Pecky Rice Sheath Brown Rot
Fig. 5 Various bacterial diseases
Fig. 6 Various virus diseases
Viral Diseases
1. 2. 3. 4.
Rice Tungro Rice Grassy Stunt Rice ragged stunt Rice yellow mottling
4 Preliminary Overview 4.1 Convolutional Neural Network
Pooling Layer 3
Convolution Layer 3
Pooling Layer 2
Convolution Layer 2
Input Layer
Pooling Layer 1
Input Images
Convolution Layer 1
CNN is one of the descriptive network models used in deep learning. CNN is built on artificial neural networks (ANNs). Figure 7 depicts the architecture of a traditional CNN algorithm. The back-propagation method is used for various classification tasks
Feature Selection Maps
Fig. 7 Architecture of conventional CNN algorithm
Fully Connected Layer
Output Layer
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
251
to repeatedly reduce the weight of consistent features. To perform the feature extraction process, the filters are applied to auto extract features from the convolution layer. CNN is the preferred deep learning algorithm for extracting disease characteristics from plant leaf sequences due to its effectiveness in identification. CNN can learn features with a high level of abstraction by using its convolution property on the input data sequence. CNN’s layers include the network layers, pooling layers, and normalization layers. In the convolution layer, a feature map is generated based on the raw input sequence. The pooling layer reduces the size of the generated feature map from the convolution layer by adjusting a set of filters and weights, making it more visible for analysis. The normalization layer’s output consists of the silhouette features as a single dataset to be fed to the classifier for gait recognition [17].
4.2 Support Vector Machines When compared to other types of ML classifiers, SVM classifier produces the best results. It is used for this study because of its ability to manage small datasets and perform well in high-dimensional spaces. The primary motivation for using SVM is to apply a supervised learning algorithm that encourages the discovery of the optimal hyperplane that isolates the feature/component space. During training, the SVM generates hyperplanes in a high-dimensional space to separate the training data into different classes. If the data subset cannot be distinguished directly, a kernel function SVM is used to route the data to another vector. SVM makes fine with huge training features that also lead to precise and successful outcomes. For a specified training dataset; D(x 1 , y1 ), (x 2 , y2 ),…(x N , yN ) where x i eRn and memberships yi e ± 1 classes; i denotes the label comparing to each activity in the characterized dataset. To determine a selection capability for linear classification, the hyperplane partition is stated as follows: yi = sng((w(xi ) + b) • A generic hyperplane is defined by satisfying the condition w = xi + b = 0 • When delimited by margins, the set of hyperplanes can be written as” yi .((w.xi ) + b) ≥ 1 • To formulate the optimal hyperplane that separates the data, we should minimize 1 W 2 2
252
N. Rajathi and P. Parameswari
Multiclass SVM: Although SVM was originally designed for two-fold order, it is now commonly used for multiclass classification problems. The basic system consists of dividing the multiclass problem into numerous bi-class problems and aggregating the yields of all the sub-two-fold classifiers to give the last class forecast. On the most fundamental level, there are two main techniques for multiclass SVM. The first sort is called “one against one,” and it entails creating one classifier for each pair of classes and join double classifiers to create a multiclass classifier by choosing the most casted ballot class. As a result, (N − 1)/2 double SVM classifiers are required, each of which is trained on examples from the two contrasting classes. The subsequent strategy is known as “one against all,” and it considers all types of data as a single enhancement issue. For each classifier, the considered class is fitted against a wide range of different classes; thus, N number of classes uses N-SVM classifiers. The preparation cycle takes a long time when using the last strategy [18].
4.3 Extreme Gradient Boosting (XGBoost) This method is based on boosting, which combines all of the expectations of a large number of powerless learners in order to create a solid learner through additional substance training procedures. XGBoost prevents overfitting while also improving calculation assets. It is obtained by streamlining the target capacities that enable consolidating foresight and regularization terms while maintaining optimal computational quickness. Similarly, at the training stage, equal computations are naturally implemented for the capacities in XGBoost. The primary learner of XGBoost is immediately fitted to the entire space of data, and a subsequent model is then fitted to these residuals to deal with the downsides of a powerless learner. This fitting cycle is repeated for a few epochs until the stumbling measure occurs. The total of each learner’s expectations yields a definitive forecast of the model. The forecast’s overall capacity is introduced at step t as follows: f i(t) =
t
f k(xi) = f i t−1 + f i(xi)
k=1
From the above equation, f t (x i ) means the learner at step t, f i (t) & f i (t −1) are the forecasts at steps t and t − 1, finally x i represents the input variable. To precluding overfitting concern without cooperating the computational quickness of the system, the XGBoost technique originates the logical expression below to assess the goodness of the technique from the unique function: Obj (t) =
n k=1
l(yi.yi) +
t k=1
( f i)
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
253
In this equation, l means the loss function, n represents the total number of observations utilized then is the regularization term as well as defined as: 1 ( f ) = ϒ T + λω2 2 From the above equation, ω means the vector of scores in the leaves, λ is the regularization parameter as well as γ is the smallest loss required to additional divider the leaf node [19].
4.4 Proposed Method One disadvantage of ANN in general is the large number of hyperparameters that must be set correctly in order to achieve the best recognition performance. The neuron parameters, which are learned during training time, are physically chosen, which disrupts the building of the ANN [20]. The CNN model is used in this work to overcome these drawbacks. Figure 8 depicts CNN configurations for disease detection in various plant leaves. The network is divided into several stages, such as preprocessing the input images for noise removal, image enhancement, cropping and resizing the input images to ensure the success of the subsequent steps, extracting the features with CNN, and finally applying the multiclass SVM method to classify the various types of diseases [21]. The CNN is made up of four layers [22], which are, • • • •
Convolutional layer Pooling layer Fully connected layers Output layer.
Convolutional Layer: A convolution layer is a central segment of CNN that performs feature extraction by combining direct and nonlinear tasks, i.e., convolution activity
Fig. 8 Proposed CNN with SVM-based classification model
254
N. Rajathi and P. Parameswari
and actuation work. Convolution is a type of direct activity used for feature extraction in which a small array of numbers known as apportion/kernel is applied over the data called tensor. The output of previous kernels is stored in these layers, which consists of loads (weights) and inclinations (biases) to be learned. The purpose of the streamlining capacity is to produce parts that indicate information without error. An arrangement of numerical cycles is performed in this layer to separate the component guide of the input digital images. This method is continued by utilizing numerous bits to frame a discretionary number of feature maps that denote various attributes of the information tensors; various bits can be thought of as various component extractors. Size and number of portions are two key hyper boundaries that characterize convolution activity. Nonlinear activation function: The outputs of a direct activity, such as convolution, are then passed through a nonlinear actuation work. Although smooth nonlinear capacities, for example, sigmoid or hyperbolic digression (tanh) work, have been used previously because they are numerical representations of organic neuron conduct, the most well-known nonlinear initiation work utilized the rectified linear unit (ReLU), which basically calculates the function using the below equation. f (x) = max(0, x) Pooling Layer: This layer reduces overfitting and reduces neuron size for the down inspecting layer. The pooling layer reduces the element map size, the number of boundary digits, the preparation time, the calculation rate, and the power overfitting. It removes patches from the input feature maps, produces the extreme incentive in each fix, and eliminates the apparent multitude of different qualities. Practically, a maximum pooling with a channel of size 2 × 2 with a step of 2 is used. The in-plane component of highlight maps is reduced by a factor of two in this work. Fully Connected Layer: The output include guides of the previous convolution or pooling layer are normally smoothed, i.e., modified into a 1D exhibit of vectors (numbers), and associated with at least one completely associated layers, also known as thick layers, in that each input feature is associated with each yield by a learnable weight. When the highlights extracted using convolution layers and down examined using pooling layers are formed, they are planned by a subset of completely associated layers to the organization’s final yields, for example, the probabilities for each class in arrangement endeavors. The last completely associated layer commonly has same number of yield hubs as the number of classes. Each completely associated layer is straggled by a nonlinear capacity, e.g., ReLU. It is utilized to examine the class possibilities then the yield of this layer is the contribution of classifier. Here, SVM multiclassifier is applied to order the different plant leaves infections. After a couple of cycle, the weight force on the misclassified perceptions will increment, and the other way around, the effectively grouped will have lesser loads. The loads on the perceptions are the markers regarding which class the perception
15 Early Stage Prediction of Plant Leaf Diseases Using Deep … Table 2 Architecture of the proposed CNN-based SVM model
Layers
255
Output layer
Max-pooling layer
Input layer
60 × 60 × 32
2×2
Convolution1
30 × 30 × 32
2×2
Max-pooling
30 × 30 × 32
2×2
Convolution 2
26 × 26 × 64
2×2
Max-pooling
13 × 13 × 64
2×2
Fully connected layer
15
n/A
Output layer
SVM multiclass
n/A
has a place with, along these lines bring down the misclassification of the perceptions while incredibly improve the exhibition of the classifiers. That predominantly targets decreasing fluctuation, boosting is a method that comprises in fitting consecutively various frail students through added substance preparing techniques: each model in the arrangement is fitted giving more significance to perceptions in the dataset that were severely dealt with by the past models in the grouping. The 15-dimensional yield via SVM multiclass denotes the 15 various types (14 defective and 1 non-defective) of the detection problem and the results are argued in following Section. The input image is 64 × 64 × 3 color plant leaf image from plant village dataset [https://www.kaggle.com/emmarex/plantdisease]. The aspect parameters setting are obtainable in Table 2. In this dataset, each input image randomly cropped into 30 × 30 × 32 patch as the input of the 1stconvolution layer. This system includes 2 convolution layers in that all convolution layer is joined with 2 autonomous convolution portions computed from the input. The ReLU activation function also max-pooling layer are utilized. Next layer is a fully connected layer with 15 units. Finally, there is a SVM multiclass output layer with 15 possible values corresponding to the digits from 0 to 14.
4.5 Multiple Feature Extraction Feature extraction is subsequent phase of the approach for disease analysis. In this phase, a set of 10 significant features were extracted from the preprocessed images at various dimensions. The features that are extracted for SVM-based classification and identification is listed below. Mean and standard deviation are concerned with properties of individual pixels [23]. • • • • • •
Mean_r: The average intensity values of the red pixels Mean_g: The average intensity values of the green pixels Mean_b: The average intensity values of the blue pixels Stddev_r: The deviation between the red pixels in the input image Stddev_g: The deviation between the green pixels in the input Stddev_b: The deviation between the blue pixels in the input image
256
N. Rajathi and P. Parameswari
• Entropy is to measure the randomness’s I(x, y) is the probability matrix. N 1 Entropy = Ii(x, y)(−In Ii(x, y)) MN y=1
• Contrast is the intensity of the given pixel to the entire image. Contrast =
n
Pd(i − j)2
i, j=1
• Energy is square of the elements added together in the matrix (i.e.,) Square root of angular second moment. N =1 N −1 Energy = M 2 (i, j) i=0 j=0
• Correlation texture calculates the linear dependency of gray levels on those of adjacent pixels. N −1
√ (i − μi)(j − μj) / σ i2(σ j2)
←
Correlation =
i,j=0
• Homogeneity or Inverse difference moments (IDM) is a dimension that gives the value of how close the elements are distributed to that of the diagonal of the matrix. IDM =
n
Pd 1+|i−j| i,j=0
5 Result and Discussion In order to experiment and assess the proposed methodology, 20,636 plant leaves images are taken as input images. Here, Spyder 3.3.0 IDE is utilized for implementation of the framework. Spyder offers us with different excellent features of input images like analysis, debugging, advance editing along with data exploration, data visualization then interactive execution.
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
257
5.1 Dataset In this research work, the input image of the different plant leaves is obtained from the Plant Village data set [https://www.kaggle.com/emmarex/plantdisease] in the Kaggle repository. A total number of 20,636 images including of normal as well as diseased images that has the jpg and. png format. The below Fig. 9 indicates some sample input images from the database. It contains images of three plants namely, such as Pepper, Potato, and Tomato. Initially, the dataset accessible was made using images of the diseased leaves obtained after splitting them physically into the normal and diseased classes. In that 14 type of disease classes mentioned in the acquired dataset. Initially, each image is resized into 64 × 64 × 3 resolution. The various features captured with the different magnitudes using CNN method are used as the training datasets and testing datasets. In this system, 20,636 datasets were collected in which the 80% were used for generation of training sets, and the remaining 20% were used for testing process.
Fig. 9 Sample image set
258
N. Rajathi and P. Parameswari
5.2 Performance Evaluation In this research, the evaluation process is performed for the various DL algorithms with the below parameters [24]. DR × 100 TNI TP × 100 Precision= TP + FP TN Recall= × 100 TP + FN
Accuracy=
Here, TP, TN, FP, and FN represents true positive, true negative, false positive, and false negative values and DR represents number of detected results and TNI represents total number of iterations. The performance of the proposed algorithms has been assessed using various cases, which are shown in the Table 3. In the evaluation scenario, plant leaf images are taken for the classification and different comparative analysis are illustrated in above table. The Fig. 10 clearly shows that the accuracy of the proposed algorithm has reached the peak value of 97% when the XGBoost algorithm optimizes the hidden neurons. Table 3 Performance evaluation
Algorithms
Accuracy (%) Precision Recall
CNN
89.76
CNN-SVM
94.68
94
95
XGBoost with CNN-SVM 97.14
97
99
Fig. 10 Comparative analysis of accuracy, precision and recall
90
90
15 Early Stage Prediction of Plant Leaf Diseases Using Deep …
259
The precision and recall have been calculated for all the methods and that the proposed method defeats the other existing methods. This is pictorially presented in Fig. 10. The above results clearly show that proposed algorithm gives the beat results when compared with others. The DL algorithms provided by the python toolbox is utilized for assessing the effort of the proposed methodology and calculates the number of normal and disease affected leaves present in the testing dataset. Among the 20,636 images, 20% (4127) has been taken as testing data and the 80% (16,509) has been taken as training data. With this, the accuracy percentage of proposed algorithm has been reached with 97%, i.e., 20,017 images are classified under 15 categories. According to that 3% of data, i.e., 619 has been comes under misclassification scenario.
6 Conclusion In this work, an early stage prediction of plant leaf diseases using deep learning models were proposed to improve the classification accuracy. Three classifiers namely, CNN, CNN-SVM, and the XGBoost with CNN-SVM were proposed for early prediction. The performance of the proposed system is estimated on a data set that including 15 various categories, which consist of various, plant leaves taken from plant village repository. The empirical results prove that the proposed model is achieved the higher accuracy of 97.14% than other methods. The result of the study confirms the effectiveness of proposed model in the plant leaves disease detection task, which is a promising step in providing a cost-effective assessment tool. Future work attempts in develop the optimal feature selection methods to reduction the difficulty of the learning system.
References 1. Agriculture Economics and Importance of Agriculture in National Economy. http://agriinfo. in/?Page=topic&superid=9&topicid=185 2. Shruthi U, Nagaveni V, Raghvendra BK (2019) A review on machine learning classification techniques for plant disease detection. In: 5th international conference on advanced computing & communication systems (ICACCS). IEEE, Coimbatore 3. Dhingra G, Kumar V, Joshi HD (2017) Study of digital image processing techniques for leaf disease detection and classification. Springer-Science, 29 Nov 2017 4. Fujita E, Kawasaki Y, Uga H, Kagiwada S, Iyatomi H (2017) Basic investigation on a robust and practical plant diagnostic system. In: Proceedings of the 2016 15th IEEE international conference on machine learning and applications, ICMLA 2016, pp 989–992 5. Fuentes A, Yoon S, Kim SC, Park DS (2017) A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors (Switzerland) 17(9):2022 6. Zhang X, Qiao Y, Meng F, Fan C, Zhang M (2018) Identification of maize leaf diseases using improved deep convolutional neural networks. IEEE Access 6:30370–30377
260
N. Rajathi and P. Parameswari
7. Ma J, Du K, Zheng F, Zhang L, Gong Z, Sun Z (2018) Original papers A recognition method for cucumber diseases using leaf symptom images based on deep convolutional neural network. Comput Electron Agric 154(September):18–24 8. Sardogan M, Tuncer A, Ozen Y (2018) Plant leaf disease detection and classification based on CNN with LVQ algorithm” September 2018. In: 3rd international conference on computer science and engineering (UBMK) 9. Gandhi R, Nimbalkar S, Yelamanchili N, Ponkshe S (2018) Plant disease detection using CNNs and GANs as an augmentative approach. In: Proceedings of the 2018 IEEE international conference on innovative research and development, ICIRD 2018. IEEE Press. https://doi.org/ 10.1109/ICIRD.2018.8376321 10. Wang G, Sun Y, Wang J (2017) Automatic image-based plant disease severity estimation using deep learning. In: Hindawi, Computational intelligence and neuroscience, vol. 2017, Article ID 2917536, p. 8 11. De Luna RG, Dadios EP, Bandala AA (2019) Automated image capturing system for deep learning-based tomato plant leaf disease detection and recognition. Proc/TENCON 2018:1414– 1419 12. Militante SV, Gerardo BD, Dionisio NV (2013) Plant leaf detection and disease recognition using deep learning. In: 2019 IEEE Eurasia conference on IOT, communication and engineering (ECICE) 13. Patidar S, Pandey A, Shirish BA, Sriram A (2020) Rice plant disease detection and classification using deep residual learning June 2020, https://doi.org/10.1007/978-981-15-6315-7_23 14. Bhatia GS, Ahuja P, Chaudhari D, Paratkar S, Patil A (2020) Plant disease detection using deep learning. In: Smys S et al. (eds) Springer Nature Switzerland AG 2020, ICCNCT 2019, LNDECT 44, pp 408–415 15. Bhatia GS, Ahuja P, Chaudhari D, Paratkar S, Patil A (2019) FarmGuide one-stop solution for farmers. Asian J Converg Technol 5(1) 16. Gobalakrishnan N, Pradeep K, Raman CJ, Ali LJ, Gopinath MP (2020) A Systematic review on image processing and machine learning techniques for detecting plant diseases. In: International conference on communication and signal processing, 28, 30 July, 2020. IEEE, India 17. Santhosh Kumar S, Raghavendra BK (2019) Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection. In: 2019, 5th international conference on advanced computing & communication systems (ICACCS). IEEE 18. Aruraj A, Alex A, Subathra MSP, Sairamya NJ, George ST, Ewards SV (2020) Detection and classification of diseases of banana plant using local binary pattern and support vector machine. In: 2nd international conference on signal processing and communication (--ICSPC). IEEE, Feb 2020 19. Jidong L, Ran Z (2018) Dynamic weighting multi factor stock selection strategy based on XGboost machine learning algorithm. In: 2018 IEEE international conference of safety produce informatization (IICSPI). IEEE 20. Patil P, Yaligar N, Meena SM (2017) Comparison of performance of classifiers—SVM, RF and ANN in potato blight disease detection using leaf images. In: 2017, IEEE international conference on computational intelligence and computing research (ICCIC) 21. Sun X, Park J, Kang K, Hur J (2017) Novel hybrid CNN-SVM Model for recognition of functional magnetic resonance images. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC), Dec 2017 22. Weimer D, Scholz-Reiter B (2016) Moshe Shpitalni “Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection” Elsevier. CIRP Ann Manuf Technol 65:417–420 23. Pandey A, Dubey S (2017) Evaluations of brinjal germplasm for resistance to fusarium wilt disease. Int J Sci Res Publ 7(7), (2017) 24. Goyal N, Kumar KN (2019) Plant species identification using leaf image retrieval: a study. In: International conference on computing, power and communication technologies (GUCON), Mar 2019. IEEE