Computer Vision and Machine Learning in Agriculture, Volume 3 (Algorithms for Intelligent Systems) 9819937531, 9789819937530

This book is as an extension of the previous two volumes on “Computer Vision and Machine Learning in Agriculture”. This

107 15 7MB

English Pages 225 [215] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computer Vision and Machine Learning in Agriculture, Volume 3 (Algorithms for Intelligent Systems)
 9819937531, 9789819937530

Table of contents :
Preface
Contents
About the Editors
1 Computer Vision and Machine Learning in Agriculture: An Introduction
1 Introduction
2 Application Areas of CV-ML in Agriculture
2.1 Quality Analysis of Seed
2.2 Analysis of Soil
2.3 Precision Irrigation
2.4 Weed Management
2.5 Crop Monitoring
2.6 Livestock Monitoring
2.7 Food Safety
2.8 Yield Estimation
2.9 Supply Chain Management
2.10 Climate Change Adaptation
3 CV-ML in Agriculture (Vol 1 and Vol 2)
4 CV-ML in Agriculture Vol 3
5 Conclusion
References
2 Deep Learning Modeling for Gourd Species Recognition Using VGG-16
1 Introduction
2 Description of Gourd Species
2.1 Sponge Gourd
2.2 Snake Gourd
2.3 Ridge Gourd
3 Literature Review
4 Methodology
4.1 Dataset Preparation
4.2 Preprocessing
4.3 VGG-16 Deployment
4.4 Dense Architecture
5 Experimental Evaluation
5.1 Accuracy
5.2 Precision
5.3 Recall
5.4 F1-Score
6 Analysis of Results
7 System Architecture
8 Conclusion
References
3 Sugarcane Diseases Identification and Detection via Machine Learning
1 Introduction
2 Literature Review
3 Data Summary and Preprocessing
4 Methodology
5 Result
6 Conclusion
References
4 Freshness Identification of Fruits Through the Development of a Dataset
1 Introduction
2 Literature Review
3 Methodology
3.1 Recognition Procedure
3.2 System Architecture
3.3 Data Collection Procedure
3.4 Pre-trained Deep Neural Networks
4 Experimental Results
4.1 Dataset
4.2 Performance Analysis
5 Conclusion
References
5 Rice Leaf Disease Classification Using Deep Learning with Fusion Concept
1 Introduction
2 Literature Survey
3 Dataset Used
4 Typical Deep Learning Architecture
4.1 Fusion Methods
4.2 Data Pre-processing
4.3 Feature Extraction
4.4 Machine Learning Classifiers
5 Experiment and Results
6 Conclusion and Future Scope
References
6 Advances in Deep Learning-Based Technologies in Rice Crop Management
1 Introduction
2 DL-Based Rice Crop Management System
2.1 Rice CNN Models
2.2 Rice Transformers
3 Applications of DL in Rice Crop Management
3.1 Rice Development Stage Analysis
3.2 Rice Disease and Disorder Diagnosis
3.3 Rice Variety Identification
4 Challenges and Future Scopes
5 Conclusion
References
7 AI-Based Agriculture Recommendation System for Farmers
1 Introduction
2 Literature Survey
3 System Design and Architecture
4 Recommendation System for Farmers
4.1 Crop Recommendation System
4.2 Leaf Disease Detection System
4.3 Fertilizer Recommendation System
4.4 Chatbot for Farmers
4.5 Web Application for Farmers
5 Results
6 Conclusion and Future Work
References
8 A New Methodology to Detect Plant Disease Using Reprojected Multispectral Images from RGB Colour Space
1 Introduction
2 Literature Review
3 Data Set
4 Model Architecture
5 Result
6 Conclusion
References
9 Analysis of the Performance of YOLO Models for Tomato Plant Diseases Identification
1 Introduction
2 Literature Review
3 Methodology
3.1 Experimental Environment
3.2 Data Set
3.3 Training, Test and Validation
3.4 Performance Matrix
4 Result
5 Conclusion
References
10 Strawberries Maturity Level Detection Using Convolutional Neural Network (CNN) and Ensemble Method
1 Introduction
2 Materials and Methods
2.1 Acquisition of Image Dataset and Pre-processing
2.2 CNN-Based Detection and Classification
2.3 Comparison Criteria
2.4 Evaluation Matrices
3 Results and Discussion
3.1 Comparison of the Individual CNN Models and Ensemble Models
4 Conclusion
References
11 RGB to Multispectral Remap: A Cost-Effective Novel Approach to Recognize and Segment Plant Disease
1 Introduction
2 Literature Review
3 Dataset
4 Model
5 Result
6 Conclusion
References
12 An Intelligent Vision-Guided Framework of the Unmanned Aerial System for Precision Agriculture
1 Introduction
2 Material and Method
2.1 Sensors
2.2 Target Recognition System
2.3 Main System
2.4 Autopilot Drive System
3 Experimentation and Results
3.1 Experimental Scenario
3.2 Simulation Experiment
4 Discussion
5 Conclusion
References
13 Leveraging Computer Vision for Precision Viticulture
1 Introduction
2 Typical Viticulture Logbook
2.1 Basic Practices
3 Towards Computer Vision-Based Automated Logbook
3.1 Computer Vision-Based Practices
4 Computer Vision-Based Subtasks
4.1 Detection of Vine Parts
4.2 Structural Elements Detection
4.3 Supplementary Detection Subtasks
5 Discussion
6 Conclusions
References
Author Index

Citation preview

Algorithms for Intelligent Systems Series Editors: Jagdish Chand Bansal · Kusum Deep · Atulya K. Nagar

Jagdish Chand Bansal Mohammad Shorif Uddin   Editors

Computer Vision and Machine Learning in Agriculture, Volume 3

Algorithms for Intelligent Systems Series Editors Jagdish Chand Bansal, Department of Mathematics, South Asian University, New Delhi, Delhi, India Kusum Deep, Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India Atulya K. Nagar, School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK

This book series publishes research on the analysis and development of algorithms for intelligent systems with their applications to various real world problems. It covers research related to autonomous agents, multi-agent systems, behavioral modeling, reinforcement learning, game theory, mechanism design, machine learning, metaheuristic search, optimization, planning and scheduling, artificial neural networks, evolutionary computation, swarm intelligence and other algorithms for intelligent systems. The book series includes recent advancements, modification and applications of the artificial neural networks, evolutionary computation, swarm intelligence, artificial immune systems, fuzzy system, autonomous and multi agent systems, machine learning and other intelligent systems related areas. The material will be beneficial for the graduate students, post-graduate students as well as the researchers who want a broader view of advances in algorithms for intelligent systems. The contents will also be useful to the researchers from other fields who have no knowledge of the power of intelligent systems, e.g. the researchers in the field of bioinformatics, biochemists, mechanical and chemical engineers, economists, musicians and medical practitioners. The series publishes monographs, edited volumes, advanced textbooks and selected proceedings. Indexed by zbMATH. All books published in the series are submitted for consideration in Web of Science.

Jagdish Chand Bansal · Mohammad Shorif Uddin Editors

Computer Vision and Machine Learning in Agriculture, Volume 3

Editors Jagdish Chand Bansal Department of Applied Mathematics South Asian University New Delhi, India

Mohammad Shorif Uddin Department of Computer Science and Engineering Jahangirnagar University Dhaka, Bangladesh

ISSN 2524-7565 ISSN 2524-7573 (electronic) Algorithms for Intelligent Systems ISBN 978-981-99-3753-0 ISBN 978-981-99-3754-7 (eBook) https://doi.org/10.1007/978-981-99-3754-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

In recent years, computer vision and machine learning (CV-ML) is continuously evolving, with new applications, techniques, and models, and its domain CV-ML has touched the agriculture sector in assessing plant diseases and crop monitoring to prevent yield loss as well as to safe guard the financial damage of farmers. Moreover, the integration of CV-ML in the development of intelligent robots and drones has enabled farmers to do numerous works such as planting, weeding, harvesting, and plant health monitoring more efficiently. These books will be helpful for agricultural academics, practitioners, and farmers who want to stay connected with the most recent developments and innovations. The current book is a continuation of our previous books (Computer Vision and Machine Learning in Agriculture, ISBN: 978981-33-6424-0, https://doi.org/10.1007/978-981-33-6424-0, and Computer Vision and Machine Learning in Agriculture, Volume 2, ISBN: 978-981-16-9990-0, https:/ /doi.org/10.1007/978-981-16-9991-7) that contains 13 chapters. Chapter 1 highlights the diverse applications of computer vision and machine learning in agriculture along with a sequential description of previous issues. Chapter 2 presents the most recent advancements in machine vision system techniques in the agricultural industry to distinguish the three most confusing vegetables: sponge gourds, snake gourds, and ridge gourds. Chapter 3 describes the development of a strong and efficient method for detecting sugarcane leaf diseases using YOLO v7 and YOLO v8. The technique was evaluated using a dataset of sugarcane leaf images and confirmed that the YOLO v8 algorithm had an accuracy of over 96% in detecting sugarcane diseases. Chapter 4 presents an algorithm for the freshness identification of various types of fruits by offering an extensive dataset. It investigated five deep learning models, such as VGG 16, VGG 19, Inception V3, ResNet50, and MobileNet V2 VGG 16 using the developed dataset and found that VGG 16 gives the highest accuracy of 96.04%. Chapter 5 focuses on developing an efficient rice leaf disease detection system capable of identifying various types of rice leaf diseases. The system utilizes many datasets of rice plant images obtained from the Kaggle API and is trained on deep learning models. Chapter 6 presents an overview of the advancements in deep learning-based technologies for managing rice crops focusing on the latest convolutional neural network and transformer models. Chapter 7 focuses v

vi

Preface

on the implementation of a chatbot for plant and fertilizer recommendations using machine learning for improving crop productivity. Chapter 8 presents a deep learningbased YOLO V3 tiny model for detecting plant diseases using multispectral images. The study shows that the proposed model increases detection accuracy by 4.35% compared to RGB color-based images using the same deep learning-based detection model. Chapter 9 compares the performance of five YOLO models for detecting tomato leaf diseases from various perspectives. Chapter 10 investigates the performance of various CNN and ensemble models that are utilized for detecting and classifying strawberries based on their maturity level. Chapter 11 presents a minimal deep learning model that uses remapped multispectral images to perform leaf disease segmentation without relying on data augmentation techniques. Chapter 12 introduces a novel framework for detecting multiple targets, such as crops, weeds, mud, and other objects, using unmanned aerial vehicles (UAVs). Chapter 13 provides a comprehensive review of the use of computer vision in precision viticulture. We hope the covered topics in the current volume, along with two previous volumes, will be comprehensive literature for both beginners and experienced including researchers, academicians, and students who wish to work and explore the applications of computer vision and machine learning systems in the agricultural sector for boosting production. We sincerely appreciate the time, effort, and contribution of the authors and esteemed reviewers in maintaining the quality of the papers. Special thanks to the supporting team of Springer for helping in publishing this book. New Delhi, India Dhaka, Bangladesh

Jagdish Chand Bansal Mohammad Shorif Uddin

Contents

1

2

3

4

5

6

Computer Vision and Machine Learning in Agriculture: An Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jagdish Chand Bansal and Mohammad Shorif Uddin Deep Learning Modeling for Gourd Species Recognition Using VGG-16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Mehedi Hasan, Khairul Alam, Sunzida Siddique, Tofayel Ahamed Topu, Md. Tarek Habib, and Mohammad Shorif Uddin Sugarcane Diseases Identification and Detection via Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md Mostafizur Rahman Komol, Md Sabid Hasan, and Shahnewaz Ali

1

19

37

Freshness Identification of Fruits Through the Development of a Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nusrat Sultana, Musfika Jahan, and Mohammad Shorif Uddin

53

Rice Leaf Disease Classification Using Deep Learning with Fusion Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Rajathi, K. Yogajeeva, V. Vanitha, and P. Parameswari

69

Advances in Deep Learning-Based Technologies in Rice Crop Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mayuri Sharma and Chandan Jyoti Kumar

79

7

AI-Based Agriculture Recommendation System for Farmers . . . . . . V. Vanitha, N. Rajathi, and K. Prakash Kumar

91

8

A New Methodology to Detect Plant Disease Using Reprojected Multispectral Images from RGB Colour Space . . . . . . . 105 Shakil Ahmed and Shahnewaz Ali

vii

viii

9

Contents

Analysis of the Performance of YOLO Models for Tomato Plant Diseases Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Shakil Ahmed

10 Strawberries Maturity Level Detection Using Convolutional Neural Network (CNN) and Ensemble Method . . . . . . . . . . . . . . . . . . . 131 Zeynep Dilan Da¸skın, Muhammad Umer Khan, Bulent Irfanoglu, and Muhammad Shahab Alam 11 RGB to Multispectral Remap: A Cost-Effective Novel Approach to Recognize and Segment Plant Disease . . . . . . . . . . . . . . . 147 Shahnewaz Ali and Shakil Ahmed 12 An Intelligent Vision-Guided Framework of the Unmanned Aerial System for Precision Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . 159 Shahbaz Khan, Muhammad Tufail, Muhammad Tahir Khan, Zubair Ahmad Khan, Javaid Iqbal, and Razaullah Khan 13 Leveraging Computer Vision for Precision Viticulture . . . . . . . . . . . . 177 Eleni Vrochidou and George A. Papakostas Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

About the Editors

Dr. Jagdish Chand Bansal is Associate Professor (Senior Grade) at South Asian University, New Delhi, and Visiting Faculty at Mathematics and Computer Science, Liverpool Hope University, UK. Dr. Bansal obtained his Ph.D. in Mathematics from IIT Roorkee. Before joining SAU, New Delhi, he worked as an Assistant Professor at ABV—Indian Institute of Information Technology and Management Gwalior and BITS Pilani. His primary area of interest is swarm intelligence and nature-inspired optimization techniques. Recently, he proposed a fission–fusion social structurebased optimization algorithm, spider monkey optimization (SMO), which is being applied to various problems in the engineering domain. He has published more than 70 research papers in various international journals/conferences. He is the Section Editor (editor-in-chief) of the journal MethodsX published by Elsevier. He is the Series Editor of the book series Algorithms for Intelligent Systems (AIS), Studies in Autonomic, Data-driven and Industrial Computing (SADIC), and Innovations in Sustainable Technologies and Computing (ISTC) published by Springer. He is also an Associate Editor of Engineering Applications of Artificial Intelligence (EAAI) and ARRAY published by Elsevier. He is the General Secretary of the Soft Computing Research Society (SCRS). He has also received Gold Medal at UG and PG levels. Prof. Mohammad Shorif Uddin completed his Doctor of Engineering (Ph.D.) at Kyoto Institute of Technology, Japan, in 2002, Master of Technology Education at Shiga University, Japan, in 1999, Bachelor of Electrical and Electronic Engineering at Bangladesh University of Engineering and Technology (BUET) in 1991, and also Master of Business Administration (MBA) from Jahangirnagar University in 2013. He began his teaching career as Lecturer in 1991 at Chittagong University of Engineering and Technology (CUET). In 1992, he joined the Computer Science and Engineering Department of Jahangirnagar University, and at present, he is a Professor in this department. He served as Chairman of the Computer Science and Engineering Department of Jahangirnagar University from June 2014 to June 2017 and Teacherin-Charge of the ICT Cell of Jahangirnagar University from February 2015 to April 2023. He worked as an Adviser at ULAB from September 2009 to October 2020 and at Hamdard University Bangladesh from November 2020 to November 2021. He ix

x

About the Editors

undertook postdoctoral research at Bioinformatics Institute, Singapore, Toyota Technological Institute, Japan, and Kyoto Institute of Technology, Japan, Chiba University, Japan, Bonn University, Germany, Institute of Automation, Chinese Academy of Sciences, China. His research is motivated by applications in the fields of artificial intelligence, machine learning, computer vision, and image security. He holds two patents for his scientific inventions and has published around 200 research papers in international journals and conference proceedings. In addition, he edited a good number of books and wrote many chapters. He delivered a remarkable number of keynotes and invited talks and also acted as General Chair or TPC Chair or Co-chair of many international conferences. He received the Best Paper Award from the International Conference on Informatics, Electronics and Vision (ICIEV2013), Dhaka, Bangladesh, and the Best Presenter Award from the International Conference on Computer Vision and Graphics (ICCVG 2004), Warsaw, Poland. He was the Coach of Janhangirnagar University ACM ICPC World Finals Teams in 2015 and 2017 and supervised a good number of doctoral and master theses. He is a Fellow of IEB and BCS, a Senior Member of IEEE, and an Associate Editor of IEEE Access.

Chapter 1

Computer Vision and Machine Learning in Agriculture: An Introduction Jagdish Chand Bansal and Mohammad Shorif Uddin

1 Introduction The agriculture sector is the backbone of most of the country’s economy. According to the UNDP’s 2021 report on “leveraging digital technology for sustainable agriculture,” the world food production needs to increase by 98% to sustain a growing population of 9.7 billion by 2050 [1]. To achieve this target, the available resources such as land, labor, capital, and technology must be utilized effectively [2]. In its current condition, precision agriculture attempts to maximize output while conserve resources through optimizing farm management decision support systems. To boost productivity, efficiency, and revenues, data-driven farming is necessary to address the growing trend of food security. Technological intervention is necessary to address the issues such as the demand for food, labor, water, and climate change [3, 4]. Modern agriculture primarily relies on science, innovation, and ICT infrastructures. The conventional systems used for managing agricultural data are not only tedious but also susceptible to errors. Therefore, it is imperative to leverage advancements in remote sensing, imaging systems, digital applications, sensors, and intelligent data analysis using decision support systems to make farming smarter [5]. The use of cuttingedge technologies such as IoT, machine learning (ML), computer vision (CV), cloud computing, and blockchain can significantly improve food production and address emerging challenges in the agriculture sector. AI has the potential to transform agriculture in many ways. By using machine learning and computer vision, farmers and other agriculture stakeholders can make better decisions and increase productivity. Machine learning can be used to analyze data from various sources, such as weather J. C. Bansal (B) Department of Applied Mathematics, South Asian University, New Delhi, India e-mail: [email protected] M. S. Uddin Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_1

1

2

J. C. Bansal and M. S. Uddin

forecasts and crop yields, to predict future trends and generate accurate predictions about what crops to plant and when to plant them [6, 7]. Also, machine learning algorithms empower drones and robots to carry out and automate precise agricultural tasks, including plowing fields, crop monitoring, sorting, packing fruits, and vegetables [8]. Computer vision can be used to detect pests and diseases early, allowing farmers to take action before the problem becomes widespread. This technology can also be used to monitor crop growth, soil moisture levels, and other factors affecting crop yields [9]. Additionally, the automation of computer vision makes it possible to gather crucial data on farm animals, fields, and gardens while enabling the monitoring, forecasting, and evaluation of specific objects based on their visual characteristics. Furthermore, AI can be used to automate tasks such as planting, harvesting, and irrigation, freeing up farmers’ time and reducing labor costs.

2 Application Areas of CV-ML in Agriculture Computer vision and machine learning (CV-ML) are constantly evolving and are finding increasing applications in agriculture. The use of CV-ML in agriculture enables farmers to acquire vast amounts of information that were not possible a few years ago and to make more informed decisions. Both UAV and satellite imagery are useful for analyzing large areas of land and enhancing agricultural methods. Let us look at how modern technology improves farming’s productivity, efficiency, and labor intensiveness.

2.1 Quality Analysis of Seed The seed industry relies heavily on seed quality analysis to ensure that only highquality seeds are used for planting [10]. However, traditional seed quality analysis methods are time consuming and vulnerable to human error. To overcome these limitations, CV-ML techniques have been developed for automated seed quality analysis [11, 12]. According to research conducted by zhu et al., the combination of spectroscopy and machine learning—specifically, convolutional neural network (CNN) models—was shown to be successful in distinguishing seed varieties [13]. The study demonstrated that these machine learning techniques achieved an accuracy rate of over 80% when classifying cotton seeds based on features extracted by CNN and residual network models. Khatri et al. used machine learning to classify wheat seeds based on seven physical features [14]. The dataset had 210 occurrences of wheat kernels from three wheat varieties, and 70 components were randomly selected for the experiment. The study demonstrated that an ensemble approach utilizing hard voting achieved the highest accuracy of 95%. Qiu et al. utilized hyperspectral imaging and three machine learning techniques, namely CNN, SVM, and k-NN, to identify four varieties of rice seeds [15]. The study used two different spectral ranges and

1 Computer Vision and Machine Learning in Agriculture: An Introduction

3

altered the number of training samples. Heo et al. employed CNNs to separate weed seeds from high-quality seeds [16]. Veeramani et al. utilized CNNs to differentiate between haploid and polyploid maize seeds [17], while Uzal et al. used CNNs to predict the quantity of soybean seeds [18].

2.2 Analysis of Soil The primary focus of soil management in agriculture is to preserve and enhance dynamic soil characteristics to increase crop productivity [19]. The conventional methods to analyze soil texture require collecting soil samples and sending them to a laboratory for drying, crushing, and sieving to facilitate examination. Nonetheless, with the recent progress in image processing and image acquisition systems like cameras, there has been a growing inclination toward computer vision-based image analysis methods, in the field of soil science. This approach involves capturing soil images (either static or dynamic) with cameras and using simple computer programs to classify and categorize them. Haralick et al. [20] tried to categorize images obtained from aerial or satellite sources using entropy and angular moment-based textural characterization [20]. Subsequently, the gray-level co-occurrence matrix (GLCM) and its equivalents have found extensive application in several remote sensing domains [21, 22]. Similarly, Barman et al. introduced a soil texture classification mechanism which uses multi-class support vector machine [23]. Chandan and Thakur reviewed various machine learning methods employed for soil classification [24]. They specifically focused on the ten primary types of soil present in India: red soil, alluvial soil, forest soil, saline soil, black/regur soil, desert/arid soil, sub-mountain soil, marshy/peaty soil, laterite soil, and snowfields. The ML techniques used for soil classification include support vector machines (SVM), k-nearest networks (k-NN), decision trees (DT), and artificial neural networks (ANN). These methods utilize detectable soil characteristics such as quality, moisture content, structure, nutrients, pH, and texture for classification purposes. A soil classification method was proposed to extract textural features for retrieval; Honawad et al. applied Gabor filtering, color quantization, and low masking techniques to the original soil images [25]. In another study, Shukla et al. demonstrated the use of the random forest (RF) algorithm to classify 11 distinct categories of soil in Indian districts. In their study, they employed a set of soil-forming factors known as “scorpan” as covariates to fine-tune the RF model and assess its effectiveness [26].

2.3 Precision Irrigation Efficient irrigation is essential for sustainable agriculture [27]. In order to optimize irrigation and reduce water waste while increasing crop yields, computer vision and machine learning can assess weather data and soil moisture data. Various techniques

4

J. C. Bansal and M. S. Uddin

for irrigation system frameworks have been suggested to achieve water savings, such as thermal imaging, the crop water stress index (CWSI), and direct measurements of soil water. Thermal imaging is a popular method based on the plant’s shade temperature distribution, where irrigation is planned according to continuous monitoring of the plant’s water status [28]. Additionally, a CWSI-based technique has been introduced for efficient water usage in crop irrigation scheduling. It is worth noting that CWSI was initially characterized more than four decades ago, making it a well-established approach [29]. Allen et al. proposed an approach based on evapotranspiration (ET) as a vital factor to determine crop irrigation needs, which is influenced by climate factors such as temperature, the velocity of wind, relative humidity, solar radiation, and crop characteristics like growth stage, soil properties, variety, pest, and disease control [30]. Using ET-based techniques can lead to water savings of up to 42% compared to time-based irrigation scheduling [31]. Davis et al. conducted a study in Florida and found that ET-based watering scheduling controllers are more cost effective, require less labor, and have smaller sizes for irrigation. Compared to scheduled practices, an ET-based irrigation technique uses much less water [32]. Goldstein et al. presented an irrigation recommendation technique using an ML algorithm that incorporates the expertise of agronomists [33]. The researchers found that the gradient boosted regression trees (GBRT) model was the most accurate, achieving a 93% prediction accuracy rate for irrigation plans and recommendations. Meanwhile, Roopaei et al. created an intelligent irrigation monitoring technique that uses thermal imaging with a thermal imaging camera attached to a drone [34].

2.4 Weed Management One of the primary factors that has a detrimental effect on agricultural production is the presence of weeds. As efforts are being made to enhance agricultural productivity, there has been a surge in the use of chemicals to control weed growth. However, to increase productivity, it also needs to make the best use of available resources, which can only be done by precisely spraying weeds. The precise detection and position of weeds are crucial for precise spraying, and this is where computer vision techniques come in handy [35]. In recent years, researchers have conducted numerous studies on the feasibility of computer vision technology to categorize plant species at the field level for agronomic purposes, which includes distinguishing between crops, weeds, and off-types [36]. Sabzi et al. introduced a computer vision system based on neural networks, which can differentiate between potato plants and three distinct weed types. This system enables targeted spraying within the field [37]. The system achieved a high accuracy rate of 98.38% while maintaining a rapid average execution time of under 0.8 s. Zhai et al. suggested a precision farming system (PFS) using multi-agent systems (MAS) [38]. This approach can effectively allocate scarce resources and plan tasks to spray pesticides only in areas where weeds are present, reducing the risk of contamination to crops, animals, water resources, and humans. Chang et al. utilized a combination of CV and multitasking to construct a small intel-

1 Computer Vision and Machine Learning in Agriculture: An Introduction

5

ligent agricultural machine that can automatically weed and irrigate cultivated land with variable irrigation [39]. Genetic algorithms have been suggested by researchers for weed detection problems. For instance, Nguyen et al. employed genetic programming to differentiate between rice and non-rice based on 20×20-pixel windows [40]. Their findings indicate that this approach considerably outperformed a simpler approach based on color thresholding, resulting in a 90% accuracy rate. Similarly, Watchareeruetai et al. utilized genetic algorithms to set up the parameters of a fuzzy logic classifier based on color and texture for detecting winter lawn weeds [41]. This hybrid approach showed advantages in terms of improved weed detection. A weed detection system using multispectral images was proposed by Li et al., which employed four distinct wavelengths and achieved a recognition accuracy of 90.7% [42]. In another study, Zheng et al. addressed the issue of weed detection in maize crops by utilizing color indices [43]. They applied principal components analysis (PCA) for optimal color features and achieved 90–93.8% accuracy while minimizing the impact of illumination.

2.5 Crop Monitoring Computer vision and machine learning can analyze images of crops to detect signs of stress or disease, enabling farmers to intervene early and prevent crop losses [44]. For example, ML algorithms can analyze images of grapevines to detect signs of grapevine leafroll disease, which can significantly reduce grape yield and quality [45]. Numerous studies have conducted research on identifying crop stresses using DL-based computer vision, which encompasses water stress and nutrient deficiencies [46, 47]. Farago et al. utilized a noninvasive approach to measure the physiological and morphological parameters of plants grown in vitro [48]. The plant images were analyzed using MATLAB based on plant sizes, and essential parameters, such as plant size, chlorophyll content, and convex ratio, were calculated. Fernandez et al. examined the impact of botanical indicators and color spaces on various ML algorithms to monitor changes in crop types and leaf color [49]. The identification of grape berries and detection of grape bunches were carried out by Rodrigo et al. using a camera with a visible spectrum [50]. Their method utilizes the segmentation of combined pixel regions and shape and texture information for accurate recognition. Rangarajan et al. introduced a model to detect diseases in plants early and increase crop productivity [51]. Their experiment used the PlantVillage dataset with 13,262 tomato images divided into seven classes: one healthy class and six classes for diseased tomatoes. They utilized transfer learning models VGG-16 and AlexNet and achieved a maximum accuracy of 97.5% using the AlexNet model with lesser execution time than VGG-16. Liu et al. proposed a multispectral computer vision system to detect invertebrate pests on green leaves in the natural environment [52]. The system demonstrated acceptable accuracy in detecting twelve common invertebrate crop pests. It can make real-time action decisions for robots, making it an essential tool for integrated pest management (IPM).

6

J. C. Bansal and M. S. Uddin

2.6 Livestock Monitoring Monitoring the health and behavior of livestock is essential for disease prevention and animal welfare [53]. Farmers can take early action by analyzing video feeds for symptoms of illness or suffering in animals using CV-ML [54]. The well-being and productivity of dairy cows are intricately tied to several factors, including their daily activity patterns, food intake, and rumination. These are considered critical indicators because they provide valuable insights into the overall health and productivity of the cows [55, 56]. In studying these indicators, computer vision approaches have been increasingly replacing conventional techniques of direct observation and time-lapse video recording. Dutta et al. introduced a technique for categorizing cattle behavior using ML techniques and collar-based sensors, such as magnetometers and three-axis accelerometers [57]. The research analyzed events such as oestrus and changes in cattle diet to ensure proper nutrition. Pegorini et al. utilized machine learning-based techniques to automatically identify and classify the chewing habits of calves, analyzing their behavioral patterns and health [58]. Ebrahimie et al. developed a machine learning technique for predicting sub-clinical mastitis (SCM) in dairy herds using milking parameters [59]. They evaluated four classification models and found that random forest (RF) with Gini index criteria had the highest accuracy of 90% in predicting SCM independent of somatic cell count (SCC). In a related study, Ebrahimie et al. investigated the use of the attribute weighting model (AWM) to identify lactose concentration and electrical conductivity in milk as indicators of SCM in dairy cattle [60]. Machado et al. utilized the random forest technique to investigate the factors that contribute to the spread of Bovine viral diarrhea virus (BVDV) disease in cattle in southern Brazil [61]. The study revealed that the occurrence of the disease was significantly affected by several factors, including the number of cattle in neighboring farms, insemination, and routine rectal palpation. Morales et al. used SVM to detect and address egg production issues early in poultry farms [62]. The proposed method showed high accuracy in estimating warnings a day before with a 0.9854 estimation accuracy. Hansen et al. utilized CNN-based deep learning techniques to identify pig faces from digital images of pigs acquired from commercial farm environments with unpredictable parameters such as lighting and dirt [63]. The proposed approach achieved an accuracy of 96.7% in accurately predicting the faces.

2.7 Food Safety Technologies that can swiftly and precisely identify viruses in food are becoming more and more necessary due to the rise in foodborne illnesses. CV-ML can automate the detection of contaminants, such as bacteria or chemicals, in food products, reducing the risk of foodborne illnesses [64]. For example, machine learning algorithms can analyze images of fresh products to detect signs of contamination, such as bruises or discoloration [65]. Leiva et al. proposed a machine vision system (MVS)

1 Computer Vision and Machine Learning in Agriculture: An Introduction

7

for automatically identifying defects in blueberries [66]. A pattern recognition algorithm was used to differentiate the calyx and stem, and the blueberries with diseases were identified, along with their orientation. The authors evaluated the performance of four testing models: linear discriminant analysis (LDA), Mahalanobis distance (data covariance distance), k-NN with k = 5, and SVM, to determine the optimal classifier. Nandi et al. employed multi-attribute decision making (MADM) to grade mangoes and used support vector regression (SVR) to predict the optimal time to ship the harvested mangoes to the market [67]. The author then used a fuzzy incremental learning algorithm that evaluated the mangoes’ grade based on SVR and MADM. Amatya et al. created an MVS-based application for automated cherry harvesting [68]. The application employs a Bayesian classifier to predict which branches are partially covered by foliage, achieving an 89.6% accuracy in branch pixel classification. Zhang et al. suggested a system for detecting defects in apples with an automatic lightness correction and a weighted relevance vector machine (RVM) [69]. The system achieved an impressive 95.63% accuracy in detecting defects in apples. Pan et al. employed hyperspectral imaging to detect cold injury in peaches and applied artificial neural networks to predict quality parameters [70]. Shafiee et al. employed MVS to capture and transform the color of honey images and used ANN to predict important quality indices such as antioxidant activity, ash content (AC), and total phenolic content (TPC) of honey [71]. Zareiforoush et al. developed a system to classify milled rice grains using CV and metaheuristic techniques [72]. The method extracted size, shape, color features, and texture to construct an initial feature vector and selected the most significant features using a “greedy hill climbing” and backtracking algorithm. The final feature vector was used to train different classifiers, including SVM, DT, ANN, and Bayesian network, and the results revealed that ANN was the best classifier among them. Wan et al. created an MVS that captured images of tomatoes and segmented the regions of interest (ROIs) from the whole image [73]. The maturity level of Roma and pear tomatoes was classified using a back-propagation neural network (BPNN).

2.8 Yield Estimation Accurate and timely crop yield estimation is crucial for farmers and other interested parties to make informed decisions about post-harvest planning, crop management, and policy-making [74, 75]. Recent research suggests that DL-based computer vision applied to aerial images provides a better approach to yield estimation compared to traditional methods. For instance, Yang et al. conducted a study that utilized convolution neural networks to estimate the yield of rice grains using low-altitude remote sensing data [76]. Also, You et al. used a blend of recurrent neural networks and CNN utilizing remotely sensed photos to predict soybean production [77]. Singh et al. observed that hailstorms in February and March 2015 caused an 8.4% decline in India’s wheat production [78]. Accurate weather predictions can benefit financially weak farmers who lack proper storage facilities. ML models act as feed-forward

8

J. C. Bansal and M. S. Uddin

control, anticipating factors affecting crop yield and allowing for corrective action before anomalies hit production. Panda et al. researched to determine the efficiency of four spectral vegetation indices—normalized difference vegetation index (NDVI), green vegetation index (GVI), soil adjusted vegetation index (SAVI), and perpendicular vegetation index (PVI)—in predicting corn crop yield [79]. The researchers used back-propagation neural network (BPNN) modeling to test the efficacy of the four indices in predicting crop yield. The results indicated that the corn yield prediction was most accurate when using the means and standard deviations of PVI grid images. Kulkarni et al. applied deep learning (DL) models for rice crop yield prediction by utilizing soil properties, nutrient measurements, and historical rainfall data recorded over 31 years [80]. Recurrent neural network (RNN) models were used to process the input data for yield prediction. To improve the effectiveness of the prediction, the authors experimented with various activation functions in the neural network, such as sigmoid, ReLU, and linear. Gumuscu et al. investigated the efficacy of three supervised ML algorithms, namely k-NN, support vector machine, and DT, for predicting early, normal, and late planting dates for wheat crops in Turkey [81]. To train the machine learning algorithms, they utilized climate data collected over the previous 300 d and used genetic algorithms (GA) for feature selection. Analysis results demonstrated that the k-NN classification ML technique outperformed the other techniques for predicting wheat crop planting dates. Nevavuori et al. explored the use of a convolution neural network for wheat and barley yield prediction in the agriculture field of Pori, Finland [82]. They utilized NDVI and RGB data obtained from cameras attached in UAVs to train the six-layer convolution neural network. The results indicated that the RGB dataset was the best predictor of crop yield in the convolution neural network model, with a mean absolute error (MAE) of 484.3 kg ha−1 and a mean absolute percentage error (MAPE) of 8.8%.

2.9 Supply Chain Management Supply chain management in agriculture coordinates the procedures needed to produce and deliver agricultural products to consumers. It involves several steps: farming, harvesting, shipping, processing, packing, and distribution. In order to improve supply chain management, artificial intelligence can be used to forecast product demand and optimize logistics and transportation routes [83]. Mele et al. introduced a decision support system based on a mixed integer linear programming model to minimize the total supply chain cost and environmental impact in a sugarcane supply chain case [84]. Naik et al. conducted a study on the challenges faced in establishing sustainable agri-retail supply chains, with a particular focus on involving small farmers in the business [85]. Banasik et al. conducted a case study on mushroom agricultural supply chains [86]. They used a multi-objective optimization linear programming method to assess the trade-offs between economic and environmental factors. Ribeiro et al. stated that ML techniques such as DL and ANN are utilized in food retailing to forecast buyer perception, demand, and buying behavior [87].

1 Computer Vision and Machine Learning in Agriculture: An Introduction

9

Vlontzos et al. explored ML techniques such as ANN, DT, genetic algorithms, nearest neighbor method, k-means type algorithms, and rule induction in food retail [88]. The study primarily focuses on attracting customers’ attention using these ML techniques. Singh et al. utilized SVM and hierarchical sampling with multiscale bootstrap resampling in a big data analytics-based technique for text mining to improve retail supply chain planning [89]. They focused on sentiment analysis of customer feedback from social media platforms to inform decision-makers in supply chain management and develop a buyer-centric retail supply chain. Maleki et al. demonstrated customer-centric practices and customer values integration in food retail using Bayesian networks and ANNs [90]. The study showed that Bayesian networks could be helpful in predicting consumers’ buying nature toward various food items and performing quality checks of retail food items. Lilavanichakul et al. utilized ANNs and logistic regression techniques to recognize the factors affecting consumer purchasing nature for imported ready-to-eat foods [91]. The study found that nonlinear demand forecasting approaches were more effective and accurate in predicting buyer demand for various food items.

2.10 Climate Change Adaptation Global issues like climate change must be taken into account and resolved immediately. As a result of significant technical advances, ML and DL approaches have grown increasingly common in various sectors, including climate change [92]. With the help of ML and DL algorithms, it is possible to forecast weather patterns and inform farmers about how climate change may affect their crops, such as drought or pest outbreaks [93]. Rainfall prediction is crucial for flood risk assessment, water resource management, and agriculture. However, due to the chaotic nature of rainfall, statistical methods struggle to predict it accurately. Cramer et al. assessed the effectiveness of seven machine learning algorithms for rainfall prediction [94]. Results demonstrate that the radial basis function neural network (RBFNN) outperforms other state-of-the-art algorithms. Diez et al. proposed different ML algorithms to predict rainfall in Tenerife, a Spanish island, based on atmospheric synoptic patterns [95]. They found that neural networks outperformed other ML algorithms for rainfall prediction. Kamatchi et al. utilized a neural network for weather prediction and suggested a hybrid recommender technique to increase the system’s success rate [96]. Vulova et al. [97] addressed the issue of water scarcity and urban heat island effects caused by climate change, emphasizing the importance of evapotranspiration (ET) in urban greening projects [97]. They developed an urban ET model using CNNs and RF algorithms, flux footprint modeling, and GIS data in Berlin, Germany. While RF performed better in accuracy, CNN also showed satisfactory results, indicating the potential for further investigation with various model designs. Manandhar et al. [98] highlighted the importance of using systematic methodologies to assess the effects of climate change adaptation policies [98]. They employed RF and Gaussian processes to observe the long-term effects of flood control policies in Bangladesh. The authors

10

J. C. Bansal and M. S. Uddin

emphasized the need for a comprehensive analysis to evaluate immediate and longterm effects, as well as the anticipated and unanticipated outcomes of interventions, and presented information to validate future climate change policies.

3 CV-ML in Agriculture (Vol 1 and Vol 2) Machine learning and computer vision are two rapidly evolving technologies that have the potential to revolutionize agriculture and its related fields. The first volume on computer vision and machine learning, published in 2021, comprises 11 chapters. The first volume discusses numerous computer vision and machine learning topics in agriculture. The first chapter discussed computer vision, machine learning technology in agriculture, and current challenges and potential future uses. The second chapter comprehensively overviews various agricultural robotic applications, including grafting, picking, weeding, spraying, harvesting, and navigation. It also highlights the challenges of commercializing real-world drone and robot applications for enhancing agricultural yield. The third chapter covers how a deep convolutional neural network built on computer vision can be used to identify rotten fruits and vegetables. The fourth chapter concentrated on applying a region-based Faster R-CNN to locate and distinguish between beneficial and harmful pests in paddy fields. The fifth chapter focuses on classifying mango insects using an ensemble of three fine-tuned deep learning techniques: MobileNet, Xception, and VGG-19. The sixth chapter examines the use of a CNN for the early diagnosis of nine different tomato leaf diseases. The seventh chapter introduces a multi-plant diagnosis approach that was created and assessed utilizing six well-known image recognition baseline strategies: DenseNet, Inception, MobileNet, ResNet, VGG, and Xception. The eighth chapter demonstrates the effectiveness of deep CNN techniques in the early detection of potato disease, using a dataset of 7870 images of different diseases. The study evaluates the performance of various CNN models based on several metrics and concludes that ResNet is the most efficient technique for this application. The ninth chapter compares four image segmentation techniques for automated local fruit disease recognition. The tenth chapter reviews the latest developments in fruit and vegetable disease recognition using machine vision. Finally, the eleventh chapter introduces a new method for identifying diseased plants based on a bag-of-features technique using gray relational analysis. The second volume of computer vision and machine learning includes 15 chapters covering a range of topics related to agriculture and deep learning. Chapter 1 discusses the application of harvesting robots for various fruits and vegetables and provides an overview of their design mechanisms and commercialization challenges. Chapter 2 proposes four distinct architectures for weed detection in agriculture using Apache Spark and Spark Streaming platforms. Chapter 3 focuses on developing a deep learning framework for multi-class crops and orchards using an optimized, Faster R-CNN approach. Chapter 4 introduces a new methodology for enhancing the annotation flow of large agricultural datasets using content-based image retrieval driven by

1 Computer Vision and Machine Learning in Agriculture: An Introduction

11

deep learning. Chapter 5 proposes a contemporary farm monitoring technique using IoT technology. Chapter 6 applies multilinear regression, support vector machines, and fuzzy-based models to examine the effects of precipitation and evapotranspiration on wheat yield. Chapter 7 is dedicated to identifying an effective deep learning strategy for the autonomous recognition of coconut maturity. Chapter 8 provides a comprehensive review of the application of spectroscopic and imaging techniques for food quality analysis. Chapter 9 creates leaf datasets for eleven medicinal plants and explores various deep learning models to identify the most effective one. Chapter 10 proposes a methodology for disease detection in cotton and rice leaves using machine learning based on optimized features generated through exponential spider monkey optimization. Chapter 11 examines various CNN and transfer learning techniques employed to classify four prevalent diseases in cauliflower. Chapter 12 discusses the development of an intelligent system using a mobile and web platform to identify crop diseases and predict their spread. Chapter 13 introduces deep learning and transfer learning strategies for detecting and identifying five common apple leaf diseases. Chapter 14 creates a framework for automated detection and segmentation of plant leaf diseases utilizing EfficientDet and Mask R-CNN deep learning models. Finally, Chap. 15 explores the use of deep learning models to predict plant leaf diseases in their early stages and proposes three classifiers: XGBoost, CNN, and CNN-SVM with CNN-SVM for early prediction.

4 CV-ML in Agriculture Vol 3 The domain of computer vision and machine learning is continuously evolving, with new techniques, algorithms, and models being developed regularly. Using computer vision and machine learning (CV-ML) in agriculture is crucial in assessing plant diseases and monitoring crop conditions to prevent yield loss, compromised quality, and significant financial damage for farmers. Scientific and technological progress in detecting defects, grading quality, and recognizing diseases in diverse agricultural plants, fruits, leaves, and crops has been substantial. Moreover, the integration of CV-ML in the development of intelligent robots and drones has enabled farmers to do numerous works such as planting, weeding, harvesting, and plant health monitoring more efficiently. These books will be helpful for agricultural academics and practitioners who want to stay connected with these fields’ most recent developments. This book is a continuation of our previous books (Computer Vision and Machine Learning in Agriculture, ISBN 978-981-33-6424-0, https://doi.org/10. 1007/978-981-33-6424-0, and Computer Vision and Machine Learning in Agriculture volume 2, ISBN 978-981-16-9990-0, https://doi.org/10.1007/978-981-169991-7). This volume three, which comprises 12 chapters (other than the introduction), covers recent studies of CV-ML in agriculture, including detecting plant, leaf, and fruit diseases, monitoring crop health, utilizing AI in agriculture, precision farming, evaluating product quality and defects, and more.

12

J. C. Bansal and M. S. Uddin

Chapter 2, “Deep Learning Modeling for Gourd Species Recognition Using VGG16” proposes the most recent advancements in machine vision system techniques in the agricultural industry to distinguish the three most confusing vegetables: sponge gourds, snake gourds, and ridge gourds, using VGG-16 CNN model and has achieved an impressive accuracy rate of 99.5%. Chapter 3 “Sugarcane Diseases Identification and Detection via Machine Learning” discusses the development of a strong and efficient method for detecting sugarcane leaf diseases using YOLOv7 and v8. The proposed technique was evaluated using a dataset of sugarcane leaf images. The results demonstrated that the YOLOv8 algorithm had an accuracy of over 96% in detecting two kinds of sugarcane diseases. Chapter 4 “Recognition of Fresh and Rotten Fruits through the Development of a Dataset” presents a comprehensive fruit dataset to enhance accuracy in fruit recognition. The study investigated five deep learning models: VGG-16, VGG-19, Inception V3, ResNet50, and MobileNet V2. VGG-16 emerged as the most effective model, achieving an accuracy of 96.04% on the fruit dataset. Chapter 5 “Rice Leaf Disease Classification using Deep Learning with Fusion Concept” focuses on developing an efficient rice leaf disease detection (RLDD) system capable of identifying various types of rice leaf diseases. The system utilizes more than 3355 datasets of rice plant images obtained from the Kaggle API and is trained on deep learning models. The proposed model outperforms other models with an accuracy of 98.85%. Chapter 6 “Advances in Deep Learning-based Technologies in Rice Crop Management” presented an overview of the advancements in deep learning (DL)-based technologies for managing rice crops, focusing on the latest convolutional neural network (CNN) and transformer models utilized in agriculture. Chapter 7 “AI-based agriculture recommendation system for farmers” focuses on improving crop productivity by developing and integrating modules, including leaf disease detection, plant recommendations, and fertilizer recommendations. The study employs ensemble learning and grid search decision trees for leaf disease detection and fertilizer suggestions. An IoT module is developed for data collection and prediction, and a ChatBot is implemented to provide farmers with recommendations in their native language. The study achieves high accuracy, with 98% for leaf disease detection and 92% for plant recommendations. Chapter 8 “A new methodology to detect plant disease using reprojected multispectral images from RGB color space” presents a deep learning-based YOLOv3 tiny model for detecting plant diseases using multispectral images. The study shows that the proposed model increases detection accuracy by 4.35% compared to RGB color-based images using the same deep learning-based detection model. Chapter 9 “Analysis of the Performance of YOLO Models for Tomato Plant Diseases Identification” compares the performance of five YOLO models for detecting tomato leaf diseases: YOLO-3, YOLO-4, YOLO-4 tiny, YOLO-5, and YOLO-5 tiny. The YOLO-5 model achieved higher detection accuracy, precision, recall, and F-1 scores. YOLO-5 tiny had a faster detection time but compromised detection accuracy. Overall, YOLO-5 is the most effective model for accurate results. Chapter 10 “Strawberries Maturity Level Detection Using Convolutional Neural Network (CNN) and Ensemble Method” introduces various CNN and ensemble models that are utilized for detecting and classifying strawberries based on their maturity level. The models considered

1 Computer Vision and Machine Learning in Agriculture: An Introduction

13

in this study include AlexNet, GoogleNet, SqueezeNet, DenseNet, and VGG-16, as well as two ensemble networks: SqueezeNet-GoogleNet and VGG-16-GoogleNet. Among these models, SqueezeNet is considered the most effective model. Chapter 11 “RGB to Multispectral Remap: a cost-effective novel approach to recognize and segment plant disease” presents a minimal deep learning model that uses remapped multispectral images to perform leaf disease segmentation without relying on data augmentation techniques. The model achieved an 81.3% dice score, demonstrating its effectiveness in segmenting plant diseases. Chapter 12 “An intelligent vision-guided framework of the unmanned aerial system for precision agriculture” introduces a novel framework for detecting multiple targets, such as crops, weeds, mud, and other objects, using unmanned aerial vehicles (UAVs). The proposed framework includes target recognition, navigation, and movement of the UAVs toward the identified targets. During both simulation and real-world experiments, all targets were present, which confirms the resilience of the developed frameworks. Chapter 13 “Leveraging Computer Vision for Precision Viticulture” provides a comprehensive review of the use of computer vision in viticulture. The study primarily concentrates on the conventional vineyard management calendar. It provides frameworks for work activities in the vineyard, organized by the months of the year, based on the annual grapevine growth cycle.

5 Conclusion Computer vision and machine learning have the potential to transform agriculture by providing farmers with powerful tools to optimize their crop yield, reduce waste, and promote sustainability. The three-volume book series on computer vision and machine learning in agriculture is a comprehensive and informative guide covering a wide range of topics related to applying these technologies in agriculture. The first volume introduces the reader to the basic concepts and various topics, including disease detection, classification, and image segmentation techniques. The second volume focuses on specific applications, including robotic harvesting, weed detection, plant recognition, leaf disease identification, and IoT-based agriculture. The third volume covers recent studies and emerging trends in precision farming, product quality evaluation, crop management, and AI-based agriculture recommendation systems. The series highlights the potential for CV-ML to enhance agricultural practices and boost productivity by examining various deep learning models, transfer learning approaches, and image processing methods. However, issues still need to be resolved, like protecting the privacy and security of data, dealing with digital devices in rural areas, and creating accessible and inexpensive technology for small-scale farmers. This book could be a beneficial resource for scholars and decision-makers who want to understand more about the possibilities of CV-ML in agriculture and advance equitable and sustainable farming practices.

14

J. C. Bansal and M. S. Uddin

References 1. Burra DD, Hildebrand J, Giles J, Nguyen T, Hasiner E, Schroeder K, Treguer D, Juergenliemk A, Horst A, Jarvis A et al (2021) Digital agriculture profile: Vietnam 2. Ranganathan J, Waite R, Searchinger T, Hanson C (2018) How to sustainably feed 10 billion people by 2050, in 21 charts 3. Badrzadeh N, Samani JMV, Mazaheri M, Kuriqi A (2022) Evaluation of management practices on agricultural nonpoint source pollution discharges into the rivers under climate change effects. Sci Total Environ 838:156643 4. Elbeltagi A, Kumar M, Kushwaha N, Pande CB, Ditthakit P, Vishwakarma DK, Subeesh A (2023) Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer. India. Stochast Environ Res Risk Assess 37(1):113–131 5. Tantalaki N, Souravlas S, Roumeliotis M (2019) Data-driven decision making in precision agriculture: the rise of big data in agricultural systems. J Agric Food Inform 20(4):344–380 6. Bhavsar H, Panchal MH (2012) A review on support vector machine for data classification. Int J Adv Res Comput Eng Technol (IJARCET) 1(10):185–189 7. Jamei M, Karbasi M, Malik A, Jamei M, Kisi O, Yaseen ZM (2022) Long-term multi-step ahead forecasting of root zone soil moisture in different climates: novel ensemble-based complementary data-intelligent paradigms. Agric Water Manage 269:107679 8. Foglia MM, Reina G (2006) Agricultural robot for radicchio harvesting. J Field Robot 23(6– 7):363–377 9. Gomes JFS, Leta FR (2012) Applications of computer vision techniques in the agriculture and food industry: a review. Eur Food Res Technol 235:989–1000 10. Bao F, Bambil D (2021) Applicability of computer vision in seed identification: deep learning, random forest, and support vector machine classification algorithms. Acta Botanica Brasilica 35:17–21 11. Shahid M, Munir K, Muneer S, Jarrah M, Farooq U et al (2022) Implementation of ML algorithm for mung bean classification using smart phone. In: 2022 international conference on business analytics for technology and security (ICBATS). IEEE, pp 1–7 12. Aznan A, Gonzalez Viejo C, Pang A, Fuentes S (2021) Computer vision and machine learning analysis of commercial rice grains: a potential digital approach for consumer perception studies. Sensors 21(19):6354 13. Zhu S, Zhou L, Gao P, Bao Y, He Y, Feng L (2019) Near-infrared hyperspectral imaging combined with deep learning to identify cotton seed varieties. Molecules 24(18):3268 14. Khatri A, Agrawal A, Chatterjee JM (2022) Wheat seed classification: utilizing ensemble machine learning approach. Sci Prog 2022 15. Qiu Z, Chen J, Zhao Y, Zhu S, He Y, Zhang C (2018) Variety identification of single rice seed using hyperspectral imaging combined with convolutional neural network. Appl Sci 8(2):212 16. Heo YJ, Kim SJ, Kim D, Lee K, Chung WK (2018) Super-high-purity seed sorter using lowlatency image-recognition based on deep learning. IEEE Robot Autom Lett 3(4):3035–3042 17. Veeramani B, Raymond JW, Chanda P (2018) Deepsort: deep convolutional networks for sorting haploid maize seeds. BMC Bioinform 19:1–9 18. Uzal LC, Grinblat GL, Namías R, Larese MG, Bianchi JS, Morandi EN, Granitto PM (2018) Seed-per-pod estimation for plant breeding using deep learning. Comput Electron Agric 150:196–204 19. Suchithra M, Pai ML (2020) Improving the prediction accuracy of soil nutrient classification by optimizing extreme learning machine parameters. Inform Process Agric 7(1):72–82 20. Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621 21. Kuplich T, Curran PJ, Atkinson PM (2005) Relating SAR image texture to the biomass of regenerating tropical forests. Int J Rem Sens 26(21):4829–4854 22. Dell’Acqua F, Gamba P (2003) Texture-based characterization of urban environments on satellite SAR images. IEEE Trans Geosci Rem Sens 41(1):153–159

1 Computer Vision and Machine Learning in Agriculture: An Introduction

15

23. Barman U, Choudhury RD (2020) Soil texture classification using multi class support vector machine. Inform Process Agric 7(2):318–332 24. Chandan T (2018) Recent trends of machine learning in soil classification: a review. Int J Comput Eng Res 8:25–33 25. Honawad S, Chinchali S, Pawar K, Deshpande P (2017) Soil classification and suitable crop prediction. In: National conference on advances in computational biology, communication, and data analytics, pp 25–29 26. Shukla G, Garg RD, Srivastava HS, Garg PK (2018) An effective implementation and assessment of a random forest classifier as a soil spatial predictive model. Int J Rem Sens 39(8):2637– 2669 27. Abioye EA, Abidin MSZ, Mahmud MSA, Buyamin S, Ishak MHI, Abd Rahman MKI, Otuoze AO, Onotu P, Ramli MSA (2020) A review on monitoring and advanced control strategies for precision irrigation. Comput Electron Agric 173:105441 28. Wang X, Yang W, Wheaton A, Cooley N, Moran B (2010) Efficient registration of optical and IR images for automatic plant water stress assessment. Comput Electron Agric 74(2):230–237 29. Idso S, Jackson R, Pinter P Jr, Reginato R, Hatfield J (1981) Normalizing the stress-degree-day parameter for environmental variability. Agric Meteorol 24:45–55 30. Allen RG, Pereira LS, Raes D, Smith M et al (1998) Crop evapotranspiration-guidelines for computing crop water requirements-FAO irrigation and drainage paper 56. FAO, Rome 300(9):D05109 31. Davis S, Dukes M (2010) Irrigation scheduling performance by evapotranspiration-based controllers. Agric Water Manage 98(1):19–28 32. Davis S, Dukes MD, Miller G (2009) Landscape irrigation by evapotranspiration-based irrigation controllers under dry conditions in Southwest Florida. Agric Water Manage 96(12):1828– 1836 33. Goldstein A, Fink L, Meitin A, Bohadana S, Lutenberg O, Ravid G (2018) Applying machine learning on sensor data for irrigation recommendations: revealing the agronomist’tacit knowledge. Precis Agric 19:421–444 34. Roopaei M, Rad P, Choo K-KR (2017) Cloud of things in smart agriculture: intelligent irrigation monitoring by thermal imaging. IEEE Cloud Comput 4(1):10–15 35. Wang A, Zhang W, Wei X (2019) A review on weed detection using ground-based machine vision and image processing techniques. Comput Electron Agric 158:226–240 36. Subeesh A, Bhole S, Singh K, Chandel NS, Rajwade YA, Rao K, Kumar S, Jat D (2022) Deep convolutional neural network models for weed detection in polyhouse grown bell peppers. Artif Intell Agric 6:47–54 37. Sabzi S, Abbaspour-Gilandeh Y, García-Mateos G (2018) A fast and accurate expert system for weed identification in potato crops using metaheuristic algorithms. Comput Ind 98:80–89 38. Zhai Z, Martínez Ortega J-F, Lucas Martínez N, Rodríguez-Molina J-F (2018) A mission planning approach for precision farming systems based on multi-objective optimization. Sensors 18(6):1795 39. Chang C-L, Lin K-M (2018) Smart agricultural machine with a computer vision-based weeding and variable-rate irrigation scheme. Robotics 7(3):38 40. Nguyen ML, Ciesielski V, Song A (2018) Rice leaf detection with genetic programming. In: IEEE congress on evolutionary computation. IEEE, pp 1146–1153 41. Watchareeruetai U, Ohnishi N (2011) A new color-based lawn weed detection method and its integration with texture-based methods: a hybrid approach. IEEJ Trans Electron Inform Syst 131(2):355–366 42. Li L, Wei X, Mao H, Wu S (2017) Design and application of spectrum sensor for weed detection used in winter rape field. Trans Chin Soc Agric Eng 33(18):127–133 43. Zheng Y, Zhu Q, Huang M, Guo Y, Qin J (2017) Maize and weed classification using color indices with support vector data description in outdoor fields. Comput Electron Agric 141:215– 222 44. Nguyen C, Sagan V, Maimaitiyiming M, Maimaitijiang M, Bhadra S, Kwasniewski MT (2021) Early detection of plant viral disease using hyperspectral imaging and deep learning. Sensors 21(3):742

16

J. C. Bansal and M. S. Uddin

45. Bendel N, Kicherer A, Backhaus A, Köckerling J, Maixner M, Bleser E, Klück H-C, Seiffert U, Voegele RT, Töpfer R (2020) Detection of grapevine leafroll-associated virus 1 and 3 in white and red grapevine cultivars using hyperspectral imaging. Rem Sens 12(10):1693 46. Anami BS, Malvade NN, Palaiah S (2020) Deep learning approach for recognition and classification of yield affecting paddy crop stresses using field images. Artif Intell Agric 4:12–20 47. Jahagirdar P, Budihal SV (2021) Framework to detect NPK deficiency in maize plants using CNN. In: Progress in advanced computing and intelligent engineering: proceedings of ICACIE 2019, vol 2. Springer, pp 366–376 48. Faragó D, Sass L, Valkai I, Andrási N, Szabados L (2018) Plantsize offers an affordable, non-destructive method to measure plant size and color in vitro. Frontiers Plant Sci 9:219 49. Rico-Fernández M, Rios-Cabrera R, Castelán M, Guerrero-Reyes H-I, Juarez-Maldonado A (2019) A contextualized approach for segmentation of foliage in different crop species. Comput Electron Agric 156:378–386 50. Pérez-Zavala R, Torres-Torriti M, Cheein FA, Troni G (2018) A pattern recognition strategy for visual grape bunch detection in vineyards. Comput Electron Agric 151:136–149 51. Rangarajan AK, Purushothaman R, Ramesh A (2018) Tomato crop disease classification using pre-trained deep learning algorithm. Procedia Comput Sci 133:1040–1047 52. Liu H, Chahl JS (2018) A multispectral machine vision system for invertebrate detection on green leaves. Comput Electron Agric 150:279–288 53. Shen W, Hu H, Dai B, Wei X, Sun J, Jiang L, Sun Y (2020) Individual identification of dairy cows based on convolutional neural networks. Multimedia Tools Appl 79:14 711–14 724 54. Kumar S, Singh SK, Singh R, Singh AK, Kumar S, Singh SK, Singh R, Singh AK (2017) Recognition of cattle using face images. In: Animal biometrics: techniques and applications, pp 79–110 55. Huzzey J, Veira D, Weary D, Von Keyserlingk M (2007) Prepartum behavior and dry matter intake identify dairy cows at risk for metritis. J Dairy Sci 90(7):3220–3233 56. Weary D, Huzzey J, Von Keyserlingk M (2009) Board-invited review: using behavior to predict and identify ill health in animals. J Animal Sci 87(2):770–777 57. Dutta R, Smith D, Rawnsley R, Bishop-Hurley G, Hills J, Timms G, Henry D (2015) Dynamic cattle behavioural classification using supervised ensemble classifiers. Comput Electron Agric 111:18–28 58. Pegorini V, Karam LZ, Pitta CSR, Cardoso R, d. Silva JCC, Kalinowski HJ, Ribeiro R, Bertotti FL, Assmann TS (2015) In vivo pattern classification of ingestive behavior in ruminants using FBG sensors and machine learning. Sensors 15(11):28 456–28 471 59. Ebrahimie E, Ebrahimi F, Ebrahimi M, Tomlinson S, Petrovski KR (2018) Hierarchical pattern recognition in milking parameters predicts mastitis prevalence. Comput Electron Agric 147:6– 11 60. (2018) A large-scale study of indicators of sub-clinical mastitis in dairy cattle by attribute weighting analysis of milk composition features: highlighting the predictive power of lactose and electrical conductivity. J Dairy Res 85(2):193–200 61. Machado G, Mendoza MR, Corbellini LG (2015) What variables are important in predicting bovine viral diarrhea virus? A random forest approach. Vet Res 46:1–15 62. Morales IR, Cebrián DR, Blanco EF, Sierra AP (2016) Early warning in EGG production curves from commercial hens: a SVM approach. Comput Electron Agric 121:169–179 63. Hansen MF, Smith ML, Smith LN, Salter MG, Baxter EM, Farish M, Grieve B (2018) Towards on-farm pig face recognition using convolutional neural networks. Comput Ind 98:145–152 64. Ma L, Yi J, Wisuthiphaet N, Earles M, Nitin N (2023) Accelerating the detection of bacteria in food using artificial intelligence and optical imaging. Appl Environ Microbiol 89(1):e01 828– 22 65. Zhu L, Spachos P, Pensini E, Plataniotis KN (2021) Deep learning and machine vision for food processing: a survey. Curr Res Food Sci 4:233–249 66. Leiva-Valenzuela GA, Aguilera JM (2013) Automatic detection of orientation and diseases in blueberries using image analysis to improve their postharvest storage quality. Food Control 33(1):166–173

1 Computer Vision and Machine Learning in Agriculture: An Introduction

17

67. Nandi CS, Tudu B, Koley C (2016) A machine vision technique for grading of harvested mangoes based on maturity and quality. IEEE Sens J 16(16):6387–6396 68. Amatya S, Karkee M, Gongal A, Zhang Q, Whiting MD (2016) Detection of cherry tree branches with full foliage in planar architecture for automated sweet-cherry harvesting. Biosyst Eng 146:3–15 69. Zhang B, Huang W, Gong L, Li J, Zhao C, Liu C, Huang D (2015) Computer vision detection of defective apples using automatic lightness correction and weighted RVM classifier. J Food Eng 146:143–151 70. Pan L, Zhang Q, Zhang W, Sun Y, Hu P, Tu K (2016) Detection of cold injury in peaches by hyperspectral reflectance imaging and artificial neural network. Food Chem 192:134–141 71. Shafiee S, Minaei S, Moghaddam-Charkari N, Barzegar M (2014) Honey characterization using computer vision system and artificial neural networks. Food Chem 159:143–150 72. Zareiforoush H, Minaei S, Alizadeh MR, Banakar A (2016) Qualitative classification of milled rice grains using computer vision and metaheuristic techniques. J Food Sci Technol 53:118–131 73. Wan P, Toudeshki A, Tan H, Ehsani R (2018) A methodology for fresh tomato maturity detection using computer vision. Comput Electron Agric 146:43–50 74. Al-Gaadi KA, Hassaballa AA, Tola E, Kayad AG, Madugundu R, Alblewi B, Assiri F (2016) Prediction of potato crop yield using precision agriculture techniques. PloS one 11(9):e0162219 75. Chlingaryan A, Sukkarieh S, Whelan B (2018) Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: a review. Comput Electron Agric 151:61–69 76. Yang Q, Shi L, Han J, Zha Y, Zhu P (2019) Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images. Field Crops Res 235:142–153 77. You J, Li X, Low M, Lobell D, Ermon S (2017) Deep gaussian process for crop yield prediction based on remote sensing data. In: Proceedings of the AAAI conference on artificial intelligence, vol 31(1) 78. Singh S, Saxena R, Porwal A, Ray N, Ray S (2017) Assessment of hailstorm damage in wheat crop using remote sensing. Curr Sci 2095–2100 79. Panda SS, Ames DP, Panigrahi S (2010) Application of vegetation indices for agricultural crop yield prediction using neural network techniques. Rem Sens 2(3):673–696 80. Kulkarni S, Mandal SN, Sharma GS, Mundada MR et al (2018) Predictive analysis to improve crop yield using a neural network model. In: 2018 international conference on advances in computing, communications and informatics (ICACCI). IEEE, pp 74–79 81. Gümü¸sçü A, Tenekeci ME, Bilgili AV (2020) Estimation of wheat planting date using machine learning algorithms based on available climate data. Sustain Comput Inform Syst 28:100308 82. Nevavuori P, Narra N, Lipping T (2019) Crop yield prediction with deep convolutional neural networks. Comput Electron Agric 163:104859 83. Pournader M, Ghaderi H, Hassanzadegan A, Fahimnia B (2021) Artificial intelligence applications in supply chain management. Int J Prod Econ 241:108250 84. Mele FD, Guillén-Gosálbez G, Jiménez L (2009) Optimal planning of supply chains for bioethanol and sugar production with economic and environmental concerns. Comput Aided Chem Eng 26:997–1002. Elsevier 85. Naik G, Suresh D (2018) Challenges of creating sustainable agri-retail supply chains. IIMB Manage Rev 30(3):270–282 86. Banasik A, Kanellopoulos A, Claassen G, Bloemhof-Ruwaard JM, van der Vorst JG (2017) Closing loops in agricultural supply chains using multi-objective optimization: a case study of an industrial mushroom supply chain. Int J Prod Econ 183:409–420 87. Ribeiro FDS, Gong L, Calivá F, Swainson M, Gudmundsson K, Yu M, Leontidis G, Ye X, Kollias S (2018) An end-to-end deep neural architecture for optical character verification and recognition in retail food packaging. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, pp 2376–2380 88. Vlontzos G, Pardalos PM (2017) Data mining and optimisation issues in the food industry. Int J Sustain Agric Manage Inform 3(1):44–64

18

J. C. Bansal and M. S. Uddin

89. Singh A, Shukla N, Mishra N (2018) Social media data analytics to improve supply chain management in food industries. Transp Res Part E Logistics Transp Rev 114:398–415 90. Maleki M, Cruz-Machado V (2015) Integration of practices and customer values in a supply chain. Int J Manage Sci Eng Manage 10(1):9–19 91. Lilavanichakul A, Chaveesuk R, Kessuvan A (2018) Classifying consumer purchasing decision for imported ready-to-eat foods in china using comparative models. J Asia-Pacific Bus 19(4):286–298 92. Ladi T, Jabalameli S, Sharifi A (2022) Applications of machine learning and deep learning methods for climate change mitigation and adaptation. Environ Plann B Urban Anal City Sci 49(4):1314–1330 93. Rolnick D, Donti PL, Kaack LH, Kochanski K, Lacoste A, Sankaran K, Ross AS, MilojevicDupont N, Jaques N, Waldman-Brown A et al (2022) Tackling climate change with machine learning. ACM Comput Surv (CSUR) 55(2):1–96 94. Cramer S, Kampouridis M, Freitas AA, Alexandridis AK (2017) An extensive evaluation of seven machine learning methods for rainfall prediction in weather derivatives. Exp Syst Appl 85:169–181 95. Diez-Sierra J, Del Jesus M (2020) Long-term rainfall prediction using atmospheric synoptic patterns in semi-arid climates with statistical and machine learning methods. J Hydrol 586:124789 96. Kamatchi SB, Parvathi R (2019) Improvement of crop production using recommender system by weather forecasts. Procedia Comput Sci 165:724–732 97. Vulova S, Meier F, Rocha AD, Quanz J, Nouri H, Kleinschmit B (2021) Modeling urban evapotranspiration using remote sensing, flux footprints, and artificial intelligence. Sci Total Environ 786:147293 98. Manandhar A, Fischer A, Bradley DJ, Salehin M, Islam MS, Hope R, Clifton DA (2020) Machine learning to evaluate impacts of flood protection in Bangladesh, 1983–2014. Water 12(2):483

Chapter 2

Deep Learning Modeling for Gourd Species Recognition Using VGG-16 Md. Mehedi Hasan, Khairul Alam, Sunzida Siddique, Tofayel Ahamed Topu, Md. Tarek Habib and Mohammad Shorif Uddin

1 Introduction Farmers in Bangladesh produce about a hundred types of vegetables. Many times, the look of these vegetables seems to blend, making it difficult for urban residents to correctly identify them. The gourd vegetables of the Cucurbitaceae family, including sponge gourd, ridge gourd, and snake gourd, are an example of the most perplexing vegetables. This can be observed in Table 1. These three vegetables have the same external appearance, so people, especially urban and young people misrecognize them. We have conducted an online survey among hundred and forty (140) urban and rural students in the age group of 20–27 years. We can determine the proportion of the younger generation that correctly recognizes these three vegetables. We observed that 28% of the total participants who are urban students have been able to select these three vegetables properly, while 20% of the

M. M. Hasan (*) · K. Alam · S. Siddique · T. A. Topu  Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] K. Alam e-mail: [email protected] S. Siddique e-mail: [email protected] T. A. Topu e-mail: [email protected] M. T. Habib  Independent University, Dhaka, Bangladesh e-mail: [email protected] M. S. Uddin  Jahangirnagar University, Dhaka, Bangladesh e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Bansal and M. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_2

19

M. M. Hasan et al.

20 Table 1  Samples of the three different gourd species have been dealt with English name Sponge gourd

Bangladeshi name ধুন্দল (Dhundul)

Snake gourd

চিচিঙ্গা (Chichina)

Ridge gourd

ঝিঙ্গা (Jhina)

Sample image

participants who are also urban have failed to do so. In comparison, 39% of the total participants are rural and have been successful in selecting these three vegetables, while 13% of the participants who are also rural have failed to do so. This is delineated as a pie chart in Fig. 1. Since the number of mobile phone users is increasing day by day and people can now easily collect pictures with cameras on smartphones, computer vision, and machine learning can bring a solution in this regard through a mobile or web application. The use of images has become an important part of agricultural science. Using images, it is possible to digitally recognize different types of vegetables and fruits in the agriculture industry [1]. Nonetheless, the classification of fruits and vegetables and their species remains challenging as far as artificial intelligence is concerned because they can have similar shapes or textures, and only external features can be distinguished by images [2]. Unsupervised learning and feature-rich extraction are two characteristics of deep neural networks that help to increase the network’s performance. In the larger imaging process, the convolutional neural network (CNN) has made a significant success [3]. Recently, the use of VGG-16

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

21

Fig. 1  The results of the online survey performed on urban and rural students

as transfer learning has been noticed. The multi-level features of the VGG-16 CNN model offer good performance for evaluating the performance of the classifier in the research performed in [4, 5] by taking model length, speed, and adaptability into account. This is the key justification for utilizing VGG-16 in our work. Using the provided picture dataset, the VGG-16 CNN model has been trained for the specified mobile or web application. The performance metrics that were used to assess the CNN model’s performance were sensitivity, specificity, accuracy, recall, and F1-score. By using a successful technique, the desired mobile or online application for gourd species recognition can be built based on the results of these matrices. We anticipate that the CNN-based recognition system will have a significant impact on the agriculture industry in the future and have a wide range of applications. The chapter includes the following sections: a brief description of guard species and a literature review are given in Sects. 2 and 3, respectively. The proposed methodology is specified in Sect. 4. A discussion is held on the experimental result in Sect. 5. The evaluation of our work is given in Sect. 6. The system architecture is described in Sect. 7 and we settle on giving a conclusion and future work in Sect. 8.

2 Description of Gourd Species Gourd describes the fruits of plants in the Cucurbitaceae family, including melon, sponge gourd, snake gourd, and sponge gourd. Among a large number of gourd species, our center of interest includes sponge gourd, snake gourd, and ridge gourd only, which are briefly described here and shown in Table 1.

22

M. M. Hasan et al.

2.1  Sponge Gourd Sponge gourd (Bengali: ধুন্দল (dhundul)) is a summer vegetable as shown in Table 1. Its scientific name is Luffa cylindrica, native to South Africa, has a length of approximately 22–26 cm, color is very light green. It is a good source of vitamins A, C, iron, and calcium. The body shape of the sponge gourd is long, narrow at the head, and thick from the bottom, the flower-like part is seen at the bottom [6]. Sponge gourd is a nutritious vegetable that contributes to providing comfort for stomach problems on hot days.

2.2  Snake Gourd Snake gourd (Bengali: চিচিঙ্গা (Chichina)) is a summer vegetable as shown in Table 1. Its scientific name is Trichosanthes cucumerina, its length varies from 30 to 100 cm, and color is deep green with white spots. It contains a fair amount of vitamins A, C, and calcium. Snake gourds look smooth, greenish-white, long, slender, and cylindrical, often twisted [6]. Snake gourd is beneficial for hair fall problems and some diseases such as constipation, diabetes, or jaundice.

2.3  Ridge Gourd Ridge gourd (Bengali: ঝিঙ্গা (Jhina)) is a minor cucurbitaceous vegetable as shown in Table 1. The beginning of this vegetable is thicker than the upper side. It is a dark green vegetable that has a white spongy pulp inside with white seeds embedded, in contrast to its dark green shell [7]. Ridge gourds are abundant in a wide range of vital nutrients, including water, dietary fibers, vitamins A and C, iron, magnesium, and vitamin B6.

3 Literature Review Computer vision techniques have been increasingly popular in agriculture in recent years. Numerous writers’ works are studied as a foundation for our analysis and comprehension, which are mentioned here in this regard. Dubey et al. [1] classify fruits and vegetables as well as identifies fruit disease problems. For fruit disease detection, segmentation used the K-means clustering method, and disease classification errors of 1 and 3% were reported. Computer vision techniques are applied. They compared two credibility-based performances to identify fruit and vegetable classification that could be used to categorize the

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

23

reliability of vegetables and fruit, and the second one is fruit disease classification which is quite similar in color and texture. There are 2615 images in 15 distinct categories. Image categories are Fuji apple, Granny Smith apple, plum, Asterix potato, Agata potato, cashew, orange, honeydew melon, Tahiti lime, onion, nectarine, kiwi, watermelon, Spanish pear, and diamond peach. However, these papers disclose only one disease and it impedes this paper. This paper focuses to try different species and varieties of vegetables and fruits in the future. El-Salam [2] presented a computer vision technique for sorting and grading fruits and vegetables. For this study, 600 samples with a 94% success rate were attained in distinguishing defective apples from excellent ones. The system inspects 3000 apples each minute to the three cameras each of which covers 24 apples in its range of vision. Image feature classification has been effectively applied using neural networks and fuzzy inference methods. Zeng [3] proposed a system that is organized for fruit and vegetable discovery. They have created a convolutional neural network model for implementing classification which is the representation of extracted image features. The researchers used the VGG model for training. It is based on 26 categories. Their proposed approach achieves 95.6% accuracy. The paper consists of a 12,173-image dataset. This paper focuses to try different regular daily vegetables and fruit recognition in the future. Sakai et al. [8] proposed a very simple but powerful CNN model that is supposed to discover object category recognition through the consideration of extracting as well as also learning the object. Image data was applied to eight different vegetables and they are tomato, carrot, banana, cabbage, spinach, eggplant, and shiitake mushrooms. 160 images are used for training whereas they used 20 photographs of each vegetable. On the other hand, 5 images of each vegetable are for testing which is in total 40 pictures. 1–10 million ranges iterations are used here. The performance is increased and suitable for 3 million learning iterations and it reduced the rate for 10 million learning iterations. The analysis shows the learning rate was 99.14 and 97.58% was the recognition rate. They seem to focus on object tracking using DNN in the future. Muhtaseb et al. [9] addressed an application to differentiate fruits and vegetables using image histograms. It is based on colors and sizes to find appropriate matching. They used four features whereas they are lemon, apple, cucumber, eggplant, and pear. At that point, this paper shows 75% accuracy to recognize fruits. The fundamental research used Chi-square Method for matching images. This paper focuses to try expanding the range and fruit recognition in the future. Dubey et al. [10] worked on the problem of identifying fruits and vegetables. They have used a computer vision processing system that gives an image as an input and gets species and variety as an output. Another step is to investigate the texture feature. This paper shows a 99% accuracy based on ISADH texture. They used three approaches—the first one used to K-means clustering technique for image segmentation, the second one feature extraction, and the third one classified fruits and vegetables using a multi-class support vector machine. They have used

24

M. M. Hasan et al.

15 categories of fruit and vegetable images for their models. They have tried to focus on increasing multiple features and shapes of those features in the future. Xiaojun et al. [11] worked based on deep neural networks and machine vision systems. The detection of vegetables and drawing bounding boxes depends on the CenterNet model for identifying only the vegetables and weed species. Image processing was used to produce color index-based segmentation. Their proposed approach used the CenterNet model where its precision is 95.6%, recall is 95.0%, and an F1-score is 0.953. Image data has been applied to the images of bok choy or Chinese white cabbage. 1150 images are used for the training dataset and 512 × 512 pixels were used to resize the input images. They introduced another approach to remove weeds from the backdrop, genetic algorithms based on Bayesian classification error are used to assess the color index. They try to focus on using in-situ videos to identify weeds in the future. Ercole et al. [12] analyzed a detection model of Escherichia coli (E. coli) organisms in vegetable food using an antibody-based biosensor. As a typical sign of fecal contamination, bacteria were chosen. It is possible to create a recognition antigen–antibody system capable of detecting E. coli bacteria in food and environmental water. A monoclonal antibody is being developed, and the immobilization method is being improved. The samples of vegetables that they have used are lettuce, sliced carrots, and rucola. They were rinsed with pH 6.8 peptone water and blended in a stomacher or a sonicator to separate bacterial cells and recover them in the liquid medium. In comparison to traditional approaches, the PAB system looks more sensitive. A concentration of 10 cells/ml was discovered in around 1.5 h, indicating a detection time of 10–20 times faster than the traditional CFU technique. Kaur et al. [13] introduced a model that is using an artificial neural network to detect the quality of vegetables. Vegetables’ characteristics are based on color, shape, and size. It was the ANN technique that is used for checking the quality. The sample dataset is four vegetable images. The researchers try to focus on quality detection using ANN in the future. Kagaya et al. [14] analyzed a model based on food detection and recognition. For the model of this fragmentation to classify they used a convolutional neural network. Using their formative model, it talks about an enhanced accuracy and the accuracy is 93.8%. This study disclosed 1200 food images and 1000 nonfood images in the training phase. They used 200 images for the testing stages. The result was computed after the trial of 15. They used four credibility-based techniques to identify food image detection and recognition. Pouladzadeh et al. [15] synthesize an investigation on food detection, classification, and analysis that are related to eating habits and dietary evaluation. Calorie measurements of food portions with food items were used in the study. This paper conveys 3000 different pictures of food images that are taken by various cameras and under lighting conditions. For data collection, it includes two portions one is single and the other one is a mixed food portion. In the papers, they show the use of graph cut segmentation and deep learning techniques to detect food.

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

25

Probst et al. [16] describe an investigation which is on the food record method and prototype of the Australian diet. For this evaluation, a smartphone and image processing and pattern recognition software are used. For the description of food images scale-invariant feature transformation, local binary patterns (LBP), and color are used in the study. It was the bag-of-words model that is used for image recognition. The input image has taken over 2000 pixels. They used an organized dataset for training sets which is used during codebook creation and test sets that were isolated from the dataset. The researchers try to focus on food detection and recognition in the future. Rimi et al. [17] developed a machine learning model for nine different legume species recognition. To achieve the goal, they used classical machine learning models like k-NN, SVM’s, and decision tree-based CART algorithm, for deep learning they mainly used transfer learning models like VGG-16, Inception-v3, and ResNet-50. All algorithms applied by 224 × 224 resized 4320 number of training and 1080 number of test images. In the performance matrix section, they showed that the highest accuracy for the classical machine learning model is 76.0%, and for deep learning impressive and optimistic accuracy was achieved by inception-v3 with an accuracy rate of 98.0% by utilizing 23,817,352 trainable parameters. Nuruzzaman et al. [18] introduced a machine vision-based identification model which can identify four species of potato in Bangladesh. For machine learning modeling they used random forest, linear discriminant analysis, logistic regression, SVM’s, CART, naïve Bayes, and k-NN algorithm. For image preprocessing, they used three different analysis techniques Hu moments, Haralick texture, and color histogram. The highest accuracy of 98% was achieved by the Logistic Regression algorithm by utilizing 200 potato images. In summing up the preceding discussion of relevant works, there have a variety of works on a variety of fruits and vegetables like disease recognition, object, i.e., fruit and vegetable recognition, fruits and vegetables species recognition, and quality grading, but neither a single work has been done on visually resembled fruits and vegetables or their species recognition nor a detailed investigation has been done as a subproblem of fruits and vegetables or their species recognition.

4 Methodology To complete our work, we will maintain the workflow sequence as shown in Fig. 2. In the first case, we will need collected data.

4.1  Dataset Preparation Sponge gourds, snake gourds, and ridge gourds of the Cucurbitaceae family are all members of the same gourd. People mistakenly identify these three vegetables because of their similar color, shape, size, and differences in names. We have

M. M. Hasan et al.

26

Fig. 2  The flow diagram delineating the methodology followed in this work

determined these three vegetables as the class value of our research. For data collection, we collect images from the farmer’s farm using a smartphone camera. We collect a total of 4000 real images of the three-class values [19]. Figure 3 shows a sample dataset of the vegetables’ images.

4.2  Preprocessing In the preprocessing stage, we used the recycling process. Each of our collection images is passed through a Gaussian filter system for noise reduction as given in (1) below:

x 2 + y2 1 − 2𝜎 2 G(x, y) = e 2𝜋𝜎 2

(1)

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

27

Fig. 3  Sample of the image dataset used

Equation (1) represents the 2-dimensional Gaussian filter. The standard deviation of the Gaussian filter is represented by σ2 [20]. After denoising image is passed to the noise level-checking stage. It is the decision-making stage which is based on sigma estimation. If the sigma value is high, then it represents low noise. This system can calculate how much noise is available, if the sigma value is low then the image is passed to the Gaussian filter again. Otherwise, it passes to the data augmentation stage. In the augmentation stage, rotation range = 45, width shift range = 0.3, height shift range = 0.3, shear range = 0.3, zoom range = 0.1, horizontal flip = True, and fill mode = constant are used to increase the size of the dataset. After augmentation, the total training image was 6000. We labeled each image by sending it to 3 different directories based on the input directory name. Figure 4 represents the demonstration of the dataset. We tried to keep the same number of images for each class to make a balanced dataset. For data collection, we collect images from some farmers’ farms using a smartphone camera. Figure 5 represents the sample of the dataset.

4.3  VGG-16 Deployment VGG-16, where VGG stands for Visual Geometry Group, is a deep convolutional neural network architecture proposed by Simonyan et al. [21]. Figure 6 represents the generic VGG-16 architecture. The input of the first convolutional layer is a fixed size of 224 × 224 RGB images. The image is passed through a total of 13 convolutional and 5 max-pooling layers. All convolutional layers contain a 3 × 3 filter size, 1 stride, and the same padding. All max-pooling layers contain 2 × 2

28

M. M. Hasan et al.

Fig. 4  Class-wise distribution of the dataset used

Fig. 5  Sample of the dataset used

and the stride value is 2. So, when an image passed through a max-pooling layer the image size is reduced by half. After the last max-pooling layer image is passed through 3 dense layers. In the last dense layer, the softmax activation function is used and for other convolutional and dense layers, rectified linear activation function (ReLU) is used. In our problem, we used images size 224 × 224 and for the last dense layer, we used 3 nodes and softmax activation as we worked with 3 classes. As the model VGG-16 significantly outperformed the ILSVRC-2012 and ILSVRC-2013 competition versions that is another justification for using VGG-16 in our work directly.

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

56 * 56*256

28 * 28*512

112 * 112*128 224 * 224*64

14 * 14*5121 * 1*4096

29

1 * 1*3

Conv + Relu

Max pooling

Dense + Relu

Dense + Softmax

Fig. 6  The architecture of the VGG-16 model used Table 2  Custom dense architecture Layer name Dense Dropout Dense Dropout Dense Dropout Dense Dropout Dense Dense

Numbers of nodes 512

Activation function ReLU

Rate 0.2

256

ReLU

192

ReLU

128

ReLU

0.2 0.1 0.1 64 3

ReLU Softmax

4.4  Dense Architecture Table 2 represents our designed dense architecture. We used the dropout layer for reducing overfitting. For each dense layer, we used ReLU as activation and for the last dense layer, we used softmax with 3 nodes as we worked with three different classes.

5 Experimental Evaluation We have divided the dataset into two parts: the training set and the test set with a ratio of 7:3 as per the holdout method [22]. In our system, our validation accuracy is 99.5% and our accurate value is 99.3%. The model of the VGG-16 report of CNN in the system is given below.

M. M. Hasan et al.

30 Table 3  Validation score of VGG-16 model Label name

Precision

Recall

F1-Score

Ridge gourd Snake gourd Sponge gourd

0.98

0.99

0.97 0.98

Weighted AVG 0.98

Support

Accuracy

0.98

Macro AVG 0.98

589

98%

0.99

0.98

0.98

0.98

697

0.96

0.97

0.98

0.98

676

In Table 3, we have found the precision for ridge gourd and sponge gourd as the highest which is 0.98, snake gourd and ridge gourd gives the highest recall with scores of 0.99, and last, ridge gourd and snake gourd give the best F1-score about 0.98. To evaluate the trained model evaluation, we employed a variety of parameters, as follows:

5.1  Accuracy It is the model’s efficacy that determines the percentage of correctly classified items, out of the total number of predictions. It is frequently used as a performance ranking [9]. A proper method of classification based on the dataset can produce better performance [23].

5.2  Precision Precision could be measured by the percentage of positive forecasts that are positive. A higher level of precision suggests a well-designed model [11].

5.3  Recall The recall is used to assess the completeness of classification reports. It specifies how many positive tuples the model has predicted so far [11].

5.4  F1-Score The weighted average of precision and recall is the F1-score. As a result, this ranking considers both positives and fake negatives [11].

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

31

6 Analysis of Results We have evaluated CNN model by employing the VGG-16 model since it has been given a well accuracy. A total number of 7500 are being used for this model and all of the data is from real-life data. Among them, we have used 6000 for training purposes and 1500 for test purposes. These models make use of 20 epochs to ensure that the model runs smoothly. The batch size has been set to 32. Some bar graphs and evolution graphs are shown below. In Fig. 7, the green color denotes the training accuracy model and the red color denotes the validation accuracy model of VGG-16. In most epochs, the two accuracies are relatively near. Since the model is well-trained, there is less overfitting. In short, the training accuracy is similar, however model one shows optimistic validation accuracy. In Fig. 8, the loss scenarios are explained. According to the graph, as the red color loss has decreased, our green color, which represents training loss, has begun

Fig. 7  Training and validation accuracy curve of VGG-16

Fig. 8  Training loss and validation loss curve of VGG-16

32

M. M. Hasan et al.

Fig. 9  Comparison of the test dataset

to drop also. This indicates that our training model is well-understood and does require fewer adjustments or under-fitting. By utilizing more epochs, we can acquire less loss and provides higher accuracy. According to Fig. 9, we have determined the difference between real and predicted values of sponge gourd, snake gourd, and ridge gourd. Actual data is shown in blue, whereas the predicted data is shown in orange. For evaluation, 590 real values were taken on the ridge guard out of which, 586 values were accurately predicted. Along with that, 697 snake gourd real images were taken, where it has given 690 accurate values. At a point, the sponge gourd has given 676 real values of which 664 have come accurate. We use image data from three vegetables namely ridge gourd, sponge gourd, and snake gourd. Using these data from three vegetables, we trained the machine using the VGG-16 model and tried to detect using computer vision OpenCV through the VGG-16 model for testing. The model gives us about 98% accuracy. Table 4 represents the model implementation by OpenCV and real and prediction comparison of computer vision output. The analysis column showed whether our model’s predicted output is correct or wrong. In most of the cases, our model detected very accurately.

7 System Architecture With the advent of smartphones has come a revolutionary change in automation methods, now a variety of automated software models are being created using computer vision and machine learning [24]. Mobile and web applications are important at this stage as we are proposing the system architecture of our model

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

33

Table 4  Model demonstration by OpenCV Actual class

OpenCV output

Snake gourd

Predicted class Snake gourd

Result ✓ Correct

Snake gourd

Sponge gourd

✗ Wrong

Ridge gourd

Ridge gourd

✓ Correct

Ridge gourd

Snake gourd

✗ Wrong

Sponge gourd

Sponge gourd

✓ Correct

Sponge gourd

Snake gourd

✗ Wrong

based on web applications, as shown in Fig. 10. There may be different types of smartphones, laptops, computers, and tablets used as end devices. The devices will send a request to the web server with the sample image via the network using RestAPI, then the system will identify the vegetable from the image and the next step is to send the response to the RestAPI server with the predicted name shown in Fig. 10. Finally, users will be able to see the name of the vegetable as a response through the network.

8 Conclusion In this chapter, we propose the most recent advancements in techniques in the agricultural industry of machine vision systems to distinguish the three most confusing vegetables. From here, the identification rate was high by utilizing the

34

M. M. Hasan et al.

Fig. 10  The architecture of the intended system

VGG-16 algorithm. There are some study papers on vegetable detection with amazing accuracy but people can recognize the difference with the sets they have worked on like apple, cucumber, lemon, cabbage, etc. Which can be identified easily by sight. Hence, the lake of verity on vegetable recognition is that whose structure and shape are the same. The more similarity in the structure creates the more confusion. Here this chapter tried to work with about three exceptional vegetables whose structure and shape are the same and create causing confusion among people and is the more important findings of our work. The limitation of our chapter is that we could not visit more ongoing fields because of the rainy season. We only collected our data from a single firm. In the future, we will focus on enlarging our dataset and we will arrange Android apps to show the exhibition.

References 1. Dubey SR, Jalal AS (2015) Application of image processing in fruit and vegetable analysis: a review. J Intell Syst 24:405–424 2. El-Salam A (2012) Application of computer vision technique on sorting and grading of fruits and vegetables 3. Zeng G (2017) Fruit and vegetables classification system using image saliency and convolutional neural network. In: IEEE 3rd Information technology and mechatronics engineering conference (ITOEC), pp 613–617 4. Aravind KR, Raja P (2020) Disease classification in eggplant using pre-trained VGG16 and MSVM. Sci Rep 10:2322 5. Qassim H, Verma A, Feinzimer D (2018) Compressed residual-VGG16 CNN model for big data places image recognition. In: IEEE 8th Annual computing and communication workshop and conference (CCWC), pp 169–175

2  Deep Learning Modeling for Gourd Species Recognition Using VGG-16

35

6. Monowar SM, Matiar Rahman AKM (2022) Gourd—Banglapedia. https://en.banglapedia. org/index.php/Gourd. Last accessed 2022/09/10 7. Netmeds (2022) 5 Excellent health benefits of adding ridge gourd to your daily diet. www. netmeds.com/health-library/post/5-excellent-health-benefits-of-adding-ridge-gourd-to-yourdaily-diet. Last accessed 2022/09/12 8. Sakai Y, Oda T, Ikeda M, Barolli L (2016) A vegetable category recognition system using deep neural network. In: 10th International conference on innovative mobile and internet services in ubiquitous computing (IMIS), pp 189–192 9. Muhtaseb A, Sanaa S, Hashem T (2012) Fruit/Vegetable recognition. In: Proceedings of the 2012 student innovation conference, Hebron, Palestine 10. Dubey SR, Jalal AS (2013) Species and variety detection of fruits and vegetables from images. Int J Appl Pattern Recognit 1:108–126 11. Xiaojun J, Jun C, Yong C (2021) Weed identification using deep learning and image processing in vegetable plantation. IEEE Access 9:10940–10950 12. Ercole C, Gallo MD, Mosiello L, Baccella S, Lepidi AA (2003) Escherichia coli detection in vegetable food by a potentiometric biosensor. Sens Actuators B-Chem 91:163–168 13. Kaur M, Reecha S (2015) ANN based technique for vegetable quality detection. IOSR J Electron Commun Eng (IOSR-JECE) 10(5):62–70 14. Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. In: Proceedings of the 22nd ACM international conference on multimedia 15. Pouladzadeh P, Abdulsalam Y, Shervin S (2015) Foodd: food detection dataset for calorie measurement using food images. In: International conference on image analysis and processing. Springer, Cham 16. Probst Y, Nguyen DT, Tran MK, Li W (2015) Dietary assessment on a mobile phone using image processing and pattern recognition techniques: algorithm design and system prototyping. Nutrients 7(8):6128–6138 17. Rimi IF, Habib MT, Supriya S (2022) Traditional machine learning and deep learning modeling for legume species recognition. SN Comput Sci 3:430 18. Nuruzzaman M, Hossain MS, Rahman MM, Shoumik AS, Khan MA, Habib MT (2021) Machine vision based potato species recognition. In: 5th International conference on intelligent computing and control systems (ICICCS), pp 1–8 19. Research Data, Google Drive, 1 Oct 2022 [Online]. Available: https://drive.google.com/ drive/u/3/folders/1R2TPiPlnNeTTS8e6mDeT8yI6Cd0dKbM0. Accessed: 20 Nov 2022 20. Mafi M, Martin H, Cabrerizo M, Andrian J, Barreto AB, Adjouadi M (2019) A comprehensive survey on impulse and Gaussian denoising filters for digital images. Signal Process 157:236–260 21. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition 22. Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Addison-Wesley 23. Habib MT, Majumder A, Jakaria AZM, Akter M, Uddin MS, Ahmed F (2020) Machine vision based papaya disease recognition. J King Saud Univ Comput Inf Sci 32(3):300–309. ISSN 1319-1578 24. Júnior JF, Carvalho E, Ferreira BV, Souza CD, Suhara Y, Pentland A, Pessin G (2017) Driver behavior profiling: an investigation with different smartphone sensors and machine learning. PLoS One 12(4):e0174959

Chapter 3

Sugarcane Diseases Identification and Detection via Machine Learning Md Mostafizur Rahman Komol , Md Sabid Hasan , and Shahnewaz Ali

1 Introduction Due to its role as the primary source of sugar, biofuel, and other products like molasses and bagasse, sugarcane is a crucial crop (the fibrous residue left after juice extraction). More than 120 nations cultivate it. Brazil, India, China, Thailand, and Pakistan are the main producers of sugarcane. Together, these nations produce more than 70% of the world’s sugarcane. The global sugarcane market has been growing steadily in recent years, driven by increasing demand for sugar and biofuels. The total global sugar production was around 183 million metric tonnes in 2020, and it is projected to grow at a CAGR of around 2.72% during the forecast period (2022–2029) [1]. The countries which have higher sugar consumption are China, India, the EU, and the USA. Sugarcane is also an important source of biofuels, particularly in Brazil, where sugarcane ethanol is widely used as a transportation fuel. During the projection period, it is estimated that the worldwide market for biofuels will expand at a CAGR of around 8%. (2021–2026). Approximately USD 2,229,890 worth of raw cane sugar was imported into Indonesia in 2021, according to ITC Trade Map, Trade data for Md Mostafizur Rahman Komol (B) School of Electrical Engineering & Robotics, Queensland University of Technology (QUT), Brisbane, Australia e-mail: [email protected] Data61 Robotics and Autonomous System Group, The Commonwealth Scientific and Industrial Research Organization (CSIRO), Pullenvale, Australia Md Sabid Hasan Department of Electrical and Electronic Engineering, Bangladesh Army University of Engineering and Technology, Natore, Bangladesh e-mail: [email protected] S. Ali Brisbane, Australia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_3

37

38

Md Mostafizur Rahman Komol et al.

international company growth. China came in second with an import value of USD 1,942,598. Thailand came in second with an export value of USD 630,065, and India was the second-largest exporter of raw cane sugar with a value of USD 1,340,793. However, a number of diseases often hamper this enormous market and sugarcane production, which may adversely affect crop yields and quality. These diseases may dramatically affect sugarcane crop yields, which lowers the market’s supply of the commodity. Diseases like Red rot can affect the quality of sugarcane juice, resulting in lower-quality sugar production and higher costs for growers. Growers may have to spend more money on pesticides and other measures to control the spread of diseases, which can increase their costs and make it harder for them to compete in the global market. These diseases are difficult to detect and diagnose, especially in large fields, which can lead to significant losses for growers. The use of robots and machine learning to identify and treat sugarcane illnesses has gained popularity in recent years. Machine learning allows computers to learn from data and make predictions [2]. Robotics technology, on the other hand, enables the development of machines that can carry out tasks autonomously [3–8]. It can also be used to navigate through a large area using 3D semantic maps and take necessary actions autonomously [6–8]. By combining these semantic maps and disease detection technologies, it is possible to develop an autonomous system that can detect and diagnose sugarcane diseases in an efficient and accurate manner without human effort. In this research, we have implemented the YOLO algorithm to detect two kinds of diseases of sugarcane, and they are Bacterial Blight and Red rot [9–11]. Both diseases have similarities in their appearing symptoms, such as leaf discolouration and wilting, making it difficult to distinguish between them. Bacterial Blight is caused by the bacteria Xanthomonas albilineans, and it is characterised by yellow or brown streaks on the leaves of the sugarcane plant. The bacteria can also infect the stalk of the plant, causing it to become weak and brittle. Bacterial Blight can reduce the yield of sugarcane crops and cause significant economic losses. Red rot, on the other hand, is caused by the fungus Ceratocystis Paradoxa and is characterised by red or pink discolouration of the sugarcane stalk. The fungus can also cause the sugarcane to become soft and pulpy, making it difficult to harvest. Red rot can also reduce the yield of sugarcane crops and cause significant economic losses. Both diseases can damage the sugarcane crop, reducing yield and quality. YOLO models can be used to automatically identify infected plants in a field. YOLO v7 and v8 have been used in our research. These are the latest versions of the algorithm and are considered to be among the best in terms of performance. Robotic systems equipped with YOLO v7 or v8 can be used to perform precision agriculture, which involves using technology to accurately target specific areas of a field that need attention. By using YOLO v7 and v8 for disease detection, farmers can target their efforts to the specific areas of the field that are affected, reducing the spread of the disease and minimising crop damage. By detecting early, farmers can take action to prevent the spread of the disease, which can help to minimise crop damage and ultimately reduce economic losses. Additionally, such robotic systems can help

3 Sugarcane Diseases Identification and Detection via Machine Learning

39

to increase efficiency and reduce labour costs, which can help to offset some of the economic losses caused by the disease. Deep learning enables novel techniques in agricultural robotics to solve many cultivation and growing problems. Among them, computer vision plays an inevitable role in maximising harvesting. However, variations in appearances are often observed in the agriculture field. These variations can lead to large intra-class dissimilarities that can be caused by imaging conditions, weather parameters, crop location and proximity, underexposed or covered areas due to occlusions, camera pose, and so on. Hence, despite the ability to detect diseases, there is a perennial demand to have a larger annotated dataset that requires extensive human efforts. To bridge the gaps to some extent, in this research, we annotated sugarcane images and constructed a dataset. The dataset contains both bounding boxes and class labels. We then applied deep learning techniques to identify the disease and its location along with the label. We are aiming to publish our dataset.

2 Literature Review In recent years, the application of machine learning in detecting and diagnosing sugarcane diseases has gained popularity. The implementation of machine learning algorithms leads to effective and precise disease detection, reducing the negative impact on crop production and quality. This literature review aims to evaluate the latest advancements and practicality of using machine learning to develop an independent system for sugarcane disease detection. An autonomous system requires capturing sharp, clean, and rich textured images in a real-world environment to proceed in a task with possible success. Prior to applying vision tasks, particularly disease detection, identification, recognition or mobile robot locomotion, this fundamental step related to data acquisition requires attention. In the realisation of these pivotal steps, support vector machine (SVM), a supervised machine learning and statistical approach, has been used to classify image data by researchers to adjust illuminations [11–13]. It has been found that there is a lack of an automatic method that enhances images by controlling lighting environments and restoring noisy data, as proposed in other vision-based systems [12–15]. In field robotics considering the sugarcane industry, such a method has significant importance since field acquisition can be attenuated by the motion of mobile robotics and weather parameter. For instance, excessive light and motion can cause blurred images, and fog or haze can cause low light and noisy observations. Our research found that there is room for further development of such systems. However, the proposed dataset currently considers clean and rich visual information. SVM is also used by researchers to classify diseases. Using segmented spot images, Evy Kamilah Ratnasari et al. approached detecting sugarcane leaf diseases in 2014 [16]. The SVM classifier employs Grey Level Co-Occurrence Matrics (GLCM) for its texture features and L*a*b* colour space for its colour features. With 80%

40

Md Mostafizur Rahman Komol et al.

accuracy and an average error severity estimation of 5.73, the proposed model could identify the different types of spot diseases. Bai et al. introduced an improved fuzzy C-means (FCM) technique with an average segmentation error of 0.12% [17]. Lin et al. proposed a semantic segmentation model using convolutional neural networks (CNNs) for pixel-level segmentation of powdery mildew on cucumber leaves and achieved an average pixel accuracy of 96.08% and an intersection over the union of 72.11% on 20 test samples [18]. Wang et al. conducted segmentation tests on disease leaf images under different conditions and presented a cascaded convolutional neural network-based segmentation approach with a segmentation accuracy of 87.04%, recall of 78.31%, and a comprehensive evaluation index value of 88.22% [19]. In 2019 Sammy V. Militante and his team used 14,725 images of healthy sugarcane leaves and infected sugarcane diseases as a dataset and used CNN for detection purposes. They achieved a maximum of 95% accuracy in detecting diseases with three models [20]. The next year Sakshi Srivastava et al. developed a novel deep learning framework, which achieved 90.2% accuracy with more classes of diseases [21]. Pathogen detection techniques have been used by Shamsul A. Bhuiyan et al. in detecting sugarcane diseases [22]. Kianat et al. [23] established a hybrid framework based on feature fusion and selection strategies for the complicated environment. The framework has three main components: classifier classification; feature extraction, fusion, and selection; and picture contrast enhancement. The outcomes of the experiment demonstrated that feature selection might raise a system’s recognition accuracy. In order to classify the severity of cucumber leaf disease in complicated backdrops, Wang et al. [24] suggested a two-stage model that combined DeepLabV3+ and U-Net (DUNet). The segmentation accuracy of the leaves was 93.27%. A deep learning-based technique for classifying sugarcane leaf diseases was suggested by Militante et al. [25] after gathering 13,842 photos of sugarcane. An accuracy of 95% was obtained while classifying cane leaves into sick and uninfected groups using the model. In order to detect apple leaf illnesses, Yan et al. [26] suggested an improved model based on VGG16. The results revealed that the proposed model’s categorisation of apple leaves was 99.01% accurate overall. The characteristics of chile pests and illnesses retrieved using the conventional method and those extracted using the deep learning-based method were compared by Loti et al. [27]. In contrast to the state-of-the-art, Brahimi et al. [28] employed a huge dataset. The accuracy of the dataset, which included 14,828 photos of tomato leaves affected with nine illnesses, was 99.18%. The hybrid usage of a Yolov4 deep learning model with image processing was suggested by Adem et al. [29] for the automated detection of leaf spot disease on sugar beets and severity classifications. By using 1040 photos for training and testing, the suggested hybrid technique for illness diagnosis and severity assessment was shown to have a classification accuracy rate of 96.47%. Swapnil Dadabhau Daphal et al. used CNN in 2022 and trained 1470 images as a dataset [30]. They classified healthy leaves and infected leaves. This research provides a comparison of the stochastic gradient descent, Adadelta, and Adam optimisers. Shima Ramesh and her team propose a novel approach for detecting plant

3 Sugarcane Diseases Identification and Detection via Machine Learning

41

diseases using machine learning techniques [31]. The authors use image processing algorithms to extract features from leaf images and then classify the plants into healthy or diseased categories using support vector machines (SVMs) and convolutional neural networks (CNNs). Y. A. Nanehkaran presents a novel technique for identifying plant leaf diseases. Image classification and Image segmentation make up the method’s two sections [32]. Initially, a hybrid segmentation technique based on LAB and hue, saturation, and intensity is devised and applied to separate illness symptoms from photos of plant diseases. The segmented pictures are then fed into an image categorisation convolutional neural network. This strategy produced validation accuracy that was around 15.51% greater than that of the traditional method. The detection findings also revealed that, despite the challenging background circumstances, the average detection accuracy was 75.59%, and the majority of illnesses were successfully identified. In 2021 Maryam Ouhami studies the many data-gathering modalities, including IoT, unmanned aerial vehicle imaging, ground imaging, and satellite imaging, as well as the conventional and deep learning techniques related to each [33]. The research also emphasises the significance of data fusion approaches and their capacity to enhance the forecasting of plant health status. Overall, the study offers insightful information on the issues and developments in this field’s research. In 2022 T. Tamilvizhi studied Quantum Behaved Particle Swarm Optimisation for investigating sugarcane leaf diseases [34]. As a feature extractor, the SqueezeNet model is used, and as a classification model, the deep stacked autoencoder (DSAE) model is used. Machine learning techniques were used by Mr. Amarasingam Narmilan et al. also in 2022. They have used UAV Multispectral Images to detect only white leaf diseases of sugarcane [35]. 94% accuracy has been gained by the study. Due to the dead leaves, the border of all crops was shown in red in the segmentation findings. As a result, during training, the accuracy, recall, and F 1 -score for early and severe symptoms were all decreased. Recent progress in using multispectral images reconstructed from RGB colour opens the door to investigating disease detection using low-cost devices [36–38]. It is a drawback of this research. Using ResNet-50, Nattapak Lawanwong and his team created a classification algorithm for common sugarcane diseases in Thailand [39]. Sugarcane disease detection still requires more effort to define all possible detection conditions. Moreover, detecting the diseases of sugarcane with real-time performance and accuracy at the same time has not been done yet. The contributions of our research are discussed below: • Dataset Generation: Identified diseases (types) and annotated high-quality images obtained from publicly available datasets. Dataset Link—https://github.com/sab idarrow/Sugarcane-Disease-Dataset. • Training object Detection Models: YOLOv7 and YOLOv8. • Model Performance Comparison: Evaluate the best detection model for accurate detection of sugarcane disease in real-time.

42

Md Mostafizur Rahman Komol et al.

3 Data Summary and Preprocessing A sugarcane disease dataset was taken from Kaggle [40]. The dataset contains 300 high-quality images of sugarcane leaf images. However, the dataset was not annotated based on sugarcane disease classes to be used for disease detection. We preprocess the dataset through annotation, augmentation, resizing, and splitting to prepare the data to train YOLOv7 and YOLOv8 object detection models. A flowchart of the data preprocessing steps is shown in Fig. 1. The unlabelled data with 300 images have been annotated into three classes: Bacterial Blight (B.Blight), Red rot diseases, and healthy leaves. Again, write these disease pathogens and say that we have annotated the dataset based on these pathogen criteria. Some images of the annotated dataset are shown in Fig. 2. The images have been adjusted to an input size of 416 × 416. The dataset was increased three times using a data augmentation process. So, a total of 900 datasets were gathered after the augmentation process. Data augmentation includes data rotation, saturation, and brightness level control. The saturation and brightness levels were also modified within a range of − 44% to + 44%. These three image adjustments were applied to each training example, resulting in three different outputs. The ability to apply these adjustments improves the model’s ability to handle variations in camera settings and lighting conditions. We applied image adjustments and divided the data into two sets by altering the number of images while maintaining a consistent image count per class across both sets. The data was further split into three parts—training, testing, and validation— using 75% for training, 20% for testing, and 5% for validation.

Raw Sugarcane Dataset

Labeling Sugarcane Data according to the Disease

Data Resize (416X416) & AugmentaƟons (RotaƟon, SaturaƟon, Brightness)

Data Preprocessing

Fig. 1 Data preprocessing framework

Split The Sugarcane Data into Train, ValidaƟon and Test Dataset

3 Sugarcane Diseases Identification and Detection via Machine Learning

43

Fig. 2 Labelled sugarcane data

4 Methodology For our detection process, the you only look once (YOLO) algorithm of version 7 and 8 have been used [41, 42]. The YOLO algorithm works by dividing an image into a grid of cells and using a convolutional neural network (CNN) to predict the bounding boxes and class probabilities for objects within each cell. The CNN is trained on a dataset of labelled images, where each image is annotated with the location and class of the objects within it. The methodology flowchart of our study is shown in Fig. 3. The YOLO algorithm starts by resizing the input image to a fixed resolution, such as 416 × 416 pixels, and dividing it into a grid of 13 × 13 cells. Each cell

Pre-processed Data

Training YOLOv7 and YOLOv8

Test Models and Evaluate Performance

Methodology

Fig. 3 Training methodology

IdenƟfy the Best Performing Model

44

Md Mostafizur Rahman Komol et al.

predicts N bounding boxes, with N being a user-defined number. Each bounding box is represented by five parameters: centre coordinates, width, height, and class probabilities (i.e. the label of the object). The CNN trains to predict the bounding box parameters and class probabilities for each cell. The network comprises several layers, including convolutional, max-pooling, and fully connected layers. Convolutional layers extract features from the image, while fully connected layers predict the bounding box parameters and class probabilities. The algorithm then uses non-max suppression (NMS) to remove overlapping boxes and produce the final detection results. The YOLO model outputs multiple bounding boxes for each object in an image, and these boxes often overlap or have slight differences in their predictions [43]. NMS compares the predicted boxes and retains the one with the highest confidence score. It then removes any boxes that have a high overlap (usually defined as an Intersection over Union (IoU) higher than a certain threshold) with the kept box. This reduces false positives and improves model accuracy. NMS is a crucial part of the YOLO pipeline as it eliminates multiple detections and false positives, leading to a more accurate and efficient object detection system while also avoiding duplicate detections of the same object. YOLOv8 uses a deeper ResNet-style architecture as the backbone network, while YOLOv7 uses a shallower CSPDarknet-53 architecture. This uses residual connections to improve the flow of information through the network. A feature pyramid network (FPN) termed as Path Aggregation Network (PANet) is used as a model neck. It aggregates features from multiple layers of the backbone network and generates a feature pyramid to capture objects at different scales. The PANet also has an upsampling layer that increases the resolution of the feature maps to match the input image size. This allows the model to predict the location of small objects more accurately. In YOLOv8, the neck network uses the SPADE module, which adaptively normalises the features extracted by the backbone network. The head network takes the features from the neck network and predicts the object bounding boxes and class probabilities. The head network in YOLOv8 is composed of several convolutional layers and uses anchor boxes of different aspect ratios to improve the detection of objects of different shapes. YOLOv8 uses a new loss function called Complete Intersection over Union (CIoU) loss, which improves the accuracy of the model by penalising errors in the predicted bounding boxes. Besides this, it uses a new data augmentation technique called Mosaic data augmentation which increases the diversity of the training data by randomly pasting four sub-images together to form a new image. YOLOv8 incorporates several optimisations to improve the speed and accuracy of the model. For example, it uses a dynamic anchor assignment algorithm that assigns anchor boxes to objects based on their size and aspect ratio. This helps to reduce false positives and improve detection accuracy. YOLOv8 also uses a Mish activation function, which has been shown to outperform other activation functions like ReLU and LeakyReLU (Fig. 4). For training, we’ve specified some parameters like batch size, epochs, and image size. All the images are resized into 416 × 416 pixels. The batch size was set to 16. Epochs have been varied for different models and versions. The model summary represents that it has 225 layers, 11,136,761 parameters, 11,136,745 gradients, and

3 Sugarcane Diseases Identification and Detection via Machine Learning

45

Fig. 4 Detected sugarcane diseases

28.7 GFLOPs. The “Giga Floating-point Operations Per Second” (GFLOPS) is the number of operations the model can perform in a second. It is a measure of the computational power required to run the model. A higher number of GFLOPS may indicate a more complex model but also a model that requires more computational resources to run. After the training, the models were saved and used further to test the dataset and detect sugarcane diseases in real-time to distinguish the diseased leaf from the healthy sugarcane leaf.

5 Result After training the data, we’ve tested with 25% of the data remaining. The test result of the models are shown in Table 1, and the comparison of the result is shown in Fig. 5. Table 1 Model performance table Model

Test accuracy

Test accuracy by class B.Blight

Healthy

Red rot

YOLOV8

96.67

96.67

100

96.67

YOLOV7

77.56

77.56

90.73

86.83

46

Md Mostafizur Rahman Komol et al.

120

Accuracy (%)

100 80 60 40 20 0 B.Blight Test Accuracy

Healthy

Red rot

Test Accuracy by Class YOLOV8

YOLOV7

Fig. 5 Comparison of classification accuracy

Model performance in Table 1 shows that YOLO v8 has achieved the maximum accuracy with 96.67%. For YOLOv8, the accuracy of Bacterial Blight, Healthy, and Red rot were, respectively, 96.67%, 100%, and 96.67%. An average accuracy of 77.56% has been achieved by YOLOv7. In YOLOv7 healthy class is detected more accurately than the other two classes. The accuracy of the healthy class is 90.73%. The red rot class follows the next with 86.83% accuracy. Bacterial blight accuracy was 77.56%. To evaluate the models’ performance further, we also analyse the precision, recall, and F 1 -score of the models on detecting sugarcane diseases. Precision, recall, and F 1 score is important performance measurement to evaluate the models’ performance in the minority class. Though we considered a nearly balanced dataset, the precision, recall, and F 1 -score analysis was performed. Table 2 shows the performance metrics, including F 1 -score, precision, and recall. A classification model’s effectiveness is measured by its accuracy, recall, and F 1 -score [44]. How many of the positive cases that are projected to occur actually do so is a measure of precision [45]. With a precision of 100% for the “Bacterial blight” class, all occurrences of “Bacterial blight” in the test set were properly detected by the model without any false positives. The recall is a measurement of the proportion of real positive events that the model accurately detected [46]. A recall of 90% for the “Bacterial blight” class indicates that 90% of the cases of “Bacterial blight” in the test set were properly recognised by the model, while 10% were missed. A harmonic mean of memory and accuracy is used to balance recall and precision in the F 1 -score [47]. The “Bacterial blight” class’s F 1 -score of 95% indicates that the model well balances accuracy and recall for this class. Figure 6 compares the accuracy, recall, and F 1 -scores between YOLOv8 and YOLOv7. For “Healthy” class, the model correctly identified all instances of “Healthy” in the test set without any false positives or false negatives. For the “Red rot” class, the model correctly identified 91% of all instances of “Red rot” in the test set, and it had

3 Sugarcane Diseases Identification and Detection via Machine Learning

47

Table 2 Performance metrics Model YOLOV8 YOLOV7

Class

Precision

Recall

F 1 -score

B.Blight

100

90

95

Healthy

100

100

100

Red rot

91

100

95

B.Blight

65

78

71

Healthy

90

80

85

Red rot

83

75

79

Comparison of Performance Metrics 120 100 80 60 40 20 0 B.Blight

Healthy

Red rot

B.Blight

YOLOV8 Precision

Healthy

Red rot

YO L O V 7 Recall

F1 score

Fig. 6 Comparison of performance metrics

no false negatives, but it had some false positives. F 1 -score of 95% for the “Red rot” class means that the model strikes a balance between precision and recall for the “Red rot” class. For YOLOv7, the average accuracy is 77.56%. For the Bacterial Blight precision value is 65, the recall value is 78, and the F 1 -score is 71. For a healthy dataset, all the scores were improved than before. Precision, recall, and F 1 -score for the healthy dataset were, respectively, 90, 80, and 85. Red rot achieved 83, 75, and 79, respectively, for precision, recall, and F 1 -scores. Figure 7 displays the matrix for the two models of disease prediction for sugarcane. A table known as a confusion matrix is used to describe how well a classification model performs. It serves as an overview of the test’s accuracy and is often used to contrast various models. True positives, True Negatives, False Positives, and False Negatives are the four elements in the 2 × 2 table that holds the metrics. The number of times the model accurately predicted the positive class is known as true positives (TP). The number of times the model accurately predicted the negative class is known as true negatives (TN). The number of times the model predicted the positive class wrongly is known as false positives (FP). The number of times the model mispredicted the negative class is known as false negatives (FN). A classification model’s performance is assessed using the confusion metrics by computing a number

48

Md Mostafizur Rahman Komol et al. Actual Class YOLOv8

Predicted

B.Blight

B.Blight

Healthy

90

0

Healthy

0

Red rot

10

Blanked

0

Red rot

Blanked

0

17

100

0

50

0

100

33

0

0

0

Actual Class Predicted YOLOv7

B.Blight

Healthy

Red rot

Blanked

B.Blight

56

13

17

55

Healthy

0

53

0

27

Red rot

6

0

50

18

Blanked

38

34

33

0

Fig. 7 Confusion matrix

of metrics, including accuracy, precision, recall, and F 1 -score. These metrics provide information on the model’s effectiveness and suggest ways to enhance it. With YOLOv8, 90% of the Bacterial Blight leaf has been detected successfully. Blacked 17% leaf has also been detected. It means it has detected true Bacterial Blight leaves, additionally from the background. Healthy and Red rot classes have the best detection result so far. Both classes have 100% with YOLOv8. Additionally, it has detected 50% of healthy leaves and 33% of Red rot leaves from the background accurately. YOLOv7 has not performed as well as YOLOv8. Bacterial Blight class has the best accuracy so far, with a 56% detection rate. 6% of Bacterial Blight leaves have been detected as Red rot, and 38% remained blank. 55% of B.Blight leaves have been detected additionally. Only 53% of healthy leaves have been detected accurately. 13% of healthy leaves were predicted as B.Blight, and 34% remained blank without detecting any classes. Almost the same goes for Red rot, also. 50% of Red rot leaves have the exact detection. 17% of Red rot leaves have been marked as B.Blight, and 33% have remained blanked.

6 Conclusion This research aimed to develop a robust and efficient method for the detection of sugarcane leaf diseases using YOLO v7 and v8. The proposed approach was evaluated using a dataset of sugarcane leaf images, and the results showed that the YOLO v8 algorithm achieved an accuracy of above 96% in detecting Bacterial Blight and Red rot, two of the most common and destructive diseases in sugarcane crops. The

3 Sugarcane Diseases Identification and Detection via Machine Learning

49

high accuracy of the proposed method demonstrates its potential for practical application in the field of agricultural robotics. The ability of YOLO v8 to detect and localise infected areas in a field can help farmers and agricultural researchers quickly and accurately locate areas of the field that are affected by the disease, which can ultimately reduce economic losses caused by the disease. Furthermore, using robotic systems with YOLO v7 and v8 may assist in boosting productivity and saving labour expenses, making precision agriculture more affordable for farmers. Overall, this study underscores the need to create effective and precise disease detection techniques for crop protection and shows the potential of YOLO v7 and v8 in the area of agricultural robotics. Author Contribution Md Mostafizur Rahman Komol contributed to conceptualisation, planning, writing, and methodology. Md Sabid Hasan contributed to dataset annotation and training of the models. He also contributed to preparing the manuscript and presenting the results. Dr Shahnewaz Ali supervised this research.

References 1. Industrial sugar market size, share & covid-19 impact analysis, by source (cane sugar and beet sugar), type (white sugar, brown sugar, and liquid sugar), end use (beverages, confectionary, bakery products, dairy products and other food applications). Fortune Business Insights 2022 2. Komol MMR, Hasan MM, Elhenawy M, Yasmin S, Masoud M et al (2021) Crash severity analysis of vulnerable road users using machine learning. PLoS ONE 16(8):e0255828. https:/ /doi.org/10.1371/journal.pone.0255828 3. Komol MMR, Podder A, Ali N, Ansary S (2018) RFID and finger print based dual security system: a robust secured control to access through door lock operation. Am J Embed Syst Appl 6:15–22. https://doi.org/10.11648/j.ajesa.20180601.13 4. Komol MMR, Podder A (2017) Design and construction of product separating conveyor based on color. In: 2017 3rd International conference on electrical information and communication technology (EICT), Khulna, Bangladesh, pp 1–5. https://doi.org/10.1109/EICT.2017.8275163 5. Khan M, Komol MMR, Podder A, Mishu SA (2019) A developed length based product separating conveyor for industrial automation. In: International conference on electrical, communication, electronics, instrumentation and computing (ICECEIC) 6. Ali S, Dayoub F, Pandey AK (2023) Learning from learned network: an introspective model for arthroscopic scene segmentation. In: Proceedings of international conference on information and communication technology for development: ICICTD 2022. Springer Nature, Singapore, pp 393–406 7. Ali S, Pandey AK (2022) ArthroNet: monocular depth estimation technique toward 3D segmented maps for knee arthroscopic. Intell Med 8. Jonmohamadi Y, Ali S, Liu F, Roberts J, Crawford R, Carneiro G, Pandey AK (2021) 3D Semantic mapping from arthroscopy using out-of-distribution pose and depth and indistribution segmentation training. In: International conference on medical image computing and computer-assisted intervention (MICCAI). Springer, Cham, pp 383–393 9. Rott P, Davis MJ, Baudin P (1994) Serological variability in Xanthomonas albilineans, causal agent of leaf scald disease of sugarcane. Plant Pathol 43(2):344–349 10. Viswanathan R, Samiyappan R (2000) Red rot disease in sugarcane: challenges and prospects. Madras Agric J 87(10/12):549–559

50

Md Mostafizur Rahman Komol et al.

11. Hossain MI, Ahmad K, Siddiqui Y, Saad N, Rahman Z, Haruna AO, Bejo SK (2020) Current and prospective strategies on detecting and managing Colletotrichum falcatum causing red rot of sugarcane. Agronomy 10(9):1253 12. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Networks 10(5):988–999 13. Ali S, Jonmohamadi Y, Takeda Y, Roberts J, Crawford R, Pandey AK (2020) Supervised scene illumination control in stereo arthroscopes for robot assisted minimally invasive surgery. IEEE Sen J 21(10):11577–11587 14. Sarker S, Chowdhury S, Laha S, Dey D (2012) Use of non-local means filter to denoise image corrupted by salt and pepper noise. Signal Image Process Int J (SIPIJ) 3(2):223–235 15. Ali S, Jonmohamadi Y, Fontanarosa D, Crawford R, Pandey AK (2023) One step surgical scene restoration for robot assisted minimally invasive surgery. Sci Rep 13(1):3127 16. Ratnasari EK, Mentari M, Dewi RK, Hari Ginardi RV (2014) Sugarcane leaf disease detection and severity estimation based on segmented spots image. In: Proceedings of international conference on information, communication technology and system (ICTS) 2014, Surabaya, Indonesia, pp 93–98. https://doi.org/10.1109/ICTS.2014.7010564 17. Bai XB, Li XX, Fu ZT, Lv XJ, Zhang LX (2017) A fuzzy clustering segmentation method based on neighborhood grayscale information for defining cucumber leaf spot disease images. Comput Electron Agric 136:157–165 18. Lin K, Gong L, Huang YX, Liu CL, Pan J (2019) Deep learning-based segmentation and quantification of cucumber powdery mildew using convolutional neural network. Front Plant Sci 10:155 19. Wang Z, Zhang S, Zhao B (2020) Crop diseases leaf segmentation method based on cascade convolutional neural network. Comput Eng Appl 56:242–250 20. Militante SV, Gerardo BD (2019) Detecting sugarcane diseases through adaptive deep learning models of convolutional neural network. In: 2019 IEEE 6th International conference on engineering technologies and applied sciences (ICETAS), Kuala Lumpur, Malaysia, pp 1–5. https:/ /doi.org/10.1109/ICETAS48360.2019.9117332 21. Srivastava S, Kumar P, Mohd N et al (2020) A novel deep learning framework approach for sugarcane disease detection. SN Comput Sci 1:87. https://doi.org/10.1007/s42979-020-0094-9 22. Strachan S, Bhuiyan SA, Thompson N, Nguyen N-T, Ford R, Shiddiky MJA (2022) Latent potential of current plant diagnostics for detection of sugarcane diseases. Curr Res Biotechnol 4:475–492. ISSN 2590-2628. https://doi.org/10.1016/j.crbiot.2022.10.002 23. Kianat J, Khan MA, Sharif M, Akram T, Rehman A, Saba T (2021) A joint framework of feature reduction and robust feature selection for cucumber leaf diseases recognition. Optik 240:166566 24. Wang CS, Du PF, Wu HR, Li JX, Zhao CJ, Zhu HJ (2021) A cucumber leaf disease severity classification method based on the fusion of DeepLabV3+and U-Net. Comput Electron Agric 189:106373 25. Militante SV, Gerardo BD, Medina RP (2022) Sugarcane disease recognition using deep learning. In: Proceedings of the IEEE Eurasia conference on IOT, communication and engineering (IEEE ECICE), National Formosa University, Yunlin, Taiwan, 3–6 Oct 2022, pp 575–578 26. Yan Q, Yang BH, Wang WY, Wang B, Chen P, Zhang J (2020) Apple leaf diseases recognition based on an improved convolutional neural network. Sensors 20:3535 27. Loti NNA, Noor MRM, Chang SW (2021) Integrated analysis of machine learning and deep learning in chili pest and disease identification. J Sci Food Agric 101:3582–3594 28. Brahimi M, Boukhalfa K, Moussaoui A (2017) Deep learning for tomato diseases: classification and symptoms visualisation. Appl Artif Intell 31:299–315 29. Adem K, Ozguven MM, Altas Z (2020) A sugar beet leaf disease classification method based on image processing and deep learning. Multimed Tools Appl 18:1–18 30. Daphal SD, Koli SM (2022) Efficient use of convolutional neural networks for classification of sugarcane leaf diseases. In: ICCCE 2021. Lecture notes in electrical engineering, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-16-7985-8_70

3 Sugarcane Diseases Identification and Detection via Machine Learning

51

31. Ramesh S et al (2018) Plant disease detection using machine learning. In: 2018 International conference on design innovations for 3Cs compute communicate control (ICDI3C), Bangalore, India, pp 41–45. https://doi.org/10.1109/ICDI3C.2018.00017 32. Nanehkaran YA, Zhang D, Chen J et al (2020) Recognition of plant leaf diseases based on computer vision. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-025 05-x 33. Ouhami M, Hafiane A, Es-Saady Y, El Hajji M, Canals R (2021) Computer vision, IoT and data fusion for crop disease detection using machine learning: a survey and ongoing research. Remote Sens 13(13):2486. https://doi.org/10.3390/rs13132486 34. Tamilvizhi T, Surendran R, Anbazhagan K, Rajkumar K (2022) Quantum behaved particle swarm optimization-based deep transfer learning model for sugarcane leaf disease detection and classification. Math Probl Eng 2022:12, Article ID 3452413. https://doi.org/10.1155/2022/ 3452413 35. Narmilan A, Gonzalez F, Salgadoe ASA, Powell K (2022) Detection of white leaf disease in sugarcane using machine learning techniques over UAV multispectral images. Drones 6:230. https://doi.org/10.3390/drones6090230 36. Ali S, Jonmohamadi Y, Takeda Y, Roberts J, Crawford R, Brown C, Pandey AK (2023) Surface reflectance: a metric for untextured surgical scene segmentation. In: Proceedings of international conference on information and communication technology for development: ICICTD 2022. Springer Nature, Singapore, pp 209–222 37. Ali S, Pandey AK (2022) Towards robotic knee arthroscopy: spatial and spectral learning model for surgical scene segmentation. In: Proceedings of international joint conference on advances in computational intelligence: IJCACI 2021. Springer Nature, Singapore, pp 269–281 38. Ali S, Crawford R, Pandey AK (2023) Arthroscopic scene segmentation using multi-spectral reconstructed frames and deep learning. Intell Med 39. Lawanwong N, Pumrin (2022) Development of an algorithm for classifying common sugarcane diseases in Thailand. In: The 14th Regional conference on electrical and electronics engineering, p 152 40. https://github.com/sabidarrow/Sugarcane-Dataset 41. Wang CY, Bochkovskiy A, Liao HYM (2022) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 42. Yolo v8. https://github.com/ultralytics/ultralytics 43. Redmon J, Divvala S, Girshick R, Farhadi A (2015) You only look once: unified, real-time object detection. ArXiv. https://doi.org/10.48550/arXiv.1506.02640 44. Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada DE, Fernández-Luna JM (eds) Advances in information retrieval. ECIR 2005. Lecture notes in computer science, vol 3408. Springer, Berlin. https:// doi.org/10.1007/978-3-540-31865-1_25 45. Komol MMR et al (2021) Deep transfer learning based intersection trajectory movement classification for big connected vehicle data. IEEE Access 9:141830–141842. https://doi.org/10. 1109/ACCESS.2021.3119600 46. Hasan MdM, Hasan N, Alsubaie M, Komol MdM (2021) Diagnosis of tobacco addiction using medical signal: an EEG-based time-frequency domain analysis using machine learning. Adv Sci Technol Eng Syst J 6:842–849. https://doi.org/10.25046/aj060193 47. Fahim F, Al Farabi A, Hasan MS, Hasan MM (2022) Diagnosis of diabetes using clinical features: an analysis based on machine learning techniques. In: 2022 3rd International informatics and software engineering conference (IISEC), Ankara, Turkey, pp 1–5. https://doi.org/ 10.1109/IISEC56263.2022.9998257

Chapter 4

Freshness Identification of Fruits Through the Development of a Dataset Nusrat Sultana, Musfika Jahan, and Mohammad Shorif Uddin

1 Introduction People always want to buy fresh and high-quality fruits. Sometimes, fruits contain damage, and they may be rotten over time. Fruits quality is an important economic factor for both buyers and sellers. It is anticipated that one-third of the fruits would turn rotten, which will result in a considerable financial loss. Additionally, because consumers believe that spoiled fruits are harmful to their health, fruit sales will be impacted. Reduced concentrations of sugar, vitamins, amino acids, and other nutrients unavoidably raise public worries about edibility issues, prompting debate on how to detect fresh and rotten fruits. Grading of fruit freshness is important because of the consequence in the lives of and its contribution to the economy. However, it is a manual activity that is timeconsuming. Automated fruit quality grading using computerized methods might be a good solution to this problem. The most effective solution is anticipated to be a computer vision-based technique. Fruits’ texture, color, and shape are three essential visual traits for assessing fruit quality [1–6]. In paper [6], SVM and KNN are used for apple grading. It separated the apples into two categories: healthy and defective. Multiscale grading is absent there and the accuracy is not so high. Another study [2] on tomato grading quality looked at the texture, color, and shape of fruit as essential characteristics and built a computer vision method using these data. It is viewed as a binary categorization issue with fruits being classified as either defective or healthy. N. Sultana (B) · M. Jahan · M. S. Uddin Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh e-mail: [email protected] M. S. Uddin e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_4

53

54

N. Sultana et al.

Visual object identification becomes popular and effective by using deep learning. For fruit or vegetable recognition, the study [3] used a convolutional neural networkbased You Only Look Once (YOLO). It is quick and effective compared to other techniques with a high frame rate. Another study [4] utilizes a deep neural network for fruit identification with an increasing number of convolutional layers. Compared to the previous method in [2], a shallow neural network is made of four convolutional layers proceeded by two layers that are fully connected, and is used for feature extraction. The source images in this experiment, on the other hand, are simple. There is no background noise in the images. On a white background, static fruit objects are scattered around. However, these experiments are limited to one type of fruit and assume no background noise. Another issue that is the technique of grading is based on a classification model that categorizes a fruit as either healthy or defective, although the fruit’s decay process is gradual. Therefore, an updated classification scheme is essential to develop, that will classify more fruit varieties, overcome constraints by improving accuracy and reducing computing time. Prior to this, a major fruit dataset containing various fruits is essential for the validation of developed algorithms. This research tries to fill this gap by introducing an extensive such kind of dataset. The fundamental works of this research are given below: • Offering a significant dataset for researchers to develop efficient methods for accurate detection of various fresh and rotten fruits that can contribute to the agricultural automation in quality grading and packaging of fruits. • Validation of the contributed dataset by investigating different deep learning models to find the optimum one. We have organized the other parts of the paper in the following order: Sect. 2 describes a literature review of several existing related techniques for fruit classification along with their limitations, Sect. 3 elaborates on the research technique and procedure, Sect. 4 presents the experimental results with the dataset, performance analysis, and comparative results, and Sect. 5 provides a conclusion and suggestions for further research.

2 Literature Review The literature is explored to see what approaches are available for fruit classification. A lot of research work is done on detecting fresh and rotten fruits. Moallem et al. [6] emphasized golden delicious apples and analyzed them using SVM and KNN. However, this study only considers binary classification: healthy and defective and it only considers a single fruit category. Bhargava and Bansal [7] offered a full review of numerous approaches for assessing the quality of fruits and vegetables based on color, texture, size, and shape. The review also contains the flow processes including segmentation, feature extraction, and classification tasks. However, no pro-impressive description of the ongoing

4 Freshness Identification of Fruits Through the Development of a Dataset

55

degrading fruits that specifies the degree of the spoiling of the fruits based on the biological aging stage is absent. In another study on tomato grading quality Arakeri [2] looked at fruit texture, color, and shape as essential characteristics and built a computer vision method using these statistics. The issue is viewed as a binary categorization issue, with fruits being classified as either defective or healthy. A deep convolutional neural network-based VGG was employed in another study [5] for fruit recognition. The input images used in this experiment are simple. There is no background noise in the images. The fruits are static and placed on a white background. Using discrete wavelet transform and textural data, Nashat and Hassan [8] developed an autonomous freshness rating system for olive fruit batches. The disease Penicillium digitatum affects the decay process in mandarins. GomezSanchis et al. [9] described the early diagnosis of this disease by evaluating visual decay aspects. A collection of decision trees captures and processes the visual features. The experiment, however, is limited to one type of fruit and assumes no background noise. Another point is that the grading technique is considered here as a classification model that divides a fruit as either healthy or defective. However, the damage happens gradually, the final prediction layer should calculate the output rather than conducting a classification task. Ananthanarayana et al. [1] described a method for classifying and detecting fruit freshness using CMOS image sensors. For the object detection job, they used Single Shot Detector (SSD), and for the classification task, they used MobileNetV2–a pre-trained convolutional neural network (CNN) model. According to experimental results, the proposed method obtained 97% and 62% accuracies for fruit classification and fruit detection tasks, respectively. The accuracy of fruit detection is relatively low in this case. Although this procedure worked great, it only works on three different types of fruits. Singh and Singh [10] developed a method to recognize fresh and rotten apples. To improve the quality of the images in the dataset, they used an enhancement task using a histogram equalization technique named contrast-limited adaptive histogram equalization (CLAHE). For texture feature extraction, they used the gray level cooccurrence matrix (GLCM), texture energy of law, histogram of oriented gradients (HOG), wavelet transformation, and tamura feature descriptors. Fruit freshness is classified using SVM, logistic regression, LDA, and KNN classifier. They applied the method to the Apple class only. From the research works described above, we can deduce that none has worked on more than three fruit classes. However, for industrial applications more fruit classes are required. Therefore, in this research, our main focus is on working with more fruit classes with a view to develop a practicable system for quality grading and packaging of fruits. Besides, our dataset will help the researchers to investigate different deep learning models to find the optimum one.

56

N. Sultana et al.

Fig. 1 CNN architecture

3 Methodology 3.1 Recognition Procedure In this research, we have done experimentations with five bench-mark deep learning models: Inception V3, VGG 16, VGG 19, ResNet50, and MobileNet V2. Our goal is to find the best model among the well-known bench-marked model for this application. All these deep CNN models are consisting of more than 3 convolutional layers. A CNN with equal or less than 3 convolutional layers is known as a shallow CNN. A basic CNN architecture is shown in Fig. 1. It mainly consists of the following three layers. • Convolutional layer: produces a feature map by scanning the image with several pixels using a filter at a time. • Pooling layer: decreases the amount of data created by the convolutional layer to store it more efficiently. • Fully connected (dense) layer: flattens the outputs into a single feature vector. Weights are applied to the inputs provided by the feature analysis and determine the image class by generating a final probability.

3.2 System Architecture Figure 2 shows the real-life schematic diagram of our proposed fruit recognition system.

4 Freshness Identification of Fruits Through the Development of a Dataset

57

Fig. 2 Schematic diagram of the proposed real-life system

3.3 Data Collection Procedure All of the photos in our dataset were manually captured, and there are eight different varieties of fruits: apple, banana, orange, grape, jujube, guava, strawberry, and pomegranate. Each fruit is separated into two categories: fresh and rotten. There are total of 3200 original images. We have modified all the images to a fixed size of 512 × 512 pixels. Due to the small size of our dataset, we may encounter issues with overfitting. We used data augmentation to help us avoid overfitting. Augmentation was done through scaling, random rotation, shifting, shearing, cropping, and adjusting brightness [11]. Scaling is done by using interpolation operation. A shifting augmentation is done to an image by moving all pixels of the image either horizontally or vertically without changing its dimension. A shearing augmentation is done by distorting the image along an axis to rectify the visual perception angles. A rotation augmentation is done by randomly rotating the image between an angles from 0 to 360°. Brightness is adjusted by using either randomly darkening or brightening the image pixels, or both. The Keras library package in Python is used for estimating parameters. The used parameters are: rotation with 90°, scaling with 0.2, shearing with 0.2, width shifting with 0.2, and brightness with [0.3, 1.7]. After augmentation, we got a total of 12,335 images in our dataset. Figure 3 shows the complete recognition system diagram along with the data augmentation process.

58

N. Sultana et al.

Fig. 3 Flow diagram of the proposed recognition system along with data augmentation

3.4 Pre-trained Deep Neural Networks The pre-trained CNNs utilized in our research include VGG 16, VGG 19, Inception V3, ResNet50, and MobileNet V2. The following is a comprehensive explanation of these pre-trained deep neural models: (1) Inception V3 Inception V3 is a pre-trained deep learning model with 48 layers. It has label smoothing, factorized 7 × 7 convolutional layers, and the addition of an extra classifier to transfer label information in the core part of the network, with other improvements. (2) VGG 16 VGG 16 is a well-known neural network architecture containing 16 layers (13 convolutional layers and 3 fully connected layers). A fixed size 224 × 224 image containing R, G, and B channels is considered as input to all other network configurations. The only pre-processing performed is to normalize each pixel’s RGB values. Every pixel is subtracted from the mean value to gain this. (3) VGG 19 VGG 19 is an improved version of VGG 16 that contains 19 layers (16 convolutional layers and 3 fully connected layers). (4) MobileNet V2 MobileNet V2 is a 53-layer convolutional neural network. In contrast to standard remainder models that utilize enlarged presentations within the input, MobileNet V2 architecture is developed on a reversed remainder structure, with both input and output of the remaining block being narrow bottle-neck layers. In the intermediate expansion layer of MobileNet V2, features are filtered using lightweight depth-wise convolutions. To preserve representational strength, nonlinearities narrow layers are also deleted.

4 Freshness Identification of Fruits Through the Development of a Dataset

59

Table 1 Architectures of 5 well-known deep learning models VGG 16

VGG 19

MobileNet V2

ResNet50

Inception V3

Input size

224 × 224

224 × 224

224 × 224

224 × 224 × 3

299 × 299

Conv. layers

13

16

53

48

48

Filter size

3, 3

3, 3

1, 3

3, 3

1, 3, 5

Stride

1, 2

1, 2

2

2

1, 2

Parameter

138 million

138 million

3.4 million

23 million

25 million

Fully connected layer

3

3

4

1

3

(5) ResNet50 ResNet50 consists of 48 convolution layers, 1 max pooling layer, and 1 average pooling layer in ResNet variation. It contains 3.8 × 109 floating-point operations. It’s a well-known ResNet model. The architectures of the above CNNs are shown in Table 1.

4 Experimental Results 4.1 Dataset As mentioned earlier that our dataset in this research contains images of eight fruits of two classes each (Fresh and Rotten): apple, banana, jujube, orange, grape, guava, pomegranate, and strawberry. Therefore, there is a total of 16 classes. From each class, we have taken 200 original images. Hence, the dataset contains a total of 200 × 16 = 3200 original images. We have performed augmentation on each class and after augmentation 200 images of each class become 734 images. Therefore, we have a total of 734 × 16 = 12,335 images after augmentation (Fig. 4). The dataset is uploaded to the Mendeley repository: https://data.mendeley.com/ datasets/bdd69gyhv8/1. We have experimented with five bench-mark deep learning-based CNN models (Inception V3, VGG 16, VGG 19, ResNet50, and MobileNet V2) to find the effective one for the classification of fresh and rotten fruits. To pick the best model from the training step, the dataset is subdivided into training and test sets. The dataset is divided into training and testing at a ratio of 80:20.

60

N. Sultana et al.

Fig. 4 Sample images of the 16 fruit classes

4.2 Performance Analysis Accuracy indicates the percentage (%) of total data correctly identified by the classifier. Precision is the percentage of total anticipated positive data that were actually positives as determined by the classifier. Recall is the percentage of all positive data that the classifier correctly identified as positive. F 1 -score is the harmonic mean of precision and recall [12]. Mathematical representations of these metrics are shown in Eqs. (4.1)–(4.4). Accuracy(%) =

TP + TN × 100 TP + TN + FP + FN

(4.1)

TP × 100 TP + FP

(4.2)

Precision(%) =

4 Freshness Identification of Fruits Through the Development of a Dataset

Recall(%) = F1 -score(%) = 2 ×

TP × 100 TP + FN

Precision × Recall × 100 Precision + Recal

61

(4.3) (4.4)

where True Positive (TP) True Negative (TN) False Positive (FP) False Negative (FN)

the model predicts the actual sample as actual, the model predicts the wrong sample as wrong, the model predicts the wrong sample as actual, the model predicts the actual sample as wrong.

The class-wise TP, TN, FP, and FN are determined by using Eqs. (4.5)–(4.8). TPi = cii n 

TNi =

(4.5)

n 

c jk

(4.6)

k=1,k=i j=1, j=i n 

FPi =

c ji

(4.7)

ci j

(4.8)

j=1, j=i n 

FNi =

j=1, j=i

where i is the class and n is the total number of fruit classes. Here, C jk is the component of the confusion matrix, j and k are the row and column of the confusion matrix [12]. Table 2 shows the obtained accuracy of the five deep learning models. Based on this accuracy table, we have found that VGG 16 gives the highest performance among the five investigated deep learning models. Table 3 shows the confusion matrix of VGG 16. Table 4 shows the category-wise accuracy, precision, recall, and F 1 -score of VGG 16 by using the fruit dataset (Table 5). The recognition, as well as the classification success rate, was visualized using a confusion matrix, as shown in Table 3. A confusion matrix is a contingency table Table 2 Accuracy comparison for five deep learning models using our dataset

Method

Accuracy (%)

Inception V3

93.63

ResNet50

93.67

MobileNet V2

95.34

VGG 19

94.82

VGG 16

96.04

0.07

0

0

0

0

0

0

0

Rotten apple

Rotten banana

Rotten grape

Rotten guava

Rotten jujube

Rotten orange

Rotten pomegranate

Rotten strawberry

0

Fresh strawberry

Fresh jujube

0.02

0.05

Fresh guava

0.02

0

Fresh grape

Fresh pomegranate

0

Fresh banana

Fresh orange

0.97

0

Fresh apple

0

0

0

0

0

0

0.07

0

0

0

0

0

0

0

0.95

0

Predicted class

True class

0

0

0

0

0

0.07

0

0

0

0

0

0

0

1

0

0

Table 3 Confusion matrix of the VGG 16 model 0

0

0

0

0

0

0

0

0

0

0.02

0

0

0.96

0

0

0

0

0

0

0.05

0

0

0

0

0

0

0

0.93

0

0

0

0

0

0

0

0

0

0

0

0

0

0.97

0

0.02

0

0

0

0

0

0.05

0

0

0

0

0

0

0

0.93

0

0

0.02

0

0

0

0

0

0

0

0

0

0

0.97

0

0

0

0

0

0

0

0

0

0

0.02

0

0

0

0.90

0

0

0

0

0

0

0

0.03

0

0

0

0

0

0

0.92

0

0

0

0

0

0

0

0.05

0

0

0.02

0

0.02

0

0.92

0

0

0

0

0

0

0

0

0

0

0

0

0

0.02

0.97

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0.80

0

0

0

0

0

0

0

0.02

0

0

0

0

0

0

0

1

0.05

0

0

0

0

0

0

0

0

0

0

0

0

0

0.90

0

0

0.02

0

0

0.02

0

0.02

0

0

0

0

0

1

0.02

0

0.02

0

0

0

0

0.02

0

0

0

0

0

0

0

62 N. Sultana et al.

4 Freshness Identification of Fruits Through the Development of a Dataset

63

Table 4 Category-wise accuracy, precision, recall, and F 1 -score of the testing fruit dataset Category

Precision

Recall

F 1 score

Fresh apple

0.85

0.97

0.91

Fresh banana

0.93

0.95

0.94

Fresh grape

0.93

1.00

0.96

Fresh guava

0.98

0.96

0.97

Fresh jujube

0.95

0.93

0.94

Fresh orange

0.97

0.97

0.97

Fresh pomegranate

0.93

0.93

0.93

Fresh strawberry

1.00

0.97

0.99

Rotten apple

0.95

0.90

0.92

Rotten banana

0.97

0.93

0.94

Rotten grape

0.97

0.93

0.95

Rotten guava

0.95

0.97

0.96

Rotten jujube

0.97

0.80

0.88

Rotten orange

0.95

1.00

0.98

Rotten pomegranate

0.92

0.90

0.91

Rotten strawberry

0.93

1.00

0.96

Average

0.95

0.94

0.94

Table 5 Performance comparison of fruit classes recognition performed by other researchers Model

Technique

Accuracy (%)

Fruit categories

Jahanbakhshi and Kheiralipour [13]

LDA, QDA

92.59

1 (carrot, only)

Bargoti and Underwood [14]

Faster R-CNN

90.00

3 (mangoes, almonds, apples)

Siddiqi [15]

Fine-tuned MobileNet

94.82

7 (apple, banana, grape, mango, orange, strawberry, watermelon)

Lu et al. [16]

CNN

91.44

9 (blackberry, blueberry, anjou pear, cantaloupe, black grape, bosc pear, golden pineapple, green grape, granny smith apple)

Rojas-Aranda et al. [17]

Fine-tuned MobileNetV2 model

95.00

3 (apple, banana, orange)

Syazwani [18]

ANN-GDX

94.40

1 (lemon only)

Ponce et al. [19]

6 fine-tuned models

95.99

1 (olive only)

Our method

VGG 16

96.04

8 (apple, banana, orange, grape, guava, jujube, pomegranate, strawberry)

64

N. Sultana et al.

containing information about a classification system’s actual and expected classifications [20]. Figure 5 presents some of the fruit images where our system succeeded in performing the correct classification. Due to the presence of similar characteristics in the fruit classes, some of the resultant classification values are found to be overlapped. Figure 6 depicts some fruit images where our system failed to perform the correct classification. This is due to the inter-class similarities among the fruit classes, the low resolution of the photos, the visibility of a small portion of the fruit classes, etc. (Fig. 7).

Fig. 5 Samples of fruit images that are correctly classified by VGG 16

Fig. 6 Sample of fruit images that are misclassified by VGG 16

4 Freshness Identification of Fruits Through the Development of a Dataset

65

Fig. 7 Sample images from 16 fruit categories: a, f, l fresh guava; b fresh pomegranate; c, g rotten strawberry; d, i fresh apple; e, h rotten grape; k rotten jujube; j multiple fruits in an image. Fruit images in a, b, d, e, f, g, h, i, and l are recognized by VGG 16 correctly; a, d, f, and h is recognized by MobileNet V2 but Inception V3 failed to recognize; b, e, f are recognized by Inception V3 and ResNet50 but VGG 19 failed; k, l is recognized by VGG 19 and MobileNet V2; i is recognized by MobileNet V2 and VGG 19 but Inception V3 failed to recognize. However, none of the techniques could identify c, j, and k as these are hazy, noisy, contain many fruits and have low resolution

66

N. Sultana et al.

5 Conclusion Identification of freshness of various fruits has become critical in the agricultural industry for quality grading and packaging, but it is not efficient to do this job manually. As a result, a new categorization model is needed to recognize flaws in fruits and reduce human effort, cost, and processing time in the agriculture industry. This project provided researchers with a large dataset to use in developing effective algorithms for recognizing a wider range of fruits and overcoming restrictions by enhancing accuracy. We applied five different deep learning models including Inception V3, VGG 16, VGG 19, ResNet50, and MobileNet V2. Among them, VGG 16 gives the best result with an accuracy of 96.04% using our extensive fruit dataset. Deep CNN models using our dataset can detect rotten fruit early in a batch and prevent it from spreading to other fruits to reduce food loss. We increase agricultural efficiency by automating fruit quality forecasting, grading, and packaging. The findings of this research are satisfactory because it was conducted using real-world data and the results are fairly comparable to what was expected. In the future, we will increase the size of our dataset with more variety of fruit images containing more features and develop a new efficient CNN model. Ethical Approval (Involvement of Animals) This article does not contain any studies with animals performed by any of the authors. Ethical Approval (Involvement of Human Subjects) There are no studies involving human participants done by any of the authors in this article. The datasets used in the article are open to the public. For the usage of these datasets, proper citation rules should be maintained. Credit Author Statement Nusrat Sultana: Software, Data collection, Data analysis, Result presentation, Visualization, and Original draft preparation; Musfika Jahan: Software, Data collection, Data analysis, Result presentation, Visualization, and Original draft preparation; Mohammad Shorif Uddin: Conceptualization, Supervision, Reviewing, and Editing. Declaration of Competing Interest The authors declare that they have no conflict of interests from any competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements We are very grateful to the domain expert Mohammad Enayet-e-Rabbi, Deputy Director of Quality Control, Seed Certification Agency, Ministry of Agriculture, Bangladesh for the valuable help and cooperation to successfully accomplish the task. Data Availability Fresh and Rotten Fruits Dataset for Machine-Based Evaluation of Fruit Quality (Mendeley Data).

4 Freshness Identification of Fruits Through the Development of a Dataset

67

References 1. Ananthanarayana T, Ptucha R, Kelly SC (2020) Deep learning based fruit freshness classification and detection with CMOS image sensors and edge processors. Electron Imaging 2. Arakeri MP, Lakshmana (2016) Computer vision based fruit grading system for quality evaluation of tomato in agriculture industry. Proc Comput Sci 79:426–433. https://doi.org/10.1016/ j.procs.2016.03.055 3. Bresilla K, Perulli GD, Boini A, Morandi B, Grappadelli LC, Manfrini L (2019) Single-shot convolution neural networks for real-time fruit detection within the tree. Front Plant Sci 10 4. Zeng G (2017) Fruit and vegetables classification system using image saliency and convolutional neural network. In: 2017 IEEE 3rd Information technology and mechatronics engineering conference (ITOEC), pp 613–617. https://doi.org/10.1109/ITOEC.2017.8122370 5. Bird JJ, Barnes CM, Manso LJ, Ekárt A, Faria DR (2021) Fruit quality and defect image classification with conditional GAN data augmentation. https://arxiv.org/abs/2104.05647 6. Moallem P, Serajoddin A, Pourghassem H (2017) Computer vision-based apple grading for golden delicious apples based on surface features. Inf Process Agric 33–40 7. Bhargava A, Bansal A (2021) Fruits and vegetables quality evaluation using computer vision: a review. J King Saud Univ Comput Inf Sci 243–257. https://doi.org/10.1016/j.jksuci.2018. 06.002 8. Nashat A, Hassan N (2017) Automatic segmentation and classification of olive fruits batches based on discrete wavelet transform and visual perceptual texture features. Int J Wavelets Multiresolution Inf Process. https://doi.org/10.1142/S0219691318500030 9. Gómez-Sanchis J, Gómez-Chova L, Aleixos N, Camps-Valls G, Montesinos-Herrero C, Moltó E, Blascoa J (2008) Hyperspectral system for early detection of rottenness caused by Penicillium digitatum in mandarins. J Food Eng 89(1):80–86 10. Singh S, Singh NP (2018) Machine learning-based classification of good and rotten apple. In: Lecture notes in electrical engineering, pp 377–386 11. Sara U, Rajbongshi A, Shakil R, Akter B, Sazzad S, Uddin MS (2022) An extensive sunflower dataset representation for successful identification and classification of sunflower diseases. Data Brief. https://doi.org/10.1016/j.dib.2022.108043 12. Khan MM, Uddin MS, Parvez MZ, Nahar L, Uddin J (2021) A deep convolution neural networkbased SE-ResNext model for Bangla handwritten basic to compound character recognition. J Hunan University Nat Sci 48(12) 13. Jahanbakhshi A, Kheiralipour K (2020) Evaluation of image processing technique and discriminant analysis methods in postharvest processing of carrot fruit. Food Sci Nutr 8(7):3346–3352 14. Bargoti S, Underwood J. Deep fruit detection in orchards. https://arxiv.org/abs/1610.03677 15. Siddiqi R (2021) Fruit-classification model resilience under adversarial attack. SN Appl Sci 4. https://doi.org/10.1007/s42452-021-04917-6 16. Lu S, Lu Z, Aok S, Graham L (2018)Fruit classification based on six layer convolutional neural network. In: 2018 IEEE 23rd International conference on digital signal processing (DSP), pp 1–5 17. Rojas-Aranda JL, Nunez-Varela JI, Cuevas-Tello JC, Rangel-Ramirez G (2020) Fruit classification for retail stores using deep learning. In: Mexican conference on pattern recognition (MCPR 2020). https://doi.org/10.1007/978-3-030-49076-8_1 18. Wan Nurazwin Syazwani R, Muhammad Asraf H, Megat Syahirul Amin MA, Nur Dalila KA (2022) Automated image identification, detection and fruit counting of top-view pineapple crown using machine learning. Alexandria Eng J 1265–1276. https://doi.org/10.1016/j.aej. 2021.06.053 19. Ponce JM, Aquino A, Andújar JM (2019) Olive-fruit variety classification by means of image processing and convolutional neural networks. IEEE Access 7:147629–147641 20. Akhi, AB, Akter F, Khatun T, Uddin MS (2018) Recognition and classification of fast food images. Global J Comput Sci Technol

Chapter 5

Rice Leaf Disease Classification Using Deep Learning with Fusion Concept N. Rajathi, K. Yogajeeva, V. Vanitha, and P. Parameswari

1 Introduction The agricultural sector continues to play a key role in development, particularly in low-income regions. In terms of mixed sales and baseline physical activity, global venues are huge. Agriculture is the main income source for 86% of rural families. However, about 75% of the negatives remain in rural areas and earn most of their income in agriculture and related industries. Farming provides us with food, income, and jobs and can therefore be an engine for financial growth in the world’s heavily agricultural producing regions, as well as a massive tool for reducing poverty in the world’s producing regions. Harvest problems have long been one of the top concerns for farmers. Possibly this would give a shot at manufacturing the agribusiness sector. The ability to utilize an autonomous computer system that diagnoses and then discovers problems might be a major aid and cure for the farmer who is requested to undertake this diagnostic task by optically examining diseased plant leaves. Building a device or platform at your fingertips via a mobile device is likely to be of immense assistance to farmers who do not now have access to necessary commodities and logistics. This concept has been extended to plant disease detection methods for wireless control and monitoring in large-scale agricultural production with the use of drones for monitoring, sensors to deal with the amount of water, as well as fertilizers, and moderations necessary for a qualitative production result. Plant disease can cause stunted growth, reducing output. Plants are an essential part of everyone’s diet. As a result, ensuring that the plant is disease-free is crucial. If an infection arises, it is necessary to detect the disease. Many diseases affect the N. Rajathi (B) · K. Yogajeeva · V. Vanitha IT Department, Kumaraguru College of Technology, Coimbatore 641049, India e-mail: [email protected] P. Parameswari MCA Department, Kumaraguru College of Technology, Coimbatore 641049, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_5

69

70

N. Rajathi et al.

plant’s leaves at various periods of the year, depending on the temperature and climate where the leaves become infected. Because of the diversity of physical topography, environmental elements such as temperature, humidity, and rainfall vary across India. Insect pests cause agricultural damage, resulting in severe costs for farmers. The leaves are destroyed by diseases, insects, nematodes, and weeds. Deep learning models were successfully applied in different application domains including image recognition, plant disease detection and classification, etc. Several plant diseases and pest detection methods based on deep learning are successfully applied in real agricultural practice. Deep learning with data fusion concept combines multiple data before doing the analysis and provides accurate predictions. Given the variety of sources the data comes from, it makes more sense to combine them to improve disease detection. Data fusion is the simultaneous merging of data and information from several sources in order to improve performance. In late fusion, each source’s data is processed independently to produce high-level inference conclusions and the same is applied in this study. The remainder of this section is organized as follows. The relevant studies are discussed in Sect. 2, the data collection is explained in Sect. 3, and the methodologies are discussed in Sect. 4. The results obtained were discussed in Sect. 5. The conclusion part is discussed in Sect. 6.

2 Literature Survey Qaid et al. [1] developed models using transfer learning and tested with two data sets: the Kaggle COVID-19 database and a local data set from Asir Hospital in Abha, Saudi Arabia. The proposed models detected COVID-19 cases with high accuracy. Batur Sahin ¸ et al. [2] used clonal selection algorithm’s clustering theory to define features in a deep learning-based vulnerability detection model. The proposed approach used an immune-based feature selection model to improve the detection abilities of static analyses. Three open-source PHP applications are used to analyze a real-world SA dataset. Comparisons are carried out by employing a classification model for all features to assess the classification improvement of the proposed methods. The findings demonstrated that the suggested strategy significantly increased both the classification accuracy and true positive rate. Two hybrid deep learning architectures were created and tested by Abuhmed et al. in 2021 [3] for the detection of AD progression. Several deep bidirectional long short-term memory models are combined to create these models. The models are experimented using ADNI data and proved that the suggested deep learning models more efficient and useful. Shikalgar et al. [4] categorized two multimodal data by using the neural network in the proposed system. This method aims to improve learning by combining CNN and DNN weight factors to handle heterogeneous multimodal input. With more features, the classifier becomes more accurate because the classification error decreases.

5 Rice Leaf Disease Classification Using Deep Learning with Fusion …

71

Deep hybrid models were successfully applied in various application domain including human activity recognition [5], personality trait classification [6], network security [7–11], face mask detection [12], credit card scoring [13], child behavior prediction [14], lung disease detection [15], gas classification [16], stream flow prediction [17], power consumption forecasting [18], etc. Using the data fusion of data from several sensors, Maimaitijiang et al. [19] studied the association between canopy thermal information and grain yield. By fusing data from various sensors, the authors of [20] looked at the relationship between canopy thermal information and grain yield. In [21], the authors analyzed the deep learning models for various performance benchmarks for crop yield prediction. The methods evaluated in their study include the XGBoost, Convolutional Neural Networks, Deep Neural Networks, CNNXGBoost, CNN-Recurrent Neural Networks, and CNN-long short-term memory. They conducted tests on a publicly accessible soybean dataset for the case study, which includes 25,000 samples and 95 parameters like weather and soil conditions.

3 Dataset Used The Kaggle API was utilized to acquire the rice leaf image database utilized in this study. There are 3355 RGB color rice leaf images in total in the dataset, with each image containing only one disease. The dataset in this work comprises images from three disease categories: leaf blast, brown spot, and hispa, as well as healthy images Fig. 1. The training set contains 2345 images for each class label, whereas the test set has 1010 images. For the sake of testing and training, the dataset has been split into 70:30 portions. The rice leaf images were used to test the fusion model in a classification task. Total number of images in both infected and healthy classes used is mentioned in Fig. 2.

Fig. 1 Samples of rice leaf disease dataset

72

N. Rajathi et al.

Fig. 2 Total number of images

4 Typical Deep Learning Architecture The representation of typical Convolution Neural Network (CNN)-based deep learning architecture is shown in Fig. 3. In CNN, many convolutional layers are stacked on top of one another, and each layer can recognize more complex structures. Three different kinds of layers, namely fully connected layers, pooling layers, and convolutional layers, make up the CNN. When these layers are combined, a CNN architecture is produced. Next, the input which comes from convolutional layers are flattened by the flatten layer. The fully connected (FC) layer connects the neurons between two layers. These layers are positioned before the output layer and make up the last few layers of a CNN design. When fed with insufficient data or even most of the time, a deep learning model powered by fully connected neural network layers may produce overfitting, necessitating the excessive use of computational resources and power, which is not present in traditional machine learning techniques. Deep hybrid learning, the fusion network that results from merging deep learning with Machine Learning. In deep hybrid learning, we extract features from unstructured data using deep learning techniques

Fig. 3 Deep learning architecture

5 Rice Leaf Disease Classification Using Deep Learning with Fusion …

73

and then utilize traditional machine learning techniques to create highly accurate classification models using the unstructured data. As a result, deep hybrid learning (DHL) enables us to combine the advantages of DL and ML, addresses their respective shortcomings, and delivers more precise and computationally affordable solutions. The deep neural learning is faster, more efficient, requires less computing power, and outperforms individual deep learning methods.

4.1 Fusion Methods There are two kinds of fusion methods, namely early fusion and late fusion. Early Fusion Early fusion is applicable to both raw and pre-processed sensor data. Prior to data fusion, data characteristics should be retrieved from the data; otherwise, the procedure would be difficult, particularly if the data sources have varying sampling rates between the modalities. When one source of data is discrete while the others are continuous, synchronizing the two can be difficult. Consequently, early data fusion has a considerable barrier in combining data sources into a single feature vector. Late Fusion Until the very end, each sense modality is processed individually in late fusion, after which they are combined using probabilistic methods. The term “late fusion” might refer to the combination of outcomes from various retrieval techniques or distinct similarity rankings. This study employs the late fusion approach. In late fusion concept, deep learning models are utilized for feature extraction and the final classification is done by machine learning classifiers and is shown in Fig. 4. The proposed methodology is clearly depicted in Fig. 5.

Fig. 4 Fusion of DL and ML

74

N. Rajathi et al.

Fig. 5 Working of the proposed methodology

4.2 Data Pre-processing Three different types of rice leaf diseases are utilized in our experiment. It started by changing the image size by 256 × 256 and converting RGB colored leaf images into hsv color space and then moved on to segmented leaf images from the same dataset. The background was smoothed in the segmented images so that it could give more useful information that was easier to examine. Finally, we evaluated the performance of the developed techniques using grayscale images from the same dataset. A training set and a testing set were created from all the leaf images. To review and analyze, we divided leaf images into 70–30 ratio (70% training images and 30% testing images). CNN networks are the most often employed deep learning algorithms. For the study presented in this paper, we applied a memory efficient deep neural network model that we built from scratch on a dataset of 3355 rice infected and healthy images, yielding an excellent RLDD accuracy.

4.3 Feature Extraction For feature extraction, transfer learning technique is involved. Transfer learning uses previously trained network on a new task. It is often used in deep learning because it can train a network with a little amount of data. In this study, the fully connected layer with the softmax function and four classes are used. It is decided to freeze the layer in each model and added a stack of one activation layer, one batch-normalization layer, and one dropout layer. All models were evaluated using various dropout levels, learning rates, and batch sizes. The deep neural network input size is 256 × 256. As deep hybrid learning supports transfer learning, DenseNet 121 is used as a pre-trained model for feature extraction.

5 Rice Leaf Disease Classification Using Deep Learning with Fusion …

75

4.4 Machine Learning Classifiers In this work, three machine learning classifiers are used for final classification.

4.4.1

AdaBoost

It is a machine learning-based ensemble method. Decision trees are the most popular AdaBoost technique. Through the transformation of several weak learners into strong learners, these algorithms increase prediction power.

4.4.2

XGBoost

Extreme Gradient Boosting (XGBoost) is a supervised learning technique for largescale regression and classification. It provides reliable results by using successively created shallow decision trees and a highly scalable training strategy that minimizes overfitting.

4.4.3

Random Forest

Random forest is a supervised machine learning algorithm that is frequently used in classification problems. It creates decision trees from several samples, then utilizes the majority vote for classification and the average for regression.

5 Experiment and Results To assess the classification performances of the proposed methodology, 3355 rice leaf images were used. Two non-fused models including sequential and DenseNet-121 were implemented, and their classification performances were studied. The classification performances of the non-fused models were compared with the classification accuracy of the three fused models and are presented in Fig. 6. From the results, it is inferred that the fused models gives better accuracy than non- fused models. Among the fused models, random outperforms well than other models.

76

N. Rajathi et al.

Fig. 6 Classification performance of the models used

6 Conclusion and Future Scope To detect plant diseases, we used both the deep learning and machine learning techniques on a data set of rice leaves. The deep learning model was used to extract features, and then fine-tune was used to execute transfer learning. The model learns from the extracted features. Extracting features, according to the models’ behavior, is far more efficient than transfer learning; using the best classifier, produces more accurate and takes less time to execute than transfer learning. The suggested model obtained 98.85% accuracy. In the future, we intend to incorporate more rice plant images as well as other diseased plant leaves in the analysis. Future research will combine multiple evolutionary optimization strategies with deep learning models and use the resulting models to analyze the performances of real-time data.

References 1. Qaid TS, Mazaar H, Al-Shamri MYH, Alqahtani MS, Raweh AA, Alakwaa W (2021) Hybrid deep-learning and machine-learning models for predicting COVID-19. Comput Intell Neurosci 2. Batur Sahin ¸ C, Abualigah L (2021) A novel deep learning-based feature selection model for improving the static analysis of vulnerability detection. Neural Comput Appl 33(20):14049– 14067 3. Abuhmed T, El-Sappagh S, Alonso JM (2021) Robust hybrid deep learning models for Alzheimer’s progression detection. Knowl-Based Syst 213:106688 4. Shikalgar A, Sonavane S (2020) Hybrid deep learning approach for classifying Alzheimer disease based on multimodal data. In: Computing in engineering and technology. Springer, Singapore, pp 511–520 5. Gumaei A, Hassan MM, Alelaiwi A, Alsalman H (2019) A hybrid deep learning model for human activity recognition using multimodal body sensing data. IEEE Access 7:99152–99160

5 Rice Leaf Disease Classification Using Deep Learning with Fusion …

77

6. Ahmad H, Asghar MU, Asghar MZ, Khan A, Mosavi AH (2021) A hybrid deep learning technique for personality trait classification from text. IEEE Access 9:146214–146232 7. Hassan MM, Gumaei A, Alsanad A, Alrubaian M, Fortino G (2020) A hybrid deep learning model for efficient intrusion detection in big data environment. Inf Sci 513:386–396 8. Potluri S, Henry NF, Diedrich C (2017) Evaluation of hybrid deep learning techniques for ensuring security in networked control systems. In: 2017 22nd IEEE international conference on emerging technologies and factory automation (ETFA). IEEE, pp 1–8 9. Ertam F (2019) An efficient hybrid deep learning approach for internet security. Physica A 535:122492 10. Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381 11. Wei Y, Jang-Jaccard J, Sabrina F, Singh A, Xu W, Camtepe S (2021) AE-MLP: a hybrid deep learning approach for DDoS detection and classification. IEEE Access 9:146810–146821 12. Loey M, Manogaran G, Taha MHN, Khalifa NEM (2021) A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement 167:108288 13. Zhu B, Yang W, Wang H, Yuan Y (2018) A hybrid deep learning model for consumer credit scoring. In: 2018 international conference on artificial intelligence and big data (ICAIBD). IEEE, pp 205–208 14. Kumar TS, Senthil T (2021) Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. J Inf Technol 3(01):29–43 15. Bharati S, Podder P, Mondal MRH (2020) Hybrid deep learning for detecting lung diseases from X-ray images. Inform Med Unlocked 20:100391 16. Wang SH, Chou TI, Chiu SW, Tang KT (2020) Using a hybrid deep neural network for gas classification. IEEE Sens J 21(5):6401–6407 17. Lin Y, Wang D, Wang G, Qiu J, Long K, Du Y, Xie H, Wei Z, Shangguan W, Dai Y (2021) A hybrid deep learning algorithm and its application to stream flow prediction. J Hydrol 601:126636 18. Yan K, Wang X, Du Y, Jin N, Huang H, Zhou H (2018) Multi-step short-term power consumption forecasting with a hybrid deep learning strategy. Energies 11(11):3089 19. Maimaitijiang M, Ghulam A, Sidike P, Hartling S, Maimaitiyiming M, Peterson K, Shavers E, Fishman J, Peterson J, Kadam S et al (2017) Unmanned aerial system (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine. ISPRS J Photogramm Remote Sens 134:43–58 20. Maimaitijiang M, Sagan V, Sidike P, Hartling S, Esposito F, Fritschi FB (2020) Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sens Environ 237:111599 21. Oikonomidis A, Catal C, Kassahun A (2022) Hybrid deep learning-based models for crop yield prediction. Appl Artif Intell 1–18

Chapter 6

Advances in Deep Learning-Based Technologies in Rice Crop Management Mayuri Sharma and Chandan Jyoti Kumar

1 Introduction Rice is a vital crop that significantly contributes to the global economy because at most 50% of the global population relies only on rice for sustenance. In order to meet the increasing food demand of the world’s population, farmers must maintain the annual rice yield and the frequency of harvests. However, disorders, biotic stresses, diseases, reduced growth, stagnant yields, etc., act as major hindrances to rice cultivation. Managing rice crop cultivation allows for preserving global food and economic security. Manual rice crop management requires more manpower and time. Thus, technologies like machine learning (ML), deep learning (DL), computer vision (CV), and the Internet of Things (IoT) have been introduced to overcome these obstacles and improve the accuracy of an automated system for managing the rice crop. The use of ML/DL with CV-IoT tools stimulated interest in the study of creating agriculture-based computer systems. Traditional ML models including Support Vector Machine (SVM), Self-organizing Model (SOM), Decision Tree (DT), Neural Networks (NN), clustering, and thresholding techniques are utilized for Region of Interest (RoI) segmentation, classification, and recognition. The use of DL and ensemble learning models increases the models’ diagnostic accuracy. With the availability of massive amounts of data, GPUs are used to create these computational models [1–13].

M. Sharma Department of Computer Science and Engineering, Assam Royal Global University, Guwahati 781035, India C. J. Kumar (B) Department of Computer Science and Information Technlogy, Cotton University, Guwahati 781001, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_6

79

80

M. Sharma and C. J. Kumar

The remainder of this chapter is structured as follows. Section 2 presents the basic flow of DL rice crop management systems as well as the DL models used for these systems. Section 3 provides the DL applications in rice crop management. Section 4 discusses the present challenges of DL-based rice crop management systems. Finally, Sect. 5 provides a conclusion.

2 DL-Based Rice Crop Management System The general framework of the rice crop management system is shown in Fig. 1. Data annotation can be implemented with open-source label toolkits like MakeSense, LabelImg, and VGG image annotator. First, the pre-processed images are sent as input into the Convolutional Neural Network (CNN)-based models for feature extraction and classification. Integration of DL models (UNet, DenseNet, VGG19, and Mobilenetv2) with attention/inception modules as well as conventional optimization techniques (rider optimization, Spider monkey optimization, water wave optimization, and superpixel optimization) is another way of predicting physical traits in rice panicles, grain, and leaves. The features extracted can be sent to detector block 2 for extraction of the required location. Region-based CNN (RCNN) models like Faster RCNN, Mask RCNN, and Fast RCNN have successfully analyzed the traits in rice seedlings, and rice grain/panicle, and have estimated the probability of diseases and their severity in rice crops. Multimodal rice data is sent to transformer block 3 [14, 15].

Fig. 1 Schematic diagram of rice crop management system

6 Advances in Deep Learning-Based Technologies in Rice Crop Management

81

Fig. 2 Different DL models used in developing automated rice crop management system

2.1 Rice CNN Models Deep learning architectures based on CNN are frequently used for rice image classification. CNN is a particular form of neural network designed to effectively recognize and classify the input images by extracting key features. The release of the ImageNet dataset has established a baseline for creating advanced pre-trained DL models. Transfer learning has been incorporated into these pre-trained models to produce more effective results. Transfer learning (TL) techniques try to transfer the knowledge obtained from an earlier problem to the new task when the new task requires less training. The pre-trained DL models applied in rice crop management are fine-tuned by changing the hyperparameters along with the CNN model’s last layers to train it for a new task [14–17]. Figure 2 shows various CNN detection/classification models used in rice crop management. In managing rice crops, pre-trained DL models such as DenseNet169, ResNet50, ResNet152, DenseNet121, VGG19, and VGG16 have been employed, with DenseNets and VGGs doing well in most of the situations. With nearly 100% accuracy, an ensemble of TL models is a new trend, particularly in diagnosing rice disease/deficiency disorders. CNN architectures, with convolution layers, pooling layers, and activation layers, have filters of different sizes. Small patches (filters/features) of the input images are selected to make rough feature matches on different images. Each of these filters is put throughout the whole image to match [18–20]. The corresponding pixel values of the filter and image portion over the filter are multiplied. Thus, the filter moves with a certain size until it parses the complete image, creating a convolution feature map. All negative values obtained from the convolution layer are removed by ReLu layer. In the pooling layer, a window of a certain size is moved across the entire matrix after passing through ReLu Layer. The maximum or average value of the ReLu layer’s output is taken to shrink the image size. Figure 3 shows a sample of rice features obtained by combining the 3rd

82

M. Sharma and C. J. Kumar

Fig. 3 Illustrations of learning rice features by DL mechanism

Fig. 4 DL segmentation models used in developing automated rice crop management systems

and 5th blocks in VGG19. In detection models, a Feature Pyramid Network (FPN) has been used to improve the detection effect of small targets by using multiscale feature maps. The full feature maps are then sent to the region proposal network (RPN), which generates multiple anchor boxes in each feature map for searching possible RoI. Later on, each RoI generates a fixed-size feature map. The classes of the proposals are calculated through softmax/sigmoid classification. The final precise location of detected boxes is obtained by using the Bounding Box (BBox) regression. Cascading detectors are known to improve the accuracy of the prediction box [20–23]. Image segmentation approaches (Fig. 4) are applied in rice crop management by dividing the images into several segments like lesion, seed, root, grain, panicle, etc. For each object in the image, segmentation produces a pixel-oriented mask. These important segments can also be considered for classification or detection tasks [7, 17, 24–26].

2.2 Rice Transformers Transformer based on shunted/self-attention, cross-attention, and data fusion has shown great potential in controlling rice diseases. The transformer can handle sequen-

6 Advances in Deep Learning-Based Technologies in Rice Crop Management

83

Fig. 5 Schematic diagram of transformer/SegFormer used in rice crop management

tial data and consists of encoding and decoding parts. The rice images are fed into the transformer in the form of patches. Rice transformer has handled multimodal data while controlling rice diseases, where MLP and CNN are feature extractors for continuous agriculture sensor data and rice image data, respectively. Features extracted are transmitted linearly to self-attention encoders. Encoders in transformers produce multiscale features and have a self-attention module, whereas the decoding part has cross-attention module. Output from the last encoder is sent to all cross-attention decoders. The features are then concatenated and then sent to the final classification layer. SegFormer is created by combining lightweight multilayer decoders with a transformer. Figure 5 shows the schematic architecture of the transformer and SegFormer for rice crop management. The coordinate attention mechanism links the encoder with the decoder. SegFormer reduces the input channel and predicts the kernel for each target position using a shunted encoder and a decoder having feature upsampling. The weighted sum of each local pixel of the original feature map and each predicted kernel of the new feature map is computed. This upsampled feature map is forwarded to MLP for feature concatenation and mask prediction [17, 18].

3 Applications of DL in Rice Crop Management DL has been used in agriculture to produce crops more quickly and efficiently while also lowering labor costs. Maximizing rice productivity requires measuring the physiological and physical traits of the rice plant, a process known as rice phenotyping.

84

M. Sharma and C. J. Kumar

Table 1 Use of DL for rice crop management in recent years DL method used Application Dataset used TL/ensemble [3, 5]

Rice disease diagnosis Kaggle and Mendeley data TL/ensemble [4] Rice nutrient Kaggle and Mendeley deficiency diagnosis data Transformer [18] Rice disease control On-field Average weighted Rice nutrient Kaggle data ensemble learning [27] deficiency diagnosis Multi layer Rice disease diagnosis Multimodal on-field perceptron+CNN [28] data SVM+CNN [29] Paddy disease On-field detection/classification Transfer learning (TL) Rice disease diagnosis Public data Kaggle [30] Custom CNN [31] Rice leaf disease Public data Mendeley detection/classification TL [32] Rice On-field disease/deficiency detection RSegformer [33] Rice leaf disease On-field severity prediction Multiscale feature Rice blast disease Public data fuse-attention [34] extraction CNN [35] Grain counting in rice On-field/laboratory panicle data YOLO, FRCNN [36] Rice panicle On-field data detection/counting

Best performance A = 100% A = 100% A = 97.38% A = 98.33% A = 95.31% A = 91.45% High performance than ML A = 97.47% A = 98.03%

MIoU = 85.38% MIoU = 96.15% E = 5% A = 92.92%

Image-based analysis using DL is one of the techniques that can replace the manual crop observation and analysis required for rice phenotype, disease, and variety analysis [4–6]. This section provides an overview of the DL applications in rice crop management. Table 1 shows the most recent works done in this field evaluated in terms of accuracy (A), mean error (E), mean intersection of union (MIoU), etc.

3.1 Rice Development Stage Analysis The problem of counting rice grains/panicles can be addressed by deep semantic segmentation and detection [19]. The images of rice panicles are captured using a smartphone, digital camera, and phenotyping instrument. Detection methods like Faster RCNN, YOLO or their collaboration with CNN models and field robots have been implemented for counting and detecting grain, tiller and panicle [21, 24, 25,

6 Advances in Deep Learning-Based Technologies in Rice Crop Management

85

37]. It is also possible to estimate the flowering day after detecting and counting flowering panicles of rice. DL method is known to show high accuracy compared to conventional ML methods [37]. Open-source DL framework (Caffe) implementation and superpixel image patches have also proven to be a robust method for extracting and detecting rice panicles and tillers. Rice cultivars, different types of mobile phone, as well as the ambient circumstances during image collection, have no impact on the efficiency of CNN-based web applications for automatic counting of tillers [22, 25]. Based on the UNet model’s prediction (segmented root), the root distribution parameters during the late development stage of rice were known to be unaffected by the acquisition date [26].

3.2 Rice Disease and Disorder Diagnosis Crop diseases and disorders constitute a significant threat in agricultural production systems that deteriorate yield quality and quantity at the production, storage, and transportation level. Disease and deficiency disorder analysis in rice is timeconsuming in the laboratories so DL techniques are being utilized for efficient outcomes. The feature modules of different CNN architectures are combined to perform TL on the target rice disease/disorder datasets [38, 39]. DL detection model Faster RCNN is effective in real-time rice leaf disease identification [40]. An ensemble of various CNN pre-trained models with fine-tuning is found to enhance the diagnosis system [27, 41, 42]. Multiple studies have applied metaheuristic optimization techniques to optimize the hyperparameters of various DL models [43]. These DLbased algorithms have increased the diagnostic model’s accuracy by a large extent as compared to conventional ML algorithms [44–53].

3.3 Rice Variety Identification Pre-trained model VGG16 has served as a non-destructive method to classify different rice varieties like Arborio, Basmati, Ipsala, Jasmine, and Karacada of Turkey, outperforming ANN [54]. DL/TL and CV technology are used as a tool to create a real-time prototype model for monitoring broken, wrinkled, and black spots rice defects [55] as well as to achieve better and more objective rice quality including medium-grain Japonica rice, round-grain Glutinous rice and long-grain Indica rice [56].

4 Challenges and Future Scopes A significant obstacle preventing ML/DL researchers from working in the field of rice crop management is the presence of less publically available datasets for the rice crop. This research barrier might be removed by making the dataset freely accessible

86

M. Sharma and C. J. Kumar

to the academic community. In rice plants, occlusion between grains, panicles, and leaves is caused by growth density. This might make DL detection models less effective. Therefore, developing and testing the models at various planting densities is crucial. Rice datasets may be from the same rice type or from distinct rice varieties. Root, grain, panicle, and leaf shapes and heights typically change across several growth phases, which may also affect the detection accuracy of DL models. The development of models for each rice variety and growth stage has to receive more attention. Estimating the severity of a rice disease or nutrient deficiency is another important but less explored field in managing the rice crop [24, 28, 37, 41].

5 Conclusion This chapter provided an overview of developments in DL-based technologies for managing rice crops by presenting the most recent CNN and transformer models used in agriculture. It will undoubtedly inspire researchers to concentrate on developing agricultural technologies for tracking the health/growth of rice crops, detecting/controlling diseases and pests and harvesting rice with high productivity and low cost. In order to tackle the issues facing agriculture today, it may be inferred that hybrid DL models will be widely used in the future for automating all agriculturebased systems. Acknowledgements None.

References 1. Wang D, Huang J, Nie L, Wang F, Ling X, Cui K, Li Y, Peng S (2017) Integrated crop management practices for maximizing grain yield of double-season rice crop. Sci Rep 7(1):1– 11 2. Sandhu N, Yadav S, Singh VK, Kumar A (2021) Effective crop management and modern breeding strategies to ensure higher crop productivity under direct seeded rice cultivation system: a Review. Agronomy 11(7):1264 3. Sharma M, Kumar CJ, Deka A (2021) Early diagnosis of rice plant disease using machine learning techniques. Arch Phytopathol Plant Prot 55(3):259–283 4. Sharma M, Nath K, Sharma RK, Kumar CJ, Chaudhary A (2021) Ensemble averaging of transfer learning models for identification of nutritional deficiency in rice plant. Electronics 55(3):148–164 5. Sharma M, Kumar CJ (2022) Improving rice disease diagnosis using ensemble transfer learning techniques. Int J Artifi Intell Tools 31(8):148–164 6. Uddin MS, Bansal JC (2021) Computer vision and machine learning in agriculture, 2nd edn. Springer, Singapore 7. Jeong Y, Lee J, Park M, Lee H, Baek J, Kim K, Lee C (2020) Deep learning-based rice seed segmentation for high-throughput phenotyping. In: SMA 2020: the 9th international conference on smart media and applications. ACM, NewYork, pp 64–65 8. Sharma M, Kumar CJ, Deka A (2021) Land cover classification: a comparative analysis of clustering techniques using sentinel-2 data. Int J Sustain Agric Manag Inf 7(4):321–342

6 Advances in Deep Learning-Based Technologies in Rice Crop Management

87

9. Kumar CJ, Kalita SK (2016) Preparation of a dataset and issues related with recognition of optical character in as samese script. Indian J Sci Technol 9(40):40 10. Kumar CJ, Das PR (2022) The diagnosis of ASD using multiple machine learning techniques. Int J Develop Disabil 68(6):973–983 11. Bhadra S, Kumar CJ (2022) An insight into diagnosis of depression using machine learning techniques: a systematic review. Curr Med Res Opin 38(5):749–771 12. Kumar CJ, Das PR, Hazarika A (2022) Autism spectrum disorder diagnosis and machine learning: a review. Int J Med Eng Inf 14(6):512–527 13. Bhadra S, Kumar CJ (2023) Enhancing the efficacy of depression detection system using optimal feature selection from EHR. Comput Meth Biomech Biomed Eng 14. Chen J, Zhang D, Zeb A, Nanehkaran YA (2021) Rice diseases detection and classification using attention based neural network and Bayesian optimization. Exp Syst Appl 178(1):114770 15. Wang Y, Wang H, Peng Z (2021) Identification of rice plant diseases using lightweight attention networks. Exp Syst Appl 169(3):114514 16. Daniya T, Vigneshwari S (2021) Deep Neural Network for disease detection in rice plant using the texture and deep features. Comput J 65(7):1812–1825 17. Li Z, Chen P, Shuai L, Wang M, Zhang L, Wang Y, Mu J (2022) A copy paste and semantic segmentation-based approach for the classification and assessment of significant rice diseases. Plants 11:(22):3174 18. Patil R, Kumar S (2022) Rice transformer: a novel integrated management system for controlling rice diseases. IEEE Access 10(8):87698–87714 19. Gong L, Fan S (2022) A CNN-based method for counting grains within a panicle. Machines 10(1):30 20. Singh D, Ichiura S, Nguyen TT, Sasaki Y, Katahira M (2021) Rice tiller number estimation by field robot and deep learning (Part 1). J JSAM 83(5):1–17 21. Deng R, Tao M, Huang X, Bangura K, Jiang Q, Jiang Y, Qi L (2021) Automated counting grains on the rice panicle based on deep learning method. Sensors 21(1):281 22. Deng R, Jiang Y, Tao M, Huang X, Bangura K, Liu C, Lin J, Qi L (2020) Deep learning-based automatic detection of productive tillers in rice. Comput Electron Agric 177(10):105703 23. Singh D, Mori T, Ichiura S, Nguyen TT, Sasaki Y, Katahira M (2022) Estimation of tiller number in rice using a field robot and deep learning-investigating effects of dataset composition on tiller estimation accuracy. Eng Agric Environ Food 15(2):47–60 24. Wu W, Liu T, Zhou P, Yang T, Li C, Zhong X, Sun C, Liu S, Guo W (2021) Image analysis-based recognition and quantification of grain number per panicle in rice. Plant Meth 15(122):1–14 25. Xiong X, Duan L, Liu L, Tu H, Yang P, Wu D, Chen G, Xiong L, Yang W, Liu Q (2017) Panicle-SEG: a robust image segmentation method for rice panicles in the feld based on deep learning and superpixel optimization. Plant Meth 13(104):1–15 26. Teramoto S, Uga Y (2020) A deep learning-based phenotypic analysis of rice root distribution from field images. Plant Phenomics 2020(10):1–10 27. Talukder SH, Sarkar AK (2023) Nutrients deficiency diagnosis of rice crop by weighted average ensemble learning. Smart Agric Technol 4(8):100155 28. Patil RR, Kumar S (2022) Rice-fusion: a multimodality data fusion framework for rice disease diagnosis. IEEE Access 10(1):5207–5222 29. Haridasan A, Thomas J, Raj ED (2023) Deep learning system for paddy plant disease detection and classification. Environ Monitor Assess 195(1):120 30. Mohapatra D, Das N (2023) A precise model for accurate rice disease diagnosis: a transfer learning approach. Proc Indian National Sci Acad 1–10 31. Mohapatra S, Marandi C, Sahoo A, Mohanty S, Tudu K (2022) Rice leaf disease detection and classification using a deep neural network computing. In: Communication and learning: first international conference, CoCoLe 2022. Springer, Cham, pp 231–243 32. Nayak A, Chakraborty S, Swain DK (2023) Application of smartphone-image processing and transfer learning for rice disease and nutrient deficiency detection. Smart Agric Technol 4(1):100195

88

M. Sharma and C. J. Kumar

33. Li Z, Chen P, Shuai L, Wang M, Zhang L, Wang Y, Mu J (2022) A copy paste and semantic segmentation-based approach for the classification and assessment of significant rice diseases. Plants 11(22):3174 34. Feng C, Jiang M, Huang Q, Zeng L, Zhang C, Fan Y (2022) A lightweight real-time rice blast disease segmentation method based on DFFANet. Agriculture 12(10):1543 35. Gong L, Fan S (2022) A CNN-based method for counting grains within a panicle. Machines 10(1):30 36. Wang X, Yang W, Lv Q, Huang C, Liang X, Chen G, Xiong L, Duan L (2022) Field rice panicle detection and counting based on deep learning. Front Plant Sci 2921 37. Xu C, Jiang H, Yuen P, Ahmad KZ, Chen Y (2020) MHW-PD: a robust rice panicles counting algorithm based on deep learning and multi-scale hybrid window. Comput Electron Agric 173(6):105375 38. Chen J, Chen J, Zhang D, Suzauddola M, Nanehkaran YA, Sun Y (2021) Identification of plant disease images via a squeeze-and-excitation MobileNet model and twice transfer learning. IET Image Proc 15(5):1115–1127 39. Rahman CR, Arko PS, Ali ME, Khan MAI, Apon SH, Nowrin F, Wasif A (2020) Identification and recognition of rice diseases and pests using convolutional neural networks. Biosyst Eng 194(6):112–120 40. Bari BS, Islam MN, Rashid M, Hasan MJ, Razman MAM, Musa RM, Ab Nasir AF, Abdul Majeed ppA (2021) A real-time approach of diagnosing rice leaf disease using deep learningbased faster R-CNN framework. Peer J Comput Sci 7(e432) 41. Wang X, Yang W, Lv Q, Huang C, Liang X, Chen G, Xiong L, Duan L (2022) Field rice panicle detection and counting based on deep learning. Plant Sci 13(8):966495 42. Sobhana M, Sindhuja VR, Tejaswi V, Durgesh P (2022) Deep ensemble mobile application for recommendation of fertilizer based on nutrient deficiency in rice plants using transfer learning models. Int J Interact Mobile Technol 16(16) 43. Darwish A, Ezzat D, Hassanien AE (2020) An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant diseases diagnosis. Swarm Evolu Comput 52(2):100616 44. Anandhan K, Singh AS (2021) Detection of paddy crops diseases and early diagnosis using faster regional convolutional neural networks. In: 2021 international conference on advance computing and innovative technologies in engineering (ICACITE). IEEE, Greater Noida, pp 898–902 45. Temniranrat P, Kiratiratanapruk K, Kitvimonrat A, Sinthupinyo W, Patarapuwadol S (2021) A system for automatic rice disease detection from rice paddy images serviced via a Chatbot. Comput Electron Agric 185(6):106156 46. Li D, Wang R, Xie C, Liu L, Zhang J, Li R, Wang F, Zhou M, Liu W (2020) A recognition method for rice plant diseases and pests video detection based on deep convolutional neural network. Sensors 20(3):578 47. Kiratiratanapruk K, Temniranrat P, Kitvimonrat A, Sinthupinyo W, Patarapuwadol S (2020) Using deep learning techniques to detect rice diseases from images of rice fields. In: International conference on industrial. Engineering and other applications of applied intelligent systems. Springer, Cham, pp 225–237 48. Rallapalli S, Durai MAS (2021) A contemporary approach for disease identification in rice leaf. Int J Syst Assur Eng Manag 1–11 49. Sharma R, Singh A (2022) Big bang-big crunch-CNN: an optimized approach towards rice crop protection and disease detection. Archives Phytopathol Plant Protect 55(2):143–161 50. Chen J, Zhang D, Nanehkaran YA, Li D (2020) Detection of rice plant diseases based on deep transfer learning. J Sci Food Agric 100(7):3246–3256 51. Sathya K, Rajalakshmi M (2022) RDA- CNN: enhanced super resolution method for rice plant disease classification. Comput Syst Sci Eng 42(1):33–47 52. Motamarri V, Sreenivasan S (2020) Novel convolutional neural network that uses a two stage inception module for bacterial blight and brown spot identification in rice plant. In: 4th international conference on imaging, signal Processing and communications (ICISPC). IEEE, Kumamoto, pp 6–12

6 Advances in Deep Learning-Based Technologies in Rice Crop Management

89

53. Feng S, Cao Y, Xu T, Yu F, Zhao D, Zhang G (2021) Rice leaf blast classification method based on fused features and one-dimensional deep convolutional neural network. Remot Sens 13(16):3207 54. Koklu M, Cinar I, Taspinar YS (2020) Computer-assisted real-time rice variety learning using deep learning network. Rice Sci 187(8):489–498 55. Jeyaraj PR, Asokan SP, Nadar ERS (2020) Computer-assisted real-time rice variety learning using deep learning network. Comput Electron Agric 29(5):106285 56. Lin P, Chen Y, He J, Fu X (2017) Determination of the varieties of rice kernels based on machine vision and deep learning technology. In: 2017 10th international symposium on computational intelligence and design (ISCID). IEEE, Hangzhou, pp 169–172

Chapter 7

AI-Based Agriculture Recommendation System for Farmers V. Vanitha, N. Rajathi, and K. Prakash Kumar

1 Introduction India is a predominantly agricultural country with a population of 1.3 billion, 70% of the people in India are employed in agriculture sector. Indian citizens primarily rely on the agriculture sector for the source of income. In India’s economy, the agriculture sector plays a very important role. There are countless agricultural items available for topsoil for farmers to select from. To what extent the farmer contributes to the production is determined by the seed they select. Also to maximize the crop yield, farmers use various pesticides and fertilizers in the high amount. This results in the increased toxicity of the soil. Recommending the best pesticide and fertilizer with best measurements will control the toxicity of the soil. According to Srivastava et al. [1], the Internet of Things has a significant impact on increasing crop output. According to Kumar et al. [2], AI systems are capable of forecasting agricultural diseases, suggesting fertilizers, and offering farming advice. An intelligent automated cropping system is what is being proposed, and it aids farmers in increasing crop yields. The main objective of this work includes 1. To implement IoT-based real-time data collection system to help the farmers in leaf disease detection, crop recommendation, and fertilizer recommendation. 2. To develop a robust system that can diagnose plant diseases in remote places using techniques of machine learning and image processing. 3. To suggest best crop for the farm, soil analysis is performed and machine learning technique is used to suggest best crop. V. Vanitha (B) · N. Rajathi · K. Prakash Kumar Kumaraguru College of Technology, Coimbatore, Tamil Nadu, India e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_7

91

92

V. Vanitha et al.

4. To suggest best fertilizer for the crop, soil analysis is performed and machine learning technique is used to suggest best fertilizer. 5. To make ease of suggestion, chatbot is used to perform crop and fertilizer suggestion. This paper is organized as follows: Sect. 2 introduces the research background and the significance of this paper. Section 3 deals with overall system design of this work. Section 4 deals with various modules in the proposed system. Section 5 discusses the results, and Sect. 6 concludes the paper.

2 Literature Survey The study by Ahmed et al. [3] aims to address the issue of excessive use of chemical dressings in weed control in agricultural fields. The study proposes the use of an automated gadget vision system that can distinguish between plants and weeds in digital images, which can lead to more efficient and cost-effective weed management. The system utilizes a support vector machine (SVM) algorithm for the classification of plants and weeds based on fourteen features that characterize them in photographs. The results of the study show that SVM achieves high accuracy in classifying plants and weeds, with a sensitivity of over 97% in a set of 224 test images. This is crucial in ensuring that crops are not mislabeled as weeds and vice versa, which can have significant impacts on crop productivity and quality. Rothe et al. [4] refocused on the idea that splint circumstances on cotton factories must be addressed in advance and without delay, as they might be detrimental to yield. The suggested work provides a pattern recognition device for the identification and classification of three cotton splint situations: bacterial scar, Myrothecium, and Alternaria. The pictures for this painting were taken from the fields of the Principal Institute of Cotton Research in Nagpur, as well as the cotton fields in the Buldana and Wardha sectors. For photo segmentation, the lively determine model is applied, and Hu’s moments are extracted as characteristics for the education of an adaptive neuro-fuzzy end machine. The bracket delicacy is required to be 85% at the very end outcomes. Singh et al. [5] and Kulkarni et al. [6] highlighted the increasing availability of high-resolution images and detector data in the field of stress phenotyping. However, analyzing such vast amounts of data requires the use of machine learning tools to identify patterns and features. Machine learning techniques can be applied at four stages of the decision cycle in stress phenotyping: (i) identification, (ii) classification, (iii) quantification, and (iv) prediction. The authors provide a comprehensive review of machine learning technologies and practices for stress phenotyping, including a taxonomy and guidelines for using machine learning tools in biotic and abiotic stress applications. The goal is to assist the agricultural community in selecting the most appropriate machine learning techniques for their specific applications.

7 AI-Based Agriculture Recommendation System for Farmers

93

Islam et al. [7] discussed cutting-edge phenotyping and manufacturing facility grievance finding to provide a potential step toward food security and sustainability. Imaging and computer vision-based phenotyping, in particular, allows for the examination of quantitative factory body structure. On the negative side, homemanufactured interpretation necessitates a high level of labor, bravery in manufacturing facility circumstances, and excessive processing time. As illustrated in this design, this style, which combines picture processing and device literacy, provides for the opinion of ails from the farmland. This automated technology classifies conditions (or lack thereof) on potato crops from a comprehensive industrial unit image database known as Plant Village. CNN segmentation and support vector machine alertness demonstrate grievance bracket over 300 photographs with a delicacy of 95%. As a result, the proposed strategy provides a track closer to computerized production facility scenarios on a large scale. Kumar Srivastava et al. [8] highlighted the importance of identifying and monitoring manufacturing conditions in agriculture to prevent yield losses and maintain sustainability. They noted that manually detecting issues in the field is risky and time consuming and requires significant effort. Therefore, image processing techniques are being used to develop tools for manufacturing unit condition analysis. The process involves steps such as image acquisition, preprocessing, segmentation, feature extraction, and classification. The authors focused on using images of leaves to develop patterns for manufacturing unit condition analysis and discussed various segmentation and feature extraction algorithms used in this area. Bashir et al. [9] and Oppenheim et al. [10] discussed in detail about detecting diseases in plants, particularly, in the potato plants. Kaviya et al. [11] discussed the role of talkbots in assisting the farmers in various activities. These bots help farmers by answering queries.

3 System Design and Architecture Overall process of the AI-based agriculture recommendation system is shown in Fig. 1. For crop recommendation, real-time data from IoT devices is used. This data is stored in the firebase, and further it will be sent to machine learning model for processing. For leaf disease detection, the real-time data from the camera of Raspberry Pi is used and stored in the firebase and processed further. By giving NPK data to fertilizer recommender model, the best fertilizer is identified for the crop. Chatbot is helpful for the farmers to get crop recommendation and fertilizer suggestion. IoT devices are used to get real-time data for the agriculture recommendation system. From IoT devices, the leaf images and the parameters like temperature, humidity, moisture, and pH values are collected from the ESP and RPi and stored in the firebase.

94

V. Vanitha et al.

Fig. 1 Overall flow of the agriculture recommendation system

4 Recommendation System for Farmers 4.1 Crop Recommendation System Plant Village is a web-based platform focused on crop diseases and health. Experts have collected and validated a dataset of 54,309 three-channel (RGB) images, each with a resolution of 256*256. These images were captured under standard laboratory conditions with only one leaf per image. The dataset comprises 14 crop species, including apple, orange, bell pepper, blueberry, cherry, corn, tomato, grape, strawberry, peach, potato, raspberry, soybean, and squash, as reported by Hlaing et al. [12]. The dataset includes images of 17 fungal diseases, four bacterial diseases, two earth disorders, two viral diseases, and one disorder caused by a mite. For 12 crop species, the dataset also contains images of healthy leaves that exhibit no visible symptoms of disease. Data augmentation is a powerful technique used to increase the size of the dataset by generating new images from the existing ones. This helps in creating variations of images that the model has never seen before, which in turn improves the accuracy of the model. The commonly used augmentation techniques are flipping, rotation, zooming, and changing brightness and contrast. These techniques can be applied randomly during the training process to generate a larger and more diverse dataset. Normalization is another important preprocessing step. It is used to rescale the pixel values of the images to a common range, typically between 0 and 1, to avoid

7 AI-Based Agriculture Recommendation System for Farmers

95

issues with different pixel intensities between images. This helps the model to converge faster during training and makes it less sensitive to lighting and contrast changes in the input images. In addition to these steps, it is also important to split the dataset into training, validation, and testing sets. The training set is used to train the model, the validation set is used to tune the hyperparameters and prevent overfitting, and the testing set is used to evaluate the final performance of the model. This ensures that the model is able to generalize well to new, unseen data. Ensemble learning (voting classifier) was used for recommending the crops. Algorithms used for ensemble learning are k-nearest neighbors, decision tree, random forest, and Gaussian Naive Bayes. Default parameters were used for algorithms. No hyperparameter tuning was done because of the high accuracy achieved by using the default parameters. Pickle was used to save the label encoder and the model for deployment using web application. The saved model was used for prediction on new data. The saved label encoder was used to inverse transform the label predicted by the model.

4.2 Leaf Disease Detection System Database development is the process of gathering comprehensive information on various species and diseases, which is then made available to users through successful inference. The bulk of the data is sourced from web-based platforms such as Wikipedia, Tree Geek, Tomato Disease Help, Spruce, among others. These sites provide highly relevant information about different plant types, as well as the scientific names of both plants and the diseases that afflict them, along with information about the causes of such diseases, their symptoms, and preventive and remedial measures. To gain a better understanding of the data necessary for accurate plant disease descriptions and prevention, prior projects and applications related to plant disease detection and identification are referenced. This project relies on the Plant Village dataset, which includes information on a variety of plants and fruits that are widely available in southern India, such as apples, tomatoes, rice, corn, sunflowers, oranges, peppers, and potatoes. The model for classification is made up of two phases, as depicted in Fig. 2. The first step involves feeding well-labeled training images into the ensemble model using machine learning algorithms, with feature vectors used to train the model. Leaf disease detection is performed by ensembling three high performing models such as VGG-16, Inception V3 and MobileNet to get more accurate results using the concept called voting. For real-time data collection, this work utilizes RPi camera to get leaf images and predict the disease. The fundamental idea of Inception architecture is to identify the optimal approach for estimating and including a desirable sparse structure within a convolutional neural network using readily available dense components. There is substantial empirical evidence to support the notion that residual networks are easier to fine-tune and can

96

V. Vanitha et al.

Fig. 2 Image classification system

handle greater depth. Although several designs were tested to create a lightweight CNN model for on-device inference, the architecture that performed best was one that incorporated two core concepts: the Inception module from GoogleNet and ResNet. The Inception architecture is depicted in Fig. 3. The stem component comprises a set of three 3 × 3 convolutional layers stacked together, allowing it to achieve an equivalent receptive field to that of a 7 × 7 convolutional layer. The max pooling layer is utilized to filter out insignificant entities. The Inception module starts with a stack of two 3 × 3 convolutional layers, which then decreases in size as it progresses deeper into Modules II and III. The stem layers contain the most basic feature set, such as lines and edges, while the features learned become more complex as the network delves deeper into the Inception module. Global average pooling is used to significantly reduce the input size that is fed into the fully connected layer, resulting in only 32 features being output. The final inference details, along with their respective confidence levels, are obtained through the use of a dense and fully connected layer connected with the SoftMax function. According to Agarwal et al. [13], transfer learning refers to the practice of using a model that has been trained for one task as a starting point for training another model. In machine learning, transfer learning is a crucial technique used to address the problem of insufficient training data. It does so by relaxing the requirement that the training and test data must be comparable, thereby enabling the transfer of information from the source domain to the target domain. In many cases, the problem of insufficient data cannot be avoided. The entire process of collecting and refining data is arduous and requires significant time and patience. The learning process of a model using transfer learning is depicted in Fig. 4. Ensembling of three deep learning models such as VGG-16, Inception V3, and MobileNet is shown in Fig. 5. Each model is built upon a pre-trained base model. To create the ensemble model, all three models (Inception V3, VGG-16, and MobileNet) are re-instantiated, and their best saved weights are loaded. The ensemble

7 AI-Based Agriculture Recommendation System for Farmers

Fig. 3 Inception model architecture Fig. 4 Transfer learning process

97

98

V. Vanitha et al.

Fig. 5 Ensembling three models

model definition is relatively simple, utilizing the same input layer that is shared by all of the previous models. The ensemble then calculates the average of the outputs of the three models using the Average() merge layer, a technique known as ensembling, in the top layer.

4.3 Fertilizer Recommendation System For fertilizer recommendation, decision tree with grid search and cross-validation is used to suggest best fertilizer for the soil depending on the soil data and type of crop. To predict the correct fertilizer, the input information like N, P, K, temperature, humidity, moisture and soil type and also crop to be grown are considered. Data preprocessing was performed to remove null values. There were some rows where the rate of nitrogen was zero. Those rows were considered as missing values and were dropped. Potassium and phosphorus had high correlation, so the column containing the ratio of phosphorus was dropped from the features to avoid multi-collinearity. Classes of the target column were encoded using label encoder. Features were scaled using robust scaler as there are possible outliers in the data. Training and testing ratio is selected as 80 and 20%. After the data preprocessing, training is carried out. In the training dataset, a decision tree with grid search and cross-validation is utilized. For fertilizer prediction, numerous factors such as

7 AI-Based Agriculture Recommendation System for Farmers

99

temperature, humidity, soil PH, and expected crop to be cultivated were considered. These are the system input parameters that can be entered manually by the users or obtained through sensors. After performing decision tree with grid search for fertilizer recommendation, the model achieved the accuracy of 96%.

4.4 Chatbot for Farmers When the farmer enters their queries regarding crop and fertilizer suggestion, the chatbot is intended to give the best results according to their queries; for example if the farmer needs best fertilizer for their crop, the chatbot will ask for details like NPK values and crop name and the chatbot will show best fertilizer suggestion for the crop. When a user inputs data into the chatbot using Natural Language, the model cannot process it directly. Therefore, the text needs to be tokenized and converted into numerical vectors using techniques such as skip-gram, word2vec, and continuous bag of words (CBOW). Once the text is converted into vectors, the next step is intent classification, which involves understanding the user’s intention or purpose from the Natural Language input. This can be considered as a machine learning classification problem using SVM or RNN-LSTM. Entity extraction involves extracting key information from the user input. Entity recognition can be accomplished using RNN-LSTM or bidirectional LSTM, CNN, and CRF. For instance, when a user inputs data such as nitrogen, potassium, and phosphorous values into the chatbot, the chatbot with the Rasa framework will provide the desired output. Figure 6 illustrates the results of the crop recommendation chatbot. This chatbot also includes fertilizer suggestion when user enters NPK and city after clicking recommend on chatbot, the user will get suggested fertilizer for their crop as shown in Fig. 7.

4.5 Web Application for Farmers The entire AI-based agriculture system is implemented as web application using Django and Python. Using the web application, the user can enter temperature, humidity, moisture, and pH values manually, or when the user clicks data from IoT device, the model will display the best crop in the output. Also the users are allowed to change the language from current language. For leaf disease detection, users are allowed to use either real time image from RPi camera or local image in the system to predict leaf diseases. When the user clicks on get, image from firebase will be fetched. When the user can click choose file, image from local device is fetched. Finally when user clicks predict, the final

100 Fig. 6 Crop recommendation through chatbot

Fig. 7 Fertilizer recommendation through chatbot

V. Vanitha et al.

7 AI-Based Agriculture Recommendation System for Farmers

101

Fig. 8 Web app page for fertilizer recommendation

output containing the diesease will be displayed on new screen. For fertilizer recommendation when the user enters NPK and crop details in the model and clicks on predict, the user will get fertilizer suggestion as shown in Fig. 8.

5 Results Leaf disease detection is implemented using ensemble learning of Inception V3, VGG-16, and MobileNet with the help of hard voting technique and achieved the accuracy of 98.7%. For crop recommendation, this project implements ensemble learning of random forest, KNN, decision tree, and Gaussian Naive Bayes with the help of soft voting and got accuracy of 99.7%. For fertilizer recommendation, decision tree with grid search and cross-validation is used and got accuracy of 96%. The summary of the results achieved is given in Table 1. Table 1 Summary of results

Module name

Technique

Accuracy (%)

Leaf disease detection

Ensemble learning

98.7

Crop recommendation

Ensemble learning

97.7

Fertilizer recommendation

Decision tree with grid search

96

102

V. Vanitha et al.

This agriculture recommendation system is very useful for the farmers to detect the disease within no time. Additionally, the spread of the disease can be stopped once it has become prevalent. The farmers can easily make use of the chatbot and webbased application to get any advice related to crop identification, disease detection, and fertilizer identification.

6 Conclusion and Future Work By creating and combining modules like leaf disease detection, crop recommendation, and fertilizer recommendations, this work addresses the issue faced by the farmers in India and seeks to boost crop yield. For plant recommendation and leaf disease detection, ensemble learning was applied. Grid search with decision trees was used to estimate the optimal fertilizer for crops and to provide fertilizer recommendations. IoT modules were integrated to enable real-time data collection. Recent techniques in the field of computer vision and neural networks were utilized and achieved good accuracy. The possible extensions for the current work include (i) prediction of disease using the images of other parts of the plant like branch, root, flower, etc., other than leaf and (ii) complete analysis of all diseases of all plants.

References 1. Srivastava A, Das DK (2021) A comprehensive review on the application of Internet of Thing (IoT) in smart agriculture. Wirel Pers Commun, pp 1–31 2. Kumar BS, Santhi SG, Kumar KK (2021) SAMS: Smart agriculture management system using emerging technologies IoT, AI-a study. In: IOP conference series: materials science and engineering, vol 1074, no 1, p 012017, Oct 2021. IOP Publishing 3. Ahmed F, Al-Mamun HA, Bari AH, Hossain E, Kwan P (2012) Classification of crops and weeds from digital images: a support vector machine approach. Crop Prot 40:98–104 4. Rothe PR, Kshirsagar RV (2015) Cotton leaf disease identification using pattern recognition techniques. In: 2015 International conference on pervasive computing (ICPC), Jan 2015, pp 1–6. IEEE 5. Singh A, Ganapathy subramanian B, Singh AK, Sarkar S (2016) Machine learning for highthroughput stress phenotyping in plants. Trends Plant Sci 21(2):110–124 6. Kulkarni AH, Patil A (2012) Applying image processing technique to detect plant diseases. Int J Mod Eng Res 2(5):3661–3664 7. Islam M, Dinh A, Wahid K, Bhowmik P (2017) Detection of potato diseases using image segmentation and multiclass support vector machine. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE), pp 1–4, Apr 2017. IEEE 8. Kumar Srivastava P, Shiney J, Shan P (2021) Plant disease prediction using image processing and soft computing algorithms: a review. In: 2021 International conference on computational intelligence and knowledge economy (ICCIKE), pp 143–148, Mar 2021. IEEE 9. Bashir S, Sharma N (2012) Remote area plant disease detection using image processing. IOSR J Electron Commun Eng 2(6):31–34 10. Oppenheim D, Shani G (2017) Potato disease classification using convolution neural networks. Adv Anim Biosci 8(2):244–249

7 AI-Based Agriculture Recommendation System for Farmers

103

11. Kaviya P, Bhavyashree M, Krishnan MD, Sugacini M (2021) Artificial intelligence based farmer assistant chatbot. Int J Res Eng Sci Manage 4(4):26–29 12. Hlaing CS, Zaw SMM (2017) Plant diseases recognition for smart farming using model-based statistical features. In: 2017 IEEE 6th global conference on consumer electronics (GCCE), Oct 2017, pp. 1–4. IEEE 13. Agarwal N, Sondhi A, Chopra K, Singh G (2021) Transfer learning: survey and classification. In: Smart innovations in communication and computational sciences, pp 145–155

Chapter 8

A New Methodology to Detect Plant Disease Using Reprojected Multispectral Images from RGB Colour Space Shakil Ahmed and Shahnewaz Ali

1 Introduction Plant disease detection is a pivotal step to prevent crop damage; hence, it has high research interest. Accuracy of the detection model and its deployment can have a significant impact on annual growths [1]. Particularly, for the small scale grower and for the countries where agricultural industries plays an important role to total GDP, plant health monitoring is fundamentally important to prevent possible damages. In some circumstances, plant disease may cause drastically damage to the growers, and subsequently, it affects the whole nation. According to the Food and Agriculture Organization of the United Nations (FAO), around 20–40% of plant diseases die due to pests, which leads loss of about USD 220 billion per year [2]. In many countries, farmers still use the traditional method to identify plant diseases, which is nakedeye observations by pathogen professionals. The expert needs to monitor the growth of the plant frequently. This method is very expensive and not appropriate for the large-scale farm area. In addition, it is not highly accurate, prone to error, and timeconsuming which fails to detect pests in the early stage. The technology can solve this problem to identify diseases in the early stage and can provide continuous monitoring in the large-scale area. Recent progress in technology, particularly in computer vision and machine learning, it opens a door towards autonomous plant disease detection and recognition process [3, 4]. Recently 3D vision, semantic and segmented maps, and localisation have been achieved a significant progress which enables the possibility to deploy an advanced robotic system in a growing field and navigate autonomously [5–8]. Hence, S. Ahmed (B) Department of Machanical and Electrical Engineering, Massey University, Auckland, New Zealand e-mail: [email protected] S. Ali Independent researcher, Brisbane, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_8

105

106

S. Ahmed and S. Ali

an automatic method to detect and recognise plant disease has great demand that can be integrated with an autonomous field robot to monitor plant health and growth in regular basis. To fill the demand, in our other work, we have addressed the plant disease recognition task using reconstructed multispectral images. In this article, we are presenting an advanced methodology to detect and identify plant disease. Summary of our contributions are as follows. 1. A methodology to identify plant disease from their early and challenging syndromes. 2. A concept validation experiment to evidence the benefits of using a re-mapped multispectral imaging system using a conventional imaging device. 3. Cost-effective novel technique that improves the detection accuracy of a deep learning-based technique for agricultural field robotics. The rest of the paper is organised as follows: Sect. 2 discusses the related article about RGB image processing-based plant disease detection. Section 3 discussed the data set used to validate the result. Section 4 explains the architectural model of YOLOv3 tiny, while Sect. 5 briefly describes the performance of YOLO models for both spectral image and RGB image, Sect. 6 concludes the paper.

2 Literature Review Early research in computer vision-based object detection includes template matching, texture analysis, and visual feature matching [8]. Template and texture matching includes pixel level information to match objects in an image. However, the pixel, texture, and shape of an object may vary which requires a larger data set to compare to achieve a baseline accuracy with real life data set. Such larger data set with variations and statistical individual matching need notably high computing resources. Feature-based object matching such as edge, corner, or high level features like scale invariant feature transform (SIFT) or Speeded Up Robust Features (surf) mitigate the mentioned shortcoming, lack of feature, e.g. featureless object, impose further limitations to apply such methods proved in various vision task in other domains such as correspondence pixel search which has relevance in detecting object [9–11]. Moreover, object visual appearances, texture and size, imaging conditions, and camera motions raise an overly complex research problem to detect objects in a robust manner. In agriculture, the plant disease detection process falls into mentioned limitation as in fields they are extremely challenging in their appearances. Compared to the conventional vision techniques, the deep learning techniques have been achieved a significant success to solve various complex vision-based task that includes untextured or repeated texture image data [6–13]. It evidences that the learned features from the learnable feature extraction mechanism outperform over hand-crafted features mentioned above. Moreover, a fully convolutional neural network (FCN) enables to capture data at various size and appearances in a single step.

8 A New Methodology to Detect Plant Disease …

107

Sun et al. [14] proposed a multiple regression model-based image recognition system for plant disease identification. The author used a different method for image segmentation and recognition to improve accuracy. Out of them, the combination of RGB colour and regional growth technique enhanced the system’s accuracy. Bharate et al. [15] also presented different RGB image processing methods to identify plant diseases. The author used different types of plants and use the most common machine learning algorithms such as support vector machine (SVM) classifier and artificial neural network (ANN) classifier. Khan et al. [16] proposed a computer vision system to identify crop diseases by segmenting the RGB image in CILEB. The author proposed a novel cascade system and the result showed evidence of excellent colour segmentation of the disease area. The author [17] used RGB image processing with a different machine learning algorithms. Most of the common machine learning used k mean clustering and applied KNN classifier for the validation. Apart from colour RGB images, multispectral and hyperspectral images are also have been used to detect disease in their early stage [18–20]. The implementations of a complex optics technology over imaging system, these images represent band specific monochrome images that can range from VIS to NIR or SWIR bands. The major importance of these type of imaging system is that it allows band specific radiant spectral energy that relates both surface (skin) and inner biological information as an optical response is unique to the material. However, strong light, penetration depth, radiometric calibrations, effect of indirect reflectances, shadow, and weather conditions may affect the resultant images that make it difficult to adapt such a system in various agricultural setups [21–24]. Moreover, these devices are expensive that may not a feasible solution to many farms. However, such a rich feature enabled image with deep learning technique achieved significant progress to detect plant diseases [18–25]. RGB to multispectral reconstruction from camera response function has been explicitly studied [26]. To enrich visual information in VIS range using conventional RGB camera, recently some research has been reported that uses numerical methods, data driven methods, and deep learning techniques [27–29]. Recent research evidences the potential of such images in surgical robotics that opens a novel possibility to integrate such systems in agricultural robotics in a resource constrained approach [23–28, 30]. The benefits of such a method are two folds. The advanced network architecture that includes complex attention mechanism, dense connections, multiscale features extractions and embedded subnetwork as proposed in [31–33] strongly indicates a possibility to detect disease accurately but requires high computational resources and power. On the other side, lightweight vision tasks that deployed on embedded system able to enrich human experiences [34–36]. Such system, for instance, arm embedded process able to run vision-based methods and lightweight deep network model. With recent edge devices such as Jetson Nano just created a pathway to deploy such disease detection model in field with low-cost devices. Having multispectral images that reconstructed from RGB colour images enables the model to see more channel dependent spectral features. However, lack of evidence and study has been found in the literature. In this work, our study fills the gaps,

108

S. Ahmed and S. Ali

and it evidences the benefit of such a system. Moreover, an intelligent dynamic light control system to adjust lighting conditions relevant to the scene context such as proposed in ref [37] will enhance its applicability both in indoor and outdoor situation as light enrich spectral information when environmental parameters are considered. In this study, we have carried out our research to apply the methodology proposed by Otsu et al. [27] that have been validated and applied in other scientific works to reconstruct multispectral images from tristimulus values [23–26]. For comparison, we have adapted YOLOv3 tiny model [38].

3 Data Set The data set is taken from publicly available open data platforms for this study [39]. There are a total of 25851 images and 11 categories in the data set. However, for this study, 2000 images and eight categories have been used. Each category consists of 250 images. The categories used for this experiment are (1) Late_blight, (2) healthy, (3) Early_blight, (4) Septoria_leaf_spot, (5) Tomato_Yellow_Leaf_Curl_Virus, (6) Bacterial_spot, (7) Tomato_mosaic_virus, and (8) Powdery Mildew. Initial data has been augmented by image flipping, PCA colour augmentation, Gamma correction, rotation, and scaling. 70% of the images (175) from each category were used for the training and 30% (75) images are used for the testing. We acknowledge that the accuracy result might be influenced due to the size of the data set. RGB images are transferred into multispectral images. Three channels of RGB data were converted into 36 channels multispectral data.

4 Model Architecture YOLOv3 models detect and annotate objects by providing bounding boxes and class labels. Bounding boxes are represented in coordinates that represent object possible instances known as regions of interest. To represent a region of interest of each object’s individual presence in an image, centre coordinates, box height and width are used. Another important feature of bounding is the confidence score. For each predicted bounding box, YOLO model associates a class label. There is another attribute used with detection models name anchor box that defines aspect ratio (size or IoU). Anchor boxes are calculated from the data set, and it is an important parameter that requires to define prior to the training. In our study, we implemented YOLOv3 tiny model as shown in Fig. 1. YOLOv3 tiny version uses 3 channels, 416*416 image. That was set for RGB and for multispectral we used 32 channels. For each layer, 2D convolution was performed. First, six layer 2D convolution and 2D maxpool were performed that transfer 416*416 images to 13*13. The output then feed to four layers of successive 2D convolution. Output of this layer is used as first stage detection output for multiscale

8 A New Methodology to Detect Plant Disease …

Conv. Layer 7x7x64 s-2 Maxpool layer 2X2 s-2

Conv. Layer 3x3x192 Maxpool layer 2X2 s-2

Conv. Layer 1x1x128 3x3x256 1x1x256 3x3x512 Maxpool layer 2X2 s-2

Conv. Layer 1x1x256 x4 3x3x512 1x1x512 3x3x1024 Maxpool layer 2X2 s-2

109

Conv. Layer Conv. Layer Conn.Layer 1x1x512 3x3x1024 x2 3x3x1024 3x3x1024 1x1x1024 3x3x1024 s-2

Fig. 1 YOLOv3 tiny architectural model

loss function, and then feed to next layer. In this layer, convolution and upsampling is performed. Output of this layer used as a second stage output and feed to next layer. In the next layer, two successive convolutions were performed and the out of this layer is the third stage output.

5 Result The proposed methodology is to reproject multispectral image from RGB image. Figure 2 shows the RGB image which is a combination of the radiant energy. This radiant energy consists of the Red, Green, and Blue spectrum. These spectra are obtained by lens technology, which is also known as a colour filter. The different range of spectrums represents different colours. For red is 620–750nm, green is 495– 750 nm, and for blue is 450–495 nm. From this RGB colour image, we reprojected the multispectral image shown in Fig. 3. In this study, we compared the performance of the two methods of plant disease identification. The performance matrices used to validate the method are as follows: Accuracy =

tp + tn tp + tn or tp + fp + fn + tp Total

precision =

tp tp or tp + fp actual result

110

S. Ahmed and S. Ali

Fig. 2 Sample of a RGB image from the data set

Fig. 3 Multispectral images. One can notice that in light bands, more informative spectral information became visible

8 A New Methodology to Detect Plant Disease …

111

Fig. 4 Performance score for RGB image and multispectral image using YOLOv3 tiny

Recall =

tp tp or tp + fn predicted result

f 1_Score =

(2 ∗ tp) (2 ∗ tp + fn + fp)

where tp=true positive, tn=true negative, fp=false positive, fn=false negative. Figure 4 shows the performance comparison of using YOLOv3 tiny for the RGB colour space image vs the multispectral image for plant disease identification. The accuracy for the multispectral image method is 76.86%. On the other hand, RGB images perform 4% less than the multispectral method. The same trend was for the precision, recall, and F-1 scores. The multispectral performs better than the RGBbased image crop infection identification.

6 Conclusion In this study, we have conducted plant disease detection research problems using multispectral images. Multispectral images are reconstructed from RGB colour space. Our work evidence that such methodology enriches image features in the visible light spectrum that subsequently improve detection accuracy in vegetation. Moreover, the research has been conducted with limited data set that integrates several vision challenges. With this challenging data set, the detection model achieved approx. 4.5% accuracy gain when multispectral images are considered oppose to RGB images. We have used deep learning technique the YOLOv3 tiny model in order to fit the model in a resource constraint devices. Hence, we conclude that following the pro-

112

S. Ahmed and S. Ali

posed methodology, a light weights deep learning detection model able to detect plant disease with a significant accuracy that can be deployed in field.

References 1. Too EC, Yujian L, Njuki S, Yingchun L (2019) A comparative study of fine-tuning deep learning models for plant disease identification. Comput Electron Agric 161:272–279 2. New standards to curb the global spread of plant pests and diseases. FAO http://www.fao.org/ news/story/en/item/1187738/icode/ (2021) 3. Chen T, Zhang J, Chen Y, Wan S, Zhang L (2019) Detection of peanut leaf spots disease using canopy hyperspectral reflectance. Comput Electron Agric 156:677–683 4. Saleem MH, Potgieter J, Mahmood Arif K (2019) Plant disease detection and classification by deep learning. Plants 8:468 5. Ali S, Dayoub F, Pandey AK (2023) Learning from learned network: an introspective model for arthroscopic scene segmentation. In: Ahmad M, Uddin MS, Jang YM (eds) Proceedings of international conference on information and communication technology for development. Studies in autonomic, data-driven and industrial computing. Springer 6. Ali S, Pandey AK (2022) ArthroNet: monocular depth estimation technique toward 3D segmented maps for knee arthroscopic. Intell Med 7. Jonmohamadi Y, Ali S, Liu F, Roberts J, Crawford R, Carneiro G, Pandey AK (2021) 3D semantic mapping from arthroscopy using out-of-distribution pose and depth and in-distribution segmentation training. In: International conference on medical image computing and computerassisted intervention (MICCAI). Springer, Cham, pp 383–393 8. Megalingam RK, Teja CR, Sreekanth S, Raj A (2018) ROS based autonomous indoor navigation simulation using SLAM algorithm. Int J Pure Appl Math 118(7):199–205 9. Shahnewaz A, Pandey AK (2020) Color and depth sensing sensor technologies for robotics and machine vision. In: Machine vision and navigation. Springer, Cham, pp 59–86 10. Lowe G (2004) Sift-the scale invariant feature transform. Int J 2(91–110):2 11. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Understanding 110(3):346–359 12. Manavalan R (2021) Efficient detection of sugarcane diseases through intelligent approaches: a review. Asian J Res Rev Agric 27–37 13. Ali S, Crawford, Maire, Pandey, Ajay K (2021) Towards robotic knee arthroscopy: multi-scale network for tissue-tool segmentation. arXiv preprint arXiv:2110.02657 14. Sun G, Jia X, Geng T (2018) Plant diseases recognition based on image processing technology. In proceedings of Hindawi. J Electric Comput Eng 15. Bharate AA, Shirdhonkar M (2017) A review on plant disease detection using image processing. In: International conference on intelligent sustainable systems (ICISS 2017), pp 103–109 16. Khan ZU, Akram T, Naqvi SR, Haider SA, Kamran M, Muhammad N (2018) Automatic detection of plant diseases; utilizing an unsupervised cascaded design. In: 15th international Bhurban conference on applied sciences and technology (IBCAST 2018), pp 339–346 17. Patel V, Srivastava N, Khare M (2022) Plant disease detection using image processing and machine learning. In: Singh PK, Wierzcho´n ST, Chhabra JK, Tanwar S (eds) Futuristic trends in networks and computing technologies. Lecture notes in electrical engineering, vol 936. Springer, Singapore. https://doi.org/10.1007/978-981-19-5037-7_39 18. Moshou D, Bravo C, Oberti R, West J, Bodria L, McCartney A, Ramon H (2005) Plant disease detection based on data fusion of hyper-spectral and multi-spectral fluorescence imaging using Kohonen maps. Real-Time Imaging 11(2):75–83 19. Pourazar H, Samadzadegan F, Dadrass Javan F (2019) Aerial multispectral imagery for plant disease detection: radiometric calibration necessity assessment. Euro J Remote Sens 52(sup3):17–31

8 A New Methodology to Detect Plant Disease …

113

20. Karpyshev P, Ilin V, Kalinov I, Petrovsky A, Tsetserukou D (2021) Autonomous mobile robot for apple plant disease detection based on CNN and multi-spectral vision system. In: 2021 IEEE/SICE international symposium on system integration (SII). IEEE, pp 157–162 21. Shahnewaz A, Jonmohamadi, Takeda Y, Roberts J, Crawford R, Brown C, Pandey, Ajay K (2021) Arthroscopic multi-spectral scene segmentation using deep learning. arXiv preprint arXiv:2103.02465 22. Kuzmina I, Diebele I, Jakovels D, Spigulis J, Valeine L, Kapostinsh J, Berzina A (2011) Towards noncontact skin melanoma selection by multispectral imaging analysis. J Biomed Opti 16(6):060502–060502 23. Ali S, Pandey AK (2022) Towards robotic knee arthroscopy: spatial and spectral learning model for surgical scene segmentation. In: Proceedings of international joint conference on advances in computational intelligence. Springer, Singapore, pp 269–281 24. Pourazar H, Samadzadegan F, Dadrass Javan F (2019) Aerial multispectral imagery for plant disease detection: radiometric 25. Nagasubramanian K, Jones S, Singh AK, Singh A, Ganapathysubramanian B, Sarkar S (2018) Explaining hyperspectral imaging based plant disease identification: 3D CNN and saliency maps. arXiv preprint arXiv:1804.08831 26. Stigell P, Miyata K, Hauta-Kasari M (2007) Wiener estimation method in estimating of spectral reflectance from RGB images. Pattern Recog Image Anal 17(2)233–242 27. Otsu H, Yamamoto M, Hachisuka T (2018) Reproducing spectral reflectances from tristimulus col-ours. Comput Graphics Forum 37(6):370–381 28. Ali S, et al (2023) Surface reflectance: a metric for untextured surgical scene segmentation. In: Ahmad M, Uddin MS, Jang YM (eds) Proceedings of international conference on information and communication technology for development. Studies in autonomic, data-driven and industrial computing. Springer, Singapore 29. Han XH, Shi B, Zheng Y (2018) Residual HSRCNN: residual hyper-spectral reconstruction CNN from an RGB image. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 2664–2669 30. Ali S, Crawford R, Pandey AK (2023) Arthroscopic scene segmentation using multi-spectral reconstructed frames and deep learning. Intell Med 31. Mei S, Geng Y, Hou J, Du Q (2022) Learning hyperspectral images from RGB images via a coarse-to-fine CNN. Sci China Inf Sci 65:1–14 32. Wang CY, Bochkovskiy A, Liao HYM (2022) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 33. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, Li Y (2022) YOLOv6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 34. Ali S (2016) Lip contour extraction using elliptical model. In: 2016 international workshop on computational intelligence (IWCI). IEEE, pp 30–34 35. Ali S (2016) Embedded home surveillance system. In: 2016 19th international conference on computer and information technology (ICCIT). IEEE, pp 42–47 36. Mittal S (2019) A survey on optimized implementation of deep learning models on the Nvidia Jetson platform. J Syst Archit 97:428–442 37. Ali S, Jonmohamadi Y, Takeda Y, Roberts J, Crawford R, Pandey AK (2020) Supervised scene illumination control in stereo arthroscopes for robot assisted minimally invasive surgery. IEEE Sens J 21(10):11577–11587 38. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unied, real-time object detection. In Processing IEEE conference computer visual pattern recognition, pp 779788 39. Mei-Ling H, Ya-Han (2020) Dataset of tomato leaves. Mendeley Data V1. https://doi.org/10. 17632/ngdgg79rzb.1

Chapter 9

Analysis of the Performance of YOLO Models for Tomato Plant Diseases Identification Shakil Ahmed

1 Introduction Agricultural production is one of the critical factors for many countries’ economic developments. According to the world bank report, about 4% of global gross domestic product (GDP) contributions come from agriculture. In the least developing country, it increases by 25% [1]. According to the USDA stat, USA agriculture firms contributed about $136 billion towards the USA economy in 2019 [2]. The contribution could increase more if there is an efficient use of technology in agriculture. Plant production could damage due to environmental (climate conditions) and pathogen infections. According to the Food and Agriculture Organization of the United Nations (FAO), around $220 billion is lost annually due to plant diseases [2]. The plant leaf is one of the crucial parts of a plant to identify the plant’s health. If the plant is infected, leaves could start to fall off at any stage and eventually, plants could die. As a result, production is decreased as well as economic loss. It is necessary to identify the cause of the infection and the types of infection of a plant more accurately as early stage as possible. Most of the firms are used two traditional methods to identify plant infections. One is expert diagnoses, and another is pathogen analysis. The first method refers to the plant diseases expert who has many years of handful experience in identifying infections. The person worked in the production field and has real-time diagnosis experience in extensive knowledge of different plant diseases. In contrast, finding a domain expert for plant disease monitoring is also challenging. Furthermore, it could be more cost-effective for the farmers. In addition, this method highly depends on real-time experiences, manual monitoring and low accuracy about the infections [3]. This method is also less efficient as a different expert might have different opinions. The second methods is identifying pathogens S. Ahmed (B) Department of Mechanical and Electrical Engineering, Massey University, Auckland, New Zealand e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_9

115

116

S. Ahmed

Fig. 1 Comparision of object detection algorithms [13]

that cause diseases using microscope. Though, this method has achieved high accuracy compared with the first method. However, this method is time-consuming and unsuitable for wide-area inspections [4]. As a result, it is necessary to implement an intelligent automated system for plant disease identification which will regularly monitor the plant’s growth in a wide range of agricultural fields. Artificial intelligence (AI)-based automatic plant monitoring systems are involved in identifying the plant’s diseases where there is minimum human intervention. The advancement and recent success of deep learning technology in order to perform high-level computer vision task for instance image pixel level classification [5, 6], researchers are interested to apply this techniques to identify plant diseases and insect [7]. In addition, the deep learning-based applications significantly improve diseases identification accuracy, processing time and reduce the workload [8]. Convolutional neural networks (CNNs) have higher achievement than traditional deep learning methods. This model runs end-to-end pipelines to automate the learning process with fewer model parameters. This makes this CNNs model much easier to use for nonexperts’ people to handle the computer vision task for plant disease identifications [9]. CNN’s shown outstanding performance for automatic visible feature extraction. This model is widely used in different agricultural applications for image processing for plant disease identification, insect detection, etc. [10, 11]. You Only Look Once(YOLO), introduced in 2015, employs CNN’s state-of-theart technique of detecting and recognising various objects in a picture in real time [12]. YOLO has several advantages compared with other object detection algorithms. YOLO is much faster, as it is not using the traditional complex pipelines. Figure 1 shows the comparison of different object detection algorithms. The accuracy is very high with compare with other state-of-the-art real-time object detection. In addition, YOLO is an open-source community that is improving quickly. Though there are many series of YOLO versions available. However, there is still a research gap for the different YOLO models’ performance in tomato leaf disease identification. In this paper, the performance of different YOLO versions (YOLO-3, YOLO-4,YOLO-5,

9 Analysis of the Performance of YOLO Models…

117

YOLO-4 tiny and YOLO-5 tiny) analysed for plant disease identification by using image classifications. The contributions of this study are as follows: 1. This study analysed the architecture of the different YOLO models and their performance for the multi-classification tomato leaf disease detections. 2. This study investigates the YOLO models in terms of portability and efficiency for disease detection. 3. Through this study, it was proposed the YOLO model that is suitable for the detection of the disease in various factors. It also compares the different model’s detection processing times and the accuracy of the tomato leaf disease detection. The rest of the paper is organised as follows: Sect. 2 discusses the related article about object detection model used for plant leaf disease detection. Section 3 explains about the architectural model of YOLO, the experimental environment and the data set used in this study, while Sect. 4 briefly describes about the performance of YOLO models, Sect. 5 concludes the paper.

2 Literature Review Early identification of plant diseases is one of the significant steps for disease prevention and treatment. In this section, it discussed the techniques that can be used for the identification of plant diseases. Support vector machine (SVM) is a well-known machine learning classifier that has been used in many domains [14, 15]. Similar to other domains, support vector machine (SVM) classifier also has been used in plat disease detection [16, 17]. The author [17] proposed a histogram technique to segment the leaf’s healthy and unhealthy regions. The author proposed the image scaling 300 × 300 size, improving brightness, adjusting the contrast level and reducing the noise from the background. After that, use the K-mean cluster for segmentation and implement the statistical model GLCM to extract the important features. After finding the important features, apply the support vector machine (SVM) classifier to identify the different leaf diseases. Another SVM model was proposed by the author [18] to identify sugar plant diseases. Hyperspectral images are used as input in the model to identify the diseases. In this technique, the average accuracy was 78% of the model. Another author proposed an SVM model that included the quadratic kernel function to identify the curl diseases of the tomato plant from yellow leaf [19]. The model’s classification accuracy achieved 92% of the curl diseases from the yellow leaf. For the huanglongbing detection, the author used the SVM and artificial neural network (ANN) [20]. The model’s accuracy achieves 92.2% for the ANN and 92.8% for the SVM. Another ML-based model K-mean clustering-based plant disease identification and classification is proposed by the author [21]. Image classification and detection running on embedded devices have several benefits on mobile robotics in terms of portability issue. Conventional ML for instances SVM, Adaboost-based image-based classification and detection on handcrafted feature like Haar, corner or histogram of gradient

118

S. Ahmed

perform real time on device raspberry pi [22–24]. Many ML algorithms, for instance SVM, were proposed for automated plant disease detection, which is an efficient way to use in the agriculture industry. However, there is some limitations use of the traditional ML model due to the complex nature of the image processing and feature extraction method. These methods score lower performance in terms of processing time in real-time infection detection. In addition, traditional ML models are unsuitable for the real situation’s non-uniform complex background. In contrast, the deep learning model is useful in overcoming the above-mentioned drawback. DL models are monarchy in the computer vision field and use different applications [25–27] in different domains. In solving highly complex computer vision problems, deep learning technology has proven its efficacy in extracting meaningful features automatically in a learnable fashion outperforms over handcrafted features. For instance, in stereo vision context, deep learning outperforms over conventional computer vision techniques that comprise on handcrafted features and ML [28, 29]. Seemingly, leaf also a highly complex attribute of plant’s low texture or repeated textures with subtle changes in colours and veins if early stage disease is considered and such methods combined with stereo and rgb image has significant importance in disease detection [30]. Hence, agriculture is one of the major fields where DL models are used for automated plant disease classification, image segmentation and plant detection. The CNN-based model was successfully used and achieved a higher success rate of accuracy in object detection. CNN can also reduce the complexity of image processing and extract essential features automatically from the input image. Due to Region Proposal Networks (RPN) technique in this model, the processing time significantly drops for plant disease detection. Thus, many researchers have recently used different types of CNN models to identify plant diseases. There are many CNN models that are applied to identify different plant diseases. Mask R CNN [31], Faster R CNN [32] and SSD [33] are applied and successfully identify crop diseases. YOLO is one of the popular models on CNN. YOLO-3 [34], YOLO-4 [35] and YOLO-5 [36] are designed to identify the infection of a plant from the image more precisely. Hammad et al. [37] use meta-architecture based on the DL model to process the image to identify plant diseases. Some other models, such as AlexNet and GoogleNet, are used to identify the 26 diseases and 14 different crops by processing around 54,000 images of both healthy and unhealthy plant leaves [38]. In addition, the author proposed DL-based Efficient net to classify the tomato diseases from around 18,000 leaves [39]. In this domain research also includes multi-spectral and hyper-spectral data to detect disease, however, it requires expensive dedicated hardware [40, 41]. Although reconstructed multi-spectral images presented in other research has the potential to use in field robotics, very scarce literature found in the context of plat disease detection [42]. The success of the models reviewed above depends on many factors, such as complex background and variability of light in real situations. In addition, it also needs to consider the accuracy and processing time of the system, which is not present in the many existing models. Thus, there is a significant gap in the existing model for the plant diseases identification in terms of accuracy and processing time. YOLO algorithm is able to fill the existing gap of the trade-off between accuracy and

9 Analysis of the Performance of YOLO Models…

119

time. The state-of-the-art YOLO algorithm has a higher processing speed and also achieves higher accuracy, precision to identifies the infections in the real scenario in the agriculture field. In this paper, we analyse the performance of different YOLO algorithms (YOLO-3, YOLO-4, YOLO-5, YOLO-4 tiny and YOLO-5 tiny) that are popular and widely used in identifying plant infection more accurately compared with other DL model. We also perform in different performance matrices to identify the model efficiency and accuracy.

3 Methodology In the era of object detection and classification, you only look once. “YOLO” also includes high-level feature extraction techniques and a detection mechanism to provide a bounding box and class annotations. Apart from its previous versions, currently, the YOLO version 3 (YOLO-3), YOLO version 4 (YOLO-4), and YOLO version 5 (YOLO-5) [34–36]. In order to integrate such an intelligent system into the mobile robotic platform, lighter versions of such efficient models have been implemented with reduced network architecture known as YOLO-3 tiny, YOLO-4 tiny and YOLO-5 tiny. In order to benchmark such a lighter model in agricultural field robotics, the state-of-the-art detection techniques YOLO-3, YOLO-4, YOLO5, YOLO-4 tiny and YOLO-5 tiny are investigated to solve multi-class detection problems. Multiscale feature extraction is modelled in a convolutional network through the use of convolution operations and successive stride or max pool operation [26, 42]. The residual learning technique is used to improve the vanishing gradient problem during the training operation. Information propagation is strengthened through the use of multi-connected layers [34]. Similar to the deeper architecture like ResNet DenseNet, inception and technique like residual learning, YOLO-3 implementation introduces Darknet-53 new deeper network that comprises 53 layers to extract features from images shown in Fig. 2 [43– 45]. The three detection heads were used at different scales to build feature pyramids to propagate high-level semantic information to accurately predict class probabilities and box coordinates. YOLO-4 introduced a new feature extractor network called Cross Stage Partial Network Darknet-53, in short form, CSPDarknet53. A dense connection mechanism is used to minimise information loss, where wider network modules such as spatial pyramid pooling are used in the middle layer of the proposed model. In YOLO-5, fundamentally introduced focus layer that replaced the initial layers of Darknet-53. The architecture of YOLO-5 is shown in Fig. 3. Similar to YOLO-4, it used a path aggregation network that includes a feature pyramid network to maintain enhanced information flow among the convolution layers.

120

S. Ahmed

Fig. 2 Backbone structure of 53 layer

3.1 Experimental Environment The Ubuntu 22.04, Python and CUDA softwares were used to do the experiment. One desktop PC was used of this study. The hardware specification of that machine is given in Table 1. PyTorch and Darknet-53 framework was used to validate the models.

9 Analysis of the Performance of YOLO Models…

121

Fig. 3 Architecture of YOLO-5 version Table 1 Harware environment details for the analysis Hardware Model Main board CPU Memory GPU Solid state drive

HP ElliteBook 850 G8 Intel core i7-1185G7 32 GB DD4 3200 NVIDIA Quadro K2200 512GB

Number 1 1 2 2 1

3.2 Data Set The data set is used for the study collected from publicly available open-source data [46]. There are 25,851 images composed for the study. Out of these numbers, the data set was divided into two sets: one set for the training and one set for the testing. For the training, 18,096 images were used for the training and 7755 images were used for the testing. There are 11 categories in the data set; out of these 11 categories, one category is healthy, and the other ten categories are diseases. The details classes of the data set are shown in Fig. 4.

122

S. Ahmed

Fig. 4 Class distribution of different diseases from the data set Table 2 Hyperparameter setting for the analysis Models Epoch Framework YOLO-3

3000

Darknet-53 Pytorch

YOLO-4 YOLO-4 tiny YOLO-5 YOLO-5 tiny

Saturation-Hue

BFLOPS

1.5–0.05

130 120 12.5 210 16.4

3.3 Training, Test and Validation YOLO models (YOLO-3, YOLO-4, YOLO-4 tiny, YOLO-5 and YOLO-5 tiny) were trained in the described data set. Data augmentation techniques are used for all the models during the training process for the unobserved data. For saturation and exposure, a 1.5 coefficient is used in the image, and a random hue coefficient of 0.05 is used. Because of the different computational demands of each model, different billions of floating-point operations per second (BFLOPS) are used for each model. The training parameters which used for the study are given in Table 2.

3.4 Performance Matrix To measure the performance of the models, several techniques are used. One of them is the confusion matrix. The confusion matrix is obtained from the test data. After Analysis the confusion matrix and its parameters, we can measure a model’s

9 Analysis of the Performance of YOLO Models…

123

performance. There are other performance matrices used for the evaluation of a model, which is known as accuracy, F-1 score, precision and recall. In this study, we have chosen all the performance matrices used to evaluate a classification model. Accuracy is one of the performance matrices to evaluate a classification model, which means the fraction of the prediction is correct. In other words, correctly predicts number of a classifications model which is divided by the total number of predictions TP+TN , where TP = True Positives, TN = True Negatives, FP = False made. TP+TN+FP+FN Positives, and FN = False Negatives. The second performance matrix is precision which describes how many detected TP The third matrix is recall which indicates how many items are truly relevant. TP+FP TP relevant elements were detected. TP+FN In addition, mean average precision (mAP) parameter is also consider as a performance matrix in this study. The mAP is calculated by the average detection of precision. The mAP incorporates the trade-off between precision and recall curve at a value of IoU. mPA50 cindicates that the Precision(i) C precession–recall curve surrounds the box by 50%. mAP = 1c i=1 is total disease classes. For this study, c = 11.

4 Result The YOLO models are one of the effective models for image object detection performance in several applications. In this paper, different YOLO models were implemented and analysed. Figure 5 shows the testing sample of the YOLO model. It shows the name of the disease and the percentage of accuracy of that identified disease.

Fig. 5 Testing result sample of using YOLO model

124

S. Ahmed

Fig. 6 Performance comparison of different YOLO family Table 3 Quantative comparison during training process Models Training time in hours YOLO-3 YOLO-4 YOLO-4 tiny YOLO-5 YOLO-5 tiny

26.3 31.2 8.6 32.1 8.3

Best mAP 50% 82.76 89.55 86.84 92.62 88.31

From Fig. 6, we can see YOLO-5 performs better than the other YOLO family considered in this study. In terms of accuracy, YOLO-5 performs much better than compare with YOLO-3. YOLO-4 tiny performs better compared with YOLO-4 tiny. However, in the case of precision, recall YOLO-5 tiny performs better than the YOLO-4 tiny. Figure 6 shows the performance comparison of the different models of YOLO. mAP 50% was calculated at several epoch intervals. Table 3 shows the training time and the top mAP for the models. YOLO-3, YOLO-4 and YOLO-5 took longer training time as expected. For 3000 epochs, YOLO-3, YOLO-4 and YOLO-5 took 26.3,31.2 and 33.1 h, respectively. However, YOLO-4 tiny and YOLO-5 tiny took less time to complete the same epoch due to the architectural design of those models. YOLO-5 tiny took less amount time to compare with the YOLO-4 tiny. In the case of mAP performance, YOLO-5 is higher accuracy (92.62%), and the least is YOLO-3. YOLO-4 tiny, YOLO-5 tiny and YOLO-4 performance are almost similar. In this study, it has observed the detection speed of each model. Figure 7 shows the detection speed for the model. YOLO-4 tiny and YOLO-5 tiny model has higher detection speed which is about 200FPS. At the same time, the other model has about 50 FPS. YOLO-5, which model has higher accuracy, the detection speed is slow.

9 Analysis of the Performance of YOLO Models… 1.0 YOLO-5

0.9

mAP @50%

Fig. 7 Detection time comparison is different YOLO family

125

YOLO-5 YOLO-4 tiny tiny

YOLO-4

0.8

YOLO-3

0.7 0.6 0.5

0

50

100

150

200

250

FPS

However, it is faster than the YOLO-3 and YOLO-4. YOLO-3, YOLO-4, and YOLO5 detection speeds are around 50 FPS. Therefore, a tiny model could be a suitable option where detection requires faster processing and low power consumption.

5 Conclusion With the advancement of object detection technology, several promising object detection applications are used in the agriculture industry. Plant identification, disease detection, plant phenotyping and pet identification problems are addressed very efficiently way with the help of different object detection algorithms. In disease identification, conventional methods are integrated with the human intervention or pathogen system. Both systems are unsuitable for availability, efficiency and time effectiveness. The researchers proposed several automatic object detection algorithms to overcome these situations. The YOLO algorithm is very popular and efficient in using plant disease detection. YOLO is an open-source algorithm; as a result, many versions have been developed. In this study, we analysis YOLO-3, YOLO-4, YOLO-4 tiny, YOLO-5 and YOLO-5 tiny model performance for tomato diseases identification. We used open data sources and applied the YOLO models. We analyse the model’s performance in terms of accuracy, precision, recall, F-1 score, mAP and detection time. As per our experiment environment, YOLO-5 performs better than the other models. However, YOLO-5 tiny and YOLO-4 tiny have less detection time compared with the other model. For the portable device and low power consumption, YOLO-5 tiny and YOLO-4 tiny are suitable for plant disease detection. However, for accurate results, YOLO-5 is very effective.

126

S. Ahmed

Appendix Class-wise precision performance result of different YOLO models Class

YOLO-3

YOLO-4 82.19 92.23 83.41 81.78 96.49 74.45 72.87 80.26 86.18 88.71

YOLO-4 tiny 78.32 89.47 81.22 78.51 97.73 70.08 69.91 79.38 84.64 87.67

Late_blight healthy Early_blight Septorialeafspot TomatoYellowLeafCurlVirus Bacterial_spot Target_Spot Tomatomosaicvirus Leaf_Mold SpidermitesTwospottedspider_mite Powdery Mildew

79.53 89.09 82.85 77.52 91.65 67.31 65.12 78.54 83.68 87.45 88.61

YOLO-5 84.62 94.73 91.03 88.64 97.98 78.68 75.21 81.06 90.27 92.77

YOLO-5 tiny 77.23 90.31 86.21 82.79 96.67 69.23 70.78 80.65 87.68 86.08

89.92

87.16

89.24

85.64

Class-wise Recall performance result of different YOLO models Class

YOLO-3

YOLO-4 83.45 93.81 82.14 80.54 95.35 76.47 76.7 81.34 87.56 88.34

YOLO-4 tiny 79.65 90.21 80.78 80.05 95.31 72.84 70.54 80.34 85.44 86.47

Late_blight healthy Early_blight Septorialeafspot TomatoYellowLeafCurlVirus Bacterial_spot Target_Spot Tomatomosaicvirus Leaf_Mold Spidermites Two-spottedspider_mite Powdery Mildew

78.21 89.78 83.12 78.48 90.27 70.23 67.55 78.81 84.81 87.12 88.46

YOLO-5 83.12 93.45 91.79 89.35 96.84 84.14 79.65 82.11 91.39 91.32

YOLO-5 tiny 77.54 90.14 85.29 81.45 94.14 73.41 70.56 80.23 85.84 86.71

89.31

88.47

88.79

86.11

References 1. Agriculture and food. https://www.worldbank.org/en/topic/agriculture 2. Economic Research Service U.S. Department of Agriculture. https://www.ers.usda.gov/ data-products/ag-and-food-statistics-charting-the-essentials/ag-and-food-sectors-and-theeconomy/ 3. Pandian JA, Kumar VD, Geman O, Hnatiuc M, Arif M, Kanchanadevi K (2022) Plant disease detection using deep convolutional neural network. Appl Sci 12:6982. https://doi.org/10.3390/ app12146982 4. Geetharamani G, Pandian A (2019) Identification of plant leaf diseases using a nine-layer deep convolutional neural network. Comput Electr Eng 76:323–338

9 Analysis of the Performance of YOLO Models…

127

5. Ali S, Crawford P, Maire D, Pandey A, Ajay K (2021) Towards robotic knee arthroscopy: multi-scale network for tissue-tool segmentation. arXiv preprint arXiv:2110.02657 6. Jonmohamadi Y, Ali S, Liu F, Roberts J, Crawford R, Carneiro G, Pandey AK (2021) 3D semantic mapping from arthroscopy using out-of-distribution pose and depth and in-distribution segmentation training. In: International conference on medical image computing and computerassisted intervention, pp 383-393. Springer, Cham 7. Sladojevic S, Arsenovic M, Anderla A, Culibrk D, Stefanovic D (2016) Deep neural networks based recognition of plant diseases by leaf image classification. Comput Intell Neurosci 2016:3289801 8. Afonso M, Fonteijn H, Fiorentin FS, Lensink D, Mooij M, Faber N et al (2020) Tomato fruit detection and counting in greenhouses using deep learning. Front Plant Sci, vol 11. https://doi. org/10.3389/fpls.2020.571299 9. Sun H, Xu H, Liu B, He D, He J, Zhang H et al (2021) MEAN-SSD: a novel real-time detector for apple leaf diseases using improved light-weight convolutional neural networks. Comput Electron Agric, 189:106379. https://doi.org/10.1016/j.compag.2021.106379 10. Zhou H, Deng J, Cai D, Lv X, Wu BM (2022) Effects of image dataset configuration on the accuracy of rice disease recognition based on convolution neural network. Front Plant Sci, vol 13. https://doi.org/10.3389/fpls.2022.910878 11. Dai F, Wang F, Yang D, Lin S, Chen X, Lan Y et al (2022) Detection method of citrus psyllids with field high-definition camera based on improved cascade region-based convolution neural networks. Front Plant Sci, vol 12. https://doi.org/10.3389/fpls.2021.816272 12. Joseph R, Santosh D, Ross G, Ali F (2015) You only look once: unified, real-time object detection. https://doi.org/10.48550/arXiv.1506.02640 13. Hernández Sánchez S, Romero H, Morales A (2020) A review: comparison of performance metrics of pretrained models for object detection using the TensorFlow framework. In: IOP conference series: materials science and engineering, vol 844, p 012024. https://doi.org/10. 1088/1757-899X/844/1/012024 14. Ali S, Jonmohamadi Y, Takeda Y, Roberts J, Crawford R, Pandey AK (2020) Supervised scene illumination control in stereo arthroscopes for robot assisted minimally invasive surgery. IEEE Sens J 21(10):11577–11587 15. Chandra MA, Bedi SS (2021) Survey on SVM and their application in image classification. Int J Inf Technol 13(5):1–11 16. Kaleem MK, Purohit N, Azezew K, Asemie S (2021) A modern approach for detection of leaf diseases using image processing and ML based SVM classifier Turkish. J Comput Math Educ, 12(13):3340–3347 17. Rumpf T, Mahlein A-K, Steiner U, Oerke E-C, Dehne H-W, Plümer L (2010) Early detection and classification of plant diseases with support vector machines based on hyperspectral reflectance. Comput Electron Agric 74:91–99 18. Pushpa SH, Ashok A (2021) Diseased leaf segmentation from complex background using indices based histogram. In: IEEE International conference on communication and electronics systems, pp 1502–1507 19. Wetterich CB, Neves RFDO, Belasque J, Ehsani R, Marcassa LG (2017) Detection of Huanglongbing in Florida using fluorescence imaging spectroscopy and machine-learning methods. Appl Opt 56:15–23 20. Mokhtar U, Ali MA, Hassanien AE, Hefny H (2015) Identifying two of tomatoes leaf viruses using support vector machine. In: Mandal JK, Satapathy SC, Sanyal MK, Sarkar PP, Mukhopadhyay A (eds) Information systems design and intelligent applications, New Delhi, India. Springer India, pp 771–782 21. Al Bashish D, Braik M, Bani-Ahmad S (2011) Detection and classification of leaf diseases using k-means-based segmentation and neural networks-based classification. Inf Technol J 10:267–275 22. Ali S (2016) Embedded home surveillance system. In: 2016 19th International conference on computer and information technology (ICCIT), Dec 2016, pp 42–47. IEEE

128

S. Ahmed

23. Abughalieh KM, Sababha BH, Rawashdeh NA (2019) A video-based object detection and tracking system for weight sensitive UAVs. Multimedia Tools Appl 78(7):9149–9167 24. Ali S (2016) Lip contour extraction using elliptical model. In: 2016 International workshop on computational intelligence (IWCI), pp 30-34. IEEE 25. Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain intelligence: go beyond artificial intelligence. Mob Netw Appl 23:368–375 26. Ali S, Pandey AK (2022) ArthroNet: monocular depth estimation technique toward 3D segmented maps for knee arthroscopic. Intell Med 27. Li J, Wang N, Wang Z-H, Li H, Chang C-C, Wang H (2018) New secret sharing scheme based on faster R-CNNs image retrieval. IEEE Access 6:49348–49357 28. Shahnewaz A, Pandey AK (2020) Color and depth sensing sensor technologies for robotics and machine vision. In: Machine vision and navigation, pp 59–86. Springer, Cham 29. Zhang F, Prisacariu V, Yang R, Torr PH (2019) Ga-net: guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 185–194 30. Liu P, Li X, Zhou Q (2017) Apical growing points segmentation by using RGB-D data. In: Advanced computational methods in life system modeling and simulation, pp 585–596. Springer, Singapore 31. He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42:386–397. https://doi.org/10.1109/TPAMI.2018.2844175 32. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. https://doi.org/ 10.1109/TPAMI.2016.2577031 33. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: Computer vision-ECCV 2016, vol 9905. Springer, Cham, Switzerland, pp 21–37 34. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. http://arxiv.org/abs/1804. 02767 35. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) YOLOv4: optimal speed and accuracy of object detection 36. Chen Z, Wu R, Lin Y, Li C, Chen S, Yuan Z, Chen S, Zou X (2022) Plant disease recognition model based on improved YOLOv5. Agronomy, 12:365. https://doi.org/10.3390/ agronomy12020365 37. Hammad Saleem M, Khanchi S, Potgieter J, Mahmood Arif K (2020) Image-based plant disease identification by deep learning meta-architectures. Plants, 9:1451. https://doi.org/10. 3390/plants9111451 38. ohanty SP, Hughes DP, Salathé M (2016) Using deep learning for image-based plant disease detection. Front Plant Sci, 7:1419. https://doi.org/10.3389/fpls.2016.01419. 39. Chowdhury MEH, Rahman T, Khandakar A, Ayari MA, Khan AU, Khan MS, Al-Emadi N, Reaz MBI, Islam MT, Ali SHM (2021) Automatic and reliable leaf disease detection using deep learning techniques. AgriEngineering, 3:294–312. https://doi.org/10.3390/agriengineering3020020 40. Kerkech M, Hafiane A, Canals R (2020) VddNet: vine disease detection network based on multispectral images and depth map. Remote Sens 12(20):3305 41. Chunying W, Baohua L, Lipeng L, Yanjun Z, Jialin H, Ping L, Xiang L (2021) A review of deep learning used in the hyperspectral image analysis for agriculture. Artif Intell Rev 54(7):5205– 5253 42. Ali S, Pandey AK (2022) Towards robotic knee arthroscopy: spatial and spectral learning model for surgical scene segmentation. In: Proceedings of international joint conference on advances in computational intelligence, pp 269–281. Springer, Singapore 43. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 44. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

9 Analysis of the Performance of YOLO Models…

129

45. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 46. Huang M-L, Chang Y-H (2020) Dataset of tomato leaves. Mendeley Data V1. https://doi.org/ 10.17632/ngdgg79rzb.1

Chapter 10

Strawberries Maturity Level Detection Using Convolutional Neural Network (CNN) and Ensemble Method Zeynep Dilan Da¸skın, Muhammad Umer Khan, Bulent Irfanoglu, and Muhammad Shahab Alam

1 Introduction The world population, steadily increasing at an astounding rate of around 1.05% yearly, is expected to reach ten billion by 2050 as projected by the United Nations. This triggers the deterioration of natural resources and enormous demand for food worldwide. Unfortunately, the growth in food production trails behind the population growth rate, rather a decline is recorded. To recapitulate, if this problem is not dealt with efficaciously at the earliest, it can lead to a catastrophe. The introduction of modern technology to the agriculture sector has shown some promising results regarding yield and quality of the crops, fruits, and vegetables [1–3]. Moreover, in broader terms, such technologies can be utilized to reduce time, cost, and humandriven mistakes. In recent years, many automated systems and intelligent algorithms are developed for harvesting, plowing, and crop analysis [4]. Some attempts are also made to build agricultural robots for harvesting different vegetables and fruits, including apples, tomatoes, cucumbers, and strawberries [5–7]. Although the performance of these systems is improving steadily, there are still some ongoing issues with various crops that encourage companies or research laboratories to develop new systems for specific problems in harvesting. This study focuses on a key area of detecting the strawberry’s maturity to determine whether it is ready to be harvested or not. The fundamental challenge addressed here is that strawberries are delicate and small; therefore, their accurate identification is very critical for harvesting to prevent any damage. The Z. Dilan Da¸skın · M. Umer Khan (B) Department of Mechatronics Engineering, Atilim University, Ankara 06830, Turkey e-mail: [email protected] B. Irfanoglu Department of Electrical and Electronics Engineering, Baskent University, Ankara 06790, Turkey M. Shahab Alam Defense Technologies Institute, Gebze Technical University, Kocaeli 41400, Turkey © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7_10

131

132

Z. Dilan Da¸skın et al.

complexity of the environment and operating conditions make the overall process more challenging. In order to solve the strawberries detection problem, first, the process starts with capturing images of the strawberries via an RGB camera, and then after detecting the target, using intelligent software, it classifies them according to their characteristics, such as size, shape, and color. Before the success of convolutional neural networks (CNN), image processing techniques, such as color-space conversion, segmentation, and background subtraction, were mainly used to determine these characteristics [8]. The performance of these operations is heavily dependent upon the lighting conditions, and slight variations may result in failure. Additionally, accessing the color, size, and shape of different strawberries’ growth types simultaneously is also a challenging task. Recently, deep learning methods, such as CNNs, have been widely accepted for their robust target detection using autonomous feature extraction and learning ability [9, 10]. In [11], the authors proposed a strawberry detection system using CNNs where they worked on obtaining accurate results by validating the choice of hyperparameters, framework, and network structure. The authors in [12] compared the performance of various deep architectures based on the quality inspection of strawberries. The work presented by [13] used deep-CNN to classify the strawberries into two classes. The authors reported achieving an average precision of 88.03% and 77.21% for mature and immature classes, respectively. Exploiting the impressive learning capability of CNNs, many researchers employed them for detecting various strawberry diseases. In [14], the authors designed a deep learning-based semantic segmentation model to detect and measure the strawberry plants’ gray mold disease. The authors in [15] worked on the detection of strawberry diseases using CNN. Their main focus was to detect leaf blight, gray mold, and powdery mildew by using a simple, reliable, and cost-effective mode. In a similar work [16], through disease identification, the amount of unnecessary fungicide on strawberry leaves was reduced. They compared the performance of different classifiers, such as AlexNet, SqueezeNet, GoogleNet, ResNet-50, SqueezeNet-MOD1, and SqueezeNet-MOD2. Aside from strawberries, CNN models have also proved to be effective with different crop types, such as tomatoes, tobacco, and hazelnuts. The authors in [17] studied semantic and instance segmentation models based on U-Net and Mask R-CNN to detect the effects of Tuta absoluta, a type of pest which is a tomato leaf miner. The main focus was to develop a model to improve tomato productivity and prevent farmers from annual losses. In [18], the authors developed a dataset, called TobSet, that comprises tobacco crop and weed images. In their study, the authors trained and tested Faster R-CNN and YOLOv5-based vision frameworks on TobSet and reported classification accuracy of 98% and 94%, respectively. The work reported in [19] focused on classifying seventeen widely grown hazelnut varieties to obtain high-quality nuts from the orchard to the consumer by using CNN. The performance of CNN models is also associated with the complexity of the environment. Their performance may saturate at a certain level, beyond which further improvement cannot be expected. More recently, a promising solution to this problem is proposed in the form of the Ensemble method, which combines various classifiers

10 Strawberries Maturity Level Detection Using Convolutional …

133

to make better predictions. Being a relatively new topic, not much investigation is found related to strawberries, rather forestry and agriculture have been the focus in general. In one of the latest research studies found regarding this topic [20], the authors used a Voting Ensemble technique to enhance the performance of two different networks for forecasting strawberry yields and prices. In [21], the authors used an Ensemble pre-trained residual network structure to detect fires. The authors in [22] investigated the performance of two novel CNN-DNN models for predicting county-level corn yields across the US Corn Belt (12 states). The contributions of this study are summarized as follows: • Five state-of-the-art CNN models—AlexNet, GoogleNet, SqueezeNet, DenseNet, and VGG-16—are employed to determine the maturity levels of strawberries for the developed image dataset. • Two Ensemble CNN-based models are proposed utilizing GoogleNet, SqueezeNet, and VGG-16 to further improve the accuracies obtained by each individual. • This study builds and labels a strawberry image dataset (StrawberrySet) captured during different phases of maturity levels of strawberry images. The dataset is captured under different conditions such as lightning, day timings, and weather conditions to train and evaluate most accurately. StrawberrySet is an open-source dataset publicly available at https://github.com/dlndskn/Strawberry-Dataset.git (accessed on February 22, 2023). The rest of the chapter is organized as follows: Sect. 2 presents the image dataset followed by the CNN and Ensemble models. The classification results for the considered models are compared and analyzed in Sect. 3. Lastly, Sect. 4 concludes this work.

2 Materials and Methods 2.1 Acquisition of Image Dataset and Pre-processing Most of the available strawberries datasets focus on the determination of diseases; therefore, for maturity level detection, we developed indigenously an extensive image dataset of strawberries from the actual plants. The main objective of building this dataset is to provide real-field data for training and evaluating the performance of state-of-the-art algorithms for strawberry maturity detection. The dataset comprises 500 original images that portray different phases of the ripening process of the strawberries. The capturing device was a 12MP camera with a 1/3-inch sensor and a wide-angle (28 mm equivalent) f/1.8 lens. No artificial shading and sources of lightning were used while capturing the images. To maintain diversity in the dataset, images in the dataset were captured under several factors of variations: different growth stages, different day timings, varying lighting, and weather conditions (i.e., on normal, bright sunny, and cloudy days). Some sample images from the dataset

134

Z. Dilan Da¸skın et al.

(a) Immature

(b) Semi-mature

(c) Mature

Fig. 1 Different growth stages in the life cycle of strawberry Fig. 2 Sample images before and after the augmentation process

(a) Actual

(b) Augmented

are presented in Fig. 1. After data acquisition, the main step involved in maturity detection is the annotation of images for ground truth data. All images in the dataset are manually labeled with the LabelImg tool. In deep learning, data augmentation is considered an essential step to enhance image datasets as issues like good performance and over-fitting can be dealt in this manner [23]. Data augmentation also helps to obtain more robust results since it provides different angle rates and perspectives. By using Image Data Generator, the images were oriented along a different axis within a range of ± 10◦ , sheared within a range of ± 0.15, shifted from width and height within a range of ± 0.1, flipped horizontally and vertically, and finally zoomed within a range of ± 0.1. At the end of the augmentation process, 500 original images were extended to 900 images, which shows the different stages of the ripening process of strawberries. A sample augmented image generated from an original image is illustrated in Fig. 2. All images were captured at a resolution of 1600 × 1200 pixels but later resized to satisfy the requirements of different models. The strawberries were classified into multi-classes according to their maturity levels: mature, semi-mature, and immature. All the considered models are first trained on the strawberry dataset before being evaluated.

10 Strawberries Maturity Level Detection Using Convolutional …

135

Fig. 3 Deep learning-based vision framework

2.2 CNN-Based Detection and Classification Due to the inherent robustness of deep learning algorithms to obtain accurate results under difficult operating conditions, they are mostly preferred over the conventional image processing and computer vision approaches. This study aims to develop a deep learning-based vision framework that is able to perform multi-classification from the image dataset of strawberries. To achieve this, two different approaches are utilized based on five distinguished CNN models—AlexNet [24], GoogLeNet [25], SqueezeNet [26], DenseNet [27] and VGG-16 [28], and two Ensemble methods. The Ensemble methods are utilized with the intention to improve the performance of the classifiers—especially the ones which performed worse. The classifiers used in this study differ from each other according to the size of convolutional layers and how they are structured within the network. These layers are the main parts of the network which extract the features of an image according to its characteristics and after some operations, the classification is applied to the system. The flowchart of the CNN-based proposed framework to perform the strawberry classification is shown in Fig. 3. Basic detail about the involved CNN models and Ensemble methods is provided. AlexNet AlexNet was developed for the ImageNet contest in 2012 to classify 1.2 million high-resolution images. The AlexNet architecture consists of five convolutional layers, three fully connected layers, and one softmax layer, as depicted in Fig. 4. ReLu as an activation function is employed in all convolutional layers, and pooling layers are introduced to perform max pooling. The input image size for this convolutional network is defined as 224 × 224 × 3 pixels. The dropout value of this model is 0.4, which is added to prevent over-fitting. GoogleNet GoogleNet, from Google, was declared a winner of the ILSVRC 2014 competition. To avoid over-fitting, GoogleNet architecture introduced the idea of having filters with multiple sizes to operate on the same level. This caused the network to grow wider rather than deeper. Moreover, the accuracy improved while the computational load remained the same. The architecture consists of twenty-two convolutional layers and five max pooling layers, which is a variant of the original

136

Z. Dilan Da¸skın et al. Input

1 x 1 x 4096 28 x 28 x 512

1 x 1 x 1000

14 x 14 x 512 7 x 7 x 512

56 x 56 x 256

112 x 112 x 128

convolutional + ReLU

fully connected + ReLU

max pooling

softmax

224 x 224 x 64 224 x 224 x 3

Fig. 4 Architecture diagram of AlexNet

Inception Network (Fig. 5). In comparison with AlexNet, the number of parameters reduces from sixty million to four million. The required input image size for the GoogleNet is defined as 224 × 224 × 3 pixels. SqueezeNet SqueezeNet begins and ends with a convolution layer, and in between, there exist eight fire modules. The most important aspect of this architecture is the attainment of higher accuracy at low computational requirements with the help of fire modules. Each module comprises two sub-modules: 1 × 1 squeeze convolution layer and an expanded layer that has a mix of 1 × 1 and 3 × 3 convolution filters. The input image size for SqueezeNet is required as 224 × 224 × 3. The overall SqueezeNet architecture is illustrated in Fig. 6. DenseNet Normally, a convolution layer is dependent upon the previous convolution layer for the input to generate an output. The current layer carries the information to another; hence, this process repeats itself according to the structure of the architecture. But in the DenseNet structure, each layer is connected to every other layer; therefore, the system uses collective knowledge in order to obtain the results. The entire structure comprises dense layers that pass the information to and from other layers (Fig. 7). The input image size for the DenseNet is defined as 224 × 224 × 3 pixels. VGG-16 The first appearance of VGG-16 was made in 2014 during the ImageNet Challenge. VGG-16 is regarded as a variant of AlexNet to improve the overall accuracy by introducing further network depth. In this study, VGG-16 is used, having a

10 Strawberries Maturity Level Detection Using Convolutional …

137

Input

28 x 28 x 256 14 x 14 x 528

14 x 14 x 480

14 x 14 x 512

14 x 14 x 832

1 x 1 x 1024

7 x 7 x 832

7 x 7 x 1024

1 x 1 x 1000

28 x 28 x 480

28 x 28 x 192 56 x 56 x 192 56 x 56 x 64

convolutional + ReLU

global average pooling

max pooling

fully connected + ReLU

inception

softmax

112 x 112 x 64 224 x 224 x 3

Fig. 5 Architecture diagram of GoogleNet Input

27 x 27 x 384

13 x 12 x 512

13 x 13 x 1000

1 x 1 x 1000 27 x 27 x 512

27 x 27 x 256

13 x 13 x 512

55 x 55 x 256 55 x 55 x 128 55 x 55 x 96

convolutional + ReLU global average pooling

111 x 111 x 96 max pooling 224 x 224 x 3 fire layer

Fig. 6 Architecture diagram of SqueezeNet

softmax

138

Z. Dilan Da¸skın et al. Input

57 x 57 x 384 15 x 15 x 384

8 x 8 x 1056 1 x 1 x 2208

1 x 1 x 1000 8 x 8 x 2208 29 x 29 x 768

15 x 15 x 2112

29 x 29 x 192

57 x 57 x 96

convolutional + ReLU max pooling

112 x 112 x 96 224 x 224 x 3

dense block

transition layer global average pooling fully connected + ReLU softmax

Fig. 7 Architecture diagram of DenseNet

structure of thirteen convolutional layers and five max pooling layers combined in five blocks (Fig. 8). The convolutional layers are followed by three fully connected layers containing a flattening layer in between them. Finally, an output layer is defined with a softmax activation function for three different classes in the strawberry dataset. The input image size for the VGG-16 architecture is required as 224 × 224 × 3 pixels. Ensemble Method The Ensemble method is developed to combine multiple classifiers to increase the overall accuracy to perform multi-class prediction. The main objectives are to reduce bias and variance. Stacking, Blending, Bagging, and Boosting dominate the field of Ensemble Learning. In general, the combined Ensemble model outputs better result than an individual model because the variance of the data is reduced. The weighted average Ensemble method, one of the sub-branches of the boosting method, is utilized for this study. Through this approach, the effect of the individual classifier on the final prediction can be manipulated by means of the weighting factor. This work utilizes a uniform weight to Ensemble GoogleNet with SqueezeNet and VGG-16. The approach can be easily extendable to n number of models. The architecture diagram of the Ensemble method is illustrated in Fig. 9.

10 Strawberries Maturity Level Detection Using Convolutional …

139

Input

7 x 7 x 512

1 x 1 x 4096 28 x 28 x 512

1 x 1 x 1000

14 x 14 x 512

56 x 56 x 256

112 x 112 x 128

224 x 224 x 64 224 x 224 x 3

Fig. 8 Architecture diagram of VGG-16

Fig. 9 Architecture of Ensemble method

convolutional + ReLU

fully connected + ReLU

max pooling

softmax

140

Z. Dilan Da¸skın et al.

2.3 Comparison Criteria To evaluate the performance of the CNN models and Ensemble, various performance matrices are considered: accuracy, loss, precision, recall, F1-score, and specificity. A confusion matrix is used as a main tool to determine accurate predictions and to better analyze the behavior of a specific model. The matrix consists of four main structure elements: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). True positive shows the correctly predicted event values and true negative shows the correctly predicted no-event values. False positive indicates the incorrectly predicted event values and false negative represents the incorrectly predicted no-event values. These elements are then used to determine the evaluation criterion.

2.4 Evaluation Matrices To evaluate the performance of involved CNN and Ensemble models, six indicators are assessed: accuracy, loss, precision, recall, F1-score, and specificity. The reason for the selection of these indicators is their wide acceptability and recognition among the research community. Accuracy and loss To measure the performance of each model, accuracy and loss ratings are compared first. Accuracy is used as a measure to determine how accurate the model’s predictions are to the true data, whereas loss is the sum of errors for each validation and training set. Accuracy is regarded as a measure that we wish to maximize, whereas loss needs to be reduced as much as it can. The best and worst values for accuracy are regarded as 1 and 0, whereas for loss the best value is regarded as 0, and there is no peak defined for the worst value. Precision, recall, F1-score, and specificity Precision, recall, F1-score, and specificity are the additional parameters that are considered to measure the performance of each model. Precision is used to observe how accurate the model is by using the predicted positives (false positive and true positive) and by checking if these are predicted correctly. It is calculated as follows: Precision =

TP TP + FP

(1)

{worst value: 0; best value: 1}. Recall is a measure to assess how well the developed system determined the actual positives and labeled them as true positives. It is determined as Recall = {worst value: 0; best value: 1}.

TP TP + FN

(2)

10 Strawberries Maturity Level Detection Using Convolutional …

141

F1-score is a parameter that is calculated when there is a need to balance between precision and recall; therefore, it is calculated by taking their weighted average. It usually gives better results where the distribution of the classes is not equal. The score can be calculated as F1-score =

2 × Recall × Precision Recall + Precision

(3)

{worst value: 0; best value: 1}. Specificity is the measure of how precise the system calculated the actual negatives and labeled them as true negatives. It is calculated by Specificity =

TN FP + TN

(4)

{worst value: 0; best value: 1}.

3 Results and Discussion The computations were made using Jupyter Notebook and deep learning development framework of Keras and TensorFlow on a computer pre-installed with Windows 10, equipped with i7-8750H CPU, GTX1060 GPU, 16 GB of RAM, and 256 GB SSD storage. To have a fair comparison among all models, epoch and batch size were kept constant throughout as 30 and 250, respectively. After specifying the initial parameters, training, validation, and testing batches were generated. For training, validation, and testing, the whole image dataset was distributed as 70% (613 images), 20% (163 images), and 10% (123 images), respectively. Each batch consists of samples from three different classes (mature, semi-mature, and immature) to be classified. For the learning rate, an annealer is used to obtain the best results for all models. The learning rate is a parameter that is used to control the adaptivity of the model to the system. Annealer helps to adjust the learning rate during the training. This method helps the system to increase the speed of training and productivity. For manual tuning, utmost care must be taken to tune the learning rate as a smaller value may result in insufficient changes, while a larger value may produce rapid changes that disturb overall learning process of the system.

3.1 Comparison of the Individual CNN Models and Ensemble Models The training and validation obtained for 30 epochs are given in Table 1 for each method used in this study. Most of the CNN models have been able to perform very well for the training datasets, as the highest accuracy attained is 96% and the

142

Z. Dilan Da¸skın et al.

lowest is 74%. Among the considered models, GoogleNet and VGG-16 are the worst performers, as they obtained accuracies of 74% and 85%, respectively. Their loss values also reflect the same, as they obtained scores of 2.06 and 1.90, respectively. On the other hand, AlexNet, DenseNet, and SqueezeNet came out the best performers with 95%, 95%, and 97% accuracies and 0.30, 0.16, and 0.10 loss values on average. The first Ensemble network (SqueezeNet–GoogleNet) attained 82% accuracy and 0.27 loss, while the second Ensemble network (VGG-16–GoogleNet) obtained 75% accuracy and 3.56 loss. Therefore, in terms of loss and accuracy, SqueezeNet had the best performance followed by DenseNet and AlexNet; whereas, GoogleNet had the worst. In accordance with the results obtained in Table 1, it was expected that AlexNet, SqueezeNet, and DenseNet would also exhibit better performance in terms of precision, recall, F1-score, and specificity. As it can be seen from Table 2, SqueezeNet and DenseNet have outperformed others by obtaining a score of either 0.99 or 1.00 for the given measures. To further improve the performance of the worst-performing classifiers, two different Ensemble approaches are developed. In this study, GoogleNet and VGG16 are the worst performers; whereas, SqueezeNet is the best performer. The first

Table 1 Accuracy and loss evaluated for 30 epochs Classifiers Performance measures Accuracy Training Validation AlexNet GoogleNet SqueezeNet DenseNet VGG-16 Ensemble (SqueezeNet–GoogleNet) Ensemble (VGG-16–GoogleNet)

Loss Training

Validation

0.95 0.74 0.97 0.95 0.85 0.82

0.74 0.56 0.79 0.63 0.66 0.83

0.30 2.06 0.10 0.16 1.90 0.27

1.25 3.28 0.62 7.29 1.16 0.22

0.75

0.72

3.56

4.01

Table 2 Precision, recall, F1-score, and specificity for 30 epochs Performance measures

Classifiers Alex Net

Google Net

Squeeze Net

Dense Net

VGG16

Ensemble Ensemble (SqueezeNet– (VGG-16– GoogleNet) GoogleNet)

Precision

0.92

0.50

1.00

1.00

0.83

0.85

0.72

Recall

0.83

0.50

0.98

1.00

0.90

0.81

0.69

F 1-score

0.83

0.67

1.00

0.99

0.91

0.89

0.79

Specificity

0.83

0.67

0.98

1.00

1.00

0.88

0.81

10 Strawberries Maturity Level Detection Using Convolutional …

143

obvious choice is to patch the worst performer (GoogleNet) to the best performer (SqueezeNet), which provides an overall accuracy of 82%, much better than the standalone GoogleNet. In the second approach, both worst performers (VGG-16 and GoogleNet) are combined to result in an improved accuracy of 75%, which is again an improvement from the standalone GoogleNet. This validates the idea that the Ensemble method is able to improve the performance of the individual classifier when it is patched with other better classifiers. The first Ensemble model has outperformed the second model; therefore, it can be observed that increasing the diversity of the combination results in a better performance. By considering the overall performance of SqueezeNet and DenseNet in terms of accuracy, it can be deduced that adding a depth layer to a CNN architecture can improve the overall accuracy, however, this does not ensure an overall improvement. Additionally, although Ensemble methods did not outperform the best individual model (SqueezeNet), it strengthens the idea of improving the worst performers by patching them with the best ones. Considering this, it can be concluded that although one model can do an excellent job for a classification problem, a set of models can enhance the performance even better in most cases. The main reason for this is that the Ensemble models are more generalized with less bias. The results proved that CNN is an effective solution for detecting and classifying strawberries according to their maturity levels, and it has promising results in the realm of precision agriculture. Using this method for agricultural purposes, the quality of the products can be improved while continuing to increase accuracy and decreasing working costs. Though we have achieved satisfactory results, they can be further improved by increasing the dataset size and pre-processing techniques. Computational time Deep learning is often regarded as a process that requires resources as well as time, depending upon the complexity of the datasets. Therefore, time is regarded as a critical element when the selection of CNN models is to be made. In this study, we observed the computational time required by each model to train and test 900 images from the dataset. DenseNet, which is one of the best performers, required the longest computational time. For 30 epochs, it took almost 100 mins to complete the process, which makes it the most computationally intensive in our list. Followed by DenseNet are the VGG-16, AlexNet, and Ensemble of (VGG-16– GoogleNet) that required 45 mins, 25 mins, and 21 mins, respectively. SqueezeNet had the best performance with an accuracy of 97% achieved in 20 mins. The Ensemble of SqueezeNet–GoogleNet required almost 20 mins to complete the prediction. Surprisingly, GoogleNet proved to be the least computationally intensive, requiring only 18 mins, but surely at the cost of accuracy and other performance measures.

4 Conclusion This study presents CNN and Ensemble models to detect and classify strawberries according to their maturity level. The considered models are AlexNet, GoogleNet, SqueezeNet, DenseNet, and VGG-16 and two Ensemble networks (SqueezeNet– GoogleNet and VGG-16–GoogleNet). To measure the performance of these models,

144

Z. Dilan Da¸skın et al.

accuracy, loss, precision, recall, F1-score, specificity, and computational time are used as matrices. SqueezeNet had the best performance with an accuracy of 97% obtained in 20 mins. Although, DenseNet is a competitor of SqueezeNet as it achieved an accuracy of 95%; however, it proved to be most computationally expensive as it required 1 h and 40 mins for training. AlexNet achieved an accuracy of 95% in only 25 mins hence became the second-best option overall. GoogleNet and VGG16 have under-performed among all as their accuracy rate was 75% and 85%, but GoogleNet came out as the least computationally intensive model. To test if the result of the poorly performed classifiers can be improved by the Ensemble method, we combined VGG-16–GoogleNet first and obtained 75% accuracy in 21 mins. The other Ensemble method combined SqueezeNet with GoogleNet to obtain an accuracy of 82% with a computational time of 20 mins. This study also gives the authors insight into selecting the most appropriate CNN model based on the resources and the performance requirements. In the future work, we plan to investigate the effect of non-uniform weights on the predicted output using the Ensemble methods. Another possible area of research can be the introduction of a high-level performance-boosting method called Transfer Learning to the given model to pre-train the given model and improve its accuracy. Lastly, we intend to deploy the proposed model on the embedded hardware to provide input to an autonomous robot for performing harvesting.

References 1. He Z, Li M, Cai Z, Zhao R, Hong T, Yang Z, Zhang Z (2021) Optimal irrigation and fertilizer amounts based on multi-level fuzzy comprehensive evaluation of yield, growth and fruit quality on cherry tomato. Agric Water Manag 243:106360 2. Fuglie K, Gautam M, Goyal A, Maloney W (2019) Harvesting prosperity: technology and productivity growth in agriculture. World Bank Publications 3. De Clercq M, Vats A, Biel A (2018) Agriculture 4.0: the future of farming technology. Proceedings of the world government summit, Dubai, UAE, pp 11–13 4. Van Klompenburg T, Kassahun A, Catal C (2020) Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric 177:105709 5. Baeten J, Donné K, Boedrij S, Beckers W, Claesen E (2008) Autonomous fruit picking machine: a robotic apple harvester. Field Serv Robot 531–539 6. Knuth D. Agricultural robots. https://www.agrobot.com/ 7. Wang Z, Xun Y, Wang Y, Yang Q (2022) Review of smart robots for fruit and vegetable picking in agriculture. Int J Agric Biol Eng 15:33–54 8. Ouyang C, Li D, Wang J, Wang S, Han Y (2012) The research of the strawberry disease identification based on image processing and pattern recognition. In: International conference on computer and computing technologies in agriculture, pp 69–77 9. Zhou X, Lee W, Ampatzidis Y, Chen Y, Peres N, Fraisse C (2021) Strawberry maturity classification from UAV and near-ground imaging using deep learning. Smart Agric Technol 1:100001 10. Perez-Borrero I, Marin-Santos D, Vasallo-Vazquez M, Gegundez-Arias M (2021) A new deeplearning strawberry instance segmentation methodology based on a fully convolutional neural network. Neural Comput Appl 33:15059–15071 11. Lamb N, Chuah M (2018) A strawberry detection system using convolutional neural network. In: IEEE international conference on big data (Big Data), pp 2515–2520

10 Strawberries Maturity Level Detection Using Convolutional …

145

12. Sustika R, Subekti A, Pardede H, Suryawati E, Mahendra O, Yuwana S (2018) Evaluation of deep convolutional neural network architectures for strawberry quality inspection. Int J Eng Technol 7:75–80 13. Habaragamuwa H, Ogawa Y, Suzuki T, Shiigi T, Ono M, Kondo N (2018) Detecting greenhouse strawberries (mature and immature), using deep convolutional neural network. Eng Agric Environ Food 11:127–138 14. Bhujel A, Khan F, Basak J, Jaihuni M, Sihalath T, Moon B, Park J, Kim H (2022) Detection of gray mold disease and its severity on strawberry using deep learning networks. J Plant Dis Prot 1–14 15. Xiao J, Chung P, Wu H, Phan Q, Yeh J, Hou M (2020) Detection of strawberry diseases using a convolutional neural network. Plants 10:31 16. Shin J, Chang Y, Heung B, Nguyen-Quang T, Price G, Al-Mallahi A (2021) A deep learning approach for RGB image-based powdery mildew disease detection on strawberry leaves. Comput Electron Agric 183:106042 17. Loyani L, Bradshaw K, Machuve D (2021) Segmentation of Tuta Absoluta’s damage on tomato plants: a computer vision approach. Appl Artif Intell 35:1107–1127 18. Alam M, Alam M, Tufail M, Khan M, Güne¸s A, Salah B, Nasir F, Saleem W, Khan M (2022) TobSet: a new tobacco crop and weeds image dataset and its utilization for vision-based spraying by agricultural robots. Appl Sci 12:1308 19. Taner A, Öztekin Y, Duran H (2021) Performance analysis of deep learning CNN models for variety classification in hazelnut. Sustainability 13:6527 20. Chaudhary M, Gastli M, Nassar L, Karray F (2021) Deep learning approaches for forecasting Strawberry yields and prices using satellite images and station-based soil parameters. ArXiv Preprint ArXiv:2102.09024 21. Dogan S, Barua P, Kutlu H, Baygin M, Fujita H, Tuncer T, Acharya U (2022) Automated accurate fire detection system using ensemble pretrained residual network. Expert Syst Appl 203:117407 22. Shahhosseini M, Hu G, Khaki S, Archontoulis S (2021) Corn yield prediction with ensemble CNN-DNN. Front Plant Sci 12:709008 23. Lawrence S, Giles C (2000) Overfitting and neural networks: conjugate gradient and backpropagation. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: new challenges and perspectives for the new millennium, vol 1, pp 114–119 24. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25 25. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 26. Iandola F, Han S, Moskewicz M, Ashraf K, Dally W, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and Vineyard-scale datasets

• Head has a complex, self-occluding shape that is not easy to model in a feature-based framework

Leaves Detection. Leaves detection is bound to disease detection. A search on Scopus with the terms “grape leaves detection/segmentation” returns only results for vine disease detection; therefore, it is hard to locate solely papers with reported performances on the leaves segmentation task. In Sect. 4.1, computer vision methods for disease detection are also reviewed, including leaves segmentation and diseased part identification in their pre-processing steps. In Table 10, research works focused on leaves segmentation with reported results are included. Kalampokas et al. [38] tested 11 CNN models architectures for grape clusters and leaves segmentation in vineyard RGB images. Results indicated that CNNs could be efficiently used for real-time segmentation of grapes and leaves (Fig. 9). In [39], an adaptive snake algorithmbased model (ASA) was proposed for segmentation and region identification. The latter is an extended work of the research team, first presented in [40]. Pereira et al. [41] introduced a segmentation technique based on region growing using colour model and thresholding techniques. Leaves Stem Detection. Leaves stem (or peduncle) detection is not reported in the literature. Such work does not seem to have attracted research interest; however, it may constitute an essential first step towards computer vision-based precision leaves thinning in vineyards. The latter is expected to be challenging, since leave stems Table 10 Computer vision-based vine leaves detection Leaves detection Refs. Model

Performance

Dataset

Encountered challenges

[38]

MobileNetV2 _PSPNet

83.45% IOU

Grapes and leaves dataset

• Varying illumination • Different capturing angles

[39]

ASA

96.05% precision PlantVillage – and 92.02 recall dataset [42] and PlantLevel dataset

[41]

Image Segmentation 94.8% average Assessment Tool ISAT 1.0 [43] accuracy

In-house dataset • Shadowing • Holes in the leaves

198

E. Vrochidou and G. A. Papakostas

Fig. 9 Indicative grapes and leaves segmentation results from Kalampokas et al. [38]; from left to right: RGB input image and segmented image

are occluded or totally obstructed by the leaves themselves. An approach could be to detect the veins of the leaf and estimate the position of the stem towards the direction of the main vain. Grape leaves fingerprinting was conducted by Michels et al. [44]. The research team managed to extract the vein network on leaves and thus determine the peduncle point as the common start point of the network by using a temple matching strategy. The main challenges were on the handling of overlapping parts, the manual labelling, and the discolourations on the leaves. Grape Cluster Detection. Detection and localization of grape clusters (on bunches) in vineyards is the main subtask for most of viticultural practices. Recognition of grape clusters is essential for automating spraying, disease detection, cluster thinning, ripeness estimation, grape quality assessment, and harvest which is the most timeconsuming and labour-intensive task of all. Therefore, many approaches have been proposed in the literature. Table 11 includes information on computer vision-based grape cluster detection methods of the last decade, and Fig. 10 illustrates indicative results for grape clusters detection. The main encountered challenges in all works are on the image quality and quantity. Lack of public datasets with many images is observed. Neighbouring clusters separation is also a challenge, along with occlusions from leaves and shoots, varying illumination conditions and shadowing. Grape Stem Detection. In order to automate harvest and green harvest, the exact location of grape stems needs to be detected towards precise grape picking. Grapes are soft and delicate and cannot be removed by grasping and puling, unless damaged. The most effective alternative is cutting them from the stem. Therefore, much research has been conducted in grape stem detection. Information is included in Table 12. The main difficulties are due to environmental noise and occlusions and due to the thin structure of stems, making both annotation and detection tasks challenging. Indicative sample results from grape stem detection are illustrated in Fig. 11. Graft Union Detection. Graft union is the part of the trunk just above the surface of the soil where the scion and rootstock are united by grafting and start growing together

13 Leveraging Computer Vision for Precision Viticulture

199

Table 11 Computer vision-based grape cluster detection Cluster detection Refs.

Model

Performance

Dataset

Encountered challenges

[38]

ResNet50 _FRRN

87.89% IOU

Grapes and leaves • Varying dataset illuminations • Different capturing angles

[45]

CNN

87.5% accuracy

In-house dataset

• Poor quality of input images • Overlapping bunches

[46]

RBF-SVM

88.61% AP and 80.34% average recall (AR)

Israel dataset [47], Iceland dataset [48], Portugal dataset [49], Chile dataset

• Difficulty to identify white grapes based on colour features

[50]

YOLO-Grape and Picking points

93.27% mAP

Grape-PP dataset • Complex (part of Embrapa backgrounds Wine Grape Instance Segmentation Dataset (WGISD) [51])

[52]

Mask R-CNN

91% F1-score for instance grape segmentation

WGISD [51]

[53]

SVM

88% accuracy In-house dataset and 91.6% recall

[54]

UNet

76% IOU

• Challenging due to human annotations • Occlusions and noise • Only tested on red grapes • Not efficient when an area contains more than bunches

Israel dataset [47] • Limited input and an in-house size with dataset colour-based feature makes it difficult to discriminate white grapes from leaves (continued)

200

E. Vrochidou and G. A. Papakostas

Table 11 (continued) Cluster detection Refs.

Model

Performance

Dataset

[55]

k-NN

94% accuracy in In-house dataset white grapes and 83% in white grapes

• High similarity of white grape bunch with the background including leaves

[56]

Random Forest (RF)

97.5% accuracy and 90.7% F1-score

• Big computation time • Only for daytime grape images

[57]

ANN

99.40% accuracy Israel dataset [47] • Limited to images took from grape gardens • Not evaluated for diverse grape colours • Segmentation was carried out under a limited light condition • Grapes were of specific matureness

[58]

YOLOv5s

99.40% Grape-Internet precision, dataset 99.40% recall, mAP and 99.40% F1-score

[59]

SSD MobileNet-V1

66.96% mAP in 6.29 ms

In-house dataset

Encountered challenges

• Light changes, overlaps, shadows, and occlusions

Grape bunch and • Small bunches vine trunk dataset have colour and for deep learning texture similar to object detection the surrounding [60] foliage • Limited growing stages images • Not evaluated for night images with artificial illumination (continued)

13 Leveraging Computer Vision for Precision Viticulture

201

Table 11 (continued) Cluster detection Refs.

Model

Performance

Dataset

Encountered challenges

[61]

YOLOv5x

82.6% accuracy

In-house dataset

• Tested only on white varieties • Difficult to detect many bunches (two) in an image

[62]

Mask R-CNN

60.1% AP on In-house dataset object detection, 59.5% on instant segmentation

• Huge variation of light conditions in the field background conditions • Variation of camera lens distance during data collection

[63]

Pixel-wise classification

90.1% accuracy

BIVcolour – dataset and RondoDS dataset

[64]

Swing Transformer (SwinGD)

91.5% mAP in red grapes, 79.8% in green grapes

In-house dataset

• Slow inference time

[65]

Swin-transformer-YOLOv5

97% mAP and 89% F1-score

Wine grape dataset [66]

• Affected by berry matureness • Affected by light directions

[67]

AdaBoost classifier

93.74% average detection rate in 0.59 s

In-house dataset

• Multiple overlapping and adjoining clusters could not be detected

[68]

CNN ResNet50

99% accuracy for both red and white grapes

In-house dataset

• Transfer learning approach not tailored to the problem

[69]

Chan–Vese level set model and morphological processing method

91.7% accuracy for white grapes detection in 0.46 s

In-house dataset

• Night images with artificial illumination to avoid natural light errors

202

E. Vrochidou and G. A. Papakostas

Fig. 10 Indicative sample images for grapes clusters detection results from Wang et al. [64] Table 12 Computer vision-based grape stem detection Stem detection Refs.

Model

Performance

Dataset

Encountered challenges

[50]

YOLO-Grape and Picking points

31.63 pixel standard deviation

Grape-PP dataset (part of Embrapa Wine Grape Instance Segmentation Dataset (WGISD) [52])

• Grape stem not always visible

[69]

Hough straight line detection method

80%–92.5% accuracy on picking point calculation

In-house dataset

• Night images with artificial illumination to avoid natural light errors

[70]

Unet_ MobilenetV2

98.90% IOU

Stem dataset

• Shades and occlusions of stems

[71]

Geometric calculations

87% average detection

In-house dataset

• Occlusions of stems by objects or overlapping clusters

[72]

Geometric calculations

88.33% average recognition

In-house dataset

• Changing illumination • Grape clusters obscured by leaves • Unripe berries mixed with ripe grape clusters that reduce the robustness of the segmentation algorithm • Short length stems not detected

[73]

Hough line fitting

80% accuracy In-house dataset

• Detection in artificial lighting

[74]

Image operations

82%–92% In-house dataset detection rate

• Complex background can cause incisions on grape stems • Sunny days exposure makes image details fade

[75]

Mask R-CNN

91.03% detection accuracy

• Only for simple screen construction

WGISD [51]

13 Leveraging Computer Vision for Precision Viticulture

203

Fig. 11 Indicative sample images for grapes clusters detection results from Kalampokas et al. [70]; from left to right: original image, ground truth, and final stem segmentation

[76]. Detection of craft unions is useful in automated trunk width measurements to track the vines development and in case of viticultural practices automation, to define the near root area towards precision fertilizing. While graft union detection has been implemented for other crops, e.g. apples [77], it has never been investigated for vines. Suckers Detection. Suckers are weak shoots (offshoots) located on vine trunks. Suckers need to be removed in shoot thinning so as to make room for health fruitful shoots to fully develop. Suckers’ detection has never been implemented by computer vision towards precision viticulture. The detection of suckers can be included in the detection of shoots in general, with the difference that this is done in the area of the trunk instead of the cordon’s area.

4.2 Structural Elements Detection Structural elements of vineyards include trellis wires and posts (or poles). Metal wires are usually three; the lower is the cordon wire and the higher is the catch wire. Line post (wooden, concrete, or metal) is vertically placed between the trellis wires, to support the canopy. The posts and trellis wires have simple structures; therefore, they can be detected with standard computer vision methods. In [17], posts and wires in a vineyard were detected as part of a pruning methodology. Wires were detected by matching straight line models to neighbouring wire pixels and then merging collinear parts (Fig. 12). Posts were detected by applying large filters to detect vertical edges and Hough transform. For the classification, an SVM was used. The authors concluded that wires segmentation could be challenging due to variations of lights, shadowing, and demosaicing artefacts that were created around the thin wire structures. Results are included in Table 13.

204

E. Vrochidou and G. A. Papakostas

Fig. 12 Vine and structural elements detection from Botterill et al. [17]; canes are marked in yellow, head region is marked in red, and trellis wires are marked in orange

Table 13 Computer vision-based vine structural elements detection Vine structural elements detection Ref.

Model

Performance

Dataset

Encountered challenges

[17]

Linear SVM for wires classification

95.5% classification accuracy

Vineyard-scale datasets

• Wires segmentation

4.3 Supplementary Detection Subtasks In vine crop management, monitoring of the vine’s health allows for early detection of deficiencies, pests, diseases, etc., that could drive targeted viticultural practices to control the problem. Chemical and molecular methods are implemented in laboratories or may be confirmed visually from in-site inspection of the entire vineyard, which is time consuming and error prone. Therefore, computer vision-based monitoring is an efficient non-destructive, quick, and economic alternative, adopted to viticultural settings the past few years [2, 5, 78]. Nutrients detection. Nutrients detection is essential to the final harvested grapes’ quality. Nutrients depend on soil, weather, and irrigation and are strongly connected with fertilizing practices. Macro- and micro-nutrients can be detected with nearinfrared (NIR) spectroscopy and chemometrics [79, 80]. Nutrients deficiencies diagnosis can be automated by computer vision algorithms. Nitrogen and potassium are the two most important nutrients for a vine, and their deficiency can be visible in vine leaves. In [81], an image processing methodology for potassium level classification in grape leaves was proposed. The authors employed a k-NN for segmentation and concluded that working with colour data could lead to efficient nutrition deficiency diagnosis from leaves images. In [82], aerial multispectral images were used

13 Leveraging Computer Vision for Precision Viticulture

205

to estimate the grapevine leaf nitrogen concentration by using five machine learning classifiers with SVM resulting in 80.85% F1-score. In [83], real-time detection of potassium deficiency in red grape leaves was performed by using a SVM and a CNN model, resulting in 80% detection accuracy with ResNet50. Ripeness Estimation. Ripeness estimation is crucial for optimizing harvest time and harvest quality. Wine quality is often limited by the degree of grape uniformity at harvest. Machine vision has been introduced for grape ripeness estimation as an efficient alternative to chemical analyses and in-site sampling towards homogenous maturity level at harvest time. Vrochidou et al. in [29] reviewed machine vision techniques for ripeness estimation in viticultural applications, including colour imaging, hyperspectral imaging, and NIR spectroscopy, revealing challenges, limitations, and future potentials. Quality Assessment. Search on Scopus on grape quality assessment in the last decade returned over 70 documents. The latter is due to the fact that quality assessment includes the detection of defected clusters (rotten, immature, diseased, non-uniform) [84], as well as the detection of size, weight and firmness [85]. Therefore, it is also strongly related to harvest, green harvest, disease detection, and yield estimation. Most of the research in this area focus on image processing of individual grapes in the laboratory; however, in-field applications need to be able to deal with variable lighting conditions and factors such as occlusion of the grapes within the cluster and by the dense foliage. Moreover, it should be noted that although significant progress has been made, still a practical in-field method for quick and accurate measurement of grape size and weight is not reported. The latter would allow wine grape to assess the ripeness level and quality of their production in the entire vineyard several times during the growing season. Water Stress Detection. Water stress detection can determine optimal irrigation planning. Water stress can be detected in vineyards by using appropriate sensory data from field sensors to monitor crop status, e.g. trunk growth, berry growth, etc., meteorological variables, e.g. temperature, humidity, wind, etc., and other field measurements, e.g. daily water input, etc. [86]. Computer vision offers a more simplified alternative, by using aerial images, thermal, multispectral, and hyperspectral [82, 88, 88–90] to detect water stress in vineyards. Remote sensing can handle the uniform irrigation practice of most vineyards which does not consider the individual plant needs. Disease Detection. There are numerous grapevine diseases, including viruses, fungus, and micro-organisms that may affect the crop’s growth and production. The three most common grape diseases are downy mildew, powdery mildew, and grey mould. Diseases may be detected on the leaves or on the grape clusters. Early disease detection and prevention are necessary to avoid yield and economic losses. Therefore, there is a variety of methodology schemes that have been developed for grape disease detection. Extensive research has been conducted in remote sensing disease detection in vineyards; a Scopus research in the last decade reported 80 documents

206

E. Vrochidou and G. A. Papakostas

on this research field. Computer vision-based disease detection could be incorporated to the automation of harvest and green harvest, so as to remove the diseased clusters, as well as in spraying, to start the process by determining the infected zones towards targeted pesticides application [91–94]. This research field has still room for improvement. In-field detection of diseases is challenging as grapes may exhibit different signs and symptoms of the same disease, depending on the grape cultivar and the development stage of the disease. Additionally, the crop may be affected by more than one disease. Current research still does not address the issue of the detection of more than one disease or the identification of the disease at various stages of its development by using in-field images. Pest Detection. Pests are insects that can deteriorate the yield. Pests’ detection could aim towards early prediction and spraying the appropriate kind of pesticides. Pests are handled as diseases; therefore, their detection is usually included in disease detection [95, 96]. Weed Detection. Computer vision-based weed detection is usually performed by aerial images. Detection from the ground is challenging due to the crops having the same characteristics, i.e. size and colour, as weed. Cynodon dactylon is the most common weed to compete with the cultivation. Many research focus on computer vision-based Cynodon dactylon detection from both aerial [97, 98] and ground images [23].

5 Discussion The conducted review revealed that there is still room for improvements in computer vision for viticulture. As observed from the tables of reported performances of computer vision algorithms, different evaluation metrics are reported among different researchers. The selection of a unified metric or the normalization of metrics could allow for fair comparisons between different methodologies. The comparison, however, depends also on the used datasets. From the tables again, it is clear that viticultural public datasets are very limited. Most of the data used in research are in-house data and may be available after request or not available at all. Large datasets including multiple images along with their annotations could save time and effort for researchers and could facilitate the fair comparison among algorithms performances. Public datasets, moreover, need to include a large number of images of grapevine elements in varying illumination, shadowing, in different seasons and at different growing stages, since environmental noise, occlusions from leaves, branches and foliage, and varying lighting conditions are the main reported obstacles of computer vision detection algorithms. Image quality also has a great impact on the detection of computer vision algorithms. High quality of captured images, however, needs to balance with low costs of the capturing systems. Low-cost and robust vision systems are desirable. Hyperspectral images may reveal more information in cases such as disease detection; however,

13 Leveraging Computer Vision for Precision Viticulture

207

the current cost of hyperspectral cameras is much higher than that of a conventional digital colour camera. Hopefully, recent advances in multispectral cameras for the fast-developing aerial remote sensing market indicate that multi-waveband cameras of lower costs will be available soon. As costs and ease of data collection decrease, winegrowers will increasingly adapt computer vision technologies in their vineyards. Regarding the current status of precision viticultural automation, it should be noted that only six out of the 16 defined annual practices have already been fully automated in the last decade: pruning, shoot thinning, weeding, leaf thinning, cluster thinning, and harvest. Regarding computer vision-based subtasks, most of the work is done on grape cluster detection, ripeness estimation, and disease detection. Moreover, there are subtasks that have never been investigated before, such as graft union and suckers detection. The identified research gap that derived from this research, for all viticulture practices, is included in Table 2, towards guiding future potential research in this research field.

6 Conclusions This paper comprises a comprehensive review of the current status computer vision in viticulture. Latest developments in automations of viticultural practices were presented, following the annual viticulture practices logbook. Research revealed that currently only six out of the 16 defined annual practices have already been fully automated in the last decade: pruning, shoot thinning, weeding, leaf thinning, cluster thinning, and harvest. Subsequently, all practices were broken down on computer vision-related subtasks, regarding the detection of the basic parts of the vine tree (shoots, canes, etc.), the detection of the structural elements of a vineyard, and supplementary detection subtasks which complement the basic practices. The defined subtasks form a set of preliminary computer vision subtasks that need to be implemented by priority towards the overall automation of the basic viticultural practices. At this point, research revealed that most of the work is done on grape cluster detection, ripeness estimation, and disease detection, while there are subtasks that have never been investigated before, such as graft union and suckers detection. The identified research gap steaming from this work could take us one step towards the development of practical automations systems in viticulture technology for their practical implementation in smart vineyards of the future.

208

E. Vrochidou and G. A. Papakostas

References 1. Creasy GL (2017) Viticulture: grapevines and their management. In: Encyclopedia of applied plant sciences. pp 281–288. Elsevier. https://doi.org/10.1016/B978-0-12-394807-6.00240-9 2. Seng KP, Ang L-M, Schmidtke LM, Rogiers SY (2018) Computer vision and machine learning for viticulture technology. IEEE Access 6:67494–67510. https://doi.org/10.1109/ACCESS. 2018.2875862 3. Pádua L, Marques P, Hruška J, Adão T, Peres E, Morais R, Sousa J (2018) Multi-temporal vineyard monitoring through UAV-based RGB imagery. Remote Sens 10:1907. https://doi.org/ 10.3390/rs10121907 4. Roure F, Moreno G, Soler M, Faconti D, Serrano D, Astolfi P, Bardaro G, Gabrielli A, Bascetta L, Matteucci M (2018) GRAPE: ground robot for vineyard monitoring and protection. In: Advances in intelligent systems and computing, pp 249–260. https://doi.org/10.1007/978-3319-70833-1_21 5. Mohimont L, Alin F, Rondeau M, Gaveau N, Steffenel LA (2022) Computer vision and deep learning for precision viticulture. Agronomy 12:2463. https://doi.org/10.3390/agronomy1210 2463 6. Gutiérrez-Gamboa G, Zheng W, Martínez de Toda F (2021) Current viticultural techniques to mitigate the effects of global warming on grape and wine quality: a comprehensive review. Food Res Int 139:109946. https://doi.org/10.1016/j.foodres.2020.109946 7. Allegro G, Martelli R, Valentini G, Pastore C, Mazzoleni R, Pezzi F, Filippetti I (2022) Effects of mechanical winter pruning on vine performances and management costs in a trebbiano romagnolo vineyard: a five-year study. Horticulturae 9:21. https://doi.org/10.3390/horticultura e9010021 8. The vineyard magazine: mechanical pre pruning. https://www.vineyardmagazine.co.uk/mac hinery/mechanical-pre-pruning/ 9. Main GL, Morris JR (2008) Impact of pruning methods on yield components and juice and wine composition of Cynthiana grapes. Am J Enol Vitic 59:179–187. https://doi.org/10.5344/ ajev.2008.59.2.179 10. Jackson RS (2014) Vineyard practice. In: Wine science. pp 143–306. Elsevier. https://doi.org/ 10.1016/B978-0-12-381468-5.00004-X 11. Andújar D, Dorado J, Fernández-Quintanilla C, Ribeiro A (2016) An approach to the use of depth cameras for weed volume estimation. Sensors 16:972. https://doi.org/10.3390/s16 070972 12. Ogawa Y, Kondo N, Monta M, Shibusawa S (2006) Spraying robot for grape production. In: Field and service robotics, pp 539–548. Springer-Verlag, Berlin/Heidelberg. https://doi.org/10. 1007/10991459_52 13. Ivaniševi´c D, Kalajdži´c M, Drenjanˇcevi´c M, Puškaš V, Kora´c N (2020) The impact of cluster thinning and leaf removal timing on the grape quality and concentration of monomeric anthocyanins in Cabernet-Sauvignon and Probus (Vitis vinifera L.) wines. OENO One, 54:63–74. https://doi.org/10.20870/oeno-one.2020.54.1.2505 14. Sivilotti P, Falchi R, Herrera JC, Škvarˇc B, Butinar L, Sternad Lemut M, Bubola M, Sabbatini P, Lisjak K, Vanzo A (2017) Combined effects of early season leaf removal and climatic conditions on aroma precursors in sauvignon blanc grapes. J Agric Food Chem 65:8426–8434. https://doi.org/10.1021/acs.jafc.7b03508 15. Korkutal ˙I, Bahar E, Zinni A (2021) Determination the effects of leaf removal and topping at different times on the grape berry. J Inst Sci Technol, pp 1–9. https://doi.org/10.21597/jist. 785219 16. Huffman WE (2014) Agricultural labor: demand for labor. In: Encyclopedia of agriculture and food systems, pp 105–122. Elsevier. https://doi.org/10.1016/B978-0-444-52512-3.00100-5 17. Botterill T, Paulin S, Green R, Williams S, Lin J, Saxton V, Mills S, Chen X, Corbett-Davies S (2017) A robot system for pruning grape vines. J F Robot 34:1100–1122. https://doi.org/10. 1002/rob.21680

13 Leveraging Computer Vision for Precision Viticulture

209

18. Fernandes M, Scaldaferri A, Fiameni G, Teng T, Gatti M, Poni S, Semini C, Caldwell D, Chen F (2021) Grapevine winter pruning automation: on potential pruning points detection through 2D plant modeling using grapevine segmentation. In: 2021 IEEE 11th annual international conference on CYBER technology in automation, control, and intelligent systems (CYBER), pp 13–18. IEEE. https://doi.org/10.1109/CYBER53097.2021.9588303 19. Yang Q, Yuan Y, Chen Y, Xun Y (2022) Method for detecting 2D grapevine winter pruning location based on thinning algorithm and lightweight convolutional neural network. Int J Agric Biol Eng 15:177–183. https://doi.org/10.25165/j.ijabe.20221503.6750 20. Majeed Y, Karkee M, Zhang Q, Fu L, Whiting MD (2021) Development and performance evaluation of a machine vision system and an integrated prototype for automated green shoot thinning in vineyards. J F Robot 38:898–916. https://doi.org/10.1002/rob.22013 21. Majeed Y, Karkee M, Zhang Q, Fu L, Whiting MD (2020) Determining grapevine cordon shape for automated green shoot thinning using semantic segmentation-based deep learning networks. Comput Electron Agric 171:105308. https://doi.org/10.1016/j.compag.2020.105308 22. Majeed Y, Karkee M, Zhang Q, Fu L, Whiting MD (2019) A study on the detection of visible parts of cordons using deep learning networks for automated green shoot thinning in vineyards. IFAC-PapersOnLine 52:82–86. https://doi.org/10.1016/j.ifacol.2019.12.501 23. Kateris D, Kalaitzidis D, Moysiadis V, Tagarakis AC, Bochtis D (2021) Weed mapping in vineyards using RGB-D perception. In: The 13th EFITA international conference, p 30. MDPI, Basel Switzerland. https://doi.org/10.3390/engproc2021009030 24. Vrochidou E, Tziridis K, Nikolaou A, Kalampokas T, Papakostas GA, Pachidis TP, Mamalis S, Koundouras S, Kaburlasos VG (2021) An autonomous grape-harvester robot: integrated system architecture. Electronics 10:1056. https://doi.org/10.3390/electronics10091056 25. Asefpour Vakilian K, Massah J (2017) A farmer-assistant robot for nitrogen fertilizing management of greenhouse crops. Comput Electron Agric 139:153–163. https://doi.org/10.1016/j.com pag.2017.05.012 26. Vrochidou E, Oustadakis D, Kefalas A, Papakostas GA (2022) Computer vision in self-steering tractors. Machines, 10:129. https://doi.org/10.3390/machines10020129 27. Reiser D, Sehsah E-S, Bumann O, Morhard J, Griepentrog H (2019) Development of an autonomous electric robot implement for intra-row weeding in vineyards. Agriculture 9:18. https://doi.org/10.3390/agriculture9010018 28. Pulko B, Frangež M, Valdhuber J (2022) The impact of shoot topping intensity on grape ripening and yield of ‘chardonnay’. Agricultura, 19:29–35. https://doi.org/10.18690/agricu ltura.19.2.29-35.2022 29. Vrochidou E, Bazinas C, Manios M, Papakostas GA, Pachidis TP, Kaburlasos VG (2021) Machine vision for ripeness estimation in viticulture automation. Horticulturae 7:282. https:// doi.org/10.3390/horticulturae7090282 30. Guadagna P, Frioni T, Chen F, Delmonte AI, Teng T, Fernandes M, Scaldaferri A, Semini C, Poni S, Gatti M (2021) Fine-tuning and testing of a deep learning algorithm for pruning regions detection in spur-pruned grapevines. In: Precision agriculture’21, pp 147–153. Wageningen Academic Publishers, The Netherlands. https://doi.org/10.3920/978-90-8686-916-9_16 31. Aguiar AS, Santos FND, De Sousa AJM, Oliveira PM, Santos LC (2020) Visual trunk detection using transfer learning and a deep learning-based coprocessor. IEEE Access 8:77308–77320. https://doi.org/10.1109/ACCESS.2020.2989052 32. Santos L, Aguiar A, Santos F (2021) VineSet: vine trunk image/annotation dataset. https://zen odo.org/record/5362354#.Y8ElAnZByMo 33. Badeka E, Kalampokas T, Vrochidou E, Tziridis K, Papakostas GA, Pachidis TP, Kaburlasos VG (2021) Vision-based vineyard trunk detection and its integration into a grapes harvesting robot. Int J Mech Eng Robot Res, pp 374–385. https://doi.org/10.18178/ijmerr.10.7.374-385 34. Aguiar AS, Monteiro NN, dos Santos FN, Solteiro Pires EJ, Silva D, Sousa AJ, BoaventuraCunha J (2021) Bringing semantics to the vineyard: an approach on deep learning-based vine trunk detection. Agriculture 11:131. https://doi.org/10.3390/agriculture11020131 35. Alibabaei K, Assunção E, Gaspar PD, Soares VNGJ, Caldeira JMLP (2022) Real-time detection of vine trunk for robot localization using deep learning models developed for edge TPU devices. Future Internet 14:199. https://doi.org/10.3390/fi14070199

210

E. Vrochidou and G. A. Papakostas

36. Liu S, Tang J, Cossell S, Whitty M (2015) Detection of shoots in vineyards by unsupervised learning with over the row computer vision system. In: Australasian conference on robotics and automation, ACRA, p 128492 37. Liu S, Cossell S, Tang J, Dunn G, Whitty M (2017) A computer vision system for early stage grape yield estimation based on shoot detection. Comput Electron Agric 137:88–101. https:// doi.org/10.1016/j.compag.2017.03.013 38. Kalampokas T, Tziridis K, Nikolaou A, Vrochidou E, Papakostas GA, Pachidis T, Kaburlasos VG (2020) Semantic segmentation of vineyard images using convolutional neural networks. In: 21st International conference on engineering applications of neural networks (EANN 2020), pp 292–303. https://doi.org/10.1007/978-3-030-48791-1_22 39. Shantkumari M, Uma SV (2021) Grape leaf segmentation for disease identification through adaptive Snake algorithm model. Multimed Tools Appl 80:8861–8879. https://doi.org/10.1007/ s11042-020-09853-y 40. Shantkumari M, Uma SV (2019) Adaptive machine learning approach for grape leaf segmentation. In: 2019 International conference on smart systems and inventive technology (ICSSIT), pp 482–487. IEEE. https://doi.org/10.1109/ICSSIT46314.2019.8987971 41. Pereira CS, Morais R, Reis MJCS (2018) Pixel-based leaf segmentation from natural vineyard images using color model and threshold techniques. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp 96–106. https://doi.org/10.1007/978-3-319-93000-8_12 42. Tairu Oluwafemi E, PlantVillage dataset. https://www.kaggle.com/datasets/emmarex/plantd isease 43. Mazhurin A, Kharma N v An image segmentation assessment tool ISAT 1.0. In: Proceedings of the international conference on computer vision theory and applications, pp 436– 443. SciTePress—Science and and Technology Publications. https://doi.org/10.5220/000421 6404360443 44. Michels DL, Giesselbach SA, Werner T, Steinhage V (2013) On feature extraction for fingerprinting grapevine leaves. In: Proceedings of the 2013 international conference on image processing, computer vision, and pattern recognition, IPCV 2013, pp 1–6 45. Marani R, Milella A, Petitti A, Reina G (2019) Deep learning-based image segmentation for grape bunch detection. In: Precision agriculture’19, pp 791–797. Wageningen Academic Publishers, The Netherlands. https://doi.org/10.3920/978-90-8686-888-9_98 46. Pérez-Zavala R, Torres-Torriti M, Cheein FA, Troni G (2018) A pattern recognition strategy for visual grape bunch detection in vineyards. Comput Electron Agric 151:136–149. https:// doi.org/10.1016/j.compag.2018.05.019 47. Berenstein R, Shahar OB, Shapiro A, Edan Y (2010) Grape clusters and foliage detection algorithms for autonomous selective vineyard sprayer. Intell Serv Robot 3:233–243. https:// doi.org/10.1007/s11370-010-0078-z 48. Škrabanek P, Runarsson TP (2015) Detection of grapes in natural environment using support vector machine classifier. In: Mendel, pp 143–150 49. Reis MJCS, Morais R, Peres E, Pereira C, Contente O, Soares S, Valente A, Baptista J, Ferreira PJSG, Bulas Cruz J (2012) Automatic detection of bunches of grapes in natural environment from color images. J Appl Log. 10:285–290. https://doi.org/10.1016/j.jal.2012.07.004 50. Zhao R, Zhu Y, Li Y (2022) An end-to-end lightweight model for grape and picking point simultaneous detection. Biosyst Eng 223:174–188. https://doi.org/10.1016/j.biosystemseng. 2022.08.013 51. Santos T (2019) Embrapa wine grape instance segmentation dataset—Embrapa WGISD. https:/ /zenodo.org/record/3361736#.Y9VVeHZByMo 52. Santos TT, de Souza LL, dos Santos AA, Avila S (2020) Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association. Comput Electron Agric 170:105247. https://doi.org/10.1016/j.compag.2020.105247 53. Liu S, Whitty M (2015) Automatic grape bunch detection in vineyards with an SVM classifier. J Appl Log 13:643–653. https://doi.org/10.1016/j.jal.2015.06.001

13 Leveraging Computer Vision for Precision Viticulture

211

54. Mohimont L, Roesler M, Rondeau M, Gaveau N, Alin F, Steffenel LA (2021) Comparison of machine learning and deep learning methods for grape cluster segmentation. In: Communications in computer and information science, pp 84–102. https://doi.org/10.1007/978-3-030-882 59-4_7 55. Badeka E, Kalabokas T, Tziridis K, Nicolaou A, Vrochidou E, Mavridou E, Papakostas GA, Pachidis T (2019) Grapes visual segmentation for harvesting robots using local texture descriptors. In: 12th International conference on computer vision systems (ICVS 2019), pp 98–109, Thessaloniki. https://doi.org/10.1007/978-3-030-34995-0_9 56. Chauhan A, Singh M (2022) Computer vision and machine learning based grape fruit cluster detection and yield estimation robot. J Sci Ind Res 81:866–872. https://doi.org/10.56042/jsir. v81i08.57971 57. Behroozi-Khazaei N, Maleki MR (2017) A robust algorithm based on color features for grape cluster segmentation. Comput Electron Agric 142:41–49. https://doi.org/10.1016/j.compag. 2017.08.025 58. Zhang C, Ding H, Shi Q, Wang Y (2022) Grape cluster real-time detection in complex natural scenes based on YOLOv5s deep learning network. Agriculture 12:1242. https://doi.org/10. 3390/agriculture12081242 59. Aguiar AS, Magalhães SA, dos Santos FN, Castro L, Pinho T, Valente J, Martins R, BoaventuraCunha J (2021) Grape bunch detection at different growth stages using deep learning quantized models. Agronomy 11:1890. https://doi.org/10.3390/agronomy11091890 60. Aguir AS (2021) Grape bunch and vine trunk dataset for deep learning object detection. https:/ /zenodo.org/record/5114142#.Y9U-XXZByMo 61. Sozzi M, Cantalamessa S, Cogato A, Kayad A, Marinello F (2022) Automatic bunch detection in white grape varieties using YOLOv3, YOLOv4, and YOLOv5 deep learning algorithms. Agronomy 12:319. https://doi.org/10.3390/agronomy12020319 62. Shen L, Su J, Huang R, Quan W, Song Y, Fang Y, Su B (2022) Fusing attention mechanism with mask R-CNN for instance segmentation of grape cluster in the field. Front Plant Sci, 13. https://doi.org/10.3389/fpls.2022.934450 63. Gonzalez-Marquez MR, Brizuela CA, Martinez-Rosas ME, Cervantes H (2020) Grape bunch detection using a pixel-wise classification in image processing. In: 2020 IEEE international autumn meeting on power, electronics and computing (ROPEC), pp 1–6. IEEE. https://doi.org/ 10.1109/ROPEC50909.2020.9258707 64. Wang J, Zhang Z, Luo L, Zhu W, Chen J, Wang W (2021) SwinGD: a robust grape bunch detection model based on swin transformer in complex vineyard environment. Horticulturae 7:492. https://doi.org/10.3390/horticulturae7110492 65. Lu S, Liu X, He Z, Zhang X, Liu W, Karkee M (2022) Swin-transformer-YOLOv5 for real-time wine grape bunch detection. Remote Sens 14:5853. https://doi.org/10.3390/rs14225853 66. Liu X, Wine-grape-dataset. https://github.com/LiuXiaoYu2030/Wine-Grape-Dataset 67. Luo L, Tang Y, Zou X, Wang C, Zhang P, Feng W (2016) Robust grape cluster detection in a vineyard by combining the Adaboost framework and multiple color components. Sensors 16:2098. https://doi.org/10.3390/s16122098 68. Cecotti H, Rivera A, Farhadloo M, Pedroza MA (2020) Grape detection with convolutional neural networks. Expert Syst Appl 159:113588. https://doi.org/10.1016/j.eswa.2020.113588 69. Xiong J, Liu Z, Lin R, Bu R, He Z, Yang Z, Liang C (2018) Green grape detection and pickingpoint calculation in a night-time natural environment using a charge-coupled device (CCD) vision sensor with artificial illumination. Sensors 18:969. https://doi.org/10.3390/s18030969 70. Kalampokas T, Vrochidou E, Papakostas GA, Pachidis T, Kaburlasos VG (2021) Grape stem detection using regression convolutional neural networks. Comput Electron Agric 186:106220. https://doi.org/10.1016/j.compag.2021.106220 71. Luo L, Tang Y, Zou X, Ye M, Feng W, Li G (2016) Vision-based extraction of spatial information in grape clusters for harvesting robots. Biosyst Eng 151:90–104. https://doi.org/10.1016/j.bio systemseng.2016.08.026 72. Luo L, Tang Y, Lu Q, Chen X, Zhang P, Zou X (2018) A vision methodology for harvesting robot to detect cutting points on peduncles of double overlapping grape clusters in a vineyard. Comput Ind 99:130–139. https://doi.org/10.1016/j.compind.2018.03.017

212

E. Vrochidou and G. A. Papakostas

73. Xiong J, He Z, Tang L, Lin R, Liu Z (2017) Visual localization of disturbed grape picking point in non-structural environment. Nongye Jixie Xuebao/Trans Chin Soc Agric Mach, issue 4. https://doi.org/10.6041/j.issn.1000-1298.2017.04.003 74. Jin Y, Yu C, Yin J, Yang SX (2022) Detection method for table grape ears and stems based on a far-close-range combined vision system and hand-eye-coordinated picking test. Comput Electron Agric 202:107364. https://doi.org/10.1016/j.compag.2022.107364 75. Wu Z, Xu D, Xia F, Suyin Z (2022) A keypoint-based method for grape stems identification. SSRN Electron J. https://doi.org/10.2139/ssrn.4199859 76. Rasool A, Mansoor S, Bhat KM, Hassan GI, Baba TR, Alyemeni MN, Alsahli AA, El-Serehy HA, Paray BA, Ahmad P (2020) Mechanisms underlying graft union formation and rootstock scion interaction in horticultural plants. Front Plant Sci, vol 11. https://doi.org/10.3389/fpls. 2020.590847 77. Sun X, Fang W, Gao C, Fu L, Majeed Y, Liu X, Gao F, Yang R, Li R (2022) Remote estimation of grafted apple tree trunk diameter in modern orchard with RGB and point cloud based on SOLOv2. Comput Electron Agric 199:107209. https://doi.org/10.1016/j.compag.2022.107209 78. Whalley J, Shanmuganathan S (2013) Applications of image processing in viticulture: a review. In: Piantadosi J, Anderssen RS, Boland J (eds) MODSIM2013, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand (MSSANZ), Inc. https://doi.org/10.36334/modsim.2013.B1.whalley 79. Cuq S, Lemetter V, Kleiber D, Levasseur-Garcia C (2020) Assessing macro- (P, K, Ca, Mg) and micronutrient (Mn, Fe, Cu, Zn, B) concentration in vine leaves and grape berries of vitis vinifera by using near-infrared spectroscopy and chemometrics. Comput Electron Agric 179:105841. https://doi.org/10.1016/j.compag.2020.105841 80. Anderson G, van Aardt J, Bajorski P, Vanden Heuvel J (2016) Detection of wine grape nutrient levels using visible and near infrared 1nm spectral resolution remote sensing. Presented at the May 17. https://doi.org/10.1117/12.2227720 81. Rangel BMS, Fernandez MAA, Murillo JC, Pedraza Ortega JC, Arreguin JMR (2016) KNNbased image segmentation for grapevine potassium deficiency diagnosis. In: 2016 International conference on electronics, communications and computers (CONIELECOMP), pp 48–53. IEEE. https://doi.org/10.1109/CONIELECOMP.2016.7438551 82. Moghimi A, Pourreza A, Zuniga-Ramirez G, Williams LE, Fidelibus MW (2020) A novel machine learning approach to estimate grapevine leaf nitrogen concentration using aerial multispectral imagery. Remote Sens 12:3515. https://doi.org/10.3390/rs12213515 83. Ukaegbu U, Tartibu L, Laseinde T, Okwu M, Olayode I (2020) A deep learning algorithm for detection of potassium deficiency in a red grapevine and spraying actuation using a raspberry pi3. In: 2020 International conference on artificial intelligence, big data, computing and data communication systems (icABCD), pp 1–6. IEEE. https://doi.org/10.1109/icABCD 49160.2020.9183810 84. Kalampokas T, Vrochidou E, Papakostas GA (2022) Machine vision for grape cluster quality assessment. In: 2022 International conference on applied artificial intelligence and computing (ICAAIC), pp 916–921. IEEE. https://doi.org/10.1109/ICAAIC53929.2022.9792817 85. Palacios D (2019) Tardaguila: a non-invasive method based on computer vision for grapevine cluster compactness assessment using a mobile sensing platform under field conditions. Sensors 19:3799. https://doi.org/10.3390/s19173799 86. Ohana-Levi N, Zachs I, Hagag N, Shemesh L, Netzer Y (2022) Grapevine stem water potential estimation based on sensor fusion. Comput Electron Agric 198:107016. https://doi.org/10. 1016/j.compag.2022.107016 87. Bellvert J, Zarco-Tejada PJ, Girona J, Fereres E (2014) Mapping crop water stress index in a ‘Pinot-noir’ vineyard: comparing ground measurements with thermal remote sensing imagery from an unmanned aerial vehicle. Precis Agric 15:361–376. https://doi.org/10.1007/s11119013-9334-5 88. Matese A, Baraldi R, Berton A, Cesaraccio C, Di Gennaro S, Duce P, Facini O, Mameli M, Piga A, Zaldei A (2018) Estimation of water stress in grapevines using proximal and remote sensing methods. Remote Sens 10:114. https://doi.org/10.3390/rs10010114

13 Leveraging Computer Vision for Precision Viticulture

213

89. Poblete T, Ortega-Farías S, Ryu D (2018) Automatic coregistration algorithm to remove canopy shaded pixels in UAV-borne thermal images to improve the estimation of crop water stress index of a drip-irrigated cabernet sauvignon vineyard. Sensors 18:397. https://doi.org/10.3390/s18 020397 90. Zovko M, Žibrat U, Knapiˇc M, Kovaˇci´c MB, Romi´c D (2019) Hyperspectral remote sensing of grapevine drought stress. Precis Agric 20:335–347. https://doi.org/10.1007/s11119-019-096 40-2 91. Dwivedi R, Dey S, Chakraborty C, Tiwari S (2021) Grape disease detection network based on multi-task learning and attention features. IEEE Sens J 21:17573–17580. https://doi.org/10. 1109/JSEN.2021.3064060 92. Shruthi U, Nagaveni V, Raghavendra BK (2019) A review on machine learning classification techniques for plant disease detection. In: 2019 5th International conference on advanced computing & communication systems (ICACCS), pp 281–284. IEEE. https://doi.org/10.1109/ ICACCS.2019.8728415 93. Hasan RI, Yusuf SM, Alzubaidi L (2020) Review of the state of the art of deep learning for plant diseases: a broad analysis and discussion. Plants 9:1302. https://doi.org/10.3390/plants 9101302 94. Rajpal N (2020) Black rot disease detection in grape plant (vitis vinifera) using colour based segmentation & machine learning. In: 2020 2nd International conference on advances in computing, communication control and networking (ICACCCN), pp 976–979. IEEE. https:// doi.org/10.1109/ICACCCN51052.2020.9362812 95. Zhu J, Wu A, Wang X, Zhang H (2020) Identification of grape diseases using image analysis and BP neural networks. Multimed Tools Appl 79:14539–14551. https://doi.org/10.1007/s11 042-018-7092-0 96. Gutiérrez S, Hernández I, Ceballos S, Barrio I, Díez-Navajas AM, Tardaguila J (2021) Deep learning for the differentiation of downy mildew and spider mite in grapevine under field conditions. Comput Electron Agric 182:105991. https://doi.org/10.1016/j.compag.2021. 105991 97. de Castro AI, Peña JM, Torres-Sánchez J, Jiménez-Brenes F, López-Granados F (2017) Mapping Cynodon dactylon in vineyards using UAV images for site-specific weed control. Adv Anim Biosci 8:267–271. https://doi.org/10.1017/S2040470017000826 98. Jiménez-Brenes FM, López-Granados F, Torres-Sánchez J, Peña JM, Ramírez P, CastillejoGonzález IL, de Castro AI (2019) Automatic UAV-based detection of Cynodon dactylon for site-specific vineyard management. PLoS ONE. https://doi.org/10.1371/journal.pone.0218132

Author Index

A Ahmed, Shakil, 105, 115, 147 Ali, Shahnewaz, 37, 105, 147

B Bansal, Jagdish Chand, 1

D Dilan Da¸skın, Zeynep, 131

I Iqbal, Javaid, 159 Irfanoglu, Bulent, 131

J Jahan, Musfika, 53

P Papakostas, George A., 177 Parameswari, P., 69 Prakash Kumar, K., 91

R Rajathi, N., 69, 91

S Sabid Hasan, Md, 37 Shahab Alam, Muhammad, 131 Sharma, Mayuri, 79 Sultana, Nusrat, 53

T Tufail, Muhammad, 159

U Uddin, Mohammad Shorif, 1, 19, 53 Umer Khan, Muhammad, 131

K Khan, Muhammad Tahir, 159 Khan, Razaullah, 159 Khan, Shahbaz, 159 Khan, Zubair Ahmad, 159 Kumar, Chandan Jyoti, 79

V Vanitha, V., 69, 91 Vrochidou, Eleni, 177

M Mostafizur Rahman Komol, Md, 37

Y Yogajeeva, K., 69

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. C. Bansal and M. S. Uddin (eds.), Computer Vision and Machine Learning in Agriculture, Volume 3, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-3754-7

215