The Impact of Thrust Technologies on Image Processing

150 1 16MB

English Pages [382] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Image processing technologies 0824750578, 8002281160, 8457961772, 3974102000, 4474502000

645 65 2MB Read more

Recent Advanced in Image Security Technologies: Intelligent Image, Signal, and Video Processing 3031228081, 9783031228087

This book provides the readers with a comprehensive overview of principles methodologies and recent advances in image, s

424 67 11MB Read more

Digital Image Processing

4,805 437 17MB Read more

The Body-Image Meaning-Transfer Model: An investigation of the sociocultural impact on individuals‘ body-image : An investigation of the sociocultural impact on individuals‘ body-image [1 ed.] 9783954896202, 9783954891207

This book deals with the impact of the sociocultural environment on body-image in Western consumer culture. Based on McC

154 31 570KB Read more

The Image Processing Cookbook, 4th Edition

This book leads the reader on a guided tour of the practical methods that can reveal the most important information in t

285 31 4MB Read more

Mathematical image processing 9783030014575, 9783030014582

1,365 269 6MB Read more

Microscope Image Processing 9780128210499, 0128210494

Microscope Image Processing, Second Edition, introduces the basic fundamentals of image formation in microscopy includin

207 84 13MB Read more

Image Processing Toolbox. User's Guide

146 89 Read more

Hybrid Image Processing Methods for Medical Image Examination 9781000300185, 1000300188

In view of better results expected from examination of medical datasets (images) with hybrid (integration of thresholdin

1,049 289 28MB Read more

Image Engineering: Volume 1 Image Processing 9783110524116, 9783110520323

This graduate textbook explains image geometry, and elaborates on image enhancement in spatial and frequency domain, unc

231 94 15MB Read more

The Impact of Thrust Technologies on Image Processing

Author / Uploaded
Digvijay Pandey
Rohit Anand
Nidhi Sindhwani
Binay Kumar Reecha Sharma
Pankaj Dadheech

Table of contents :
Contents
Preface
Acknowledgments
Chapter 1
The Investigation of Image Processing in Forensic Science
Abstract
1. Introduction
1.1. Computer Tools for Image Processing in Forensic Science
1.2. Softwares Used in Forensic Image Processing
1.2.1. Disk and Data Capture Tools
1.2.2. Autopsy/The Sleuth Kit
1.2.3. X-Ways Forensics
1.2.4. Access Data FTK
1.2.5. En Case
1.2.6. Mandiant Red Line
1.2.7. Paraben Suite
1.2.8. Bulk Extractor
1.2.9. Registry Analysis
1.2.10. Registry Recon
1.2.11. Memory Forensics
1.2.12. Volatility
1.2.13. Windows SCOPE
1.2.14. Wireshark
1.2.15. Network Miner
1.2.16. X Plico
2. Image Reconstruction
3. Mobile Device Forensics
3.1. Oxygen Forensic Detective
3.2. Celle Brite UFED
3.3. XRY
3.4. Linux Distros
3.5. CAINE
3.6. SANS SIFT
3.7. HELIX3
4. Processing of Forensic Digital Image
5. Different Types of Digital Image Evidence
6. Use of Digital Image Forensics Techniques
7. Lifecycle of Digital Image
8. Factors Affecting the Digital Image Lifecycle
9. Core Functionalities of Image Forensics Software
10. Forensic Image Analysis
10.1. Photo Image Comparison
10.2. Image Content Analysis (ICE)
10.3. Image Authentication
10.4. Image Enhancement and Restoration
11. Photogrammetry
12. Pros and Cons of Digital Image Forensics
12.1. Pros
12.2. Cons
13. Image Processing for Digital Forensics
Conclusion and Future Outlook
References
Chapter 2
Integrating IoT Based Security with Image Processing
Abstract
1. Introduction
1.1. Internet of Things
1.1.1. Features of IOT
1.1.2. A Comparison between Traditional Internet and IoT
1.1.3. Pros & Cons of IoT
1.1.4. Applications of IoT
1.1.5. Security Flaws of IoT
1.2. Image Processing
1.2.1. Characteristics of Digital Image Processing (DIP)
1.2.2. Applications of DIP
1.2.3. Architecture/Working
1.3. CBIR
1.3.1. Applications of CBIR
1.4. Camera Surveillance and Machine Vision
1.5. Machine Vision
2. Literature Review
3. Problem Statement
4. Proposed Model
4.1. Role of Canny Edge Detection in Reducing Image Size
5. Results and Discussion
5.1. Comparison of Size during Image Processing in IoT Environment
5.2. Comparison of Time Taken during Image Processing in IoT Environment
Conclusion and Future Outlook
References
Chapter 3
Pattern Analysis for Feature Extraction in Multi-Resolution Images
Abstract
1. Introduction
1.1. Pattern Class
1.2. Analysis
1.3. Pattern Analysis
1.3.1. Pattern Recognition
1.4. Problem Definition
1.5. Pattern Analysis Algorithm
1.6. Feature Extraction
2. Literature Review
3. Implementation
3.1. Sobel Edge Detector
3.2. Prewitt Edge Detector
3.3. Laplacian Edge Detector
3.4. Canny Edge Detector
Conclusion
References
Chapter 4
The Design of Microstrip Patch Antenna for 2.4 GHz IoT Based RFID and Image Identification for Smart Vehicle Registry
Abstract
1. Introduction
2. Smart Vehicle Registry
3. Background
4. Design Parameters
4.1. Design Eqation of Inset Feed Microstrip Patch Antenna
4.2. Design of Microstrip
5. Modeling and Analysis
6. Results and Discussion
7. Other Applications
Conclusion and Future Outlook
References
Chapter 5
Kidney Stone Detection from Ultrasound Images Using Masking Techniques
Abstract
1. Introduction
2. Ultrasound Images
3. Contrast Enhancement
4. Proposed Optimum Wavelet Based Masking
4.1. Proposed OWBM Algorithm
5. Cuckoo Search Algorithm
5.1. The Traditional Cuckoo Search Algorithm
5.2 Need of Adaptive Rebuilding of Worst Nests (ARWN)
5.3. Enhanced Cuckoo Search Algorithm
5.4. Image Segmentation
5.5. Thresholding
6. Results and Discussion
Conclusion and Future Outlook
References
Chapter 6
Biometric Technology and Trends
Abstract
1. Introduction
2. History
3. Conventional Biometrics and Modern Age Biometrics Distinguished
4. Biometrics- Trends and Prospects
5. Types of Biometric
6. Trustworthiness and Challenges of Biometrics
7. Challenges and Countermeasures
7.1. Biometric Template Protection
7.2. Error Correction Methods
7.3. Other No Cryptographic Approaches
8. Biometric Technology in Different Spheres of Life
8.1. Commercial Applications
8.2. Law Enforcement and Public Security (Criminal/Suspect Identification)
8.3. Military (Enemy/Ally Identification)
8.4. Border, Travel, and Migration Control (Traveler/Migrant/Passenger Identification)
8.5. Civil Identification (Citizen/Resident/Voter Identification)
8.6. Healthcare and Subsidies (Patient/Beneficiary/Healthcare Professional Identification)
8.7. Physical and Logical Access (Owner/User/Employee/ Contractor/Partner Identification)
8.8. Technological Utilization
8.8.1. Mobile Phones/Tablets
8.8.2. Laptops/PCs
8.8.3. Automobiles
Conclusion and Future Outlook
References
Chapter 7
Comparison of Digital Image Watermarking Methods: An Overview
Abstract
1. Introduction
2. Watermarking
2.1. Classification of Digital Watermarking Techniques
2.1.1. Robust and Fragile Watermarking
2.1.2. Public and Private Watermarking
2.1.3. Asymmetric and Symmetric Watermarking
2.1.4. Steganographic and Non-Steganographic Watermarking
2.1.5. Visible and Invisible Watermarking
2.2. Requirements
2.3. Techniques
2.4. Applications
2.4.1. Copyright Protection
2.4.2. Copyright Authentication
2.4.3. Fingerprinting and Digital Signatures
2.4.4. Copy Protection and Device Control
2.4.5. Broadcast Monitoring
3. Performance Metrics
3.1. Signal to Noise Ratio (SNR)
3.2. Peak Signal to Noise Ratio (PSNR)
3.3. Weighted Peak Signal to Noise Ratio (WPSNR)
3.4. Effectiveness
3.5. Efficiency
4. Comparison of Watermarking Techniques
Conclusion and Future Outlook
References
Chapter 8
Novel Deep Transfer Learning Models on Medical Images: DINET
Abstract
1. Introduction
1.1. Medical Image Classification: Transfer Learning
2. Literature Review
3. Methodology
3.1. Datasets
3.2. Deep CNN Phase
3.3. Implementation Details
4. Experiment Results
4.1. Implementation Details
4.2. Experiments
4.3. Evaluation Procedures and Techniques
4.4. Results
5. Discussion
Conclusion
Limitations
Future Outlook
References
Chapter 9
A Review of the Application of Deep Learning in Image Processing
Abstract
1. Introduction
1.1. Basic Network Structure: Multi-Layer Perception (MLP)
1.2. Convolutional Neural Network (CNN)
2. Network Structure Improvements
2.1. Improvement of Convolutional Neural Network
2.1.1. AlexNet Model
2.1.2. ZFNet Model
2.1.3. Deep Residual Network (ResNet)
2.2. Improvement of Recurrent Neural Network
2.2.1. Long and Short-Term Memory Network (LSTM)
2.2.2. Hierarchical RNN
2.2.3. Bi-Directional RNN
2.2.4. Multi-Dimensional RNN
3. Applications of Deep Learning in Image Processing
3.1. Speech Processing
3.2. Computer Vision
3.3. Natural Language Processing
4. Existing Problems and Future Directions of Deep Learning
4.1. Training Problem
4.1.1. The Gradient Disappearance Problem
4.1.2. Use Large-Scale Labelled Training Datasets
4.1.3. Distributed Training Problem
4.2. Landing Problem
4.2.1. Too Many Hyper-Parameters
4.2.2. Reliability is Insufficient
4.2.3. Poor Interpretability
4.2.4. Model Size Is Too Large
4.3. Functional Problem
4.3.1. Lack of Ability to Solve Logical Problems
4.3.2. Small Data Challenges
4.3.3. Unable to Handle Multiple Tasks Simultaneously
4.3.4. Ultimate Algorithm
4.4. Domain Issues
4.4.1. Image Understanding Issues
4.4.2. Natural Language Processing Issues
Conclusion
References
Chapter 10
The Survey and Challenges of Crop Disease Analysis Using Various Deep Learning Techniques
Abstract
1. Introduction
2. Literature Survey
2.1. Leaf Diseases
2.1.1. Grape
2.1.2. Citrus
2.1.3. Apple
2.1.4. Other
3. Four Phase Technique
3.1. Data Collection
3.2. Data Augmentation
3.3. Data Detection and Classification
3.4. Optimization
4. Issues Related to Plant Disease Identification
Conclusion and Future Outlook
References
Chapter 11
Image Processing and Computer Vision: Relevance and Applications in the Modern World
Abstract
1. Introduction
1.1. Digital Image Processing
1.2. Image Acquisition
1.3. Digital Histogram Plots
1.4. Characteristics
2. Image Storage and Manipulation
2.1. Image Segmentation
2.2. Feature Extraction
2.3. Multi-Scale Signal Analysis
2.4. Pattern Recognition
2.5. Projection
3. Image Processing Techniques
3.1. Anisotropic Diffusion
3.2. Hidden Markov Models
3.3. Image Editing
3.4. Image Restoration
3.5. Independent Component Analysis
3.6. Linear Filtering
3.7. Neural Networks
3.8. Point Feature Matching
3.9. Principal Components Analysis
4. Newer Applications of Image Processing
4.1. Active Learning Participation
4.2. Workplace Surveillance
4.3. Building Recognition
4.4. Image Change Detection
4.5. Human Race Detection
4.6. Rusting of Steel
4.7. Object Deformation
4.8. Tourism Management
Conclusion and Future Outlook
References
Chapter 12
Optimization Practices Based on the Environment in Image Processing
Abstract
1. Introduction
2. Environment-Based Optimization Techniques
2.1. Evolutionary Algorithms
2.1.1. Genetic Algorithm (GA)
2.2. Swarm Intelligence Algorithms in Image Processing
2.2.1. Bat Algorithm in Image Processing
2.2.2. Ant Colony Optimization (ACO) in Image Processing
2.2.3. Artificial Honey Bee (ABC) Optimization in Image Processing
2.2.4. Cuckoo Optimization in Image Processing
2.2.5. Firefly Optimization Algorithm in Image Processing
2.2.6. Elephant Herding Optimization (EHO) Algorithm in Image Processing
2.2.7. Grey Wolf Optimization Algorithm in Image Processing
Conclusion
References
Chapter 13
Simulating the Integration of Compression and Deep Learning Approaches in IoT Environments for Security Systems
Abstract
1. Introduction
1.1. Deep Learning
1.2. IoT
1.3. Compression
1.3.1. Lossless Compression
1.3.2. Lossy Compression
1.4. Role of Compression in IoT
1.5. Role of Security in IoT
1.6. Different Types of Cyber Crimes
1.7. Status of Cyber Crimes
1.8. Advantages of Cyber Security
1.9. Disadvantages of Cyber Security
1.10. Different Cyber Attacks
1.10.1. Ransom-Ware
1.10.2. Brute Force Attack
1.10.3. Man in Middle Attack
1.10.4. Man in the Middle Attack Prevention
1.10.5. SQL Injection
1.10.6. Solution for SQL Injection Attacks
1.11. Role of Encryption in Cyber Security
1.12. Role of Firewall in Cyber Security
1.13. Intrusion Detection System (IDS)
1.14. Role of Machine Learning in Cyber Security
2. Literature Review
3. Problem Statement
4. Proposed Work
4.1. Features of Proposed Model
5. Results and Discussion
5.1. Confusion Matrix of Unfiltered Dataset
5.2. Confusion Matrix of Filtered Dataset
5.3. Comparison Analysis
5.3.1. Accuracy
5.3.2. Precision
5.3.3. Recall Value
5.3.4. F1-Score
Conclusion and Future Outlook
References
Chapter 14
A Review of Various Text Extraction Algorithms for Images
Abstract
1. Introduction
2. Literature Review
2.1. Comprehensive Study of the Existing Work
Conclusion and Future Outlook
References
Chapter 15
Machine Learning in the Detection of Diseases
Abstract
1. Introduction
2. Types of Machine Learning Techniques
3. Machine Learning Algorithms
3.1. K Nearest Neighbor Algorithm (KNN)
3.2. K-Means Clustering Algorithm
3.3. Support Vector Machine
3.4. Naive Bayes Algorithm
3.5. Decision Tree Algorithm
3.6. Logistic Regression (LR)
4. Diagnosis of Diseases by Using Different Machine Learning Algorithms
4.1. Heart Disease
4.1.1. Analysis
4.2. Diabetes Disease
4.2.1. Analysis
4.3. Liver Disease
4.3.1. Analysis
4.4. Dengue
4.4.1. Analysis
4.5. Hepatitis Disease
5. Discussion and Analysis of Machine Learning Techniques
6. Benefits of Machine Learning in Diagnosis of Diseases
6.1. Recognizes and Examines Ailments
6.2. Drug Improvement and Gathering
6.3. Clinical Imaging Diagnostics
6.4. Altered Medicine
6.5. Prosperity Record with Knowledge
6.6. Research and Clinical Fundamentals
6.7. Data Grouping
6.8. Drug Things
7. Challenges of Machine Learning in Detection and Diagnosis of Diseases
7.1. Data Irregularity
7.2. Absence of Qualified Pioneers
7.3. Supplier Scorn
7.4. Data Security
Conclusion and Outlook
References
Chapter 16
Applications for Text Extraction of Complex Degraded Images
Abstract
1. Introduction
1.1. Noise
1.1.1. Gaussian Noise
2. Impulse Noise
2.1. Pre-Processing Methods
3. Blurring
3.1. Gaussian Filter
3.2. Median Filter
4. Thresholding
4.1. Global Thresholding
5. Morphological Operations
5.1. Dilation
5.2. Erosion
5.3. Opening and Closing
6. Methodology
7. Experimental Analysis
8. Results and Discussion
Conclusion and Future Outlook
References
Index
About the Editors
Blank Page
Blank Page

Citation preview

本书版权归Nova Science所有

本书版权归Nova Science所有

本书版权归Nova Science所有

Technology in a Globalizing World

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

本书版权归Nova Science所有

Technology in a Globalizing World Human Capital in the Global Digital Economy Ashot A. Khachaturyan, PhD (Editor) 2023. ISBN: 979-8-88697-684-7 (Softcover) 2023. ISBN: 979-8-88697-727-1 (eBook) Sustainable Production: Definitions, Aspects, and Elements Kamran Kheiralipour (Editor) 2022. ISBN: 979-8-88697-057-9 (Softcover) 2022. ISBN: 979-8-88697-208-5 (eBook) Facial Recognition Technology: Usage by Federal Law Enforcement Mari F. Burke (Editor) 2022. ISBN: 979-8-88697-124-8 (Hardcover) 2022. ISBN: 979-8-88697-175-0 (eBook) Nonlinear Systems: Chaos, Advanced Control and Application Perspectives Piyush Pratap Singh, PhD (Editor) 2022. ISBN: 978-1-68507-660-3 (Softcover) 2022. ISBN: 979-8-88697-001-2 (eBook) Multidisciplinary Science and Advanced Technologies Fernando Gomes, PhD, Kaushik Pal, PhD and Thinakaran Narayanan (Editors) 2021. ISBN: 978-1-53618-959-9 (Hardcover) 2021. ISBN: 978-1-53619-198-1 (eBook)

More information about this series can be found at https://novapublishers.com/product-category/series/technology-in-a-globalizingworld/

本书版权归Nova Science所有

Digvijay Pandey, PhD Rohit Anand, PhD Nidhi Sindhwani, PhD Binay Kumar Pandey Reecha Sharma, PhD and Pankaj Dadheech, PhD

The Impact of Thrust Technologies on Image Processing

本书版权归Nova Science所有

Copyright © 2023 by Nova Science Publishers, Inc. DOI: https://doi.org/10.52305/ATJL4552

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to reuse content from this publication. Please visit copyright.com and search by Title, ISBN, or ISSN. For further questions about using the service on copyright.com, please contact:

Phone: +1-(978) 750-8400

Copyright Clearance Center Fax: +1-(978) 750-4470

E-mail: [email protected]

NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the Publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regards to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS.

Library of Congress Cataloging-in-Publication Data ISBN: H%RRN

Published by Nova Science Publishers, Inc. † New York

本书版权归Nova Science所有

Contents

Preface

........................................................................................... ix

Acknowledgments .......................................................................................xv Chapter 1

The Investigation of Image Processing in Forensic Science.............................................................1 Vikas Menon, Bhairav Prasad, Mandheer Kaur, Harleen Kaur, Digvijay Pandey and Sanwta Ram Dogiwal

Chapter 2

Integrating IoT Based Security with Image Processing .....................................................25 Vivek Veeraiah, Jay Kumar Pandey, Santanu Das, Dilip Raju, Makhan Kumbhkar, Huma Khan and Ankur Gupta

Chapter 3

Pattern Analysis for Feature Extraction in Multi-Resolution Images.............................................59 Ashi Agarwal, Arpit Saxena, Digvijay Pandey, Binay Kumar Pandey, A. Shahul Hameed, A. Shaji George and Sanwta Ram Dogiwal

Chapter 4

The Design of Microstrip Patch Antenna for 2.4 GHz IoT Based RFID and Image Identification for Smart Vehicle Registry .....................83 Manvinder Sharma, Harjinder Singh Digvijay Pandey, Binay Kumar Pandey, A. Shaji George and Pankaj Dadheech

Chapter 5

Kidney Stone Detection from Ultrasound Images Using Masking Techniques ..............................103 Harshita Chaudhary and Binay Kumar Pandey

本书版权归Nova Science所有

vi

Contents

Chapter 6

Biometric Technology and Trends ...............................117 Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi, Divyansh Jain, Harsh Jindal and Digvijay Pandey

Chapter 7

Comparison of Digital Image Watermarking Methods: An Overview .......................149 Vibha Aggarwal, Sandeep Gupta, Navjot Kaur, Virinder Kumar Singla and Shipra

Chapter 8

Novel Deep Transfer Learning Models on Medical Images: DINET ..........................................167 Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

Chapter 9

A Review of the Application of Deep Learning in Image Processing ........................191 Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

Chapter 10

The Survey and Challenges of Crop Disease Analysis Using Various Deep Learning Techniques .....................................................219 P. Venkateshwari

Chapter 11

Image Processing and Computer Vision: Relevance and Applications in the Modern World.....................................................233 Sukhvinder Singh Deora and Mandeep Kaur

Chapter 12

Optimization Practices Based on the Environment in Image Processing ...............................253 Shipra, Priyanka, Navjot Kaur, Sandeep Gupta and Reecha Sharma

Chapter 13

Simulating the Integration of Compression and Deep Learning Approaches in IoT Environments for Security Systems .............................269 Harinder Singh, Rohit Anand, Vivek Veeraiah, Veera Talukdar, Suryansh Bhaskar Talukdar, Sushma Jaiswal and Ankur Gupta

本书版权归Nova Science所有

Contents

vii

Chapter 14

A Review of Various Text Extraction Algorithms for Images...................................................303 Binay Kumar Pandey, Digvijay Pandey Vinay Kumar Nassa, A. Shahul Hameed, A. Shaji George, Pankaj Dadheech and Sabyasachi Pramanik

Chapter 15

Machine Learning in the Detection of Diseases ..........319 Aakash Joon, Nidhi Sindhwani and Komal Saxena

Chapter 16

Applications for Text Extraction of Complex Degraded Images .......................................341 Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa, A. Shaji George, Sabyasachi Pramanik and Pankaj Dadheech

Index

.........................................................................................355

About the Editors ......................................................................................359

本书版权归Nova Science所有

本书版权归Nova Science所有

Preface

This book provides novel, sophisticated techniques as well as approaches for image analysis, as well as establishing the basis for even further investigation and scientific studies throughout one such domain. All of this starts with a detailed survey of cutting-edge findings on computational effort, illustrating the functioning of methodologies and highlighting specific purposes for employing specific methodologies, their own impacts on images, and their relevant applications, and offering an audience an overall overview of the discipline and developing a foundation which could indeed be utilized as the basic principle for further study and investigations throughout this field. The content has been made expeditiously and strongly comprehensible and has been demonstrated with various concrete illustrations. The basics of images, image filtration and improvement in the spatial and frequency domains, image reconstruction and restructuring, colour image analysis, wave-lets and other image transforms, image compression and watermarking, morphology, edge detection utilising classical methods, segmentation and image feature extraction have all been covered in this book. This is a comprehensive one. An underlying image reconstruction and coding methodology has been presented, which will greatly benefit image processing experts, graduate students, and engineers. Image processing is the extraction and integration of data from images to be used in a wide range of applications. Image processing implementations seem to be useful across a variety of disciplines. A handful of examples include remote sensing, space applications, industrial applications, medical imaging, and military applications. This book's primary objective would be to give the audience an understanding of image processing as well as the technological advances that can be utilized to enhance an image. Imaging techniques come in many different varieties, including all those used for chemical, photonic, thermoelectric, and healthcare, as well as microbiological imaging. For efficient image retrieval, scanning methods and statistical analysis must be employed for image analysis. Prospective satellite developments would be

本书版权归Nova Science所有

x

Digvijay Pandey

based on the foundation of in-depth image analysis findings. Additionally, a novel method for biomedical imaging, imaging using deep learning and machine learning, as well as 2D and 3D imaging are all proposed in this book. This book will encompass not just 2D and 3D imaging technology but also include cloud computing, IoT, digital watermarking, artificial neural networks, feature extraction, and optimization. In this sense, one such book is an excellent resource for scientists seeking to develop and broaden their knowledge and interest in emerging aspects of image analysis. This book as a whole serve as a handbook for researchers and experts to understand regarding emerging innovations and computational efficiency, in addition to how to use watermarking to enhance the security and privacy of the resulting images. The multidisciplinary methodologies of image processing strengthen the protection of the cloud, IoT, and Android platform. This describes a few research studies and discusses a few completely new security issues. This also introduced different methodologies and tried to address a few relatively new security issues. Those suggested layout and study deployments would be offered with suitable design methods, graphic and visual explanations, tabular and visual representations. All chapters would be organized in a way that is advantageous for academics, research scholars, as well as industry professionals. All chapter sub-sections will be properly organized for a seamless feel as well as flows for greater understanding. Elevated images, diagrams, as well as tables would then assist the reader in gaining an improved understanding of the notions for future research. Academicians and scientists would then obtain comprehensive knowledge in the field knowledge of the most recent innovations, security issues, and optimization methods which could indeed be used in image analysis issues. Inside the healthcare industry, medical and biological image processing is also extremely useful. Furthermore, the scientists could indeed build on the book's discussion of numerous new, unique, and innovative methodologies. In summary, this book will offer a thorough understanding of the numerous operational processes that must be conducted in image processing, utilizing a variety of new and meta-heuristic methodologies. This book was co-written by a global community of scientists and academicians who can provide a variety of perspectives as well as responses to a number of the most pressing relevant problems from the standpoint of investigations. Chapter 1 is entitled “Investigations on Image Processing in Forensic Science”. The authors include a brief overview of how to create images or photos in forensic science to improve image alignment in terms of quality,

本书版权归Nova Science所有

Preface

xi

truthfulness, improvement, and reconstruction in order to extract details such as analysis, interpretation, and recognition, as well as details on the major steps of image analysis in forensic science such as photographic comparing, textual analysis, image authentication, image enhancement, and reconstruction in this chapter. Chapter 2 is entitled “Integrating IoT based Security with Image Processing”. offers insights into image processing, the internet of things, and various ways to use image processing to improve security by maintaining camera surveillance while using content-based image retrieval. A contentbased search examines an image's actual content instead of the metadata associated with it. Chapter 3 is entitled “Pattern Analysis for Feature Extraction in Multiresolution Images”. Pattern analysis utilising feature extraction has been mentioned and put in place in this work. Features contain all of the crucial data of every image pattern, and thus, in order to collect all of the essential knowledge, informative characteristics such as edges have become crucial to extract and merge for just about any type of pattern analysis or recognition. Chapter 4 is entitled “Design of Microstrip Patch Antenna for 2.4 GHz IoT based RFID & Image Identification for Smart Vehicle Registry”. The authors explain Radio Frequency Identification (RFID), which would be widely used in numerous fields such as manufacturing, distribution, and healthcare. RFID has been widely utilised in the public sector, so these systems need low-cost, low-profile antennas. RFID systems are being used in many health facilities, and RFID is thought to be the forthcoming technology for monitoring and data collection. This study demonstrates the construction of an Inset Feed Microstrip Patch Antenna for 2.4 GHz. The antenna was created and tested specifically for RFID applications. Chapter 5 is entitled “Kidney Stone Detection from Ultrasound Images using Masking Techniques”. The authors introduced masking techniques for detecting stones in the kidney in this chapter. Masking techniques, as we all know, are noticeable approaches to contrast enhancement. To accomplish this, the image is first converted to grayscale, and then the contrast is increased. The Enhanced Cuckoo Search Algorithm is used to perform contrast enhancement using Optimum Wavelet-Based Masking (OWBM) (ECSA). Following that, image segmentation and image masking were used to detect stones in the image. The cuckoo search algorithm is used for global contrast enhancement optimization. The Cuckoo search algorithm was used to optimise the coefficient approximation.

本书版权归Nova Science所有

xii

Digvijay Pandey

Chapter 6 is entitled “Biometric – Technology and Trends” The authors provide an overall view of biometrics as well as a description of some of the key research issues that need to be tackled in order for biometric technology to be a suitable and effective tool for data and information security. The primary strength of this overview is to analyses applications where biometric scans are used to solve information security issues; list the underlying challenges that biometric systems face in real-world applications; and seek solutions to adaptability and security issues in massive biometric technologies. Chapter is 7 entitled “Comparison of Digital Image Watermarking Methods: An Overview”. The author discusses digital watermarking as one of the most popular technologies for keeping digital media secure on the Internet, and provides an outline of image watermarks, watermark types, and the importance of watermarking. This chapter introduces several methods for digital image watermarking based on spatial and frequency domains. Chapter 8 is entitled “Novel Deep Transfer Learning Model on Medical Images: DINET”. This chapter discusses a novel deep model known as the "Dense Block-Inception Network" (DINET) network to demonstrate the efficacy of transition learning systems on medical images as well as resolve the problem of scarcity of domain expertise in the medical field, which is a major challenge in medical image analysis. This is also evidenced by the fact that using non-medical data with transfer learning seems to be an invention that can be utilised to train and fine-tune deep neural networks. Chapter 9 is entitled “Review of Deep Learning on Image Processing”. The main goal of this chapter is to evaluate the most recent research advancements and future scope in deep learning, as well as to debate the basic three deep learning models, which also include Multilayer Perceptron’s, Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). By delving deeper into deep learning applications in a variety of artificial intelligence realms, such as speech processing, computer vision, and natural language processing, Chapter 10 is entitled “Survey and Challenges of Crop Disease Analysis using Various Deep Learning Techniques”. This chapter discusses crop infections caused by insects, as well as the various methods used to detect crop infections. Eventually, the existing methodologies, constraints, and future perspectives for new and advanced crop infection detection are articulated. Chapter 11 is entitled “Image Processing and Computer Vision: Relevance and Applications in the Modern World”. This chapter describes a dramatic change in the utilisation, storage, interpretation, and sharing of various images following the evolution of various digital devices. These

本书版权归Nova Science所有

Preface

xiii

digital devices have largely replaced the earlier analogue devices used to create images. Digital images can be utilised in current applications such as households, industry, information exchange, sales and marketing, defense, and the medical industry, among others. This chapter has provided a thorough examination of these technologies, their applications in various fields, and their prospects for the future. Chapter 12 is entitled “Optimization Practices Based on Environment in Image Processing” The primary goal of this chapter seems to be to discuss optimization algorithms, which are high-performance techniques that are used to rectify incredibly hard optimization issues like image enhancement, restoration, image segmentation, image edge detection, image generation, image denoising, image pattern recognition, image thresholding, and so on by lowering image noise and blur. Chapter 13 is entitled “Simulating the Integration of Compression and Deep Learning Approach in IoT Environment for Security System” This chapter describes how a compression mechanism has been integrated into a deep learning approach to improve the safety of the IoT environment. Furthermore, deep learning uses usage filtration datasets to enhance attack classification accuracy. Current research can recognize and categories numerous attacks such as brute force, man in the middle, and SQL injection. The detection of this type of attack limits the transmission of unauthenticated data. On the basis of a deep learning-based trained model, simulation work can classify attacks into multiple categories. Chapter 14 is entitled “Review on Various Text Extraction Algorithms for Images” The authors discussed text extraction, which is the process of extracting text from a picture into plain text. Various types of noise, including Gaussian noise, salt-and-pepper noise, shot noise, speckle noise, quantization noise, and periodic noise, can easily harm a textual image. Chapter 15 is entitled “Machine Learning in Detection of Diseases” The authors demonstrated how machine learning plays a significant role in a variety of applications, including image recognition, data mining, natural language processing, and disease prediction. In many districts, machine learning provides predictable outcomes. This chapter provides an overview of various artificial intelligence techniques used to start investigating various diseases such as coronary artery disease, diabetes, liver contamination, dengue fever, and hepatitis. Numerous calculations produce astonishing results because they accurately define the characteristics.

本书版权归Nova Science所有

xiv

Digvijay Pandey

Chapter 16 is entitled “Application of the Text Extraction Method for Complex Degraded Images” The authors demonstrated rising interest in preserving old books and records and converting them to digital form. The rapid advancement of data innovation and the Internet's quick spread have also contributed to the enormous volume of image and video data. The texts that are included in the image and video assist us in analyzing them and are also utilized for indexing, archiving, and retrieval. Different noises, such as Gaussian noise, salt and pepper noise, speckle noise, etc., can readily damage an image. Several image filtering algorithms, including the Gaussian filter, mean filter, median filter, etc., are employed to eliminate these various noises from images. The impact of several pre-processing methods, such as thresholding, morphology, and blurring procedures, on text extraction methods.

本书版权归Nova Science所有

Acknowledgments

The editors are grateful to Almighty God for allowing them to complete this book. Completing a book is a difficult task that can take many hours, months, or even years. We can attest to the fact that we collaborated closely with publishers, editors, and authors during that time. We are grateful to our beloved Managing Editor, Nova Science Publishers, USA, for having faith in us and giving us the opportunity to edit this book. Your guidance has been extremely valuable to us from the submission of the proposal to the completion of the project. Nova staff deserves our appreciation. We wanted to thank all of the authors who made a significant contribution to this book project; we were unable to accept many great chapters owing to scope and quality, although we are confident that the work included in this book will be useful to young researchers and industry entrants in education, social, economic, digital transformation, IoT, artificial intelligence, and machine learning. Our book will help them resolve one’s defined process and give them a boost in settling their current issue. We'd appreciate hearing your opinions on this book. Though extra care has been chosen to be taken in choosing the chapters and writers' work, which has been closely monitored as well as modified through stringent peer assessment, the authors' or readers' assessments and feedback would be immensely beneficial for us in making sure that one's issues are addressed in our forthcoming volumes. We suggest that you purchase this book for your institution's library and research lab, and that you make use of the cutting-edge technological exploratory details given throughout the book.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 1

The Investigation of Image Processing in Forensic Science Vikas Menon1, Bhairav Prasad1,† Mandheer Kaur1,‡ Harleen Kaur1,¶ Digvijay Pandey2, and Sanwta Ram Dogiwal3,•, PhD 1Chandigarh

College of Technology, Landran, Mohali, Punjab, India of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India Uttar Pradesh, India 3Department of Information Technology, Swami Keshvanand Institute of Technology, Management and Gramothan (SKIT), Jaipur, Rajasthan, India 2Department

Abstract Forensic image processing necessitates training and expertise in order to analyze and interpret the various components of an image or the resultant image in a valid and authentic form. The image processing in forensic science is a multistep process and requires tedious analysis. The major steps of image analysis in forensic science include photographic 

Corresponding Author’s Email: [email protected]. Corresponding Author’s Email: [email protected]. ‡ Corresponding Author’s Email: [email protected]. ¶ Corresponding Author’s Email: [email protected].  Corresponding Author’s Email: [email protected]. • Corresponding Author’s Email: [email protected]. †

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

2

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al. comparison, content analysis, image authentication, image enhancement and restoration, photogrammetry and its extraction. It has a wide applicability ranging from obtaining crime scenes, to refine and enhancing its quality pattern. These days, with the rapid spread of electronic devices, every location and even person has an easy-to-use device capable of recording, storing, and sharing large amounts of digital images or data. At the same time, the large availability of image editing software tools such as software Canvas, MATLAB and Corel Draw makes it extremely simple to alter the content of the images, or to create new ones by blending all images. Finally, some advanced software allows making photorealistic computer graphics that viewers can find indistinguishable from photographic images or also generate hybrid visual content.

Keywords: authentication, image enhancement, text extraction and restoration

1. Introduction Digital image processing is a computer-assisted approach for refining and enhancing images using software and algorithms [1-8]. This methodology has numerous advantages over older methods such as analogue image processing. Three aspects influence image processing: computer configuration, discrete mathematics theory, and need for a wide range in agriculture, military, industrial, medical science, and forensic science [9-13]. Despite the fact that exploitation of digital data is a relatively new technique for law enforcement investigations, law enforcement relies heavily on digital evidence for critical information about both victims and suspects. Cases where digital evidence is lacking are more difficult to create leads and solve due to the potential quantity of such evidence available [14]. Forensic image processing, often known as forensic image analysis, is concerned with determining the validity and content of photographs. This enables law enforcement to use relevant data in a wide range of criminal cases. Forensic science is frequently used in the criminal justice system [15]. Forensic scientists gather and analyze evidence from crime scenes and other sites in order to generate objective results that can aid in the investigation and punishment of offenders, or clear an innocent person. Forensic digital image processing is a technique that helps forensic investigators collect, optimize, and evaluate evidence from crime scenes to improve its amplification [16].

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

3

One of the most valuable features of digital picture technologies is the ability to blend numerous photographs of the same object into a single image. The capacity to combine many photographs of the same topic to create a final, blended image is one of the most valuable features of digital image technology. The final image shows the evidence and can be used for fingerprinting or shoe impressions [17]. Figure 1. reveals an overview of image processing in forensic science.

1.1. Computer Tools for Image Processing in Forensic Science For a rising number of crimes, computers are an important source of forensic evidence. Even traditional criminals are employing computers as part of their operations, despite the fact that cybercrime has been rapidly increasing in recent years. The capacity to accurately collect forensic data from these machines could be critical in apprehending and convicting these perpetrators [18]. Computer forensics tools are designed to ensure that the information extracted from computers is accurate and reliable. Figure 2 showing different components of image processing system. Due to the wide variety of different types of computer-based evidence, a number of different types of computer forensics tools exist, including:

Figure 1. An overview of image processing in forensic sciences.

本书版权归Nova Science所有

4

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

Figure 2. Components of image processing system.

• • • • • • • • •

Disk and data capture tools File viewers File analysis tools Registry analysis tools Internet analysis tools Email analysis tools Mobile devices analysis tools Network forensics tools Database forensics tools.

Within each category, a number of different tools exist. This list outlines some of the most popularly used computer forensics tools [19].

1.2. Softwares Used in Forensic Image Processing For a rising number of crimes, computers are an important source of forensic evidence. Even traditional criminals are employing computers as part of their operations, despite the fact that cybercrime has been rapidly increasing in recent years. The capacity to accurately collect forensic data from these machines could be critical in apprehending and convicting these perpetrators. Computer forensics software (Figure 3) is meant to verify that data collected from computers is accurate and trustworthy [20].

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

5

Figure 3. Forensic tools and software used with computer for image processing.

A number of distinct types of computer forensics tools exist because to the broad diversity of different sorts of computer-based evidence, including.

1.2.1. Disk and Data Capture Tools Forensic disc and data capture programs are designed to analyze a system and retrieve potential forensic artifacts including files, emails, and other data. Many forensics tools are focused on this aspect of the computer forensics procedure. 1.2.2. Autopsy/The Sleuth Kit The Sleuth Kit and Autopsy are two of the most well-known and popular forensics equipment available. These programs can analyze disc pictures, do in-depth file system analysis, and perform a range of other tasks. As a result, they combine features from several of the forensics tool categories listed above, making them a suitable place to start when conducting a computer forensics investigation. 1.2.3. X-Ways Forensics X-Ways Forensics is a Windows-based commercial digital forensics platform. X-Ways Investigator is a stripped-down version of the platform offered by the company. One of the platform’s main selling points is that it is designed to be resource-efficient and can be run off a USB stick.

本书版权归Nova Science所有

6

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

1.2.4. Access Data FTK The Access Data Forensics Toolkit (FTK) is a commercial digital forensics platform that prides itself on its speed of analysis. It advertises itself as the only forensics tool that fully utilizes multi-core processors. FTK also indexes forensic artifacts before they are gathered, which speeds up later processing. 1.2.5. En Case En Case is a forensics platform available for purchase. It supports the collection of evidence from more than twenty-five various types of devices, including computers, mobile devices, and GPS. A wide range of reports can be generated based on predefined templates from the collected data. 1.2.6. Mandiant Red Line Mandiant Red Line is a well-known memory and file analyzer program. To create a proper report, it collects information about running processes on a host, drivers from memory, and other data such as Meta data, registry data, tasks, services, network information, and internet history. 1.2.7. Paraben Suite The Paraben Corporation provides a variety of forensics products with a variety of licensing options. Paraben has capabilities in: • • • • • •

Desktop forensics Email forensics Smartphone analysis Cloud analysis IoT forensics Triage and visualization.

1.2.8. Bulk Extractor Bulk Extractor is a useful and well-known digital forensics tool. It extracts relevant information from disc pictures, files, or directories of files. It ignores the file system structure during this procedure, making it faster than other similar utilities. Intelligence and law enforcement agencies mostly utilize it to combat cybercrime.

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

7

1.2.9. Registry Analysis The Windows registry is a database of configuration information for the operating system and its programs. As a result, it may include a wealth of information useful in forensic investigation. 1.2.10. Registry Recon A popular commercial registry analysis tool is Registry Recon. It takes the registry data from the evidence and reconstructs the registry representation. It can restore registers from both new and old Windows installations. 1.2.11. Memory Forensics The file system analysis overlooks the system’s volatile memory (i.e., RAM). Some forensics tools focus on capturing the information stored here. 1.2.12. Volatility The memory forensics framework is called volatility. It’s used for malware analysis and incident response. You may extract data from running processes, network sockets, network connections, DLLs, and registry hives using this utility. It also has the ability to extract data from Windows crash dump and hibernation files. The GPL license allows you to use this program for free. 1.2.13. Windows SCOPE Windows SCOPE is a memory forensics and reverse engineering program that can be used to examine volatile memory. It is primarily used for malware reverse engineering. It allows you to examine the Windows kernel, drivers, DLLs, as well as virtual and physical memory. 1.2.14. Wireshark Wireshark is the most popular network traffic analysis tool on the market. It can collect live traffic or ingest a previously saved capture file. The multiple protocol dissectors and user-friendly interface of Wireshark make it simple to examine the contents of a traffic capture and look for forensic evidence. 1.2.15. Network Miner Network Miner is a network traffic analysis tool that is available in both free and paid versions. While Wireshark’s free edition lacks many of the commercial features, it can be a useful tool for forensic investigations. It

本书版权归Nova Science所有

8

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

organizes data differently than Wireshark and extracts certain types of files from a traffic capture automatically.

1.2.16. X Plico X plico is a network forensic analysis program that is free to use. It is used to extract relevant data from Internet and network protocols-based applications. HTTP, IMAP, POP, SMTP, SIP, TCP, UDP, TCP, and others are among the protocols it supports. The tool’s output data is saved in a SQLite or MySQL database. Both IPv4 and IPv6 are supported.

2. Image Reconstruction This entails using image processing systems that have been extensively trained with existing image data to create newer versions of old and damaged photos (Figure 4) by having to fill in the missing or corrupted parts.

Figure 4. Reconstructing damaged images using image processing.

3. Mobile Device Forensics Mobile devices are confiscated at every type of crime scene in modern criminal investigations, and the data on those devices frequently becomes

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

9

important evidence in the case. Over the years, several mobile forensic techniques have been developed and reviewed in order to retrieve possible evidentiary data from smartphones. However, as mobile devices become more commonplace in daily life, security and privacy concerns have grown, prompting current smartphone producers to adopt a variety of security protection methods, such as encryption, to prevent unwanted access to data stored on their devices [21].

3.1. Oxygen Forensic Detective Oxygen Forensic Detective is primarily focused on mobile devices, although it can extract data from a variety of platforms, including mobile, IoT, cloud services, drones, media cards, backups, and desktop platforms. It bypasses device security (such as screen lock) and obtains authentication data for a variety of mobile apps via physical means. Oxygen is a USB dongle that is available for purchase [22].

3.2. Celle Brite UFED Celle brite sells a variety of commercial digital forensics products, but its Celle brite UFED claims to be the industry standard for digital data access. The main UFED offering concentrates on mobile devices, although the broader UFED product line includes drones, SIM and SD cards, GPS, cloud, and other devices. According to the UFED platform, proprietary methods are used to maximize data extraction from mobile devices [23].

3.3. XRY XRY is a suite of commercial tools for forensics on mobile devices. XRY Logical is a set of tools designed to interact with the operating system of mobile devices and extract data. XRY Physical, on the other hand, bypasses the operating system in order to analyze locked devices using physical recovery procedures. It consists of a hardware device for connecting phones to computers and software for extracting data [24]. XRY has been approved by a variety of government agencies as suitable for their needs, and it is now in use all over the world [25].

本书版权归Nova Science所有

10

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

3.4. Linux Distros Many of the tools mentioned here are open-source and free. Several Linux distributions have been produced that combine these free programs to give forensic investigators with an all-in-one toolkit [26].

3.5. CAINE The Linux distribution CAINE (Computer Aided Investigative Environment) was built for digital forensics. It provides a user-friendly framework for integrating existing software tools as software modules. This program is free and open-source [27].

3.6. SANS SIFT SIFT is a Linux virtual machine that collects free digital forensics tools. The SANS Institute created this platform, and it is covered in several of their courses [26].

3.7. HELIX3 HELIX3 is a digital forensic suite built on a live CD for incident response. There are numerous open-sourced digital forensics tools available, such as hex editors, data carving software, and password cracking methods. This project was taken over by a commercial vendor after its release. Physical memory, network connections, user accounts, running processes and services, scheduled jobs, Windows Registry, chat logs, screen captures, SAM files, apps, drivers, environment variables, and internet history are all sources of data for this tool. The data is then analyzed and reviewed to produce compiled results based on reports.

4. Processing of Forensic Digital Image On a computer or a local machine, this is done. It is a highly advanced investigation technique that necessitates the use of multiple software programs

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

11

and specialized training. The investigator scientist can mine everything from camera attributes to specific pixels using the various methodologies.

5. Different Types of Digital Image Evidence A single image can yield a large amount of digital evidence. The image authenticity, which includes: Pixel data (e.g., color information), Metadata (e.g., descriptive, structural, administrative, reference, statistical), and Exif data, can be divided into two primary sections that complement one another (e.g., digital camera model, shutter speed, focal length). Landmarks (e.g., apartment blocks, churches, and schools), visual languages (e.g., businesses, road signs, and road markings), topography (e.g., hills, mountains, and waterfalls), and street furniture are examples of image content evidence (e.g., bollards, benches, bins) [28].

6. Use of Digital Image Forensics Techniques There are two main uses of digital image forensics techniques [29]: 1. When a suspect denies appearing in a photograph deconvolution can be used to reverse image blurring if identities are occluded in some way. Geolocation, metadata, and exif data can also be used to prove or refute the presence of a defendant at a crime scene. 2. When a suspect alleges that a damning photograph was forged, Image authentication is critical in the age of deep fakes. Color space and color level anomalies can be used to determine the validity of a digital photo. Landmarks could potentially be utilized to confirm or deny the location of the suspect.

7. Lifecycle of Digital Image The lifecycle of a digital image is essentially its history, encompassing the numerous stages taken to make it. A photograph, for example, could be captured with a digital camera, then transferred to a graphics application and modified. The final product is not the original image; it has gone through

本书版权归Nova Science所有

12

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

various phases in its lifecycle. The goal of an investigator is to find the source image. The closer they can come to the original image, the better. It’s more likely to have relevant information and clues at this stage in the life cycle that could aid an inquiry [30].

8. Factors Affecting the Digital Image Lifecycle It’s more difficult than it used to be to track the lifecycle of a digital image online. An image can spread further and more quickly and there are more platforms where images may be easily manipulated. A JPEG is usually made when an image is taken. However, this might be scaled, shared, imbued with extra tags, modified, and so on. The more these photographs are shared and modified, the longer their lifecycle becomes. These mutations not only degrade image quality, but they also add unnecessary data to the image, making it more difficult for investigators to locate the information they require. The de facto standard file format for photographs is becoming a more important subject for digital imaging forensics [30]. JPEG was perhaps the most used and expected image format until recently, but that’s starting to change. Some devices or platforms default to other formats due to differences in networks, performance, and quality. Instead, Apple is utilizing HEIC, a new format that does not save JPEGs. Google, on the other hand, prefers Web P for reduced image sizes. Both formats have an impact on image quality, but it’s unclear if this is better or worse for the image forensics process. The Camera Forensics platform assists law enforcement agencies, nongovernmental organizations, investigators, and other groups in addressing this issue. We are always expanding our capabilities to assist users meet the challenge through innovation and collaboration [31].

9. Core Functionalities of Image Forensics Software Image forensics software is used to search for data in photographs. The tools we give at Camera Forensics assist police in building a case in a criminal investigation. Image forensics software has three basic features that assist with this [32]:

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

• • •

13

Highlighting key intelligence Displaying areas with an identifier Identifying areas with modifications.

10. Forensic Image Analysis This includes image gathering, mixing, and measuring and extracting information from various components of the image for comparison and analysis. There are four categories of image analysis.

10.1. Photo Image Comparison It is concerned with the similarities between the query image and the known image. CCTV recorders, smart phones, webcams, camcorders, and social networking websites are only some of the sources of imagery (video and still photographs). Image-based evidence can open up a world of possibilities for investigation and skilled forensic interpretation [33]. The identification or exclusion of a person portrayed within the images is usually the focus of expert interpretation of image-based evidence. Common methods for identification include [34]: • • • • •

Comparing Facial Images Comparing Clothing’s Gait Analysis Vehicle Identification/Comparison Comparing Objects.

For each of these procedures, imagery observations of a disputed subject will be compared to reference imagery of a recognized subject to see if there are any obvious differences or parallels. The expert will then give their subjective judgment on whether their findings support the disputed and known subjects being the same subject or separate subjects. When reviewing findings, the expert should consider more than one proposition to demonstrate impartiality. If an expert just considers one reason for their findings, the results may be skewed in one direction [34].

本书版权归Nova Science所有

14

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

10.2. Image Content Analysis (ICE) The process of deriving inferences about a picture is known as ICA. The subjects/objects within an image; the conditions under which, or the method by which, the image was recorded or made; the physical features of the scene (e.g., lighting or composition); and/or the image’s provenance are all possible targets for content analysis [35]. It includes blood spatter analysis, patterned injury analysis, autopsy results correlation, determination of the presence of computer-generated imagery in an alleged “snuff” film, vehicle license plate number identification, and determination of the type of camera used to record a specific image, among other things [34].

10.3. Image Authentication Using the growing application of digital images in forensic science, it’s no surprise that digital photos acquired with CCD (charge coupled device) cameras or other digital image devices have become direct or indirect exhibits in court to show the relationship between suspects and offenders. To maintain the integrity of the standard operating procedures (SOPs) and assure their accuracy and dependability, forensic examiners must document or record all processing stages (such as parameters, mathematical activities, and processed images) [36]. However, the ease with which digital photos can be manipulated makes us wary of using them as evidence in court. We need a reliable approach to ensure and reinforce the chain of custody in order to improve this scenario. Image Authentication is the process of ensuring that the information content of the studied material is a true representation of the original data based on a set of criteria. These requirements frequently concern the data’s interpretability, rather than simple format changes that do not affect the data’s meaning or content [37]. The example of image authentication includes: •

• •

Determining whether an image contains feature-based modifications such as the addition or removal of elements in the image (e.g., adding bruises to a face). Determining the degradation of a transmitted image; Determining whether an image or a video is an original recording or an edited version;

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

•

15

Evaluating the degree of information loss in an image saved using lousy compression [22, 34].

As it contains watermarking and/or a digital signature, the image above has any further information of tampering. To authenticate an image only based on an image manipulation model is a difficult task because it may not only require simple cut and paste, but also some post processing to the complete image to make it more realistic. The recent research trends in forensic image analysis include pixel-based, format-based, statistical-based, geometric-based, and physics-based methods, as well as supervised and unsupervised learning techniques [38].

10.4. Image Enhancement and Restoration Low resolution, especially in video photos, poor contrast due to under/over exposure, corruption with noise, motion blur or poor focus, and misalignment of rows due to line jitter in images are all common difficulties with surveillance images [36]. Enhancement and clarification typically refers to removing image blur, lowering image noise, or making brightness and contrast modifications to bring out features that might otherwise be difficult to see. The information gleaned from image analysis can then be utilized to reconstruct the occurrence and evaluate witness and crime scene statements [39]. Unsolved identification of persons in the CCTV (Closed Circuit Television) is increasing because of lack of enhancement issues. • • • • • • • • •

Spatial domain method – point operation – Histogram modeling Spatial operations – directional smoothing, median filtering, image sharpening, masking, edge crispening, interpolation Transform operation linear filtering, root filtering Homomorphic filtering Degradation model – diagonalization – Constrained and Unconstrained restoration Inverse filtering – Wiener filter Generalized inverse, SVD and iterative methods.

本书版权归Nova Science所有

16

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

11. Photogrammetry The process of recording, measuring, and analyzing photographic images, the pattern of electromagnetic radiant energy in audio evidence and other clues obtained from any physical objects or environment is known as Photogrammetry. This application, sometimes also called “mensuration ‘‘ is mostly used to extract dimensional information of image viz. the height of the surveillance image and crime scene reconstruction, visibility and spectral analysis. The process of basic photogrammetry requires an image with a scale or ruler in the same plane with a camera perpendicular to the subject. The subject can be a multi-story building to a tiny drop of high velocity blood spatter. The advanced forms of photogrammetry require unique software and an expertise that can cope with oblique views and three dimensional scenes. To evaluate the photogrammetry, the image must contain at least one object of known size and the operator must know the focal length of the lens [34].

12. Pros and Cons of Digital Image Forensics We all know every coin has two faces. Similarly, the pros of digital forensic outweigh the cons. while both are important when setting expectations for image processing in criminology.

12.1. Pros a.

The more the data the greater the chance it has of digitally identifying the suspect’s crime. b. It is flexible to use both in closed and open source investigations. c. It validated approaches and algorithms. It is highly accurate and reliable.

12.2. Cons a.

This is a time consuming process and also requires trained or expertise from the field.

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

17

b. Digital forensic image processing for a minute clue may take many months to develop.

13. Image Processing for Digital Forensics Digital fraud, along with constantly evolving software tools, is rapidly outpacing suspicious behaviors. Digital photographs or videos can be unmistakable proof of a crime or confirmation of a malevolent deed in some situations. The multimedia forensic agencies want to create effective techniques for helping clue investigation and also provide a foundation for decision making regarding a crime using the information from the digital image. Multimedia forensics [40] identifies forthcoming technical devices that work in the absence of a watermark or signature in a picture. Nonintrusive/blind and intrusive/non-blind digital image forgery recognition technologies are available in the current era. Figure 5 shows two digital image forensics methods. Non-blind or Intrusive techniques require digital data to be added into the produced input picture. Watermarking, digital signatures, and so on are a few examples. At the same time, a non-intrusive or blind technique does not require any data to be added. A picture is deemed forged when it has been tampered with using various transformation types such as scaling, rotation, resizing, and so on. Passive forensic ways fulfill the process with no further detail omitting the image, therefore providing advantages over active methods such as watermarking and signature approaches. As a result, several studies are being conducted to create blind authentication methods. The forged photos create a trail that may be used to track out the modified element. The basic components of CNNs are convolution layers, pooling layers, and activation functions, which are layered together to form the CNN architecture. Recent advances in CNN construction and training can be classified as structural reformulation, parameter optimization, regularization, and loss function [41]. Among them, structural reformulation is the most essential in terms of performance improvement, and it is classified into seven categories: spatial extraction, complexity, multi-path, width, channel boosting, convolutional feature exploitation, and attention based CNNs. Most commonly used CNN architectures are ResNet [42] AlexNet [43] DenseNet [44], SENet [45], GoogleNet [46] etc. Many academics have used data-driven techniques to this issue in recent years, motivated by the great performance of deep learningbased systems in image source forensics. The fundamental framework of the

本书版权归Nova Science所有

18

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al.

proposed deep learning methodology for image forensics, which uses an image as input, is shown in Figure 6. In Patch selection, PRNU noise is removed from the image. Smoothing of image is done along with providing high luminance to non-saturated areas. Further preprocessing is done by normalizing the photos and by removing the pixel-wise means value, for centering the data and assisting the network in learning quicker. After this step the data is provided to any CNN. Further, image is provided to fusion and ensemble stages, the goal of fusion and ensemble techniques is to improve performance by combining several models and features. To improve classification performance, different classifier adaptation is done. The result is taken out in the voting step.

Figure 5. Digital image forensics approaches.

Figure 6. Basic firework for using deep learning.

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science

19

Conclusion and Future Outlook The forensic image processing identifies the features, quality and commonness components in the image and to extract intended information from them for comparison and analysis. With the advancement of machines or computer tools, image analysis will provide novel and non-obvious features to understand and analyze the image. Still, image processing and its interpretation for finding vital information and establishing its authenticity is an automated task and is a challenging task for scientists and the research community [47-51]. In this chapter, we provide brief knowledge of how to process images or photos in forensic science which improve the image orientation in terms of its quality, authenticity, enhancement and restoration to extract information viz. analysis, interpretation and its recognition. But still more technological transformation is required in this field for analyzing the image efficiently and accurately. Therefore, image processing in forensic science is very challenging and continuous upgradation, tightening security to stop data infringement required for the modern world.

References [1]

[2]

[3]

[4]

[5]

[6]

Gupta, M., & Anand, R. (2011). Color image compression using set of selected bit planes. International Journal of Electronics & Communication Technology, 2(3), 243-248. Saini, P., & Anand, M. R. (2014). Identification of Defects in Plastic Gears Using Image Processing and Computer Vision: A Review. International Journal of Engineering Research, 3(2), 94-99. Vyas, G., Anand, R., & Holȇ, K. E. Implementation of Advanced Image Compression using Wavelet Transform and SPHIT Algorithm. International Journal of Electronic and Electrical Engineering. ISSN, 0974-2174. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. Pandey, B. K., Pandey, D., & Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. Pramanik, S., Ghosh, R., Pandey, D., & Ghonge, M. M. (2021). Data Hiding in Color Image Using Steganography and Cryptography to Support Message Privacy. In Limitations and Future Applications of Quantum Cryptography (pp. 202-231). IGI Global.

本书版权归Nova Science所有

20 [7]

[8]

[9] [10]

[11] [12]

[13]

[14]

[15]

[16] [17]

[18]

[19] [20]

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al. Ratnaparkhi, S. T., Singh, P., Tandasi, A., & Sindhwani, N. (2021, September). Comparative analysis of classifiers for criminal identification system using face recognition. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-6). IEEE. Saini, M. K., Nagal, R., Tripathi, S., Sindhwani, N., & Rudra, A. (2008). PC Interfaced Wireless Robotic Moving Arm. In AICTE Sponsored National Seminar on Emerging Trends in Software Engineering (Vol. 50). Sennan, S., Pandey, D., Alotaibi, Y., & Alghamdi, S. A Novel Convolutional Neural Networks Based Spinach Classification and Recognition System. Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Sennan, S., Pandey, D., Alotaibi, Y., & Alghamdi, S. A Novel Convolutional Neural Networks Based Spinach Classification and Recognition System. Anand, R., Singh, B., & Sindhwani, N. (2009). Speech perception & analysis of fluent digits’ strings using level-by-level time alignment. International Journal of Information Technology and Knowledge Management, 2(1), 65-68. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Degerine & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: https://doi.org/10.1109/TSP.2004.827195. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Peterson, J., Sommers, I., Baskin, D., & Johnson, D. (2010). The role and impact of forensic evidence in the criminal justice process. National Institute of Justice, 1-151. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Saran, V., Kumar, S., Ahmed, S., & Gupta, A. K. (2013). Similarities of Slant in Handwriting of Close Genotypic Family Members. This work is licensed under a Creative Commons Attribution Non-Commercial, 4. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59.

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science [21]

[22]

[23]

[24]

[25] [26] [27]

[28]

[29] [30]

[31]

[32]

[33]

[34]

[35]

21

Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Zhang, H. Jason, E. F., & Goldman, S. A. (2008). “Image segmentation evaluation: A survey of unsupervised methods,” Computer Vision and Image Understanding. 110: 260-280. Khalili, J. (2021). “Cellebrite: The mysterious phone-cracking company that insists it has nothing to hide.” TechRadar. Archived from the original on 2021-07-31. Retrieved 2021-09-07. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Dieguez Castro, J. (2016). Introducing Linux Distros. Apress. pp. 49, 345. ISBN 978-1-4842-1393-3. James, J. I. & Gladyshev, P. (2013). “A survey of digital forensic investigator decision processes and measurement of decisions based on enhanced preview.” Digital Investigation. 10 (2): 148–157. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Sharma, M., & Singh, H. (2022). Contactless Methods for Respiration Monitoring and Design of SIW-LWA for Real-Time Respiratory Rate Monitoring. IETE Journal of Research, 1-11. Sharma, M., & Jha, S. (2010, February). Uses of software in digital image analysis: a forensic report. In Second International Conference on Digital Image Processing (Vol. 7546, p. 75462B). International Society for Optics and Photonics. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi. org/10.1007/978-981-19-0312-0_20. Seckiner, D., Mallett, X., Roux, C., Meuwly, D., & Maynard, P. (2018). Forensic image analysis–CCTV distortion and artefacts. Forensic science international, 285, 77-85. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: https://doi.org/10.1109/TSP.2005.855077. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12).

本书版权归Nova Science所有

22 [36]

[37]

[38]

[39]

[40] [41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

Vikas Menon, Bhairav Prasad, Mandheer Kaur et al. Wischow, M., Gallego, G., Ernst, I., & Börner, A. (2021). Camera Condition Monitoring and Readjustment by means of Noise and Blur. arXiv preprint arXiv:2112.05456. Hu, J., Shen, L., & Sun, G. (2017). Squeeze-and-excitation networks. arXiv preprint arXiv: 170901507. S. Degerine and A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Scientific Working Groups on Digital Evidence & Imagery Technology. [2007, January 1]. Best Practices for forensic image analysis, Version1.6, Retrieved October 10,2010, from the International Association for Identification website: http://www.theiai.org/guidelines/swgit/guidelines/section_12_v1-6.pdf. Urschler, M., Bornik, A., Scheurer, E., Yen, K., Bischof, H., & Schmalstieg, D. (2012). Forensic-case analysis: from 3D imaging to interactive visualization. IEEE computer graphics and applications, 32(4), 79-87. Farid, H. (2009). Image forgery detection. IEEE Signal processing magazine, 26(2), 16-25. Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. Artificial intelligence review, 53(8), 5455-5516. Gupta, A. K., Sharma, M., Sharma, A., & Menon, V. (2020). A study on SARSCoV-2 (COVID-19) and machine learning based approach to detect COVID-19 through X-ray images. International Journal of Image and Graphics, 2140010. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708). Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Gupta, A. K., Sharma, M., Singh, S., & Palta, P. (2020). A Modified Blind Deconvolution Algorithm for Deblurring of Colored Images. In Advances in Computational Intelligence Techniques (pp. 157-167). Springer, Singapore. Pramanik, S., Ghosh, R., Ghonge, M. M., Narayan, V., Sinha, M., Pandey, D., & Samanta, D. (2021). A Novel Approach Using Steganography and Cryptography in Business Intelligence. In Integration Challenges for Analytics, Business Intelligence, and Data Mining (pp. 192-217). IGI Global. Kohli, L., Saurabh, M., Bhatia, I., Sindhwani, N., & Vijh, M. (2021). Design and development of modular and multifunctional UAV with amphibious landing, processing and surround sense module. Unmanned Aerial Vehicles for Internet of Things (IoT) Concepts, Techniques, and Applications, 207-230.

本书版权归Nova Science所有

The Investigations of Image Processing in Forensic Science [49]

[50]

[51]

23

Degerine, S. & A. Zaidi, “Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints,” in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: https://doi.org/10.1137/050622821. Zaïdi, A. “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices,” in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: https://doi.org/10.1109/LSP.2019.2909651. Zaïdi, A. “Mathematical Methods for IoT-based Annotating Object Datasets with Bounding Boxes,” in Mathematical Problems in Engineering, vol. 2022, 2022.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 2

Integrating IoT Based Security with Image Processing Vivek Veeraiah 1 Jay Kumar Pandey 2 Santanu Das3 Dilip Raju4 Makhan Kumbhkar5 Huma Khan 6 and Ankur Gupta 7,* 1

Department of R & D Computer Science, Adichunchanagiri University, Mandya, Karnataka, India 2 Shri Ramswaroop Memorial University, Barabanki, Uttar Pradesh, India 3 Department of Biotechnology, Seshadripuram First Grade College, Yelahanka New Town, Bangalore, India 4 Department of Electronics and Communication Engineering, Dayananda Sagar Academy of Technology & Management, Udayapura, Bengaluru, India 5 Department of Computer Science, Christian Eminent College, Indore, Madhya Pradesh, India 6 Department of Computer Science and Engineering, Rungta College of Engineering and Technology, Bhilai, Chhattisgarh, India 7 Department of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India

*

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

26

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

Abstract There are several aspects of signal processing, including signal analysis, signal processing, signal storage, signal filtering, signal coding, and signal decoding. It includes signals such as audio and visual signals as well as transmissions. Among all of these signals, image processing is the one that works with the signals whose input and output are both images. Its name suggests that it focuses on image processing. Software for digital image processing, such as computer graphics, signals, photography, camera mechanism and pixels, is called Digital Image Processing. Image enhancement, signal processing, audio signal processing, and more may all be done on this platform. It offers a variety of image formats. Algorithms in Digital Image Processing are used to alter images. Internet of Things is no longer mysterious. Our future is being quietly shaped by a technology that has quietly gained traction in the past few years. We developed IoT because we wanted to make life easier and eliminate the possibility of human error while also cutting down on labor costs. Hence, the decision was made to make gadgets intelligent and to pay attention to issues relating to increasing eff iciency. The security issues of IOT technology can be minimized by keeping check on various activities by using image processing. The proposed work provides insight to image processing, internet of things and various ways of using image processing to enhance security by keeping camera surveillance and utilizing CBIR. A content-based search is one that analyses the actual content of an image rather than the metadata connected with it, CBIR.

Keywords: Internet of Things, digital image processing, security enhancement, camera surveillance, machine vision, content based image recognition

1. Introduction For supply chain management at Procter & Gamble, Kevin Ashton initially used the phrase “Internet of Things” (IoT) in a presentation about integrating radio-frequency identification (RFID) [1, 2]. Advances in the Internet of Things (IoT) necessitate the use of new that is capable of connecting all smart things in a network without the involvement of any humans. An Internet of Things (IoT) gadget can monitor or communicate data. Two-dimensional signals are all that images are for image processing [3, 4]. F(x,y) is a mathematical function that describes the relationship between two

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

27

coordinates, x and y. F(x,y) can be used to determine pixel values at any point in time. The intensity of a picture at a given point is defined as the amplitude of F at that point’s (x,y) spatial coordinates. A digital image is one in which all three of F’s parameters (x, y, and amplitude) have fixed values. In other words, a two-dimensional array of rows and columns can be used to define an image [5].

1.1. Internet of Things In recent years, the Internet of Things (IoT) has emerged as a new study of focus [6-9]. Many academic and industrial fields are particularly in the healthcare sector. It is the IoT revolution, which incorporates technology, as well as social aspects into modern healthcare systems. Prospects in economics and society, its transforming the healthcare systems from the old-fashioned way of doing things. Healthcare systems that are better tailored to the needs of each individual patient those are easier to be treated and monitored. The Internet of Things (IoT) is becoming an increasingly important technology in healthcare systems, providing better quality services at reduced costs to the customer and more enhanced user experiences [10-12]. There are a variety of applications for this technology. The novel is severe respiratory pandemic that is currently threatening the world’s health [13]. It is the largest global public health crisis since the emergence of syncytial corona virus, the 1918 influenza epidemic. Flu-like symptoms, such as fever, are present with this illness. Cough and weariness are two symptoms that must be identified early on in the course of a diagnostic. The COVID-19 incubation period ranges from one to fourteen days. Surprised, a patient who was not receiving any kind of treatment, the COVID-19 virus [14-16] can be transmitted to people through any number of symptoms. This is the time when quarantining these individuals is necessary. In addition, the healing time is much longer than it used to be. It’s impossible to predict how severe this illness will be for a certain patient based on their age or other factors. While the risk of contracting, this disease is significant, to be more easily transmitted than corona virus-related illnesses of a comparable nature. The efforts and research to prevent the transmission of the disease in the family are continuing extensively. The Internet of Things (IoT) has shown to be a reliable and secure solution in this situation. The collection of patient data by IoT devices helps expedite the detection procedure. This can be accomplished

本书版权归Nova Science所有

28

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

by obtaining temperature readings from the subjects’ bodies and then analyzing the results, on instances of doubt, etc. [17].

1.1.1. Features of IOT AI, connection, sensors, active participation and small device utilization are the most significant aspects of IoT. The following is a quick rundown of these features: Everything may become “smart” thanks to the power of data collecting, artificial intelligence algorithms [18], and networks [19], which is what IoT is all about. This might be as simple as installing sensors in your fridge and cabinets to alert you when you’re running low on milk and pulses, and then placing a command with your local grocery store. b. IoT networks are no longer confined to a few big suppliers, thanks to new enabling technologies for networking, such as Wi-Fi and Bluetooth. Networks can exist on a much smaller and more costeffective scale while still being useful. These micro-networks are created by IoT between its system components. c. IoT would be nothing if it didn’t have sensors. IoT becomes an active system capable of real-world assimilation with the help of these defining instruments [20]. d. Connected technology is increasingly being used passively, rather than actively. There is a new way to actively engage with content, products and services via IoT. e. Small Devices - As expected, devices have shrunk in size, cost, and power. The precision, scalability, and flexibility of IoT are made possible by the use of custom-built tiny devices [21]. a.

A slight guidance from humans makes this concept a possibility. It can be shown with Figure 1.

Things

Internet

Human

Figure 1. Tri-sectional intersection among things, internet and human.

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

29

IOT is the result of interaction among three variables, i.e., things or objects, internet and human beings. However, the transition from internet to IOT is not done overnight. In fact, it took place gradually. Also, in future there are very high chances of including Artificial intelligence to enhance the performance of IoT based devices and applications [22].

1.1.2. A Comparison between Traditional Internet and IoT In Table 1 the details provided sheds light on the key changes IOT has brought to the traditional internet. Table 1. Comparison between Traditional Internet and IoT Topic Content creation Mechanism of combining content Value Connection type Digital data Data formats Composition Content Consumption

Traditional Internet By human By explicitly defined links Answers the query Point-on-point and multipoint Readily provided Homogeneous PC, server and smart phones By request

IoT By machines By explicitly defined operations Action and timely info Multipoint Not generated until augmented or manipulated Heterogeneous RFID and WSN nodes By pushing info and triggering actions

1.1.3. Pros & Cons of IoT There are numerous advantages to be gained through the Internet of Things (IoT). The following are a few advantages of the Internet of Things [23]: •

•

Improvements in Customer Interaction: As previously said, customer engagement stays passive due to a lack of analytics that are able to identify and fix these blind-spots and errors. When it comes to engaging with audiences, the Internet of Things (IoT) totally transforms this process [24]. Device Usability Improvements - The same tools and data that help customers have a better shopping experience also aid in the development of more effective technological solutions. The Internet of Things (IoT) provides unprecedented access to a wealth of operational and locational data.

本书版权归Nova Science所有

30

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

•

•

A reduction in the amount of waste generated thanks to the Internet of Things (IoT). Current analytics only provide a surface level of information, whereas IoT provides actionable data that may be used to better allocate scarce resources. As a result of its limits and intended for passive use, modern data collection suffers. To better understand the world around us, IoT takes it out of its traditional confines and puts it exactly where we want it: right at our fingertips. It provides a complete image of the world [25].

There are some drawbacks to the Internet of Things (IoT), such as [23]: •

• •

•

With IoT, you get a network of interconnected devices that are all communicating with each other all the time. Despite the fact that it is protected, the system gives the user very little control. Because of this, users are vulnerable to a variety of cyber-attacks [24]. IoT’s sophistication makes it possible to collect a large amount of personal data in minute detail without the user having to do anything. In terms of design, deployment, and maintenance, some people find IoT systems hard due to their usage of various technologies and a wide range of new enabling technologies. When many consider standard software compliance a battle, the complexity of this issue makes it appear impossible to achieve compliance [25].

1.1.4. Applications of IoT A wide range of sectors and markets can benefit from the Internet of Things (IoT). Those who wish to save money on their utility bills to major corporations who want to simplify their operations are all included in the user base. 1. Engineering, manufacturing, and infrastructure are all examples of these fields. For example, IoT can be used to improve production, marketing, service delivery, or even public safety. As a result of the Internet of Things, it is possible to monitor a wide range of processes in more detail than ever before. A deep level of control provided by the Internet of Things (IoT) enables rapid and more effective action on opportunities, such as clear client requirements, nonconforming products and malfunctioning equipment. Joan, for example, is the

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

31

owner of a company that produces protective shields for industrial machinery. Production robots are programmed to automatically program new requirements for shield composition and function when regulations change, and engineers are notified when the changes are approved [26]. 2. Public Administration and Public Safety: The Internet of Things (IoT) can improve law enforcement, defense, city planning, and economic management when used to government and safety. Many of the existing shortcomings in these initiatives have been remedied through the use of new technology. The Internet of Things (IoT) can assist city planners and governments better understand the impact of their designs on the local economy. Joan, for instance, resides in a tiny town. Crime in her neighborhood has recently increased, and she’s concerned about returning home at night. System flags have notified local law enforcement to the new “hot” area, and they’ve increased their presence in the area. Law enforcement has followed up on leads discovered by area monitoring equipment in an effort to stay one step ahead of criminals [22]. 3. Places of Work and Residence: The Internet of Things (IoT) provides a personalized experience in our daily lives, from our homes to our workplaces to the businesses with which we conduct business. As a result, we are happier, more productive, and more protected from harm. IoT can therefore assist us in customizing our workplace space to better suit our working needs. Take Joan’s job in advertising, for instance. She walks into her office, and the system identifies her. It allows her to set the temperature and brightness of the room to her personal choice. In order to get her back to where she left off, it activates her devices and launches applications. Before she arrived, her office door had detected and recognized many visits from a coworker. This visitor’s messages are opened automatically by Joan’s computer system [27]. 4. Physiology and Pharmacology: The Internet of Things (IoT) propels us toward the future of medicine we envision, one in which a sophisticated network of medical equipment is tightly interwoven. Internet of Things (IoT) has the potential to revolutionize healthcare in a variety of ways. More precision, more attention to detail, faster reactions to events, and ongoing progress can be achieved by integrating all the parts of medical research and organizations. Joan, for example, works in an emergency room as a nurse. A man has been

本书版权归Nova Science所有

32

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

injured in an incident and is in need of assistance. The computer recognizes the patient and retrieves his medical history. Paramedic equipment on the scene automatically captures and transmits vital information to those in the hospital who need it. In order to provide guidance, the system examines both the new data and the existing information. During the course of the trip, the system automatically updates to reflect the current condition of the patient. Joan is prompted by the system to accept system actions for the distribution of medicine and the manufacture of medical equipment [28]. 5. Promotion and Dissemination of Information: Unlike current technologies, analytics, and big data, IoT acts in a comparable and deeper manner. This data can be used to create metrics and patterns over time, although it is generally lacking in depth and accuracy. With the help of the Internet of Things (IoT), this can be improved. It provides more accurate metrics and patterns since it provides more information and detail. It enables companies to better understand and meet the preferences and needs of their customers. Businesses and consumers alike benefit from the increased efficiency and strategy that comes from delivering only relevant content and solutions. 6. Enhanced Marketing: The current state of advertising is plagued by an overabundance of messages and poor selection of audience members. Adverts still fail, even with the help of today’s advanced data. The Internet of Things (IoT) promises targeted advertising rather than one-size-fits-all approaches. The Internet of Things (IoT) allows consumers to engage with advertising rather than just passively receiving it. This improves the utility of advertising for consumers who are looking for answers in the market or who are unsure if those solutions exist. 7. Monitoring the environment: Environmental protection, extreme weather monitoring, water safety, endangered species protection, commercial farming, and other uses for IoT in environmental monitoring are just a few of the many possibilities. Sensors are used to monitor and record every aspect of the environment in these settings [29-31]. Current air and water safety monitoring technology is improved by IoT since it requires less human labor, allows for frequent sampling, broadens the scope of sample and monitoring, and allows for more advanced testing on-site. This permits us to avoid massive pollution and other tragedies. It’s difficult to accurately monitor weather in commercial farming due to a lack of precision,

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

8.

9.

10.

11.

33

and human effort is required for this purpose. It also has a small amount of automation. The Internet of Things (IoT) makes it possible to reduce the amount of human involvement in the operation of systems, farming analysis, and monitoring. Crops, soil, the environment, and more are monitored via systems. In order to improve regular processes, they use a significant amount of data. Health problems (such as E. coli) can also be prevented thanks to these devices [32]. Industry-Wide Smart Product Enhancements: IoT in manufacturing, like in content distribution, enables deeper real-time insights. This saves a lot of time and money by eliminating the need for extensive market research before, during, and long after a product is released to the market. Because it gives more accurate and comprehensive data, IoT also lowers the risks of introducing new or updated items to the market. As a result, it is more reliable and trustworthy than information that originates from a variety of unreliable sources. Optimized Resource Use and Waste Reduction at a lower cost: There are numerous ways in which the Internet of Things (IoT) can save money and time by replacing traditional labor and tools in a manufacturing facility and the overall supply chain; for example, IoT instruments and sensors are used to perform maintenance checks or tests that would have previously required human labor. The Internet of Things also improves operational analytics to maximize the use of resources and labor, as well as to remove various sorts of waste, such as energy and materials. It looks at the entire process, not just a portion of it in a specific facility, so that any improvements can have a greater impact. It essentially cuts down on waste across the entire network and distributes the savings equally [33]. Proper Use and Protection of the Product: Even the most advanced system cannot prevent malfunctions, non-conforming products and other dangers from reaching the market. In some cases, these accidents are the consequence of disagreements that have nothing to do with the manufacturing process itself. Recalls and harmful product distribution can be avoided with the use of the Internet of Things (IoT) in manufacturing. Control, visibility, and integration make it easier for it to deal with any problems that arise. Building/Housing: Using the Internet of Things (IoT) in the context of buildings and other structures enables us to automate a wide range of tasks and needs in both residential and commercial settings. Costs

本书版权归Nova Science所有

34

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

are reduced, safety is improved, productivity is increased, and quality of life is improved, as can be seen in manufacturing and energy applications [33]. a. Climate and Temperature: Due to a variety of elements, managing the atmosphere and conditions of a building is one of the most difficult engineering tasks. In addition to these elements, there are many others, such as the climate, building materials, and more. Managing energy expenditures is the primary focus, although conditioning also affects the structure’s durability and status. The availability of more precise and full data on buildings provided by the Internet of Things (IoT) makes it possible to improve structural design and management of already existing structures. Information such as how well a material acts as insulation in a specific design and environment is provided by this analysis [13]. b. Safety and Health: Even carefully constructed buildings might have health and safety problems. As a result of these problems, the building is vulnerable to extreme weather conditions and has shoddy foundations. The current solutions do not have the ability to detect minor problems before they become large problems or emergencies. For example, IoT can evaluate changes in system status that affect fire safety rather than simply detecting smoke. This provides a more dependable and complete solution, as IoT can monitor issues in a fine-grained fashion to regulate threats and aid in preventing them [13]. c. Quality of Life and Productivity: In addition to safety and energy issues, most individuals want precise lighting and temperature settings in their homes or workplaces. The Internet of Things (IoT) amplifies these perks by making customizing easier and faster. The area of productivity has also been subject to revisions. A smart office or kitchen created for a specific person is one example of how they customize areas to create an ideal atmosphere for that person [13].

1.1.5. Security Flaws of IoT This technology’s vulnerabilities plus the fact that it can perform specific functions make it vulnerable to any liability that may result from its use. Device malfunction, attack, and data theft are the three key concerns. They can cause a wide range of problems [27].

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

35

Fault in the Equipment: System control of life-and-property-threatening systems is possible with IoT’s deeper level of automation. Even a minor error in an IoT furnace control system can result in frozen pipes and flood damage in an unattended home, which is why these systems are so critical. This necessitates the development of countermeasures by organizations. Cyborg Invasion: Everything connected to an IoT network is vulnerable to assault, including the network itself. Monitoring and access privileges, as well as custom protections, are the best ways to prevent this. Simplistic defenses against attacks have shown to be the most effective [28]. People and organizations should look for devices that have built-in security, which means that the hardware and firmware are protected. Encryption - The manufacturer and user systems must integrate this. Organizations and individuals must take into account any risks while designing or selecting their systems. Theft of Personal Information: Many people can’t resist the allure of data, the Internet of Things’ greatest asset and greatest liability. The same measures that are efficient in preventing attacks can also be employed to deal with this new threat.

1.2. Image Processing For each of the elements in the Digital Image, there is a specific value assigned to it at that place. Pictures, images and pixels are all examples of these components [32]. Most commonly, a digital image’s pixels are employed to denote its constituent parts. There are many aspects to signal processing, including signal analysis, signal processing, signal storage, signal filtering (e.g., digital signal processing), and signal processing [34-38]. It includes signals such as audio and visual signals as well as other types of signals. Among all of these signals, image processing is the one that works with the signals whose input and output are both images. It deals with image processing, as its name implies [39]. Software for digital image processing, such as computer graphics, signals, photography, camera mechanism and pixels, is called Digital Image Processing [40]. Various activities, such as image enhancement, signal processing of analogue and digital data, picture signals, and speech signals, can be carried out on this platform Images are available in a variety of formats. Algorithms are used to modify images in Digital Image Processing. When it comes to digital image editing, Adobe

本书版权归Nova Science所有

36

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

Photoshop is the most extensively used software [41]. Table 2 presents the major functions of digital image processing. Table 2. Key functions of digital image processing Major Functions Image enhancement

Image restoration Image analysis Image compression Image synthesis

Examples for performing operations Adjustment of brightness, contrast, image averaging, convolution, frequency domain filtering, and edge enhancement are some of the techniques that can be used to improve images. Photometric correction, inverse filtering Segmentation, feature extraction, object classification Lossless and lossy compression Tomographic imaging, 3-D reconstruction

1.2.1. Characteristics of Digital Image Processing (DIP) This section of chapter shed-light on the features of digital image processing, such as: a. Some of the software it employs is free of charge. b. It offers crystal-clear visuals. c. Image enhancements are performed in Digital Image Processing in order to re-create the data using photos. d. It is widely employed in a wide variety of fields. e. It makes digital picture processing simpler. f. Use it to enhance your quality of life. Advantages of DIP are: • • • • •

Reconstruction of a previously captured image Retouching of an image Retrieval and Storage of images quickly Image dissemination that is both quick and of good quality. A method for regulating the amount of time spent watching (windowing, zooming)

Drawbacks of DIP are: • • •

It takes a long time to complete. Depending on the system, it can be extremely expensive. Only qualified people can be employed.

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

37

1.2.2. Applications of DIP • •

•

•

•

•

•

Here are a few examples of how digital image processing is applied in a variety of industries. Improving the visual quality of an image is a common term for the practice of doing so. Using it, the photos are fundamentally manipulated, resulting in the desired end result. Conversion, sharpening, blurring, identifying edges, retrieval, and recognition of images are all part of the process. Medical specialty: An MRI or CT scanner can provide 3D images of the human body for use in medical image processing, and these 3D datasets are commonly used. It also aids in the diagnosis of diseases and the planning of medical procedures such as surgery or research. Medical imaging uses include gamma-ray imaging, PET scanning, Xray imaging, medical CT scanning, and UV imaging. Technology has made it possible for us to witness live video stream or live CCTV footage from any area of the world in a couple of seconds. As a result, not only has image transmission improved, but so has image encoding as well. Images can be encoded. Using techniques like light microscopy or transmission electron microscopy, a digital image can be created that displays the intensity of light or any radiation that has passed through the material. Aside from cutting down the amount of data needed to describe a picture, image compression is also concerned with making sure that the digital image processing is complete and that it can be transferred over. Machine/robot vision: Robot Camera technology and computer techniques are used to allow robots to process visual data from the outside environment, which is called vision. Digital image processing is handled by a number of robots. Robots use image processing to navigate their way around. Robots are essentially blind without digital picture processing. Robots employ vision to do complex tasks in a dynamic environment. These photographs are enhanced and interpreted by digital image processing algorithms. Each pixel in a digital color image contains a unique set of color information. The intensity and chrominance of light are calculated using three values for each pixel in a color image. The digital image data contains information about the brightness of each dark band.

本书版权归Nova Science所有

38

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

•

•

Color images are also studied in terms of how they are transmitted, stored, and encoded. For the purpose of detecting recurrent patterns, image processing can be utilized to identify patterns and to provide the computer with the intelligence necessary for human recognition. In order to extract important information from images or videos, pattern recognition is useful. Biological and biomedical imaging and other computer vision applications make extensive use of it. When integrated with artificial intelligence, computer-aided diagnostics, handwriting recognition, and picture recognition become simple tasks. There are many other ways to process video, but the most common method is to combine many frames into one video. Motion detection, noise reduction, and color space conversion are all included. If you’re interested in learning more about how signal processing can be applied to video, this is a great place to start: video processing. Television sets, DVD players, VCRs, video codes and other devices use video processing algorithms on a regular basis.

1.2.3. Architecture/Working Digital image processing involves the following steps (as shown in Figure 2): 1. DIP’s essential phases begin with image acquisition. In this stage, a digital image is provided. Pre-processing, such as scaling, is typically done at this point. 2. Enhancement of the image: - DIP’s image-enhancing features are the simplest and most appealing. Details that are unknown or fascinating aspects of an image are emphasized at this stage. Things like brightness and contrast are examples of this. 3. Image Correction and Enhancement: - The process of enhancing the visual appeal of an image is known as image restoration. 4. Color Image Processing is the fourth step in this process: - When it comes to digital photographs, color image processing is an important area of study. Color modeling, digital processing, etc. are all part of this. 5. This section focuses on wavelets and multi-resolution processing- In this stage, various resolutions of an image are shown.

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

39

Figure 2. Working of image processing.

6. It is possible to reduce the size of an image by compressing it. Compressing data for internet use is critical, and this is a critical step in the process. 7. Slicing and Dicing- At this point, an image has been divided up into its constituent parts. DIP’s most difficult task is segmentation. In order to successfully solve imaging problems that necessitate the identification of specific objects, it is necessary to go through this lengthy process. 8. Description and Representation- The output of the segmentation stage is followed by representation and description. In the end, you’ll get an image of the region as raw pixel data. Representation is the only way to transform raw data. Whereas description is used to extract information that distinguishes one object type from another. 9. Recognition of objects- At this point, the item is given a label based on descriptors. 10. Knowledge Base- The final level of DIP is knowledge. During this phase, critical information about the image is discovered, limiting the scope of search operations further. When the image database includes a high-resolution satellite, the knowledge base becomes extremely complicated.

1.3. CBIR Algorithms are used in Image Processing to modify images. When a user requests a certain image from the database, the image retrieval system is called

本书版权归Nova Science所有

40

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

upon. Images can be searched, browsed, and retrieved from a huge database. This is a common use for content-based image retrieval. Image retrieval using computer vision techniques, referred as QBIC and CBVIR, is a method for finding digital images stored in big databases using content-based image retrieval. As a result, a “content-based” search is one that analyses the actual content of an image rather than the metadata connected with it. In this context, the term “content” could pertain to the colors, shapes, textures, or any other details of image which can be gleaned out of the image. Many problems can be solved using CBIR, which is stationed on visual analysis of contents of query image [42, 43].

Figure 3. Framework of CBIR.

Figure 3 depicts the general CBIR framework, which includes necessary and optional steps. The user’s query image is submitted to CBIR as the initial step in the process. There are two types of online and offline processes: those that take place after a user submits a query image and those that take place before the user submits the query image. The framework may include an optional preprocessing stage that includes resizing, segmentation, denoising, and rescaling, among other things. In the feature extraction step, a visual concept is transformed into a numerical form using this optional stage as a prelude. Low-level characteristics or local descriptors can be used to extract features. The final step is to compare the retrieved features from the query image to all other images in the dataset in order to find the most relevant images. Relevance feedback can also be used to improve results by allowing the user to select photos that are relevant and those that are not. Relevance input can be used in a variety of ways to improve CBIR’s performance.

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

41

1.3.1. Applications of CBIR •

•

•

•

In The Medical Field: Using a 2D MRI query image from the user, this application can identify the 3D volume that matches the query slice’s brain area. The matched slices are found by searching further in the matched 3D volume for the corresponding slices. The support vector machine (SVM) approach is used with the CBIR system. Also, in order to look for histology images in a medical database, and it has the capability of retrieving images based on both iconographic and semantic information. Textual annotations of the photos are also generated. In the case of histology images, this is an improved image retrieval technique. For Remote Sensing: These photos are typically taken by satellites or planes. It is imperative for military applications that CBIR be implemented on the geographical image database in order to organize the geographic images of a certain area. Image retrieval is achieved by using very high resolution (VHR) remote sensor pictures as input, and the anisotropic property is characterized by the Local structure tensor (LST). For using it in observing marine life, the enormous area colonization of the alga produces undesired problems in the natural sea oriented human existence. This calamity can be stopped by detecting this alga and therefore it is very vital one for safety of human life. This task is achieved by the RGB to NTSC color space conversion and the k-means segmentation algorithm. In e-commerce: Online shopping has supplanted traditional brickand-mortar stores as the preferred method of making purchases around the world. During E-purchasing, users have difficulty deciding what they need to buy. In the process of making an online clothes purchase, the CBIR systems can aid in the user’s apparel selection. Feature sets including colors, forms, and textures are used in this type of E-Commerce applications. Because trademarks are a kind of intellectual property rights, many businesses keep their trademarks as a sort of company protection. For businesses, trademark symbol protection is a necessary consideration. The CBIR systems help protect trademarks by identifying and removing any images that are similar to the trademark. For video retrieval: When it comes to meeting multimedia needs, video retrieval is a must-have feature. You may use it to look for

本书版权归Nova Science所有

42

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

•

videos on the web and browse through the results. The use of the CBIR principle in a video retrieval application is vital. When searching for a video, the query image is used to divide the database recordings into frames, which are then retrieved. This achievement makes use of color, shape, and texture features. In enhancing security: Using a query image, a physician or diagnostic assistance system can search medical images into a distant server, and that server retrieves and shares the relevant images with the physician or diagnosis help system. Security concerns arose in this case in order to protect the confidentiality and privacy of the data. CBIR systems with security should be able to retrieve encrypted images because of this. Using holomorphic encrypted pictures; [44] devises a mechanism for supporting such applications. Researchers are interested in biometric security systems because of their unique ability to secure systems. A person’s distinctive physical or behavioral features are used to help identify them in biometric security systems. Color, texture, and form are three features suggested to be used in picture retrieval for biometric security systems. This technology not only speeds up the biometric system, but it also improves image retrieval accuracy.

1.4. Camera Surveillance and Machine Vision In order to ensure the safety of its staff, customers, resources, and private property, every organization must use video surveillance systems and be aware of both internal and external dangers. Many organizations and institutions are prioritizing the implementation of advanced monitoring systems due to an increase in security concerns. An increase in criminal activity and a rise in the number of security lapses that result in the loss of valuables such as lives, data, and property are driving the electronic security market. A video surveillance system is highly recommended by financial organizations and banks to ensure the safety of their facilities as well as the management of their funds. CBIR with camera surveillance help to setup a comprehensive security system. And by integrating of IOT, a very reliable system can be established. 1. The current monitor equipment’s &increases of airports boost amount of monitoring data (or photos) too great to processing. To improve

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

2.

3.

4.

5.

6.

7.

8.

43

this airport video surveillance system, the CBIR approaches can able to aid to improve the security level. The airport security can be further accelerated by the CBIR. Prevent Robberies and criminal investigation: Banks are high-value aims for culprits looking for huge disbursement since they are assured guardians of ready cash and goods. Bank thieves are deterred by effective video surveillance, and in the worst-case scenario, the photos and videos it captures serve as critical evidence for law police. Also, Surveillance film can help investigators identify and track down suspects in a robbery or other type of criminal activity. Fraud Check: Fraud checks at banks can be greatly improved by using IP video surveillance systems with advanced video analytics to record transaction data and capture photographs of criminal suspects. The data might be utilized to find and trace fraudster and secure client accounts, staff are afraid of being found manipulating data. Combat “Phantom ATM Withdrawals”: When a bank customer reports questionable ATM withdrawals, it isn’t uncommon for them to discover that their money has been withdrawn without their knowledge. Answers to these kinds of questions may be found in the footage captured by ATM security cameras in banks. Co-ordinate Information from Multiple Locations and Integration with Alarm System: Transmitting or viewing footage from various branches through the internet is now possible with today’s state-ofthe-art video surveillance systems at banks across the country. This simplifies, expedites, and reduces the cost of auditing and controlling. Integrating security cameras of banks and alarm systems into a single network is made possible by new generation surveillance systems. As a result, bank security measures are quicker and more effective. Intelligent Functionality: Intelligent security cameras equipped with video analytics and advanced sensors, such as vibration, motion, glass break, smoke, and heat detection, can be used to identify and prevent suspicious or anomalous activity in and around a bank. Digital Storage: As surveillance footage is stored and managed digitally, improved search algorithms may be used to identify individuals with greater ease, speed, and precision, making it easier to track down specific instances. Continuous Surveillance: Continuous surveillance of banking facilities is provided by bank security cameras, which enable for

本书版权归Nova Science所有

44

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

safety outside of normal business hours. Customers may use ATMs throughout the clock; thus, this is extremely useful for them. 9. Data Recognition and Remote Access: Digital video surveillance systems can look for specific bank transactions and photographs of people in the film using advanced data recognition. You may watch recorded and live video footage with IP surveillance on virtually every desktop or laptop computer with an internet connection and access to the local network.

1.5. Machine Vision The recognition of objects is a task performed daily by humans and is related with the need to interact with the surrounding environment. Humans tend to view images as being composed of individual objects that can be identified by their shape. When there is lack of human experience or the job puts their health at risk, the automated vision systems are their natural substitutes. The automated recognition of objects may be an important aid in the manufacture and quality control and operate continuously with a constant and consistent performance. These systems have to be reasonably fast and robust to classify an object, to be more effective than a human. Using a human for recognition of objects has the disadvantage of him getting distracted or annoyed if the task is very monotonous and if he is not fast enough if the pace of production so requires. Image processing, CBIR and IoT are help to turn concept of machine vision in to reality. Automated assembly structures are widely employed in a wide range of manufacturing and packaging industries. Robotics is employed to gather the product, which included an image-taking unit, a conveyor, an element recognition unit, a component feeder, and an element selection unit. Commercial automation can improve the general product manufacturing and inspection process, as well as reduce resources and save time. The quality check and inspection areas can be highly benefitted with usage of machine vision. Device imaginative and prescient applications now frequently use SURF (Speeded Up Robust Features). SURF is capable of performing a wide range of tasks, including image recognition, object detection, object matching, and object categorization. If you look back at SIFT, you can see how it came before SURF. BRISK, FAST, HOG, FREAK, and many algorithms are available to us with similar capabilities. For the supplied image, the SURF detector served as an extraction tool for adjacent distinctive information. If an object on a

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

45

conveyer belt for matching and inspection is invariant to rotation and scaling, this set of rules can locate that thing with accuracy and performance regardless of the object’s role on the conveyer belt. Using a monochrome camera and a computer, the computer vision module creates images Above is where the camera is permanently installed Looking down at the conveyor belt. The camera is used as a convention and it is set up such that the photos collected have the proper width and height. They are positioned so that their width and length correspond to those of a casket. The camera’s height above the belt is regulated to ensure that the casket’s width is clearly seen in the photographs taken. The images are captured in an enclosed environment with constant lighting. The fields of computer vision and others all come together in computer vision. Artificial Intelligence has made a big breakthrough in computer vision technology thanks to use of the deep learning algorithm; this is a vital step toward the development of automated and intelligent systems. Automotive manufacturing requires huge robots that can handle heavy loads and are capable of arc welding and other non-simple and repetitive processes. The 3C electronics industry has embraced flexible and compact robots, which has increased the precision with which products are assembled and processed. The industrial robot system is made up of a mechanical body and a controller. High-performance computers have made it possible for the robot controller to be built within the robot itself. For the most part, its goal is to improve the adaptation of industrial robots to their environment, lower the complexity of user usage, shorten the time it takes to learn how to operate it, and increase its application range. Robot motion trajectory and control programs are generated using off-line programming, which is based on workpiece digital modules and has the advantage of great programming efficiency, as opposed to on-line programming, which is more common for industrial robots. Third-party programming software and simulation as well as opensource software can be used for offline programming of industrial robots. To be practical, however, two conditions must be met: a three-dimensional digital model and absolute positioning accuracy for robot and the work-piece, as well as a high degree of precision for both. As a result, the widespread use and popularity of industrial robots is hindered, as well as the growth of the industrial robot sector itself, due to a lack of attention paid to application technology. The actualization of three-dimensional reconstruction of objects reflects a more thorough perception and processing of sensory information. For example, the process known as three-dimensional reconstruction uses numerous photographs to compute and then restore the object’s three-

本书版权归Nova Science所有

46

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

dimensional structure. So, computer vision can imitate human vision in wide range of applications in industrial automation and HCI as well as other areas. Everywhere you look in industrial world, computer vision automatic detection technology is being used. Because of its high precision, adaptability to new functions, dependability, speed, contactless operation, and high cost-toperformance ratio, this machine considerably enhances production capacity.

2. Literature Review Researchers in the field of IOT, image processing, pattern recognition, and CBIR all contributed ideas and knowledge towards our objective of Integrating IOT Based Security with Image Processing. As a prelude, here is a brief overview of the papers that serve as a foundation and guide us toward our study goal. CBIR system performance and comparison study were given by the authors in [3]. The CBIR system’s image retrieval results were based on two methodologies. Feature extraction and search and retrieval are two main procedures used by both systems. The first approach used binary tree structure, texture and color, while second used Canny Edge Detection to extract form features. Radu Images and patterns can be used in industrial engineering in a variety of ways, according to the authors in [4]. When it comes to an industrial setting, they first established the role of vision. Some image processing approaches, feature extraction and object detection and industrial robotic guiding are then described. Examples of how these strategies have been used in industry have also been discussed in the presentation. It was possible to build automated processes for inspecting and identifying parts as well as controlling robots. Watermarking ultrasound medical images with LZW lossless compressed watermarks was topic of interest in [13]. Compression of the data in the watermark is done without any data loss. The secret key used to watermark an image and a determined region of interest (ROI) were the two components of their watermarking experiment. By comparing LZW compression with traditional approaches, it was found that the LZW methodology performed better than the others. According to [5], a computer vision system tries to extract an image that matches a user-defined specification or pattern. Image retrieval is a major goal of computer vision, and feature vectors are commonly used to represent

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

47

content attributes. CBIR takes into account the image’s qualities, such as its shape, color, and texture, as well as its metadata. Feature extraction from CBIR images is accomplished using a variety of methods in this study. According to [6], IoT and the cloud computing are the future generation’s most essential ICT models. Building and deploying smart applications and solutions for smart cities can benefit greatly from both of these ideas. Ondemand supply of software and hardware resources via Internet is referred to as cloud computing. New generation of devices that are connected to the Internet and deliver a variety of services for value-added applications is depicted as part of IoT concept Convergence of cloud computing with IoT for any smart city application deployment was the focus of their paper. With the help of several application-based scenarios, Dubai is explored as a smart city. It has been proposed by the authors in [17] that a new architecture for increasing the robustness of the Internet of Things’ infrastructure. Technology options for implementing the architecture’s components were also presented. Their idea was discussed as part of the Sus City initiative. Competently efficient 3-D computer vision-based technique to recognize machine CAD parts was reported by the authors in [18]. MATLAB was used to detect the existence of commercially available CAD components in an image using feature-based commercial object detection approaches, as described in their article. The proposed smart machine vision system for industrial production automation was shown to be accurate and resilient using photos of actual industrial tools. Internet of Things authentication mechanisms were thoroughly surveyed by the authors in [22]. (IoT). More than 40 IoT-specific authentication protocols were selected and thoroughly tested as part of this study. Presentations on IoT authentication protocols’ threats, countermeasures, and formal security verification methodologies were given at the conference. An authentication protocol taxonomy and comparison for IoT in terms of network model, specific security goals and process steps was also offered. For the ID and ID/EH users, the authors in [24] aimed to maximize the minimum signal-to-interference-plus-noise ratio (SINR) by working together to create a transmitter power allocation strategy and an ID/EH receiver power splitting strategy that utilized maximum transmit power and minimal energy harvesting restrictions. Non-convex issues were first solved using MRT, semidefinite relaxation, and ZF approaches. The multiuser interference was then removed using ZF-DPC technique, resulting in closed-form optimal solution. It was discovered through simulations that ZF-DPC offered a more realistic minimum SINR than SDR or ZF in the vast majority of instances.

本书版权归Nova Science所有

48

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

New paradigm known as the IoT was introduced by the authors in [25] as a hierarchical distributed strategy for home care systems. Cloud computing, Dew computing, and fog computing were all included in the three-tiered data management paradigm proposed for IoT-based home care systems in order to ensure smooth data flow across the system. They used a distributed fuzzy logic approach to test the suggested model in the context of an early fire detection system. When it came to image categorization, the authors in [26] used simple methods and got 100% accuracy. Incorporating industrial pipeline electronic technologies, we devised a strategy that could be used to a variety of different activities. Method and all the intricacies that were taken into account during its design were discussed in depth in their article. In addition, the method’s edge cases were examined. Pattern recognition, feature extraction, &image processing were topics of a study presented by [32]. She did an excellent job of explaining how they were interconnected. Research was done in 2018 according to the findings. Proposed a novel architecture is for the SDN to achieve secure smart home environments. Using typical software-defined network paradigm, SH Sec focuses on capabilities needed to give a flexible generic platform. In addition to providing a flexible and adaptable system, it also ensured the safety of its users. Automated security can reduce the number of time-consuming human evaluations and recommendations by system administrators. Network performance and feasibility were tested using the suggested model in a simulated network environment. A variety of metric parameters were used to gauge the model’s efficiency and effectiveness. BCI method based on IoHT was developed by the authors in [28] in order to help patients with amyotrophic lateral sclerosis accomplish predetermined activities. Ten volunteers tested the proposed method in order to confirm that it may be used in addition to conventional medical treatment.

3. Problem Statement There have been several researches in area of image processing and IoT. The performance of IoT system is mainly suffered due to delay in image processing. Thus, in order to improve the IoT system there is need to reduce the image size so that image processing could take less time and further operations after detection of any activity could be performed by IoT system quickly. Moreover, there is need to reduce the cost of storage because it has

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

49

been observed that storage system used in IoT have limited capacity and they are unable to hold too old graphical dataset. In order to maximize the time duration there should be a mechanism that could reduce image size so that storage devices could hold graphical data for long time. In previous researches it has been observed that there is lot of time consumption and space consumption during image processing and content based image retrieval. Most of the time is wasted in decision making that is degrading the performance of IoT system. In order to resolve these issues canny edge detection mechanism could be used to reduce the time and space consumption.

4. Proposed Model The size of image captured by IoT devices is reduced by integration of edge detection mechanism and the performance and space consumption is reduced to provide more efficient and economical IoT based security system.

Figure 4. Flow chat of proposed work.

Figure 4 indicates the flow chat of proposed work. We captured the image as a normal image and used IoT in the proposed work. Apply image preprocessing to an IoT image before applying an edge detection mechanism to that image. Now, compare the frame comparison operation on both images

本书版权归Nova Science所有

50

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

(normal and IoT images) to determine the size of each frame image. Now is the time to compare the frames of both images. Finally, we compare a normal image to an edge detection image.

Figure 5. Proposed model.

In proposed work, we captured the image as normal image and using IoT. By Using IoT Image, apply the Image Preprocessing on it and then apply edge detection mechanism on that image [45-48]. Now compare the frame comparison operation on both the Image (normal and IoT image) and get the size of both image of frame. Now, get time for frame comparison of both images of frame. At last, we are comparison of normal image to edge detection image. The complete model is shown in Figure 5.

4.1. Role of Canny Edge Detection in Reducing Image Size Edge detection is a technique for obtaining structural information from diverse visual objects while decreasing the amount of data that must be analyzed. Many computer vision systems have made use of it. In his research, Canny observed that the conditions for employing edge detection on a range of vision systems are identical. This means an edge detection system that achieves these requirements may be employed in a wide range of situations. Edge detection may be guided by the following broad criteria:

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

51

1. In order to catch as many of the image’s edges as possible, edge detection must have a low error rate. 2. The operator should be able to accurately locate the center of the edge using the edge point they know. 3. The margins of an image should only be marked once, to avoid the appearance of false edges caused by picture noise. Figure 6 shows an example of Canny Edge Detection.

Figure 6. Canny edge detection.

5. Results and Discussion 5.1. Comparison of Size during Image Processing in IoT Environment Table 3 considers the size of number of images with normal images and using IoT images. Figure 7 shows the size comparison of Normal Image and Edge detection Image. Table 3. Comparison of size during image processing in IoT environment Number of image 10 20 30 40 50 60 70 80 90 100 110 120

Normal image 90 180 270 360 450 540 630 720 810 900 990 1080

Edge detected 50.80566 178.2824 222.6232 356.0041 428.8373 531.6968 580.5406 704.2752 807.985 897.3057 958.2748 1065.176

本书版权归Nova Science所有

52

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

Figure 7. Size comparison of normal image and edge detection Image.

5.2. Comparison of Time Taken during Image Processing in IoT Environment Table 4 considers the time taken by number of images with normal images and using IoT images as shown. Figure 8 shows time comparison of normal images and edge detection images. Table 4. Comparison of time taken during image processing in IoT environment Number of image 10 20 30 40 50 60 70 80 90 100 110 120

Normal image 21.12 42.24 63.36 84.48 105.6 126.72 147.84 168.96 190.08 211.2 232.32 253.44

Edge detected 12.57729 41.22144 54.56675 81.26004 100.9639 116.8578 146.263 162.0492 188.3069 210.9024 224.1648 251.7438

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

53

Figure 8. Time comparison of normal images and edge detection images.

Conclusion and Future Outlook Researchers are focusing on edge identification in image processing and ant colony optimization. In order to bring emphasis to the picture’s borders, image processing utilizes an edge detection approach. In contrast, ant colony optimization is tasked with finding the most efficient solution to a given issue. As part of the planned research, an ACO-based edge detection method is being considered. This approach is based on research in image processing, edge detection, and ant colony optimization. The problem statement elaborates on the points raised in prior studies. Due to ant colony optimization, edge detection is becoming increasingly crucial. ACO for the Binary Knapsack, Quadratic Assignment, and Traveling Salesman Problems have all been solved using technological simulations by the business. A canny-based edge detection approach is used to build an edge detection mechanism during image processing. The recommended methodology generates a more complete edge profile than the standard methods, but it also obscures some of the real edges that would otherwise have been highlighted. To further incorporate the advantages of all of these techniques while maintaining the visibility of actual edge information, future research will concentrate on uncovering the underlying reasons and enhancing the Ant Colony Optimization process.

本书版权归Nova Science所有

54

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al.

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Srivastava, A., Gupta, A., & Anand, R. (2021). Optimized smart system for transportation using RFID technology. Mathematics in Engineering, Science & Aerospace (MESA), RR12(4). Anand, R., Mann, A., & Sharma, K. (2020). Deep Metric Learning-based Face Recognition Pipeline with Anti-Spoofing on Raspberry-Pi Single-Board Computer. Test Engineering & Management, 82, 4302-4308 Shambharkar, Saroj A. & Tirpude, Shubhangi C. “A Comparative Study on Retrieved Images by Content Based Image Retrieval System based on Binary Tree, Color, Texture and Canny Edge Detection Approach,” Int. J. Adv. Comput. Sci. Appl., vol. 2, no. 1, pp. 47–51, 2012, doi: 10.14569/specialissue.2012.020106. Adaway, B. “Industrial applications of image processing,” Comput. Graph. 83, (Online Publ. Pinner), vol. LXIV, no. 1, pp. 555–568, 1983, doi: 10.2478/aucts2014-0004. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Kaur, M. J. & P. Maheshwari, “Building smart cities applications using IoT and cloud-based architectures,” 2016 Int. Conf. Ind. Informatics Comput. Syst. CIICS 2016, no. March, 2016, doi: 10.1109/ICCSII.2016.7462433. Anand, R., Sinha, A., Bhardwaj, A., & Sreeraj, A. (2018). Flawed security of social network of things. In Handbook of research on network forensics and analysis techniques (pp. 65-86). IGI Global. Pandey, D., Pandey, B. K., & Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Jain, N., Chaudhary, A., Sindhwani, N., & Rana, A. (2021, September). Applications of Wearable devices in IoT. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-4). IEEE. Chawla, P., Juneja, A., Juneja, S., & Anand, R. (2020). Artificial intelligent systems in smart medical healthcare: Current trends. Int. J. Adv. Sci. Technol, 29(10), 14761484. Kohli, L., Saurabh, M., Bhatia, I., Sindhwani, N., & Vijh, M. (2021). Design and development of modular and multifunctional UAV with amphibious landing, processing and surround sense module. Unmanned Aerial Vehicles for Internet of Things (IoT) Concepts, Techniques, and Applications, 207-230. Khongsai, L., Anal, T. S., AS, R., Shah, M., & Pandey, D. (2021). Combating the spread of COVID-19 through community participation. Global Social Welfare, 8(2), 127-132. Badshah, G., S. C. Liew, J. M. Zain, & M. Ali, “Watermark Compression in Medical Image Watermarking Using Lempel-Ziv-Welch (LZW) Lossless Compression Technique,” J. Digit. Imaging, vol. 29, no. 2, pp. 216–225, 2016, doi: 10.1007/s10278-015-9822-4.

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing [14]

[15]

[16]

[17] [18]

[19]

[20] [21]

[22]

[23] [24]

[25]

[26]

[27] [28]

55

Anand, R., Sindhwani, N., & Saini, A. (2021). Emerging Technologies for COVID‐ 19. Enabling Healthcare 4.0 for Pandemics: A Roadmap Using AI, Machine Learning, IoT and Cognitive Technologies, 163-188. Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Pandey, D., Pandey, B. K., Noibi, T. O., Babu, S., Patra, P. M., Kassaw, C., & Canete, J. J. O. (2020). Covid-19: Unlock 1.0 risk, test, transmission, incubation and infectious periods and reproduction of novel Covid-19 pandemic. Asian Journal of Advances in Medical Science, 23-28. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Rana, A., Dhiman, Y., & Anand, R. (2022, January). Cough Detection System using TinyML. In 2022 International Conference on Computing, Communication and Power Technology (IC3P) (pp. 119-122). IEEE Yang, T., G. H. Zhang, L. Liu, & Y. Q. Zhang, “A survey on authentication protocols for internet of things,” J. Cryptologic Res., vol. 7, no. 1, pp. 87–101, 2020, doi: 10.13868/j.cnki.jcr.000352. A. Khanna & S. Kaur, Internet of Things (IoT), Applications and Challenges: A Comprehensive Review, vol. 114, no. 2. Springer US, 2020. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Stojkoska, B. R., K. Trivodaliev, & D. Davcev, “Internet of things framework for home care systems,” Wirel. Commun. Mob. Comput., vol. 2017, 2017, doi: 10.1155/2017/8323646. Voleti, V., P. Mohan, J. Iqbal, & S. Gupta, “Simple real-time pattern recognition for industrial automation,” ACM Int. Conf. Proceeding Ser., no. December 2017, pp. 107–111, 2017, doi: 10.1145/3178264.3178272. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. de Oliveira Júnior, W. G., J. M. de Oliveira, R. Munoz, & V. H. C. de Albuquerque, “A proposal for Internet of Smart Home Things based on BCI system to aid patients with amyotrophic lateral sclerosis,” Neural Comput. Appl., vol. 32, no. 15, pp. 11007–11017, 2020, doi: 10.1007/s00521-018-3820-7.

本书版权归Nova Science所有

56 [29]

[30]

[31]

[32]

[33]

[34] [35]

[36] [37]

[38]

[39]

[40] [41]

[42]

Vivek Veeraiah, Jay Kumar Pandey, Santanu Das et al. Kashif, M., Javed, M. K., & Pandey, D. (2020). A surge in cyber-crime during COVID-19. Indonesian Journal of Social and Environmental Issues (IJSEI), 1(2), 48-52. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Zargar, S. A., Islam, T., Rehman, I. U., & Pandey, D. (2021). Use of Cluster Analysis To Monitor Novel Corona Virus (Covid-19) Infections In India. Asian Journal of Advances in Medical Science, 1-7. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Gupta, M. & Anand, R. (2011). Image Compression using Set of Selected Bit Planes on Basis of Intensity Variations. Dronacharya Research Journal, 3(1), 35-40. Gupta, M. & Anand, R. (2011). Image Compression using Set of Selected Bit Planes using Adaptive Quantization Coding. In International Conference on Advanced Computing, Communication and Networks (pp. 457-461). Madhumathy, P., & Pandey, D. (2022). Deep learning based photo acoustic imaging for non-invasive imaging. Multimedia Tools and Applications, 81(5), 7501-7518. Sindhwani, N., Rana, A., & Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Kaur, J., Sabharwal, S., Dogra, A., Goyal, B., & Anand, R. (2021, September). Single Image Dehazing with Dark Channel Prior. In 2021 9th International

本书版权归Nova Science所有

Integrating IoT Based Security with Image Processing

[43]

[44]

[45]

[46] [47]

[48]

57

Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Fadhil, S. A. “Internet of Things security threats and key technologies,” J. Discret. Math. Sci. Cryptogr., vol. 24, no. 7, pp. 1951–1957, 2021, doi: 10.1080/09720529. 2021.1957189. Pandey, D., & Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Zaidi, A. and O. Alharbi, “ Statistical Analysis of Linear Multi-Step Numerical Treatment”, in J of Statistics Applications & Probability, Vol. 12, no. 1, 2023. Degerine, S. and A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 3

Pattern Analysis for Feature Extraction in Multi-Resolution Images Ashi Agarwal1 Arpit Saxena2 Digvijay Pandey3,* Binay Kumar Pandey4 A. Shahul Hameed5 A. Shaji George6 and Sanwta Ram Dogiwal7 1Department

of Computer Science, ABES Engineering College, Ghaziabad, Uttar Pradesh, India 2Rajasthan Technical University (RTU), Kota, India 3Department of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India 4Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology Pantnagar, India 5Department of Telecommunication, Consolidated Techniques Co. Ltd, (CTC) Riyadh, Kingdom of Saudi Arabia 6Department of Information and Communication Technology, Crown University, Int’l. Chartered Inc. (CUICI), Santa Cruz, Argentina 7Department of Information Technology, Swami Keshvanand Institute of Technology, Management and Gramothan (SKIT), Jaipur, Rajasthan, India

*

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

60

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

Abstract A pattern is a concept that has proven to be effective in one situation and is likely to be useful in others. A pattern can take many different forms, each with its own set of specializations that are appropriate for that particular pattern. Anything can be considered a pattern. It could be a collection of objects that work together. These patterns are necessary to analyze for better recognition. Pattern analysis is an area of artificial intelligence and computer science focused with using an algorithm to find patterns in data. Patterns refer to any underlying correlations, regularities, or structure in a data stream. A system could foresee making predictions based on new information arriving a similar source if it detects significant patterns in the existing data. In this study pattern analysis based on feature extraction is being discussed and implemented. Features contain all the vital information of any image pattern and hence to gather all the necessary information, feature like edges that are informative in nature is essential to extract and combine for any type of pattern recognition and analysis.

Keywords: feature extractions, pattern, pattern analysis, pattern recognition

1. Introduction Patterns have been one of the most popular subjects in the object community in recent years [1-3]. They’re quickly becoming the hottest trend, generating a lot of interest and the usual buzz. Internal disputes over what belongs in the community are also raging, with various disagreements about exactly what a pattern is? Pattern comes from data and data comes from the word ‘datum’ that means basic unit of measuring and calculating anything [4-6]. Generally, data is everywhere, whatever we use either to manipulate or to calculate we refer data. Data is any factual information (such as measurements or statistics) that is used to support argument, debate, or calculation. Depending on the data and the patterns, the process of gathering data on a regular basis in order to look for patterns, such as upward trending numbers or connections between two sets of numbers, can occasionally reveal such patterns in a basic tabular presentation of the data. [7-10]. Other times, a chart, such as a time series, line graph, or scatter plot, might aid to visualize the data.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images •

61

A pattern, for example, could be an object or an event as shown in Figure 3.1.

The University of California was the first to investigate patterns in engineering in a systematic manner, establishing an architectural pattern language that is considered a paradigm for patterns in a variety of other domains. A pattern can also be defined as a “morphological law that describes how to build an artifact in order to address a problem in a certain setting”. Pattern analysis is separated into three categories: classification, regression (or prediction), and clustering, with each attempting to discover patterns in data in a different method [11, 12].

Figure 1. Patterns.

Scattered or random points

Cluster points

Figure 2. Patterns sample point maps.

Uniform points

本书版权归Nova Science所有

62

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

The three basic pattern kinds are shown in the examples of three point maps below in Figure 2. The sample’s measured results aren’t significant (they were all set to 1). A Pattern analysis operation was utilized to input each point map. Output Map of above pattern: Pattern analysis produced an output table for each input point map. Graphs of the Distance column against the Prob1Pnt column and the Distance column against the ProbAllPnt column were created from each output table: Graphs of the Distance column versus the Prob1Pnt column first, and then against the Prob All Pnt column second. Finding at least one point neighbor probability and distance is shown in Figure 3.

For Scattered samples: Distances between points and the likelihood of discovering at least one neighbor.

For clustered sample: Distances between points and the likelihood of discovering at least one neighbor.

For uniform points: Distances between points and the likelihood of discovering at least one neighbor.

Figure 3. Output of pattern sample point map.

Concluded definition of pattern is: A pattern can be a real object, such as a book or a chair, or it can be an abstract concept, such as a speaking or writing style. It’s also a trait that a group of objects has in common, such as chairs, rectangles, or blue-colored objects. It’s a subset of comparable objects in a bigger collection (a class or a cluster) also used to describe the overall similarity structure in a group of objects as well as a single object that is representative of a group of similar objects.

1.1. Pattern Class Every pattern has a corresponding class, which represents the pattern that can be found in the real world. The symbolic or numerical property of a real-world

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

63

object can be used to determine its class [13]. This is also referred to as “attribute.” However, the amount of properties assigned to each item may vary whereas the same features may usually be measured for all items in a given situation. A feature vector or a set of characteristics can thus be used to represent an object. A dataset feature attribute that refers to the set of values that a specific feature may have. When adding new objects to a dataset, the feature values for the defined domain can be validated.

1.2. Analysis Breaking down a notion, proposition, linguistic complication, or truth into its simplest or most fundamental elements is known as analysis. Analysis is rarely employed to address the complete body of knowledge in a field, but rather to address a specific problem. Now, analysis entails focusing on what is already known about the problem we’re attempting to address [14]. The whole point of the art is to extract many truths from this investigation that will bring us to the understanding we seek. The genuine process through which the thing in question was discovered methodically and as though a priori is shown in analysis. The scientific process of evaluating something to determine what it is made up of is known as analysis.

1.3. Pattern Analysis Pattern analysis refers to the process in all of its forms and applications, and is sometimes referred to as Machine Learning, Pattern Recognition, Pattern Matching and Data Mining. The name is often determined by the application domain, the pattern sought, or the algorithm designer’s professional experience. Many correspondences and parallels will be made apparent by combining these diverse techniques into a single framework, allowing for a relatively seamless expansion of the number of pattern types and application areas. Early methods were effective for detecting linear relationships, but nonlinear patterns were handled in a less principled manner. Pattern analysis is a broad discipline that examines systems that utilise machine learning to find patterns in data [14]. Many various sorts of patterns are sought, including classification, cluster analysis, regression, Feature extraction, grammatical inference, and parsing are all examples of statistical pattern recognition (also known as syntactical pattern recognition). Pattern analysis is the sub branch of pattern recognition.

本书版权归Nova Science所有

64

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

1.3.1. Pattern Recognition Pattern recognition is the act of identifying input data, such as voice, images, or a stream of text, by recognizing and outlining patterns and their relationships in computer science. Pattern recognition includes processes such as measuring the item to find distinctive traits, extracting features for defining attributes, and comparing the item to existing patterns to determine a match or mismatch [15]. A pattern recognition algorithm is shown in Figure 4. To tackle the pattern recognition problem, different paradigms are used. The two main paradigms are: • •

Statistical Pattern Recognition Recognition of Syntactic Patterns

Statistical Pattern Recognition has proven to be more effective and popular, and it has gotten a lot of press in the literature. This is due to the fact that the majority of the practical challenges in this field have to deal with a lot of noise, uncertainty, and statistics. Vectors are used to represent patterns and class labels from a label set in statistical pattern recognition. The abstractions usually deal with point probability density/distributions in multidimensional spaces, trees and graphs, rules, and vectors themselves. It’s useful to talk about subspaces/projections and similarity between points in terms of distance measures because of the vector space representation. This concept is related with a number of soft computing techniques. Imprecision, uncertainty, and approximation is not a problem for soft computing systems. Neural networks, fuzzy systems, and evolutionary computation are some of the tools that can be used [16-19].

1.4. Problem Definition Pattern analysis’ main purpose is to assess whether an object belongs to a given group. Attempting to identify the attributes of an unlabeled object can be used to address the difficulty of assigning it to a group and evaluating the group, assuming that items in one group have a greater degree of similarity than items in other groups. Because the optimum features for discriminating between groups and the mapping of attributes to groups can both be identified with certainty, if all prospective items and the categories to which they might be placed are known, the identification problem is simple. The properties and mapping to use must be inferred from known group membership in example

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

65

objects when understanding of the identification problem is inadequate or incomplete. The functionality of an automated pattern analysis system can be separated into two main jobs, both of which aim to classify objects based on their features: the description task uses feature extraction techniques to generate attributes for an object, and the classification task uses a classifier to assign a group label to the object based on those attributes. The pattern recognition system combines the description and classification functions to determine the most appropriate label for each unlabeled object it investigates. Automated pattern recognition systems are effective for handling a wide range of real-world issues because of the universality of the description and classification architecture, as well as the flexibility provided by the training phase. Data sets comprising qualities that were automatically collected and are reflective of the physical or behavioral objects to be identified are normally the objects under investigation in real-world pattern recognition systems.

1.5. Pattern Analysis Algorithm Input: A finite amount of source data to be analyzed. Output: Positive pattern data set or no pattern detectable.

Figure 4. Pattern recognition algorithm.

• Input Data as pattern

Preprocessed Data • Preprocessing

Preprocessing parameters

Figure 5. Pattern analysis algorithm.

• Feature Extraction/ Feature Selection Features

Ouput Pattern/ No pattern • Classification

本书版权归Nova Science所有

66

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

A pattern analysis algorithm is shown in Figure 5. The main goal of pattern analysis is to determine whether or not an object belongs to a specific group. Assuming that things from one group have more in common with those from other groupings, The problem of assigning an unlabeled object to a group can be solved by first finding the qualities of the object in the form of a pattern, and then deciding which group those attributes are most reflective of (i.e., the recognition). The identification problem is straightforward if knowledge about the universe of all conceivable objects and the groups to which they can be allocated is known, because the features that best discriminate across groups and the mapping from attributes to groups can both be identified with certainty. Identifying patterns in a limited collection of data poses a variety of unique problems. We’ll look at three important characteristics that a pattern analysis algorithm must have in order to be considered effective. (i)

(ii)

(iii)

Computational Efficiency: Pattern analysis algorithms must be able to handle very huge datasets because we’re seeking for practical solutions to real-world situations. As a result, an algorithm’s performance must scale over large datasets, not just on tiny toy examples. Efficient algorithms have resource requirements that scale polynomially with the amount of input, according to the study of computational complexity or scalability of algorithms. This means that the method’s necessary number of steps and memory may be described as a polynomial function of the dataset size as well as other key aspects like the number of features, desired precision, and so on. Many pattern analysis algorithms fail to meet this seemingly innocuous requirement, and in certain cases, there is no guarantee that a solution will be identified at all. Robustness: The fact that data is frequently contaminated by noise in real-world applications is the second issue that a successful pattern analysis algorithm must overcome. By noise, we imply that measurement errors or even miscoding, such as human error, might impact the values of features for individual data items. This is related to the previously discussed concept of approximation patterns, because even if the underlying relationship is perfect, it will inevitably become approximate and statistical if noise is introduced. Statistical Stability: The third criterion is arguably the most important and the patterns identified by the algorithm are true patterns from the data source, not merely an unintentional relationship that occurred in

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

67

the small training sample. This feature may be thought of as the output’s statistical robustness, in that it should detect a similar pattern if the algorithm is run again on a new sample from the same source. As a result, the algorithm’s output should be unaffected by the dataset in question, only by the data’s underlying source. Efficient algorithms have resource requirements that scale polynomially with the amount of input, according to the study of computational complexity or scalability of algorithms. The difference is that statistical stability assesses a pattern function’s capacity to reliably process unseen examples, whereas robustness examines the effect of sampling on the pattern function itself. When the results show that the identification challenge is weak or incomplete, the attributes and mappings to utilise must be inferred from known group membership in sample objects. The functionality of an automated pattern recognition system can be separated into two basic jobs, given the purpose of classifying objects based on their features. The description task uses feature extraction techniques to generate attributes for an item, while the classification task uses a classifier to assign the object a group label based on those attributes. The pattern analysis system analyses each unlabeled object and determines the best appropriate label based on the outcomes of the description and classification tasks. Pattern analysis is used by the researcher to discover and find systematic regularity in a much larger data set. These methods frequently employ computer modeling and simulation techniques, as well as data mining, image processing, and network analysis. Various pattern analysis techniques are shown in Figure 6.

1.6. Feature Extraction Feature extraction is a very important step in pattern analysis to extract features that are good for analysis and classification [20, 21]. Good features (as shown in Figure 7) are those possessing following key points: • •

Feature values are comparable across objects of the same class. Various types of objects have different values.

本书版权归Nova Science所有

68

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

Feature Extraction • Spatial Features • Transform Features • Edges and boundaries • Shape features • Moments • Textures

Segmentation • Template Matching • Thresholding • Boundary Detection • Clustering • Quad- Trees • Texture Matching

Classification • Clustering • Statistical • Decision Trees • Similarity Measures • Minimum Spanning Trees

Figure 6. Pattern analysis techniques.

Figure 7. Good and bad features.

Features are categorized as: (i)

(ii)

(iii)

Spatial Features: The grey levels, joint probability distribution, geographical distribution, and other spatial characteristics of an item can be used to characterise it. Transform Features: Transform features give frequency domain information of the data, are also a vital element of analysis. Zonalfiltering of the picture in the chosen transform space can be used to extract these characteristics. Edge and Boundaries: • Edge detection lowers the quantity of data in an image and filters out irrelevant information while maintaining the image’s key structural features. • Edges are the lines that separate various textures. • Edges are also described as picture intensity discontinuities from one pixel to the next. • The image’s borders are always the most essential qualities that indicate a higher frequency.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images •

69

Edge identification in noisy pictures is challenging because both the noise and the edges include high-frequency information, making it difficult for image segmentation, data reduction, and good matching, such as image reconstruction and so on.

2. Literature Review Fowler M et.al. [22] Introduced analysis patterns with the help of some real world example that he had gone through in the book. Examples were illustrated as UML diagrams and java interface declaration. The author identifies the need for a book that goes beyond the tools and techniques of a standard methodology book in the object-oriented community with the help of this book. In Analysis Patterns: Object-Oriented Analysis and Design, Martin Fowler focuses on the models themselves, the end product of object-oriented analysis and design. Models of Reusable Objects. He shares his extensive object modeling knowledge as well as his good eye for detecting recurring issues and turning them into reusable models with you. Analysis Patterns is a collection of patterns that have arisen in a variety of fields, such as trade, measurement, accounting, and organisational interactions. Recognizing that conceptual patterns cannot exist in isolation, the author includes a set of “support patterns” that explain how to convert conceptual models into software that can then be integrated into a large-scale information systems architecture. Each pattern includes a rationale for its creation, rules for when and when not to use it, and implementation recommendations. The examples in this book serve as a recipe book for usable models and insights into the skill of reuse, both of which can help you better your analysis. Ding et al. [23] covers frontier methods in this area and discusses feature extraction in general, covering linear and nonlinear feature extraction, before concluding with a discussion of feature extraction’s development trend in his article. More trials will be required to validate some ideas, while the theories of some approaches will need to be refined. Because most of the systems we encounter in practice are nonlinear, time-varying systems, high-dimensional nonlinear pattern feature extraction and selection is currently a popular subject of research. The organic integration of different types of theories, such as adding information theory, neural networks, and other theories to feature extraction, is considered frontier research from the standpoint of method

本书版权归Nova Science所有

70

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

theory; the current Manifold Learning and Independent Component Detection methods; the current Manifold Learning and Independent Component Detection methods; and the current Manifold Learning and Independent Component Detection methods; and the current Manifold Learning and Independent Component Detection methods; and the current Manifold Learning and Independent Component Detection methods; and the current Manifold Learning and Independent Component Detection methods; and the current Manifold Learning and Independent Component Detect. Olszewski et al. [24] discussed about the limitations of the implementation of spatial feature recognition system to any new domain as it takes thorough knowledge of domain to make the implementation successful. After discussing about the limitation author also suggest some solutions to overcome this limitation such as author told about the need of developing a domain-independent approach to structural pattern recognition that can extract morphological characteristics and perform classification without the use of domain knowledge. Secondly, A natural solution is a hybrid system that uses a statistical classification technique to accomplish structural feature-based discrimination. The effectiveness of the structure detectors to create features helpful for structural pattern recognition is assessed by comparing classification accuracy achieved using the structure detectors to commonly utilised statistical feature extractors by the author. Also the author evaluates the uses data from two real-world datasets with vastly diverse properties and well-established ground truth. The classification accuracies obtained using the structure detectors’ features were consistently as good as or better than those obtained using the statistical feature extractors’ features, demonstrating that the suite of structure detectors effectively performs generalised feature extraction for structural pattern recognition in a variety of situations. In [25]. the physical mechanisms underlying subaqueous bed load movement and ripple development are explained by the author. In this study, the researchers used a direct numerical simulation of horizontal channel flow over a thick substrate of mobile sediment particles. The DNS-DEM method yielded results that were totally consistent with the reference experiment’s data. Once the Shields number is exceeded, the simulations show a cubic fluctuation in particle flow rate (normalised by the square of the Galileo number times viscosity). The creation of patterns was next investigated using a variety of medium- to large-scale simulations in both the laminar and turbulent regimes. To accurately model the phenomenon, large computational domains with up to O (109) grid nodes were used, with up to O(106) freely moving spherical particles representing the mobile sediment bed. The

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

71

simulations’ results are quite close to the experimental data in terms of pattern wavelength, amplitude, asymmetric shape, and propagation velocity. In order to specify the cutoff length for pattern formation, the computational box size was carefully tuned to get the smallest box dimension that accommodates an unstable pattern wavelength. In order to characterise the structure of the driving turbulent flow and its association with the bed forms, a comprehensive dune-conditioned statistical analysis of the flow field and particle motion was carried out, which took into account the spatial and temporal variability of the sediment bed. In [26], there have been shown several significant advancements in the underlying algorithms and methodologies, resulting in a substantial machine learning is seeing a surge in practical applications. Bayesian methods, for example, have evolved into a full framework for expressing and using probabilistic techniques, while graphical models have evolved into a comprehensive framework for expressing and using probabilistic techniques. Approximation inference methods like variational Bayes and expectation propagation have substantially improved Bayesian techniques’ practical usefulness, while new kernel-based models have had a considerable impact on both algorithms and applications. This textbook includes these latest advancements while providing a thorough overview of pattern recognition and machine learning ideas. Advanced undergraduates and first-year PhD students, as well as researchers and practitioners, are all targets. There is no requirement that you have any prior experience with pattern recognition or machine learning. Prior experience with probabilities, as well as a working knowledge of multivariate calculus and basic linear algebra, is necessary (though this is not required because the book contains a self-contained introduction to basic probability theory). Alexander et al. [27] termed a pattern language by the authors, that is developed from timeless concepts known as patterns. “All 253 patterns together form a language,” Page xxxv of the introduction is where they write. Patterns describe a problem before proposing a remedy. The authors hope that by doing so, regular people, not only experts, will be able to collaborate with their neighbours to enhance a town or neighborhood, design a home for themselves, or collaborate with coworkers to create an office, workshop, or public building like a school. It has 253 patterns, including Community of 7000 (Pattern 12), which is treated over several pages; on page 71, it asserts, “Individuals have no effective voice in any community of more than 5,000– 10,000 people.” It is written in the form of a set of problems with detailed solutions.

本书版权归Nova Science所有

72

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

Verginadis et al. [28] focused on collaborative work patterns as a way to collect best practices concerning recurring collaboration difficulties and solutions among scattered groups. The authors give a comparison of relevant academic and commercial activities in the area of patterns that might be used to improve collaboration. Also author presented the findings of a survey on collaboration-related pattern techniques, models, and languages. Existing projects have two broad categories of work that we’ve identified. The first dimension contrasts between ways that seek to detect/mine patterns in order to discover deviations from recognised best practices and suggest corrective steps, mostly to the designer/facilitator, and approaches that seek to actively assist participants. The second dimension differs between approaches that rely on manual engagement and those that can give participants with automatic support. We believe that, given the increasing complexity of collaborative working environments (e.g., Virtual Breading Environments, Virtual Organizations, and so on), the focus should be shifted to automatically assisting participants and developing new tools that can proactively recommend corrective actions in ongoing collaborations. The study stated that incorporating ontologies and taxonomies into collaboration patterns provides a technological foundation for recording and reasoning about defined patterns of collaborative activity.

3. Implementation (i)

(ii) (iii)

Data set: Data variables include individual data variables, description variables with references, and dataset arrays comprising the data set and its description, when applicable, are all included in data sets. Sample Images: Some sample images are shown in Figure 8. Edge detection: A curve in an image that follows a path of rapid change in image intensity is known as an edge. Edges are regularly used to link the limits of elements in a scene. Edge detection is a technique for detecting the edges of an image. The edge function can be used to locate edges. Using one of these two criteria, This function searches the picture for spots where the intensity fluctuates rapidly:

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

73

Figure 8. Sample images.

(a) Places where the intensity’s first derivative is greater than a certain threshold (b) Places where the intensity’s second derivative crosses the zero line Each of these definitions is implemented by one of the derivative estimators provided by edge. For several of these estimators, we can specify whether the algorithm should be sensitive to horizontal edges, vertical edges, or both. Edge produces a binary image with 1s where edges are found and 0s everywhere else. The Canny approach from Edge is the most powerful edge detection method on the market. In contrast to prior edge detection methods, the Canny approach employs two thresholds to discriminate between strong and weak edges, and weak edges are only included in the output if they are connected to strong edges. As a result, this method is less susceptible to noise and is better at recognising actual weak edges than the others. (iv) Edge Detectors • Sobel • Prewitt • Laplacian • Canny • Robert

本书版权归Nova Science所有

74

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

Figure 9. Sobel edge detection.

3.1. Sobel Edge Detector Edges along the horizontal and vertical axes are detected using this technique (as shown in Figure 9). In both the horizontal and vertical directions, the picture is convolved using a tiny, integer valued filter (3 kernel). As a result, this detector needs less computing.

3.2. Prewitt Edge Detector It is very much same (as shown in Figure 10) as Sobel edge detector. It detects horizontal and vertical axis and its orientations

3.3. Laplacian Edge Detector The Laplacian gradient (shown in Figure 11) operator finds the regions where the rapid intensity changes. This is a mix of Gaussian filtering and Laplacian gradient operator. As a result, it’s ideal for detecting edges.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

75

Figure 10. Prewitt edge detection.

3.4. Canny Edge Detector The ingenious edge detector (Figure 12) can recognise a broad variety of actual edges in pictures. The detector removes the undesired noise pixels using a technique called because noisy pixels produce misleading edges, smoothing edges in a picture is necessary. In this edge detection, the signal to noise ratio is higher than in previous methods, alternative approaches. This is why this detector is often utilised in edge detection applications processing of images.

本书版权归Nova Science所有

76

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

Figure 11. Laplacian edge detection.

Figure 12. Canny edge detection.

The following is a description of how to locate edges in pictures: To decrease the influence of noise, the picture is first smoothed using an appropriate filter, such as a mean filter or a Gaussian filter. After that, each point’s local gradient and edge direction are computed. This location has the greatest strength in the gradient’s direction. Ridges appear in the gradient magnitude image as a result of these edge points.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images •

77

The edge detector follows the tops of these ridges and sets all pixels that are not on the top of the ridge to zero. As a result, the output is a narrow line. Upper threshold (T2) and lower threshold (T1) are used to threshold these ridge pixels (T1). Ridge pixels with values larger than the upper threshold (T2) are categorised as strong edge pixels, whereas ridge pixels with values between the lower threshold (T1) and the upper threshold (T2) are classed as weak edge pixels. Finally, by integrating the weak pixels that are related to the strong pixels, the image’s edges are joined.

• •

Table 1 shows the comparison between the different Edge detection techniques. Table 1. Comparison among different edge detection techniques Edge Detector Sobel

Gradient Based

Prewitt

Gradient Based

Laplacian

Gradient Based

Canny

Gradient Based

Robert

Gradient Based

Method

Advantage • Simple and easy to calculate • Edges and their orientation are detected. • Simple and straightforward computation • Edges and their orientation are detected.. • Characteristics are constant in every direction • It is possible to test a large area surrounding a pixel. • Detecting edges and their orientation is straightforward. • More Accurate • Improved signal to noise ratio • More sensitive to noisy pixels • Simple and straightforward computation • Edges and their orientation are detected..

Limitation • Less reliable • Sensitive to noise • Edge detection is unreliable. • Less reliable • Sensitive to noise • Edge detection is unreliable • Magnitude of edges is inversely proportional to the noise • The grey level intensity function varies and malfunctions at corners and curves.

• False Zero crossing • Slow and complex

• Less reliable • Sensitive to noise • Edge detection is unreliable

本书版权归Nova Science所有

78

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al. • • •

It is the process of selecting appropriate features for an object’s feature representation. This could relate to unprocessed data such as photos or time signals. There are numerous approaches for extracting features from images or any other pattern recognition system.

The goal of feature extraction and image analysis is to extract relevant information that may be used to solve problems in applications. Overall Results of Edge detection in various images are shown in Figures 13, 14 and 15.

Figure 13. Edge detection for image 1.

Figure 14. Edge detection for image 2.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images

79

Figure 15. Edge detection for image 3.

Here, in the image, edges of image are being identified by 5 different edge detectors namely Canny, Prewitt, Zerocross, Roberts and Bousta. It is shown that edges from canny edge detector provide better output.

Conclusion Pattern analysis turned out to be a very important step for feature extraction. If we want to extract features like edge of any image then first we have to analysis it thoroughly for the better implementation and better understanding. Pattern analysis frequently necessitates a pre-processing stage for extracting or selecting features to aid classification, prediction, or clustering or a better representation of the data the explanation for this is simple [29-32]. The raw data must be complex and tough to work with. Without extracting or selecting appropriate features to process beforehand. In order to develop an effective classifier, feature extraction identifies a set of the most useful qualities for categorization. Preprocessing, feature extraction, and classification are the three essential components of pattern analysis. Following the acquisition of the dataset, it is preprocessed to make it appropriate for following subprocesses. Feature extraction is the next phase, which entails transforming the dataset into a series of feature vectors that are designed to represent the original data. These qualities are used in the classification process to divide the data points into different sorts of problems.

本书版权归Nova Science所有

80

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al.

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Pandey, B. K., Pandey, D., Wairya, S., and Agarwal, G. (2021). An advanced morphological component analysis, steganography, and deep learning-based system to transmit secure textual data. International Journal of Distributed Artificial Intelligence (IJDAI), 13(2), 40-62. Vyas, G., Anand, R., and Holȇ, K. E. Implementation of Advanced Image Compression using Wavelet Transform and SPHIT Algorithm. International Journal of Electronic and Electrical Engineering. ISSN, 0974-2174. Kohli, L., Saurabh, M., Bhatia, I., Shekhawat, U. S., Vijh, M., and Sindhwani, N. (2021). Design and Development of Modular and Multifunctional UAV with Amphibious Landing Module. In Data Driven Approach Towards Disruptive Technologies (pp. 405-421). Springer, Singapore. Pramanik, S., Ghosh, R., Pandey, D., Samanta, D., Dutta, S., and Dutta, S. (2021). Techniques of Steganography and Cryptography in Digital Transformation. In Emerging Challenges, Solutions, and Best Practices for Digital Enterprise Transformation (pp. 24-44). IGI Global. Saroha, R., Singh, N., and Anand, R. ECHO CANCELLATION BY ADAPTIVE COMBINATION OF NORMALIZED SUB BAND ADAPTIVE FILTERS. International Journal of Electronics & Electrical Engineering, 2(9), 1-10 Kohli, L., Saurabh, M., Bhatia, I., Sindhwani, N., and Vijh, M. (2021). Design and development of modular and multifunctional UAV with amphibious landing, processing and surround sense module. Unmanned Aerial Vehicles for Internet of Things (IoT) Concepts, Techniques, and Applications, 207-230. Pandey, B. K., Pandey, D., and Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. Saini, M. K., Nagal, R., Tripathi, S., Sindhwani, N., and Rudra, A. (2008). PC Interfaced Wireless Robotic Moving Arm. In AICTE Sponsored National Seminar on Emerging Trends in Software Engineering (Vol. 50). Saroha, R., Malik, S., and Anand, R. (2012). Echo Cancellation by Adaptive Combination of NSAFs Adapted by Stocastic Gradient Method. International Journal Of Computational Engineering Research, 2(4), 1079-1083. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., and Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15. Jain, S., Sindhwani, N., Anand, R., and Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Madhumathy, P., and Pandey, D. (2022). Deep learning based photo acoustic imaging for non-invasive imaging. Multimedia Tools and Applications, 81(5), 75017518.

本书版权归Nova Science所有

Pattern Analysis for Feature Extraction in Multi-Resolution Images [13]

[14] [15] [16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

81

Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Sayel, N. A., Albermany, S., &Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Malik, S., Saroha, R., and Anand, R. (2012). A Simple Algorithm for reduction of Blocking Artifacts using SAWS Technique based on Fuzzy Logic. International Journal Of Computational Engineering Research, 2(4), 1097-1101. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., and Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Singh, S. K., Thakur, R. K., Kumar, S., and Anand, R. (2022, March). Deep Learning and Machine Learning based Facial Emotion Detection using CNN. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 530-535). IEEE Sindhwani, N., Rana, A., & and Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Shukla, R., Dubey, G., Malik, P., Sindhwani, N., Anand, R., Dahiya, A., and Yadav, V. (2021). Detecting crop health using machine learning techniques in smart agriculture system. Journal of Scientific and Industrial Research (JSIR), 80(08), 699-706. Pandey, B. K., Pandey, D., Wairya, S., Agarwal, G., Dadeech, P., Dogiwal, S. R., and Pramanik, S. (2022). Application of Integrated Steganography and Image Compressing Techniques for Confidential Information Transmission. Cyber Security and Network Security, 169-191. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., and Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Sayal, M. A., Alameady, M. H., and Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Hussein, R. I., Hussain, Z. M., and Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sharma, M., Sharma, B., Gupta, A. K., and Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17.

本书版权归Nova Science所有

82 [26] [27]

[28]

[29]

[30] [31]

[32]

Ashi Agarwal, Arpit Saxena, Digvijay Pandey et al. Albermany, S., and Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Pandey, D., and Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/10.1007/978-981-19-0312-0_20. Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., and Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Albermany, S. A., and Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., and Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Albermany, S., Ali, H. A., and Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12).

本书版权归Nova Science所有

Chapter 4

The Design of Microstrip Patch Antenna for 2.4 GHz IoT Based RFID and Image Identification for Smart Vehicle Registry Manvinder Sharma1 Harjinder Singh2 Digvijay Pandey3,* Binay Kumar Pandey4 A. Shaji George5 and Pankaj Dadheech6 1Department

of Electronics & Communications Engineering, Malla Reddy Engineering College and Management Sciences, Hydrabad, India 2 Department of Electronics Communication Engineering (ECE), Punjabi University, Patiala, Punjab, India 3Department of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India 4Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, India 5Department of Information and Communication Technology, Crown University, Int’l. Chartered Inc. (CUICI), Santa Cruz, Argentina 6Department of Computer Science and Engineering (NBA Accredited), Swami Keshvanand Institute of Technology, Management and Gramothan (SKIT), Jaipur, Rajasthan, India

*

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

84

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

Abstract Everything is becoming more automated in the contemporary technological era due to numerous technologies. There will always be a need for a Smart Vehicle Registry with optimal performance and no human intervention. Radio Frequency Identification (RFID) is widely employed in many areas, including manufacturing, distribution, and healthcare. Because RFID is extensively employed in the public sector, these systems necessitate low-cost, low-profile antennas. FID is thought to be the next generation for tracking and data collecting, and RFID systems have been used in many healthcare facilities. The design of an Inset feed Microstrip Patch Antenna for 2.4 GHz is presented in this study. The antenna has been designed and tested for RFID applications. The computed return loss is -20.5dB, and the calculated directivity is 8.3dB.The front-to-back ratio is 11 decibels.

Keywords: Microstrip Patch Antenna, RFID, finite element method, EHF or VHF, GSM, return loss

1. Introduction Nowadays, as the population grows, so do automobile sales, as everyone wants to own more than one vehicle. A Smart Vehicle Registry is an automated system that regulates the entry and exit of vehicles from a zone. The combination of RFID (Radio Frequency Identification) technology and image processing enables management of smart vehicle registry system [1-5]. RFID may gather data, automate tracking, monitor the behavior of an identity (item or person), and interface with application software. RFID technologies capture and send data without the need for human involvement. RFID systems are viewed as disruptive innovation in communication and healthcare because they may bring enhanced safety, better monitoring, and cost savings [2, 3]. RFID tags can be embedded in assets, inventories, persons, or patients. An antenna connects a channel to a communication system. Microstrip Patch Antennas are the most often used antennas due to their simplicity of design, low cost and ease of manufacture, and design diversity in terms of implementation [6-9]. Depending on the application, it can give effective directional patterns. RFID is an abbreviation for Radio Frequency Identification [10]. Its fundamental notion is based on its use in radio contact and is divided into two components. The RFID transceiver is the first. The transponder is the RFID's second major component. Readers and tags are other

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

85

names for them. A simple wireless connection occurs between the reader and the transponder. The RFID reader's job is to read the recorded information, which is subsequently sent over the interface, which can be wired or wireless [11-13]. The RFID transponder's job is to tag information stored data in its region so that it may be sent to the centralised database system. RFID has substantially influenced a significant shift in the field of wireless communication in recent years [14]. RFID systems have a wide range of applications, including security control, product identification, supply automation, and many more [15]. RFID-based systems are efficient, consuming less time and labour, and are easy and precise in terms of tracking. RFID systems are entirely transparent in terms of visibility of correct data and provide a lot of room for improvement in the stages involved in the whole supply chain [16]. Figure 1 depicts the three major components of an RFID system. RFID systems consist of a reader, RFID tags, and an application system. The reader is a transceiver made out of a radio frequency interface module, and its principal function is to activate tags. Transfer data between tags and application software, as well as arrange the tag communication sequence. RFID tags are transponders, which can be passive or active. Active tags have a battery and may interact with another tag as well as begin communication with the reader using their own power. Passive tags, on the other hand, do not include a battery. When it is taken near a reader, the tags become agitated and communicate. A microchip and a coiled antenna comprise tags. The application system, also known as the data processing system, is the third component of an RFID system. The application programme gives the reader commands. RFID systems provide a versatile, rapid, and dependable option for electrically detecting, tracking, controlling, and monitoring a wide range of products.

Figure 1. RFID components.

本书版权归Nova Science所有

86

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

2. Smart Vehicle Registry A Smart Vehicle Registry is an automated system that manages a vehicle's entrance and exit from a zone. This method is made possible by combining RFID (Radio Frequency Identification) technology and image processing. This approach employs a gate or toll-like environment at the beginning and conclusion of a zone or specialised territory. When a person enters or quits a zone or area, their RFID tag is read [17-21]. If the individual is approved, he or she may enter or depart the zone or territory. Concurrently, the registration plate of the car in which the individual enters or departs is scanned using OCR, an Image Processing technology (Optical Character Recognition). When both steps are completed, the gate either opens or remains closed, based on the authorization. The system is designed in such a way that analysis is simple, and any fraudulent entries may be discovered and the required steps done. The data may be evaluated in such a way that useful information and trends can be discovered to aid in the development of the system.

Figure 2. Process flow of smart vehicle registry with RFID and image processing.

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

87

Figure 2 shows process flow of smart vehicle registry, A system is an advanced system that records a vehicle's entry and exit from a certain zone or location. The system's initial step is to validate the driver's RFID tag using an RFID reader. If access is granted, the driver's personal information is shown on the front-end screen in a responsive manner. The second stage is to scan the vehicle's license plate, which is done concurrently with the first operation. The license plate is scanned, identified, and digitally saved throughout this procedure, and it then flows into the front end. The gate opens and the car is permitted to pass through if the access is approved and the plate recognition is complete. If access is refused, data does not flow into the front end, and the gate remains closed. Based on a Combined Feature Extraction Model and BPNN, the algorithm is able to detect license plates and recognize characters. This method is versatile in both low-light and complex backdrops. The first stage in a traditional method is to preprocess the image to increase contrast and filter it. The integral projection method is then used to localize the license plate. Then, to complete correct identification of license plate characters, feature extraction is conducted by training the vectors with BPNN. This can be built with three sets of feature combinations.

3. Background When RFID-based systems are connected with the wireless domain, there is a chance to improve RFID-based applications. Their transponders or tags may communicate with external sensors such as light, temperature, and shock. Similarly, their readers or transceivers perform the functions of broadcasting and receiving at the same frequency as in wireless communication. In order to attain a long range of the RFID reader or transponder, high gain directional antennas with the maximum power allowed are used. It is necessary to isolate the two channels, the transmitter and the receiver, for successful detection and decoding of weak signals [22-24]. Figure 3 shows RFID system with tags and tag reader. RFID bands for applications operate on 2.4 to 5.8 GHz microwave range, 860 MHz to 960 MHz ultra high frequency (UHF) band, 13.56 MHz high frequency (HF) band, and 120-150 KHz low frequency (LF) band. For RFID applications, the microwave band provides a greater operating range. The Internet of Things (IoT) is a worldwide infrastructure that collects and communicates data from humans and objects. It is capable of sensing,

本书版权归Nova Science所有

88

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

object detection, identification, and communication, and it has a high level of network connectivity, data collection, and interoperability. The general architecture of an IoT system, which is separated into three layers: perception layer, network layer, and application layer (service layer).

Figure 3. RFID system.

“The essential layer of IoT is the perception layer. It is essentially the information origin layer. It perceives and gathers data or information from the real physical environment via the wireless sensor network, RFID tags, sensors, GPS module, cameras, electronic data interface (EDI), objects, and so on. The network layer connects to the network and transmits information or data between associated devices. It is also known as the transport layer. The network layer receives data or information from the perception layer and transmits it via radio access networks, existing mobile communication networks, or other communication devices such as Bluetooth, GSM wireless fidelity (Wi-Fi), worldwide interoperability microwave access (Wi-Max), and Ethernet, among others. The application layer is also known as the service layer. It is further subdivided into application service and data management layers. The application sub layer converts information or data into content and provides user interface (UI) for end users and upper level enterprise applications such as disaster warning, medicine intake, health concerns, and so on, whereas the data management sub layer extracts information that is actually needed from sensor data. It converts complicated data into meaningful information. There are several antenna categories available, and their selection is entirely dependent on the amount of power they transmit and, of course, their beam width. Several approaches, like as chip and slot loading, teardrop dipole, stacking pitch, feed modification, and so on, have been employed to achieve wideband while minimizing antenna size [25-27]. Many frequency bands, including 125 kHz, 13.56 MHz, 869 MHz, 902-908 MHz, 2.4 GHz, and 5.8

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

89

GHz, have been assigned to RFID-based applications [28]. According to Indian Standard Organization 18000, the operating frequency range of the 2.4 GHz RFID band is 2400-2483.5 MHz with an 83 MHz bandwidth, which is extremely limited when compared to the HHF band of RFID, which is 860960 MHz. Because the 2.4 GHz frequency is higher than the UHF RFID band, the tag antenna is smaller. As a result, the 2.4 GHz RFID system use is limited to smaller items and systems. Figure 4 depicts a Micro strip Patch Antenna, which is constructed by directing ground plane and laying dielectric over it with conductive fix. A printed antenna is a micro strip patch antenna. The antenna is useful for RFID applications due to its radiation control, ease of integration, and low cost production/fabrication [29, 30]. The majority of the time, the fix is a leading substance such as copper or any other conducting metal. The patch can be any form, such as round, rectangular, or oval. The shape of the patch conductor determines resonance. Patch antenna input can be delivered in a variety of ways, including connected feed, probe or coaxial cable, aperture fed, and inset fed [31-34]. The Patch antenna has the disadvantage of having a limited bandwidth; however, for the specific use of RFID, this disadvantage becomes a benefit since the antenna only reacts to signals that are inside band and rejects signals that are out of band. This also improves the design's quality (Q). Figure 5 depicts the inset feeding approach. The inset fed approach gives a planar structure as edge feed and has the advantages of simplicity, ease of production, and improved impedance matching. Patch antennas are an excellent choice for applications like as UWB, LTE, and WLAN because of these benefits [35-39].

Figure 4. Structure of micro strip patch antenna.

本书版权归Nova Science所有

90

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

Figure 5. Inset fed micro strip antenna.

4. Design Parameters 4.1. Design Eqation of Inset Feed Microstrip Patch Antenna Because of the Sinusoidal Distribution, the current travels along distance R and from the ends. The current value is provided by [40] 𝑅

I= 𝐶𝑜𝑠𝜋 (𝜋 × ) 𝐿

(1)

The phase difference can be given as If the wavelength is 2L, 𝑅

𝜆 = (𝜋 × ) 𝐿

(2)

As voltage also decreases in magnitude using Z=V/I, the current increases. The input impedance can be written as 𝜋𝑅

𝑍𝑚 (𝑅) = 𝑐𝑜𝑠 2 ( ) 𝑍𝑚 (0) 𝐿

4.2. Design of Microstrip The length ofmicrostrip patch antenna can be calculated with [41]

(3)

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

𝐿=

𝑣0

− 2∆𝐿

2𝑓𝑟 √𝜀𝑟𝑒𝑓𝑓

91

(4)

Where 𝜀𝑟𝑒𝑓𝑓 is given as 𝜀𝑟𝑒𝑓𝑓 =

𝜀𝑟 +1 2

+

∆𝐿 = 0.412ℎ

𝜀𝑟 −1 2

ℎ −1/2

[1 + 12 𝑊] 𝑊 ℎ

(𝜀𝑟𝑒𝑓𝑓 +0.3)( +0.264) 𝑊 ℎ

(𝜀𝑟𝑒𝑓𝑓 −0.258)( +0.8)

(5)

(6)

The width is calculated with [23] 𝑊=

𝑣0 2𝑓𝑟

√𝜀

2 𝑟 +1

(7)

5. Modeling and Analysis The model is created in the COMSOL Multi physics environment. Figure 6 depicts the micro strip patch antenna design specifications.

Figure 6. Design parameters of patch antenna.

本书版权归Nova Science所有

92

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

Figure 7. shows the design structure of Microstrip Patch Antenna.

Figure 7. Structure of microstrip patch antenna.

“The model is designed and analyzed using an electromagnetic frequency domain solver. The design characteristics of the proposed antenna are shown in Table 1. The modeled design uses a frequency of 2.4 GHz (RFID application). The Wave equation is used to analyze the structure.

(8) Table 1. Model design description Description Patch length Substrate thickness 50 ohm line width Patch width Tuning stub width Tuning stub length Substrate width Substrate length

Value 5.2 cm 0.1524cm 0.32 cm 5.3 cm 7 cm 1.9 cm 10 cm 10 cm

The suggested antenna is meshed using a tetrahedral mesh. Figure 8 depicts the design meshing. The structure is separated into 4860 prisms, 59822 tetrahedron values, 8796 triangles, 676 edge elements, 500 quads, and 36

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

93

vertex elements during meshing. With a curvature factor of 0.6, the maximum element size is 23.060. The architecture was modeled on 4x2.60 GHz processors, with 4.8 GB of physical memory utilized during simulation. Figure 9 depicts the distribution of the electric field over the proposed antenna construction.

Figure 8. Meshing of design.

Figure 9. Electric field distribution for 2.4 GHz.

6. Results and Discussion Figure 10. depicts the 2D radiation pattern. The radiation pattern shows that the antenna has a directional beam. Because the ground plane stops radiation

本书版权归Nova Science所有

94

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

from the back side and only allows radiation from the front side. The suggested antenna's estimated directivity is 8.3dB, with a front to rear ratio of more than 11dB. The S11 value is determined to be -20.5 dB, which is significantly better than the intended -10dB. Figure 11. depicts a three-dimensional radiation pattern.

Figure 10. 2-D Far Field radiation plot.

Figure 11. 3D Far Field radiation plot.

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

95

7. Other Applications RFID systems may be utilised in healthcare applications such as patient tracking and history records, device or equipment tracking, asset tagging and the check-in/check-out procedure, and pharmaceutical and drug tracking. RFID tags can be used to authenticate patients during treatments such as blood sample and medicine delivery to prevent misidentification. RFID systems may be used to measure hand hygiene habits and compliance with real-time continuous automatic hand hygiene recorders to help avoid hospital-acquired illnesses. With 92 percent accuracy, the RFID technology recoded hand hygiene [42]. RFID band is tied around patient's wrist for patient tracking and history record, and it carries information such as name, allergies, medical record, and so on. The hospital continues to follow the patient using this RFID-based bracelet [43]. RFID tags can be used to track assets and medications in hospitals. RFID systems, which have surpassed barcode technology in popularity, can also be used for inventory management. By scanning numerous tags at once, RFID improves the overall efficiency of the system. Aldo RFID tags can hold more data than barcodes [44]. RFID also provides efficient, rapid and accurate data tracking in healthcare supply chain members. RFID can be programmed to transmit specific information or data such as dressing change, medication intake, patient transfers and patient checklists [45]. RFID monitoring data can provide rapid warning notice and can also trigger an alarm if there is a sudden change in the patient's health, such as a change in heart rate, temperature, or breathing rate, among other things. RFID technologies increase efficiency and improve patient care. RFID systems are also beneficial to patients' family and those who are connected to them since they may offer real-time information on their position. Families are kept up to date when healthcare service quality improves. RFIDs have the benefits of user friendliness, operability, recognition accuracy, medical error avoidance, and workflow efficiency. “RFID technology makes it simple to control a large and time-consuming manufacturing process. It provides all of the advantages of small batch, procedures, and manufacturing. This sort of process aids in improved analysis, the reduction and elimination of bottlenecks, the reduction of time spent identifying components and products, and the installation of production process-based sensors to notify any irregularities. RFID integration with the aerospace sector and the Department of Defense has a number of potential benefits. According to the US Federal Aviation Administration, Boeing and

本书版权归Nova Science所有

96

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al.

Airbus make it obligatory to include a suitable tracking device to trace aero plane parts. Near field communication (NFC) technology, a subset of HF RFID, links the Internet of Things (IoT) and may be utilised for more complicated, secure interactions [46-53]. An NFC tag and an item may communicate in both directions, making commonplace objects smarter and more trustworthy. NFC communication is confined to high-frequency, close-proximity communication, and only one NFC tag may be read at a time. NFC is extremely simple to use, and because most mobile devices are now NFC equipped, using the technology often does not necessitate any extra infrastructure. The use of RFID tags on livestock allows farmers to update, identify, and monitor their animals more easily. Manually updating a significant volume of data, especially at a remote location, is a difficult process. Weight, age, vaccination data, and other information may be quickly retrieved using a portable reader. Veterinary experts will be able to get pet information simply by scanning the tag (without going through records). Timing races and marathons is one of the most popular applications of RFID technology, yet many race participants are unaware that they are being timed using RFID technology, a tribute to RFID's ability to create a seamless customer experience.

Conclusion and Future Outlook Because of its compact profile, small size, low cost and ease of production, and excellent directional properties, the Micro strip Patch Antenna (MPA) is frequently employed in wireless communication. Many frequency bands, including 125kHz, 13.56MHz, 869MHz, 902-908MHz, 2.4GHz, and 5.8GHz, have been assigned to RFID-based applications. We used the 2.4GHz spectrum for the design. “The Inset feed Micro strip Patch Antenna for 2.4 GHz resonant frequency and may be utilized for healthcare applications was analyzed and investigated using modeling and simulated experiment work. For the intended RFID 2.4GHz band, factors such as electric field intensity, directivity, front to back ratio, S11, and radiation far field plot were computed. The developed antenna has an 8.3dB directivity and a front to back ratio of 11dB. The computed insertion loss is -20.5dB rather than the intended 10dB.The proposed micro strip patch antenna design is simple, simply

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna …

97

produced, and may be utilized for RFID applications in smart vehicle registry and other applications.

References [1]

[2]

[3]

[4]

[5]

[6] [7]

[8]

[9]

[10]

[11]

Albermany S, Ali HA and Hussain AK. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing, (pp. 1-12). Singh H, Pandey BK, George S, Pandey D, Anand R, Sindhwani N and Dadheech P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data, (pp. 185-192). Springer, Singapore. Gupta A, Srivastava A, Anand R and Chawla P. (2019). Smart Vehicle Parking Monitoring System using RFID. International Journal of Innovative Technology and Exploring Engineering, 8(9S), 225-229. Malik S, Saroha R and Anand R. (2012). A Simple Algorithm for reduction of Blocking Artifacts using SAWS Technique based on Fuzzy Logic. International Journal Of Computational Engineering Research, 2(4), 1097-1101. Pandey D, Wairya S, Al Mahdawi R, Najim SADM, Khalaf H, Al Barzinji S and Obaid A. (2021). Secret data transmission using advanced steganography and image compression. International Journal of Nonlinear Analysis and Applications, 12(Special Issue), 1243-1257. Sayel, N. A., Albermany, S., and Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Sindhwani N and Singh M. (2016). FFOAS: Antenna selection for MIMO wireless communication system using firefly optimisation algorithm and scheduling. International Journal of Wireless and Mobile Computing, 10(1), 48-55. Anand R, Sindhwani N and Dahiya A. (2022, March). Design of a High Directivity Slotted Fractal Antenna for C-band, X-band and Ku-band Applications. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom), (pp. 727-730). IEEE. Kirubasri G, Sankar S, Pandey D, Pandey BK, Singh H and Anand R. (2021, September). A Recent Survey on 6G Vehicular Technology, Applications and Challenges. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), (pp. 15). IEEE. Sharma, Manvinder, Digvijay Pandey, Pankaj Palta and Binay Kumar Pandey. "Design and Power Dissipation Consideration of PFAL CMOS V/S Conventional CMOS Based 2: 1 Multiplexer and Full Adder." Silicon, (2021): 1-10. Jo M, Lim CG and Zimmers EW. “RFID tags detection on water content using a back-propagation learning machine,” KSII Trans. On Internet and Information Systems, vol. 1, no. 1, pp. 19-32,2207.

本书版权归Nova Science所有

98 [12]

[13] [14]

[15]

[16]

[17]

[18]

[19]

[20] [21]

[22]

[23]

[24]

[25]

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al. Sindhwani N and Singh M. (2020). A joint optimization based sub-band expediency scheduling technique for MIMO communication system. Wireless Personal Communications, 115(3), 2437-2455. Sindhwani N. (2017). Performance analysis of optimal scheduling based firefly algorithm in MIMO system. Optimization, 2(12), 19-26. Carbunar B, Ramanathan MK, Koyuturk M, Hoffmann C and Grama A. “Redundant-Reader Elimination in RFID Systems,” in Proc. 2nd Annu. IEEE Communications and Networks (SECON), 2005, pp. 176-184. Bendavid Y, Wamba SF and Lefebre LA. “Proofofconceptofand RFID-enabled supply chain in a b2b e-commerce environment,” in Proc. 8th Intl. Conf. on Electronic Commerce (ICEC06), 2006, pp. 564-568. Sharma, Manvinder, Harjinder Singh and Digvijay Pandey. "Parametric Considerations and Dielectric Materials Impacts on the Performance of 10 GHzSIW-LWA for Respiration Monitoring." Journal of Electronic Materials, 51, no. 5 (2022): 2131-2141. Albermany S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Sharma M and Singh H. (2022). Contactless Methods for Respiration Monitoring and Design of SIW-LWA for Real-Time Respiratory Rate Monitoring, IETE Journal of Research, DOI: 10.1080/03772063.2022.2069167. Lin PJ, Teng HC, Huang YJ and Chen MK. “Design of patch antenna for RFID reader applications,” in Proc 3rd Intl Conf on Anti-Counterfeiting, Security and Identification in Communication, 2009, pp.193-196. Albermany SA, Hamade FR and Safdar GA. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Gupta KA, Sharma M, Sharma A and Menon V. "A study on SARS-CoV-2 (COVID-19) and machine learning based approach to detect COVID-19 through Xray images." International Journal of Image and Graphics, (2020): 2140010. Sharma M, Sharma B, Gupta AK and Pandey D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Anand R, Singh J, Pandey D, Pandey BK, Nassa VK and Pramanik S. (2022). Modern Technique for Interactive Communication in LEACH-Based Ad Hoc Wireless Sensor Network. In Software Defined Networking for Ad Hoc Networks, (pp. 55-73). Springer, Cham. Saini MK, Nagal R, Tripathi S, Sindhwani N and Rudra A. (2008). PC Interfaced Wireless Robotic Moving Arm. In AICTE Sponsored National Seminar on Emerging Trends in Software Engineering, (Vol. 50). Vaid V and Agarwal S. (2014). Bandwidth optimization using fractal geometry on rectangular microstrip patch antenna with DGS for wireless applications. International conference on medical Imaging, M-health and Emerging Communication Systems (MedCom), pp.162-167.

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna … [26]

[27]

[28] [29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

99

Anand R and Chawla P. (2020). A hexagonal fractal microstrip antenna with its optimization for wireless communications. International Journal of Advanced Science and Technology, 29(3s), 1787-1791. Anand R and Chawla P. (2020). Optimization of a slotted fractal antenna using LAPO technique. International Journal of Advanced Science and Technology, 29(5s), 21-26. Zaidi A and Alharbi M. "Numerical integration of locally peaked bivariate functions", in Int J Stat Appl Math, vol. 7, pp. 43-48, 2022. Sharma M, Pandey D, Khosla D, Goyal S, Pandey BK and Gupta AJ. "Design of a GaN-Based Flip Chip Light Emitting Diode (FC-LED) with au Bumps and Thermal Analysis with Different Sizes and Adhesive Materials for Performance Considerations." Silicon, (2021): 1-12. H. Kumar; N. Singh and S. Singh (2015). Slot Antenna For Frequency Switchable Active Antenna. IEEE, 13th international conference on Advanced Communication Technology (ICACT), phoenix Park, Korea, 2011b, ISBN 978-89-5519-154-7 Vol. I Issue X April 2015. Pandey D, Wairya S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra B, Tiwari M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical. Gupta N, Singh VK, Ali Z and Ahirwar J. (2016). Stacked Textile Antenna for Multi Band Application Using Foam Substrate. International Conference on Computational Modeling and Security (CMS 2016), Procedia Computer Science, 85 (2016) 871-877. Pandey BK, Pandey D, Nassa VK, George S, Aremu B, Dadeech P and Gupta A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data, (pp. 223-230). Springer, Singapore. Sharma M, Sharma B, Gupta AK and Singla BS. "Design of 7 GHz microstrip patch antenna for satellite IoT-and IoE-based devices." In The International Conference on Recent Innovations in Computing, pp. 627-637. Springer, Singapore, 2020. Singh S, Singla BS, Sharma M, Goyal S and Sabo A. "Comprehensive Study on Internet of Things (IoT) and Design Considerations of Various Microstrip Patch Antennas for IoT Applications." In Mobile Radio Communications and 5G Networks, pp. 19-30. Springer, Singapore, 2021. Sharma M and Singh H. "Substrate integrated waveguide based leaky wave antenna for high frequency applications and IoT." International Journal of Sensors Wireless Communications and Control 11, no. 1 (2021): 5-13. Sharma M, Sharma B, Gupta AK, Singh H and Khosla D. "Design of 7 GHz Microstrip Patch Antenna for Satellite IoT (SIoT) and Satellite IoE (SIoE)-Based Smart Agriculture Devices and Precision Farming." In Real-Life Applications of the Internet of Things, pp. 181-201. Apple Academic Press, 2022. Anand R and Chawla P. (2022). Bandwidth Optimization of a Novel Slotted Fractal Antenna Using Modified Lightning Attachment Procedure Optimization. In Smart Antennas, (pp. 379-392). Springer, Cham.

本书版权归Nova Science所有

100 [39] [40]

[41]

[42]

[43] [44]

[45]

[46]

[47] [48]

[49]

[50]

[51]

[52]

Manvinder Sharma, Harjinder Singh, Digvijay Pandey et al. Albermany SA and Safdar GA. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Sharma, Manvinder, and Harjinder Singh. "Contactless Methods for Respiration Monitoring and Design of SIW-LWA for Real-Time Respiratory Rate Monitoring." IETE Journal of Research (2022): 1-11. Albermany SA, Hamade FR and Safdar GA. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany S and Baqer FM. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein RI, Hussain ZM and Albermany SA. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Zaidi A and Alharbi O. " Statistical Analysis of Linear Multi-Step Numerical Treatment", in J of Statistics Applications and Probability, Vol. 12, no. 1, 2023. Vinodhini V, Kumar MS, Sankar S, Pandey D, Pandey BK and Nassa VK. (2022). IoT-based early forest fire detection using MLP and AROC method. International Journal of Global Warming, 27(1), 55-70. Sennan SK, Alotaibi Y, Pandey D and Alghamdi S. (2022). EACR-LEACH: Energy-Aware Cluster-based Routing Protocol for WSN Based IoT. CMCCOMPUTERS MATERIALS and CONTINUA, 72(2), 2159-2174. Jain N, Chaudhary A, Sindhwani N and Rana A. (2021, September). Applications of Wearable devices in IoT. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO), (pp. 1-4). IEEE. Zaïdi A. "Mathematical Methods for IoT-based Annotating Object Datasets with Bounding Boxes", in Mathematical Problems in Engineering, vol. 2022, 2022. Degerine S and Zaidi A. "Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach," in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi A. "Positive definite combination of symmetric matrices," in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. and Zaidi A. "Sources colorees" a chapter of a collective work entitled "Séparation de sources 1 concepts de base et analyse en composantes indépendantes", Traité IC2, série Signal et image ["Separation of sources 1 basic concepts and independent component analysis", Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi A. "Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices", in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Degerine S and Zaidi A. "Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints", in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

The Design of Microstrip Patch Antenna … [53]

101

Zaïdi A. "Accurate IoU Computation for Rotated Bounding Boxes in R2 and R3 ", in Machine Vision and Applications, 32, 114, 2021.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 5

Kidney Stone Detection from Ultrasound Images Using Masking Techniques Harshita Chaudhary* and Binay Kumar Pandey Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Udham Singh Nagar, Uttarakhand, India

Abstract Here we are using masking techniques for stone detection that are present in the kidney. As we know Masking techniques are conspicuous approaches in contrast enhancement. For this firstly, the image is converted into grey and after that contrast of the image is enhanced. The process of contrast enhancement is done with the help of Optimum Wavelet-Based Masking (OWBM) using the Enhanced Cuckoo Search Algorithm (ECSA). Afterward image segmentation and image masking have been done to detect stone from the image. The cuckoo search algorithm is used for global optimization of contrast enhancement. With the help of the Cuckoo search algorithm approximation of the coefficient has been optimized. The objective of this project is to design and implement a method to detect the presence of stone from the ultrasound image of a kidney. Here we are making are system our more intelligent.

Keywords: kidney stone, ultra sound images, contrast enhancement, optimum wavelet-based masking, cuckoo search algorithm, image segmentation, thresholding

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

104

Harshita Chaudhary and Binay Kumar Pandey

1. Introduction Kidney stones are very common nowadays. A kidney stone affects one out of every ten people at some point in their lives. More than half a million people visit emergency rooms each year with kidney stone problems. The formation of physiochemical substances in the urinary system is the primary cause of kidney stones. The main cause of the stone is highly concentrated urine containing salts. These salts precipitate as supersaturated precipitates and crystallise. Because of the stone-promoting or stone-inhibiting agents, the crystals can either be excreted or grow into stone. Kidney stone frequently have no specific or particular cause, though various factors may raise your risk. A kidney stone generally won’t induce symptoms unless it moves inside your kidney or flows into your ureters- which links your kidney and bladder [1]. If it becomes lodged in the ureters, it can obstruct urine flow and cause the kidney to swell and the ureter to spasm, both of which can be very painful [1]. However, treatment for kidneys stones depending on the type or size of stones, smaller stones doesn’t need invasive treatment but large stones require more-extensive treatment. Calcium oxalate stones, which are the most problematic, account for approximately 80% of all kidney stones [2]. These stones’ formation may be caused by genetic factors, but it also depends on age and geography. However, what is more important are dietary and lifestyle factors, as well as the outcomes of acquired metabolic flaws that result in crystal formation and the formation of a kidney stone. So, there is need of program to make our system more intelligent to detect the kidney stones. The organization of this paper is as follows. Section 2 describes the ultrasound images. Section 3 describes about the contrast enhancement. Section 4 presents the proposed optimum wavelet-based masking in detail. Section 5 discusses the conventional cuckoo search algorithm, need of the proposed adaptive rebuild of worst nest and enhanced cuckoo search algorithm in detail. Section 6 discusses the results and discussion. Lastly, Section 7 presents the conclusions.

2. Ultrasound Images Ultrasound is one of the most widely used imaging techniques in medical treatment. It is non-invasive, radiation-free, portable, and inexpensive when compared to other medical imaging modalities. It shows the anatomical

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images …

105

structures of organs in a tomographic view. During interventional medical procedures, it can be used to provide real-time images. Speckle noise is a common occurrence in medical ultrasound imaging, and it lowers image resolution and contrast, significantly reducing the diagnostic value of this imaging modality. As a result, speckle noise reduction is required when ultrasound imaging is used for tissue characterization. There is a class of approaches that use a multiplicative model of speckled image formation and use the logarithmical transformation to convert multiplicative speckle noise to additive noise among the many methods proposed to accomplish this task [3]. Although ultrasound has been shown to be relatively safe, no imaging technique that injects additional energy into the body can be considered completely risk-free. When deciding whether or not to make a diagnostic image, the physician should always consider whether the potential benefits of the imaging procedure outweigh any potential risks [4]. Tissue boundaries generate the majority of the reflected sound waves in ultrasound scanning. Smaller structures in relatively homogeneous regions, on the other hand, can produce echoes with random phases. These echoes cause constructive and destructive interferences in the ultrasound images, resulting in speckle patterns. Speckles degrade tissue boundaries and give homogeneous areas a rough appearance.

3. Contrast Enhancement Contrast enhancement is a technique for enhancing image quality in a variety of applications [5, 6, 7, 8, 9]. Contrast enhancement algorithms are used as a preprocessing module in medical image analysis. Medical image analysis is primarily used to improve diagnosis clarity [10]. In general, contrast enhancement is divided into two categories: example-based [11] and intensitybased [12]. Histogram-based, transform domain-based, filter-based, and masking-based approaches are all examples of intensity-based enhancement. Unsharp masking is a masking technique in which the scale value is used to create the mask. Masks are classified as unsharp masks or high boost masks based on the scale value. Traditional masking techniques rely on a random selection of a static scale value [13]. For image enhancement, the authors in [14] proposed an adaptive unsharp masking technique. Using an optimization algorithm, optimal scale value selection can dynamically adjust the scale value. Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Cuckoo Search Algorithm (CSA), and others are

本书版权归Nova Science所有

106

Harshita Chaudhary and Binay Kumar Pandey

some examples of optimization algorithms [15-23] used in image processing applications. The Enhanced Genetic Algorithm (EGA) is a modified technique that selects crossover and mutation ratios based on a threshold value [24, 25]. The traditional cuckoo search algorithm is tweaked to improve performance using genetic operators and adaptive parameters. By increasing the brightness difference between objects and their backgrounds, contrast enhancements improve the perceptibility of objects in the scene. Contrast enhancements are usually done in two steps: a contrast stretch and a tonal enhancement, though they can be done simultaneously. Tonal enhancements improve the brightness differences in the shadow (dark), midtone (greys), or highlight (bright) regions at the expense of the brightness differences in the other regions, whereas a contrast stretch improves the brightness differences uniformly across the dynamic range of the image.

4. Proposed Optimum Wavelet Based Masking The use of an optimization algorithm to select scales in a dynamic unsharp masking technique [26] is proposed. The original image is decomposed into approximation coefficients and a high pass filter using the Discrete Wavelet Transform (DWT) given by Eq. (1). The approximated coefficients reconstructed using the Inverse Discrete Wavelet Transformation (IDWT) can be found in Eq. (2).

W  ( j0, m, n ) =

f ( Ax, y ) =

1 MN

1 MN

M −1 N −1

  f ( X , Y ) j

0, m , n

x =0 y =0

W ( j , m, n) j 0

m

0, m , n

( X ,Y )

( x, y)

(1)

(2)

n

The scale coefficients are, the arbitrary initial scale value is j0, and the wavelet domain discrete variables are m, n. The approximation coefficients are W(j0,m,n), the scale coefficients are, the arbitrary initial scale value is j0, and the wavelet domain discrete variables are m, n. j0 = 0, j0,m,n(x,y) is the scale function, and fA(x,y) are the reconstructed approximation coefficients. f(x,y) is the input time domain image with discrete variables x,y of size M*N, j0 = 0, j0,m,n(x,y) is the scale function, and fA(x,y) is the reconstructed approximation coefficient The reconstructed approximation coefficients are

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images …

107

made up of wavelet low pass filtered images. For masking formulation, we use a wavelet approximated image rather than the original image. The reconstructed low pass signal is optimum scaled using an enhanced cuckoo search algorithm. Figure 1 shows a generalized enhancement technique in the form of a block diagram.

Figure 1. The generalized enhancement technique is depicted as a block diagram.

The mask that was created is an optimal wavelet mask that can improve image contrast dynamically. As shown in Figure 2, an optimal wavelet mask is applied to the original image, and the output image is an intensified image. Ultrasound images are being used to test the proposed technique.

Figure 2. Proposed enhancement technique is depicted as a block diagram.

本书版权归Nova Science所有

108

Harshita Chaudhary and Binay Kumar Pandey

4.1. Proposed OWBM Algorithm Original medical image as input Enhanced medical image as a result of the process. Step 1: Take a look at the image you’ve been given. Step 2: Decompose the DWT (db4) signal into frequency sub bands [LL, LH, HL, HH]. Step 3: Reconstruct approximation coefficients LL. Step 4: Using ECSA, choose the best scale value. Step 5: Calculate the sum of the scale value and the approximated image. Step 6: Subtract the model from main image [mask]. Step 7: Replace the original image with the mask.

5. Cuckoo Search Algorithm There are three subsections in this section: 1. The well-known cuckoo search algorithm. 2. It is necessary to rebuild an adaptive nest. 3. Proposed enhancements to the cuckoo search algorithm.

5.1. The Traditional Cuckoo Search Algorithm The cuckoo search algorithm (CSA) was proposed by the authors in [27] as a biologically inspired optimization technique based on bird cuckoo brood parasites. Cuckoos do not normally build nests, preferring instead to lay their eggs in the nests of other birds. The host bird will either rebuild the nest or abandon the eggs if the eggs are not their own. A cuckoo egg is a new solution, while a nest egg is considered a solution. In cuckoo search, the levy fight is used to generate new solutions x (t+1).

X it +1 = xit +   Levy()

(3)

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images …

109

The scales of the interest problem are proportional to the step size(α) in Eq (1). The end result is a product  that represents multiplications of entries one by one. The Levy flight is essentially a random walk with the random step length distributed according to the Levy distribution. (4) The goal of this technique is to replace the bad egg with a new solution. In the traditional form, each nest has one egg. The most basic algorithm, on the other hand, can be extended to a wide range of solution applications [28, 29].

5.2 Need of Adaptive Rebuilding of Worst Nests (ARWN) Regardless of nest fitness, the rate or probability (Pa) of nest rebuilding remains constant in traditional cuckoo search [30]. Yang and Deb tested various probability values ranging from 0.01 to 0.5 and discovered that 0.25 is the best fit for their needs. Rebuilding stationary nests can result in the local minima or maxima region being maintained. A self-adaptive approach based on two learning parameters and static learning was proposed by Li and Yin.

5.3. Enhanced Cuckoo Search Algorithm We propose an adaptive cuckoo search algorithm for scale value optimization in this section. For all iterations, the fitness of each nest is estimated, and the threshold value is set based on the fitness value. Each iteration, nests below the threshold value is abandoned, and the likelihood of nest rebuilding is dynamically changed. Adaptive crossover and mutation are used to replace the abandoned nest. The probability of nest rebuilding and adaptive genetic operators have the same threshold value in our method. As shown in Figure 3, our proposed technique, unlike the traditional cuckoo search algorithm, uses adaptive crossover and adaptive mutation to operate dynamically. The cuckoo search and genetic algorithm implementation parameters are shown in Table 1.

本书版权归Nova Science所有

110

Harshita Chaudhary and Binay Kumar Pandey

Figure 3. Flowchart of the proposed genetic operators-based enhanced cuckoo search.

Table 1. ECSA implementation parameters The number of nests is 50. Iteration 50 of the Cuckoos 0 < k < 1 Pa (probability of nest rebuilding) dynamic range of scale value (k) GA population size is 15 people. The total number of GA iterations is 15. 2-Point type of cross-over The mutation’s type a single byte (flipping) Dynamic cross-over ratio Dynamic Mutation Ratio

5.4. Image Segmentation There are numerous image segmentation techniques in the literature. Some of these methods rely solely on the grey level histogram, while others make use

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images …

111

of spatial details and fuzzy set theoretic approaches. The majority of these techniques are ineffective in noisy environments [31]. Image segmentation is the process of dividing a digital image into multiple segments, or groups of pixels, that are similar based on homogeneity criteria such as color, intensity, or texture, in order to locate and identify objects and boundaries in the image [32]. Filtering noisy images, medical applications (Locate tumors and other pathologies, Measure tissue volumes, Computer guided surgery, Diagnosis, Treatment planning, study of anatomical structure), Locate objects in satellite images (roads, forests, etc.), Face Recognition, Finger print Recognition, and so on are all examples of practical applications of image segmentation. In the literature, many different segmentation methods have been proposed. The choices of one segmentation technique over another, as well as the level of segmentation are determined by the type of image and the characteristics of the problem at hand [33]. For many years, image segmentation research has gotten a lot of attention. There are thousands of different segmentation techniques in the literature, but no single method can be considered good for all types of images, and no method is equally good for all types of images. As a result, developing an algorithm for one type of image may not always be applicable to another type of image. As a result, there are numerous difficult issues to address, such as the development of a unified approach to image segmentation that can be applied to all types of images, as well as the selection of an appropriate technique for a specific type of image [34-40].

5.5. Thresholding Image thresholding is a quick and easy way to divide an image into foreground and background. This is a type of image segmentation technique that isolates objects by converting grayscale images into binary images. Image thresholding works best with images that have a lot of contrast. Histogram and multi-level thresholding are two common image thresholding algorithms. Thresholding is an image segmentation technique in which the pixels of an image are changed to make the image easier to analyse. Thresholding is the process of converting a colour or grayscale image into a binary image, which is simply black and white. We most commonly use thresholding to select areas of interest in an image while ignoring the parts we don’t care about. In the “Manipulating pixels” section of the Skimage Images episode, we did some basic thresholding. In that case, we separated the pixels belonging to a plant’s root system from the black background using a simple NumPy array

本书版权归Nova Science所有

112

Harshita Chaudhary and Binay Kumar Pandey

manipulation. We’ll learn how to use skimage functions to perform tasks in this episode.

6. Results and Discussion This section looks at the performance of the proposed OWBM as well as ECSA. 1) Quantitative results were obtained using the enhanced cuckoo search algorithm. 2) Ultrasound images were used to perform a quantitative and qualitative analysis of OWBM.

Figure 4. Waveform for different type of detection.

The model is 80% accurate, according to the results. The first time a code is executed, a small white proportion of the entire ultrasound image is shown, indicating that stones are present in the images, whereas the second time the code is executed, the entire image is black, indicating that stones are not present in the images. This is because the values are configured so that a stone is present if the value is 1, and no stone is there if the value is 0. The black patches in the output are a representation of the noise. The waveform of different types of detection is shown in Figure 4.

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images …

113

Using the cuckoo search algorithm, the (lambda) value is image independent in the work. Because if the (lambda) value is image dependent, some images may be over enhanced, affecting the results directly or implying that the results are incorrect. In the algorithm, DWT is used for filtering, and IDWT is used to multiply the filtering process.

Conclusion and Future Outlook In this case, we’re using ECSA to perform optimal wavelet-based masking. The image masking technique is used to detect a kidney stone in this case. The method proposed in this paper uses Daniel et al., (Base Paper) application to obtain an optimized contrast enhanced image as a preprocessing tool for kidney stone detection. For getting proper details from the ultrasound image, “wavelet-based masking with enhance cuckoo search algorithm” is used to boost the contrast. Image thresholding and segmentation using the “Global Thresholding” method are used to detect the presence of stone in the kidney ultrasound image. The base paper only discusses optimized contrast enhancement, whereas the proposed method uses it in conjunction with image segmentation techniques to detect stone against a specific image background. As a result, we’re attempting to make our system more intelligent so that it can easily detect the stone.

References [1] [2] [3]

[4]

[5]

Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Brisbane, W., Bailey, M. R., & Sorensen, M. D. (2016). An overview of kidney stone imaging techniques. Nature Reviews Urology, 13(11), 654-662. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Sorensen, M. D., Harper, J. D., Hsi, R. S., Shah, A. R., Dighe, M. K., Carter, S. J., & Bailey, M. R. (2013). B-mode ultrasound versus color Doppler twinkling artifact in detecting kidney stones. Journal of endourology, 27(2), 149-153. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15.

本书版权归Nova Science所有

114 [6]

[7]

[8]

[9]

[10] [11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

Harshita Chaudhary and Binay Kumar Pandey Gupta, M., & Anand, R. (2011). Color image compression using set of selected bit planes. International Journal of Electronics & Communication Technology, 2(3), 243-248. Malik, S., Saroha, R., & Anand, R. (2012). A Simple Algorithm for reduction of Blocking Artifacts using SAWS Technique based on Fuzzy Logic. International Journal Of Computational Engineering Research, 2(4), 1097-1101. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., & Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., &Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Srivastava, A., Gupta, A., & Anand, R. (2021). Optimized smart system for transportation using RFID technology. Mathematics in Engineering, Science & Aerospace (MESA), 12(4). Pandey, B. K., Pandey, D., Wariya, S., Aggarwal, G., & Rastogi, R. (2021). Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans’ Text Recognition and Identification. Augmented Human Research, 6(1), 1-14. Anand, R., & Chawla, P. (2020). Optimization of inscribed hexagonal fractal slotted microstrip antenna using modified lightning attachment procedure optimization. International Journal of Microwave and Wireless Technologies, 12(6), 519-530. Sindhwani, N., & Singh, M. (2020). A joint optimization based sub-band expediency scheduling technique for MIMO communication system. Wireless Personal Communications, 115(3), 2437-2455.

本书版权归Nova Science所有

Kidney Stone Detection from Ultrasound Images … [19]

[20]

[21]

[22] [23]

[24]

[25]

[26]

[27]

[28]

[29] [30] [31]

[32]

[33]

115

Dahiya, A., Anand, R., Sindhwani, N., & Kumar, D. (2022). A Novel Multi-band High-Gain Slotted Fractal Antenna using Various Substrates for X-band and Kuband Applications. MAPAN, 37(1), 175-183. Pandey, D., Pandey, B. K., & Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Anand, R., & Chawla, P. (2022). Bandwidth Optimization of a Novel Slotted Fractal Antenna Using Modified Lightning Attachment Procedure Optimization. In Smart Antennas (pp. 379-392). Springer, Cham. Manchanda, N. S., Singh, M., & Sharma, S. Comparison of Various QPSK Schemes for Space Time Trellis Code over MIMO Channel. In ICCE, 47-50, 2012. Pandey, D., & Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Sindhwani, N., Bhamrah, M. S., Garg, A., & Kumar, D. (2017, July). Performance analysis of particle swarm optimization and genetic algorithm in MIMO systems. In 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-6). IEEE. Sindhwani, N., & Singh, M. (2014). Transmit antenna subset selection in MIMO OFDM system using adaptive mutation Genetic algorithm. arXiv preprint arXiv:1410.6795. Anand, R., Shrivastava, G., Gupta, S., Peng, S. L., & Sindhwani, N. (2018). Audio watermarking with reduced number of random samples. In Handbook of Research on Network Forensics and Analysis Techniques (pp. 372-394). IGI Global. Sindhwani, N., & Singh, M. (2014). Comparison of adaptive mutation genetic algorithm and genetic algorithm for transmit antenna subset selection in MIMOOFDM. International Journal of Computer Applications, 97(22). Pandey, D., Pandey, B. K., & Wariya, S. (2019). Study of Various Types Noise and Text Extraction Algorithms for Degraded Complex Image. Journal of Emerging Technologies and Innovative Research, 6(6), 234-247. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Sayel, N. A., Albermany, S., &Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15.

本书版权归Nova Science所有

116 [34]

[35]

[36] [37]

[38]

[39] [40]

Harshita Chaudhary and Binay Kumar Pandey Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Zaïdi A., “Accurate IoU Computation for Rotated Bounding Boxes in 𝑅2 and 𝑅3 ”, in Machine Vision and Applications, 32, 114, 2021. S. Degerine & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints“, in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821. A. Zaidi A., “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Zaidi & O. Alharbi, “ Statistical Analysis of Linear Multi-Step Numerical Treatment”, in J. of Statistics Applications & Probability, Vol. 12, no. 1, 2023. Degerine S. & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195.

本书版权归Nova Science所有

Chapter 6

Biometric Technology and Trends Mandeep Kaur1 Ritesh Sinha1 Syed Ali Rizvi1 Divyansh Jain1 Harsh Jindal1 and Digvijay Pandey2, 1Department

of Computer Science Engineering, Chandigarh Group of Colleges, Landran, Mohali, Punjab, India 2Department of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India

Abstract In our immensely integrated and interconnected world, establishing one’s identity is becoming incredibly valuable with each passing day. “Is she truly who she says she is?” “Is this individual authorized to use this facility?” and “Is he on the government’s watch list?” are commonly questioned in a number of situations, from obtaining a driving license to having access into a country. The demand for dependable user authentication techniques has grown with rising security concerns and rapid improvements in networking, communication, and mobility. Biometrics has been gaining prominence as a reliable method for identifying an individual and is defined as the study of distinguishing an individual’s identity based on physiological and behavioural characteristics. Biometric systems have now been used to develop a platform to identify in a variety of commercial, civilian, and forensic 

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

118

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al. applications. In this paper, we present an overview of biometrics and describe some of the key research concerns that must be considered in order for biometric technology to be an appropriate and efficient tool for data and information security. This overview’s main contribution is to: 1) analyze applications where biometric scans are used to solve information security issues and vulnerabilities; 2) list the underlying challenges that are faced by biometric systems in real-world applications; and 3) seek solutions to scalability and security issues in humongous biometric technologies.

Keywords: biometrics, facial recognition, fingerprints and retinal / iris recognition

1. Introduction Biometrics is perhaps the most effective method for reliably and quickly identifying and validating individuals based on their unique biological traits. It enables a person to be authenticated and authorized using unique and particular data that is identifiable and verifiable. Presently, biometrics are being utilized on a regular basis in nations like the US, India, Russia, Taiwan, and many others [1]. • •

•

•

The Department of Homeland Security in the United States employs biometrics to deter unauthorized immigration. The Russian government has been keeping track of people who have been ordered to quarantine for 14 days using a network of 100,000 facial recognition cameras. Taiwan’s government proposed utilizing retinal scans or fingerprints to scan data into the cards to assist government efforts to digitize records and services. In India, Aadhaar is currently the largest biometric database in the world.

The basis of the operation of biometrics is given below: Biometrics can perform two distinct but unique functions: Verification and Identification, as it responds to the following questions: “Who you are?”

本书版权归Nova Science所有

Biometric Technology and Trends

•

119

Herein, the person is selected as one of the few (1: N match). The identified individual’s personal data compares with other people on the same database or possibly other connected databases [2]. “Are you genuinely who you claim to be?”

•

It is possible to confirm a person’s identity using biometrics by carefully examining previously submitted information and prerecorded personal information (such as 1: 1).

2. History It is our instinct to always find something unique in us and what seems different in others, we point out. The biometric system is based on the same innate human fascination. In some way that was how we came up with the idea of differentiating ourselves based on biological criteria like fingerprints, facial recognition, or retinal / iris recognition. Now, biometrics allays a longstanding concern about conclusively proving one’s identity through one’s distinct features. If we look down the history line, it was the Chinese emperor Teslin who first used fingerprints to authenticate seals. In the commercial world, biometrics was first introduced in India in 1858 by William James Herschel by a British administrator in India. In the criminal world, such accommodations were first used in France in the late 19th century by a French police officer, Bertillon. He proposed the idea of scientific policing. Later biometrics was introduced into use in the crime sector in various countries such as the UK and US [3]. In India’s case, the use of biometrics first came into being in 1858 (as shown in Figure 1). During the era of colonialism, it was William James Herschel, Chief Magistrate of the Hooghly district in Jangipur, appointed by the British Crown, who first employed a handprint to seal official government documents. This created the atmosphere of being recognized and identified as an individual, which was previously lacking. Consequently, British colonial victims felt more trapped and drowned in their systems because of fingerprintbased identification [5]. With the use of hands and personal contacts, people began to believe that documents are binding. However, it officially came into existence in 1897 with the establishment of the first fingerprint bureau in the

本书版权归Nova Science所有

120

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

world in Kolkata. These initial developments actually led to today’s modern world and opened the door to many technological aspects that we could not have imagined then.

Figure 1. First biometric [4].

3. Conventional Biometrics and Modern Age Biometrics Distinguished Biometrics is more secure and pertinent than conventional methods of human recognition. It can strengthen or replace existing technologies in a few instances. It is interesting to note that the term “biometrics” was not used to describe these technologies until the 1980s. Although this term has been in use since the 19th century, it was only in 1981 that an article in The New York Times found this term for the first time [6]. Table 1 shows the comparisons between modern age biometrics and conventional biometrics. Table 1. Differences between Conventional and Modern age Biometrics [7] Conventional Biometrics The first non-automated biometrics references were excavated in Nova Scotia in the form of prehistoric handwriting images with ridge patterns. Biometric authentication is the conventional technique of identification, with origins dating back to at least 6000 B.C.

Modern Age Biometrics The Federal Bureau of Investigation produced the first AFIS in 1974. (FBI). Because storing entire fingerprint scans would be expensive, it only retained the minutiae, or the most crucial points of a fingerprint.

本书版权归Nova Science所有

Biometric Technology and Trends

121

Conventional Biometrics Chinese merchants used ink to stamp palm fronds and children’s footprints on paper to make a difference between one child and another. Assessing fingerprints for a match in a paper database took police examiners a month or more.

Modern Age Biometrics Biometrics has superseded such antiquated methods. Data is now kept in a computer system and may be accessed with a simple tap of the fingers. Nowadays, in less than 30 minutes, computers were able to evaluate them to a database of 100,000. Therefore, the same search may be completed in less than a blink of an eye. After the introduction of machine recognition The one of the first companies to work on facial technology which was of course faster and more recognition software in the 1960s was convenient, it diminished the use and further Panoramic Research in Palo Alto, California. exercise of conventional biometric methodologies. Manual checks for biometric border control are Biometrics is already embedded in a chip in both slower and less accurate. several passports. They are, in fact, essential if one want to visit the United States without obtaining a visa. After that, the system matches actual fingerprints and/or faces to those saved on the chip. Both are used to identify criminals majorly but Both are used to identify criminals majorly but the process of execution is changed over time. the process of execution is changed over time. Accuracy: Old biometrics were of course little Accuracy: The likelihood of a hit is increased by less accurate than modern day biometrics if we improved algorithms and a comprehensive set of ignore the system failures of automated strong tools that examine and improve the biometric identification systems. quality of prints. These were utilized in increasing numbers as Speed: Users may execute numerous tasks law enforcement agencies joined in the use of quickly and precisely while transactions are fingerprints as a means of personal performed in the background. identification. Due to the slow pace of the process, however, it faded away over time. Such tech was absent in conventional biometric Interoperability: External ABIS systems, identification systems. It mainly relied on criminal history systems, Live scans, cell record-keeping of individuals in hard copies. phones, internet solutions, and other information systems can all be interfaced with the system. Flexibility in this regard can only be achieved Flexibility: It works in a multitasking by transferring the identification of individuals environment and is designed to scale up as through traditional means such as passwords, demand and upgrade requirements evolve. PINs, keys, and tokens. Use of higher versions of sensors and security Scalability: In response to rising client systems has a meagre place in conventional expectations, it manages cost-effective growth. biometrics since all it relied on was pen and paper (palm paper).

4. Biometrics- Trends and Prospects Marketers and innovators are rolling out new ways that to use bioscience on an everyday basis. Figure 2 shows the assorted activities targeted for biometric

本书版权归Nova Science所有

122

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

identification. Security for private devices could be a common use for bioscience in today’s market. iProov’s supporter app, accessible on each Apple and Android devices, permits users to demonstrate and unlock their devices via face recognition. Predictions for diffusion of biometric enabled smart phones recommend that buyers can have increasing opportunities to use bioscience to access their personal devices. However biometric applications in geographical point identification, money sector authentication, and government services are on the increase [8]. The biometric technology is also utilized by health providers and leisure destinations like Disneyworld and Six Flags Magic Mountain to track visitors throughout their networks and facilities. Here, selection would be a crucial point of differentiation. Shoppers in some areas like money services have the choice to provide bioscience or select another service. Wherever government and therefore the geographical point worries, providing biometric cognition is commonly a necessity for receiving advantages or a condition of employment. The utilization of bioscience to acquire digital distinctiveness is one of the most promising prospects for trade now and in the near future. While technology helps to increase security and ease in some ways, it also introduces new risks. The most prominent biological aspect of the general public associated with the local unit is the use of biometric data as a multi-police investigation and the risk of misidentification. The benefits seem to outweigh the fear, driving revenue and high usage rates per year. Software system developers have extensive collaboration with biometric technologies to further develop new applications and solutions [9]. Today’s world is becoming more digital every day and requires increasingly sophisticated methods of securing and accessing information. All-encompassing biometric automation has established itself as a unified security attribute. Authentication is that the method of faithfully confirming the identity of somebody. in dividing authentication mechanisms into what you recognize, what you’ve got, and what you’re, biometric devices evidence you per what you’re. You will not “loan out” something that will facilitate somebody fool an identity verification device; nor can something be taken. Before victimization any biometric methodology for substantiation, the analogous attribute of every individual within the group ought to be accessible within the information. This is described as enlistment [11].

本书版权归Nova Science所有

Biometric Technology and Trends

123

Figure 2. Biometric Targeted Activities [10].

Authentication is done whether by Verification or Identification. In verification, a person’s identity is compared to a single record on a directory to determine who he or she claims to be. This is worthwhile, for an instance, if the bank needs to verify the shopper’s signature by check. In identification, a person’s attribute is matched to all records on a website to determine if you have a record in a directory. This is worthwhile, for an instance, if a company needs to allow access to the building for employees only [12].

5. Types of Biometric Figure 3 indicates many common strategies under a particular classification [13]. There are a diverse number of biometric gadgets accessible. All area unit too extravagant to be in daily utilization, however in certain circumstances the prices area unit returning all the way down to wherever we tend to might even see these. It can be further split into two broad categorizations: - physiological and behavioral [14]. Biometrics are used primarily in two ways: 1. Through Biological/Physiological traits: This includes Fingerprints, the appearance of the fist, the fingers, vein pattern, the eye (iris and retina), and the appearance of the face (Facial recognitions). Medical unit and police forensics may employ DNA, blood, saliva, or urine for distinctive identification of an individual [15].

本书版权归Nova Science所有

124

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al. BIOMETRICS

PHYSIOLOGICAL

FACE

FINGERPRINT

HAND

behavioral KEYSTROKE

SIGNATURE

VOICE

IRIS

DNA

Figure 3. Shows several common strategies under a particular classification [13].

2. Through Behavioral traits: Behavioral traits are attributes that an individual possesses in terms of how they react, their demeanor, their tone, keystroke dynamics, step sound, signature dynamics, and so on. This of course still needs a lot of development and enhancement but it does provide a glimpse of how the future world is going to be [16]. Technology available today includes: •

Face Recognition: - This procedure analyzes facial geometry based on the distance between facial features such as nose, mouth, and eyes.

本书版权归Nova Science所有

Biometric Technology and Trends

•

•

•

•

•

•

125

Some technology incorporates geometric features and skin texture. Standard video cameras and this process support both verification and detection. Just don’t come at your job with a black eye and a swollen jaw! [17-20]. Fingerprint Readers: - Although there are many ways to measure the features associated with fingerprints, the two most common are those based on the minutiae and dependent on the image. In a system built on the minutiae, the system creates a graph indicating where individual rigs start / stand or branch. In an image-based program, the system creates a fingerprint image and detects image similarity on a website. This may seem like a daunting task as fingerprints have been in use for a number of years. It has never been so successful to automate this technology, despite the availability of local devices [21]. Handprint Readers: -These are more commonly used than fingerprints. They live by hand measurements: finger length, size, etc. They are not as good as fingerprints (which is a lot of falsehood), yet they are less expensive and have fewer problems [22]. Iris Scanner: -As a retinal scanner, this creates a map of the iris structure of our eye. It is a significant advantage to have a visually impaired visual connector - instead of having to appear on a scanner, the scan of the iris is usually passed through the camera several meters and may even be subtle. Everyone has a unique pattern between the iris, which is measured through this process. It usually requires a laser beam. Life with them is very stable and comfortable. They also support validation and identification [23]. Voice: - This is the science of using human voice as a unique character and biologically identifying them to confirm who you are. It measures tone, cadence, and tone of voice. Can be used locally or remotely. This method of operation is often employed for validation [24]. DNA: - DNA is a chemical found in the nucleus of all human cells and many other living things. The pattern persists throughout life even after death. Extremely accurate. It is used for both verification and identification. The only problem is that the same twins may have the same DNA [25]. Keystroke timing: - The duration of clicking key measures the personal behavior associated with working with the keyboard. It can

本书版权归Nova Science所有

126

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

•

measure the duration of the main stress, the time between keystrokes, number and frequency of errors, key presses, and so on. It is not expensive because it does not need any additional equipment. However, it is less accurate because the feature may change over time. The actual way people filter is different, and the tests are completed to identify and support the way people choose [26]. Signatures: - These sites incorporate the old authentication method, and they are located in an area with human coordinators who have critical information about whether the 2 signatures were created by one individual or not. The machines so far have not been able to duplicate that level of precision. However, when not only the signature is recorded, but the actual movement of the signature as well, there is insufficient verification information. Few companies implement this process in the user’s language on the associate degree electronic pill. In the past, signatures were used in the banking industry to verify the identity of the checker. There are still many human professionals today who can decide if a signature check or document is the same as a signature on file. Biometric methods use signature pills and special pens to identify a person. These devices not only compare the final product, the signature, they also share another moral quality, such as the temporary setting required to record a signature [27].

It may seem tempting to build a portable fingerprint reader such as a portable computer or PC that lets you stick your finger and turn it into pieces. Such equipment is commercially available. The bit unit location was then sent to an overseas computer as a separate authentication. The machines so far have not been able to duplicate that level of precision. The only way a remote biometric device can be protected is to have the device securely encrypted and communicate with a remote computer via cryptographically protected exchange. •

•

Accuracy: - The accuracy of biometric techniques is measured using two parameters: false rejection rate (FRR) and false acceptance rate (FAR) [28]. False Rejection Rate (FRR): - This parameter measures how often a person, who should be seen, can be seen by the system. FRR is rated as the number of false positives in the total number of attempts [29].

本书版权归Nova Science所有

Biometric Technology and Trends

•

127

False Acceptance Scale (FAR): -This parameter measures how often a person, who should not be seen, is seen by the system. FAR is measured as a measure of false acceptance and total effort [30].

6. Trustworthiness and Challenges of Biometrics Biometric recognition systems endure alongside a slew of other authentication and identification mechanization, each with its own set of potentiality and objectives. Authentication systems are typically based on one of three factors: something the person knows, such as a password; something the person possesses, such as a physical key or secure token; and something the person is or does. The hindmost of these is used in biometric technologies. Biometric systems, dissimilar to password or token-based systems, can operate in the absence of working input, user collaboration, or have an understanding that recognition is occurring. As a result, biometric systems are not a universal substitute for other authentication methodologies, while integrating biometric proceed towards with other techniques can improve assured safety in situations where user participation can be assumed. One important distinction between biometric and other authentication methods, such as tokens or passwords, is that the former relies on unanimous users, allowing them to fabricate or demonstrate what they know, and the latter relies on the user’s protection of a smartcard or password. These other types of authentications, on the other hand, do not protect against the token or secret being shared or transferred, on the contrary biometric features are linked to an individual3— expressly, something that a person is or does. Unintentional revelation of biometric details, on the other hand, may have more catastrophic implications or be more arduous to rectify than the disappearance of a token or the vulnerability of a password. An additional significant distinction is that, owing to the fact that biometric systems are deterministic, they are at high risk of deliberately trying to undermine confidence in their data, as well as uncertainties that can easily be distorted into bias that suggests biometrics are unreliable and unreliable [31]. Conducting threat analysis and developing system threat models that include analyzing the feasibility of threats to secure services and against the system to make a defense is an important factor in understanding the problem. Decisions about whether and how biometric methods can be installed should consider their suitability and equity when considering the problem to be solved as well as the suitability and risk of

本书版权归Nova Science所有

128

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

biometrics compared to other solutions and need to be considered a wider community of information protection and within the biometrics community. Second, biometric systems (not just resources they are protecting) themselves are at risk of attacks aimed at undermining their reliability and integrity. For password-based systems or tokens, violations can usually be remedied by removing a new password or token with new one. However, it is usually not possible to replace a compromised biometric element. This complexity is due to the fact that the same biometric element can be used by different systems, and weaknesses in one system can lead to damage to the biometric element to be used in another system. Moreover, such traits are not secret — they are revealed in everyday life. For example, we leave fingerprints in many areas we touch, faces can be photographed, and words can be recorded [32]. However, it is difficult for an imposter to develop a set of fingerprints to match the stolen ones as from a person to grow a new and different set. Therefore, it is important to make sure that the feature presented for real recognition is not made by a fraud. This often requires a human operator to look at the feature’s presentation of the feature — which greatly pressures remote or distributed use of biometrics. Automatic verification that a live person presents what may be a combined artifact may be sufficient in some applications but will not replace personal monitoring where high self-esteem is required. It is important to manage the integrity of the whole process instead of focusing on testing an improved biometric feature. Systems that use biometric detection are usually designed with other procedures that can be used if the sensor fails or the person is deficient biometric element. Enemies may try to force the system into ways of failing to avoid or achieve recognition, which means testing procedures should be performed rigorously as a primary procedure [33]. Another possible way to improve recognition would be to use biometric modalities and other demographic data to narrow the search space. This method may have other benefits, such as scaling population distribution in addition to that given by one biometric as well to reduce the risk of fraudulent attacks. It may be bad, as well, which includes increasing the complexity and cost of the system. There are also concerns related to design and working of multi biometric systems and questions on how to model those systems in a better way. Understanding any statistics dependence is important when using multi biometric.

本书版权归Nova Science所有

Biometric Technology and Trends

129

7. Challenges and Countermeasures The most pertinent and basic question one should consider when designing a secure biometric security protocol is: How can you ensure privacy without compromising the precision of the biometric authentication system? The most challenging problems in designing effective and confidential biometric authentication systems and how to get over with it are (1) resistance to fraudulent attacks; (2) the consistency of biometric templates; and (3) ensuring that private data is kept confidential. Next, we provide a list of methods that have been used to achieve confidentiality, and highlight the main advantages and disadvantages of each method [34].

7.1. Biometric Template Protection Many privacy verification methods which are already present in the market emphasis on maintaining and deploying a changed type of the basic biometric templates to devoid the risk of listening to uptight data or corrupted website matter. One guideline to combat privacy issues related to biometric authentication is the hiring of biometric template protection plans like canceled biometrics and bio hashing. Ang et al. proposed cancellable fingerprints, whereas Connell et al. recommended cancellable iris biometrics. Various bio hashing schemes are introduced into it. Even though bio-hashing offers low error rates while certifying a swift authentication and verification phase, schemes like bio-hashing are still at risk from fraudulent activities [35].

7.2. Error Correction Methods Error - correcting codes are also an appealing way of reducing the cluttered character of biometric features. Error correction, of course, can automatically detect minor template errors in the template itself, solving the noisy data problem. Thus, systems can ascertain error biometric templates and hence in this way it can effectually use cryptographic primitives that will not influence the corresponding biometric process. This is, for instance, the background underlying Juels and Wattenberg’s ambiguous commitment system. The biometric blueprint serves as proof of commitment to the secret codeword c. As long as the new proof presented to the client is similar to the previous one,

本书版权归Nova Science所有

130

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

relatively similar code name c will be changed. The commitment scheme will then use the name of the given code. The testimony is frequently used as the encryption key / decryption key, as well as for user authentication. Later cryptographic components (hashing and/or encryption) may be employed to better regulate the noisy milieu of biometrics. In comparison to biometric referencing and sample template threats, these systems are theoretically relatively secure. The attacker needs to know the user’s biometric data in order to obtain a biometric template or key [36]. Theoretical confidentiality is not attained in practice, however, because biometric templates are not built in the same manner and actual error checking codes do not have high resolution functionalities. Fuzzy commitment methods have been found to leak private information.

7.3. Other No Cryptographic Approaches Provided that OT is a deep-rooted mechanism hostile to attacks on user tracking and segmentation, many non-cryptographic BAS privacy tools concentrate on combating template and sample retrieval attacks. It proposes, for instance, to defend against center search assaults by comparing fresh and stored templates using weighted distances and keeping the weights confidential and distinct for every user. Biometric verification approaches that use a regular Hamming range or perhaps a weighted Euclidean ranging [37] follow this procedure. Even though search engine optimization is still feasible in certain situations [38, 39, 40, 41], it will only yield a smaller collection of recorded biometric template elements [42]. Another alternative is to broaden the scope of the comparing process. In particular, if the matching method relies on a technique in which the range is selected randomly from a specified set of categories in each verification operation, the attacker would be unable to uncover any data about the template in the database before really learning the range it has utilized. Likewise, launching a central search attack by modifying the threshold value employed for the matching process for each authentication effort. Even so, such methods may decrease the accuracy of biometric verification and/or increase the percentage of incorrect acceptance and/or exclusion. Ultimately, for privacy protection, consider integrating Differential Privacy (DP) with biometric identification. DP, presumably, enables users to ask questions about the website and receive audio messages, ensuring that no details about the website’s data are disclosed [43]. Whereas the integration of

本书版权归Nova Science所有

Biometric Technology and Trends

131

DP biometric provenance and central lookup attack could provide the inference of a statistical breakthrough attack (i.e., a central search attack), this may make a contribution to the efficiency of the verification process, necessitating a much more comprehensive study of the procured service (accuracy) and data protection.

8. Biometric Technology in Different Spheres of Life 8.1. Commercial Applications In this fast-growing age, the world cannot rely on the ancient methods of handling various industries. Government institutions, educational institutions, and a few other industries rely on biometric technology now more than ever with unquestionable reliability and security [44, 45]. And today, companies have a variety of biometric options to choose from according to their business needs. Currently, the biometric market offers companies Fingerprint Readers, Palm Readers, Facial Recognition, Iris Recognition, Voice Recognition, and Signature of Recognition. In business environments, these embody access to facilities, access to data systems, dealings at point-of-sales, and worker activity. Competition between business entities is cut-throat, and no company can back down from its competitors. And biometric technology has enabled them to continue to bring product and reputation to their businesses. 1. Attendance: Swift attendance was one of the reasons why so many companies are moving ahead with biometric technology. Technology has made it easier to manage time and staffing. And today, there are updated biometrics tracking devices available in the market that allow companies to download data in seconds with data push technology [46]. 2. Access control/security: Biometric technology has become a paradigm for security solutions, aside from providing attendance solutions. Biometric-focused safety systems are in use worldwide today. The widespread use of biometric-based security systems can be proven in many financial institutions, Fortune 500 companies, MNCs, government agencies, etc. [47].

本书版权归Nova Science所有

132

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

3. Time loss prevention- One of the main reasons companies use biometric technology is to save time and bring speed to their business. Unaware of its transformational benefits, some businesses are still relying on the latest recording methods available, security, paperbased earnings and uncertainty [48]. 4. Remote data accessibility: Biometric technology is a godsend for HR departments as they reduce the human workload required to track staff presence. Biometric technology has come a long way in reducing efforts. Combined with data push technology, it can now also save time and work. Fraudulent employees will no longer be. able to manage their presence and keep the company in the dark. And paragon manufacturing that provides biometric solutions with their 24x7 customer support and a team of engineers is available nationwide. Their team critically analyzes the condition of existence and safety of biometric needs and problems of a few industries and develops devices that meet their needs. The company was founded in 1998, striving to produce instant access solutions and control access to various sectors. Their products are very economical, saving your company money for annual expenses. Although it also improves the efficiency of travel and security management systems [49].

8.2. Law Enforcement and Public Security (Criminal/Suspect Identification) Law enforcement agencies around the world have been embracing biometric technology opportunities for a long time. In the rules and regulations of enforcement system, they incorporate probes and forensic analysis. The first crime-stamping fingerprints were first utilized in 1892 in Argentina. Today, the police are facing the challenge of efficiency. At the frontline, first responders needed to rely on speed and accuracy [50]. AFIS- Automated Fingerprint Identification Systems are used to store and process fingerprints. Digital fingerprints can be compared to those recorded on a website. AFIS is used for two specific biometric methods: identification of fingerprints and verification of fingerprints [51]. Ten/latent print- Ten prints represent the complete set of fingerprints collected in one sheet. Personal identity is known, making it a ‘known text.’ Today, fingerprints are taken by the scanner, instead of the traditional ink pad.

本书版权归Nova Science所有

Biometric Technology and Trends

133

Hidden printing is found in the crime scene using chemical and lighting methods [52]. Palm prints- Palm prints can be collected using the same method as fingerprints and may be used for identification by law inspectors or forensics [53]. The human touch- Simply relying on AFIS algorithms is not enough to conclude an investigation. The human factor is still important. Image clarity and minutiae features are reviewed in person to determine what to focus on [54]. DNA- DNA is a powerful law-abiding research tool. Each person’s DNA is unique, with the exception of identical twins. By analyzing DNA sequences or loci, forensics can create a profile that can help identify the suspect [55]. Chromosomes- Each cell in the body has its own inner core, the nucleus that holds the chromosomes. The chromosome contains traits that replicate DNA sequences. Depending on the individual, the frequency varies [56]. Profiling- The first step in DNA research is to obtain samples. Only a small number of cells are required to create a unique profile. In these cells, DNA is extracted and copied. Following this procedure, forensics produces a DNA profile description that can be read by legal experts [57]. Databases-The increase in data has supported the increased use of the DNA profile. For example, the INTERPOL’s DNA database contains more than 247,000 profiles [58]. Pros/cons-The methodology is an efficient and accurate system. Though extremely accurate, it is never foolproof. An incomplete profile may be similar to most people for example and cannot serve as full proof [59]. Voice/facial recognition- Both voice and face recognition tools serve a variety of purposes such as investigating and identifying victims of crime [60]. Technology-Face detection technology is complex software, which compares the image of a suspect’s face to others on a website. By using algorithms, the system can select specific and unique information about a person’s face. This information is converted to statistical representation and compared with available data [61]. Speech recognition- The first speech recognition forums were developed in the 1950s and could only is understood digitally. Modern solutions are highly developed and precise, recognizing voices and areas with background sound. Reporting and writing are important aspects of law enforcement and speech recognition technology (SRT) making that process easier and faster [62].

本书版权归Nova Science所有

134

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

Policies- The use of facial recognition by the police can be a topic that speaks to lawmakers, political parties and the public. Take surveillance for example. The Office of the Information Officer of the UK has launched an investigation into the use of facial expressions after it became known that the public was having a face test without permission [63]. Ethics- For face recognition algorithms to work properly, the system must be trained and tested by large data sets with images captured under different conditions. This raises ethical questions about how data is collected, stored, shared and used [64].

8.3. Military (Enemy/Ally Identification) Biometrics has numerous potentials uses in the military domain. They can be used for combat operations, security operations, peace support operations, and peace time military engagement. There are several potential applications that are relevant to situations of armed conflict in which IHL applies. It is generally accepted that biometrics provide high degrees of confidence when verifying a person’s identity or identifying an enemy threat [65]. Furthermore, there are several possible applications for armed conflict, which include, but are not limited to: 1. Identification of persons wishing to gain access to military facilities via biometrics: Biometrics can be used to ensure people entering military facilities are authorized to do so. United States forces in Iraq have used this application for example. 2. Identification of persons applying for positions requiring security clearance: Biometrics can assist in identifying people who, for instance, will work closely with friendly forces. One example of how the U.S. armed forces has used bio-metrics for this purpose is in Afghanistan. Another example of the U.S. armed forces using biometrics is when they vet local drivers in Syria who support U.S. forces [66].

本书版权归Nova Science所有

Biometric Technology and Trends

135

8.4. Border, Travel, and Migration Control (Traveler/Migrant/Passenger Identification) Since 1996, the US government has tried to implement a comprehensive system to track all immigrants coming and going. Today, the U.S. Customs and Border Protection (CBP) collects certain biometric information for most foreign nationals entering the U.S. and associate it with data stored on a government website. However, after decades of law and regulation the process is still full of gaps. The database is not fully complied, and the U.S. still does not use an effective biometric data collection system for immigrants. Although technological advances have made such a program possible in recent years, the current data collection infrastructure is in a state of flux and state efforts are still ongoing [67]. Both biometric and biometric data on the arrival and departure of all outsiders are collected in a computer program called the Arrival and Departure Information System (ADIS). ADIS compiles data from a number of crossborder and immigration systems, including those used by CBP, ICE, and USCIS. ADIS uses this data to generate a daily list of people accused of staying longer than their official approval period. The records of the people on this list are then checked with other government information agencies to see if they have left, including to see if certain people have changed their immigration status. Border management and immigration management conjointly use some biometric techniques.

8.5. Civil Identification (Citizen/Resident/Voter Identification) Due to the beginning of human development, evolution and edification, public identification has become a part of it in some way or another. As evolution and advancement in human race progressed, the requirement for a method of public identification was also perceived. In today’s society, public identification is necessary for the benefit of government, voter identification, criminal identification, and so on. Furthermore, Governments across the globe spend a lot of capitals to provide institutionalized services and services, so they want to remain assured that only the right people can have access to it. This can only be done through a dependable public identification process. Biometrics like Civil ID removes the need for any extraneous antiquity such as cards or badges to manifest personal identity. It removes many problems that come in the structure of lost or lost identity cards, dissented

本书版权归Nova Science所有

136

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

from gaining ingress due to lost or forgotten IDs, etc. It has long been found that the characteristics of human nature and behavior are different enough to separate them others [68]. Not surprisingly, biometric civil identification is a growing trend and is amazingly visible in under-developed countries rather than economies. These countries have initiated utilizing or intending to make the use of biometrics as a civil identity to provide several governments reap benefits and services such as welfare, assistance, food, subsidies, etc. These countries have adopted a biometric path for public identification and the preparation of card-based diagnostics. Counterfeit cards and their duplicates are often used to display the services offered and the paper money, when in fact it may attract a corrupt system. Biometrics enables people to identify individuals or to verify their identity without dropping off any trace of deception. It is very evident that the services and usefulness provided by the national government outstretch to the right people without having to deal with fraud [69].

8.6. Healthcare and Subsidies (Patient/Beneficiary/Healthcare Professional Identification) Healthcare worldwide has shifted its reliance on medical records to the use of electronic health records (EHR) records. The benefits of using EHRs for faster access to patient data include: 1. 2. 3. 4. 5.

Accurate and up-to-date information in the service area High integrated care and efficiency Safe sharing of patient data between physicians Few medical errors Safe decision-making methods

But these benefits depend on hospitals, doctors, and other health care facilities to accurately verify a patient’s identity during all medical interactions. Some health care facilities deal with this fact based on a healthbased diagnosis (name, date of birth, Social Security Numbers, etc.) and look at a specific type of ID: biometrics. Identifying patients based on different biological factors (face, fingerprints, iris, voice) ensures that care is provided to the right people, leading to a safe and effective global health environment.

本书版权归Nova Science所有

Biometric Technology and Trends

137

When a patient enters a medical facility, a biometric search using its biometric data can confidently find a consistent identification of the corresponding identity in the master patient index. These searches determine the high level of certainty of the patient’s record [70]. Once a person is already registered, medical staff can access existing health data and speed up quality care, which is especially helpful when patients are unable to assert their identity themselves. The risk of creating a repeat patient health record that already exists in the system is reduced, as there may be a disregard for the health data needed to obtain appropriate treatment. Patient identification of a biometric patient also highlights situations where distortions of identity facts are used in an attempt to get free medical care. Biometric searches can detect these attempts by detecting when someone tries to post a false identity.

8.7. Physical and Logical Access (Owner/User/Employee/ Contractor/Partner Identification) Access control and verification systems and services fall into two categories: physical and logical. Many analysts combine two categories under the general theme of access control. Each has seen steady growth in the discovery of biometrics. Physical access control systems monitor and control access and movement throughout buildings and campuses. They can be used to track people and any portable material. Data generated by virtual access systems can be used for office and campus planning, asset management, security checks and forensic investigations [71-74]. Systems can also make robberies in social engineering difficult to accomplish. It would be difficult to say which industry does not use to control physical access. In some industries, access control, in and out of resources, is a practical requirement. They include health care, military and defense, finance, travel, aerospace and medicine. The secondary industries include manufacturing, transportation, supply chain and commercial buildings [75]. Intelligent access control systems serve the same purposes but control access to digital information and monitor network traffic and access. They often send warnings about unusual activity. Categories such as use of smart cards, RFID devices and biometric systems are utilized to authenticate people based on their pre-saved credentials [76-78]. The spectrum of biometric systems that can be distributed is limited only by budget and performance, but traditionally includes scanning fingerprints.

本书版权归Nova Science所有

138

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

8.8. Technological Utilization For all one knows that mobile phones have transformed many aspects of daily life, including how we communicate, consume, and, perhaps most vitally, how we set the passwords. We no longer employ numbers, alphabets, or symbols as passwords; instead, we utilize our physicality. Biometric technology, which began with a fingerprint recognition system for unlocking a cell phone, has evolved over time to include facial and iris identification [79].

8.8.1. Mobile Phones/Tablets Over the past decade, smart phones have drastically changed the way public communicate with each other, work, bank or even shop. It seems there is something missing when a person doesn’t have their Smartphone with them. Since the introduction of these smart handheld communications, the biggest change has been the introduction of biometrics. Earlier, biometrics had been limited to PINs, passwords, or patterns, but now mobile biometrics are changing everything. Mobile biometrics have been successful and mature in a very short time. Additionally, manufacturers offer all kinds of cellular biometrics solutions using different types of sensors and biometric imaging techniques [80]. Of all the types of biometric recognition methodologies accessible to today’s mobile devices, fingerprint recognition is undoubtedly the most reliable, widely utilized and popular recognition method. This is true not only for the mobile app but also for all biometric applications. And why shouldn’t it be? It provides a perfect balance of security, comfort and user consent, unlike many other popular biometric recognition methods. It is not only very popular, but it is also a biometric method that has undergone significant improvements. It has taken a long time to grow and has worked for mankind for over 100 years. The first approach toward introducing fingerprint sensors kept their functionality limited to a biometric lock on smartphones. However, today, fingerprint sensors on mobile devices can do much more than simply lock the phone, such as snapping a photo. Mobile fingerprint sensors, which come implanted with devices, provide limited functionality as they only scan and analyze incomplete fingerprints. Nevertheless, mobile devices with popular applications like Windows, iOS and Android come with the conventional ability to process biometric data and support external devices. It allows us to connect an external fingerprint sensor

本书版权归Nova Science所有

Biometric Technology and Trends

139

that can scan and process full fingerprints, which makes it possible to use our Smartphone as a biometric device [81].

8.8.2. Laptops/PCs Having a fingerprint reader is the most prominent type of computer biometric security. This is a small window near the bottom of the keyboard or attached to a mouse that acts as a scanner, but scans fingerprints. Users save fingerprint scanning on a secure site and set up security as they would with a written password [82]. To control the computer, the user simply uses a finger on the scanner, and we learn the unique features of the text to give access. 8.8.3. Automobiles Most cars can remember seat and mirror settings by attaching them to a used key. Biometrics allows the car to remember settings based on the user’s face, and automatically adjusts it to anyone in the driver’s seat - so you no longer need to adjust the mirrors for 2 minutes on your trip. The most common way to personalize it today is to link the driver’s details with a specific key. This is now beginning to go the way of digital car keys, which will provide the ability to connect to a wider set of personal options. By using biometrics, personalization can greatly exceed the standard settings supported today [83].

Conclusion and Future Outlook There are four technological developments that will lead to the evolution of second-generation biometrics systems: (i)the emergence of potentially new biometric traits, (ii) the added value provided by soft biometrics, (iii) the useful utilization numerous biometric traits for large-scale human characterization, and (iv) technologies that ensure a high degree of security, privacy, and versatility in the use of biometrics systems. The stakes for second-generation biometrics technologies are significant, as are the hurdles. Rather than being the outcome of a single unique breakthrough, the development of second-generation biometrics technology will be a cumulative and continuous effort [84, 85]. The inexpensive cost of biometric sensors and adequate matching performance has fueled the appeal of fingerprinting as a commercial application. Continued advances in matching performance and a progressive decrease in the cost of biometrics sensors could be enough to

本书版权归Nova Science所有

140

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al.

change the biometrics modalities choices and decisions in future. Researchers will be able to utilize extended biometric features and construct highperformance matchers employing efficient noise elimination approaches. First and foremost, the protection of personal information should be considered as an essential part of biometrics technologies. Secondly, policy considerations (ethical and legal framework) relating to the deployment of biometrics technologies should be explicitly stated in order to avoid any conflicts of interest among stakeholders [86-94]. The introduction of widely accepted biometrics standards, practices, and policies should address not only identity theft challenges, but also focus on ensuring the benefits of biometrics technologies to all members of society, particularly those who have been adversely targeted by identity theft. Based on recent biometric installations, we believe that the security and benefits they offer far outweigh social concerns about personal privacy. Hong Kong identity cards should serve as a useful example for assessing the benefits and concerns of biometrics deployments in the future. Biometric systems’ sensing, storage, and computational capabilities are likely to improve in the coming time. While this will significantly improve throughput and accessibility, basic challenges of (i) biometric representation, (ii) robust matching, and (iii) adaptive multimodal systems remain. These efforts and the capability of automatically detecting behavioral patterns may prove to be essential for surveillance and many other large-scale identification applications.

References [1] [2]

[3] [4] [5]

[6]

Choi, G. H., Moon, H. M., & Pan, S. B. (2017). Biometrics system technology trends based on biosignal. Journal of digital convergence, 15(1), 381-391. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Di Nardo, J. V. (2008). Biometric technologies: functionality, emerging trends, and vulnerabilities. Journal of Applied Security Research, 4(1-2), 194-216. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Fierrez, J., Morales, A., Vera-Rodriguez, R., & Camacho, D. (2018). Multiple classifiers in biometrics. Part 2: Trends and challenges. Information Fusion, 44, 103-112. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59.

本书版权归Nova Science所有

Biometric Technology and Trends [7] [8] [9]

[10]

[11] [12]

[13]

[14]

[15]

[16]

[17] [18]

[19]

[20]

[21]

141

Mordini, Emilio, & Dimitros Tzovaras, eds. Second generation biometrics: The ethical, legal and social context. Vol. 11. Springer Science & Business Media, 2012. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Masood, Hajra, & Humera Farooq. “A proposed framework for vision based gait biometric system against spoofing attacks.” 2017 international conference on communication, computing and digital systems (C-CODE). IEEE, 2017. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Anand, R., Khan, B., Nassa, V. K., Pandey, D., Dhabliya, D., Pandey, B. K., & Dadheech, P. (2022). Hybrid convolutional neural network (CNN) for Kennedy Space Center hyperspectral image. Aerospace Systems, 1-8. Tulyakov, Sergey, & Venu Govindaraju. “Classifier combination types for biometric applications.” In 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW’06), pp. 58-58. IEEE, 2006. Ghayoumi, M. (2015, June). A review of multimodal biometric systems: Fusion methods and their applications. In 2015 IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS) (pp. 131-136). IEEE. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Zhao, W., Chellappa, R., Phillips, P. J., & Rosenfeld, A. (2003). Face recognition: A literature survey. ACM computing surveys (CSUR), 35(4), 399-458. Pandey, B. K., Pandey, D., Wariya, S., & Agarwal, G. (2021). A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(28), e3. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. Pramanik, S., Ghosh, R., Pandey, D., & Ghonge, M. M. (2021). Data Hiding in Color Image Using Steganography and Cryptography to Support Message Privacy. In Limitations and Future Applications of Quantum Cryptography (pp. 202-231). IGI Global. Palka, Sean, Harry Wechsler, & Booz Allen Hamilton. “Fingerprint readers: vulnerabilities to front-and back-end attacks.” In 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems, pp. 1-5. IEEE, 2007.

本书版权归Nova Science所有

142 [22] [23] [24] [25]

[26] [27]

[28] [29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al. Nagel, Alexandra. “The Hands of Albert Einstein: Einstein’s Involvement with Hand Readers and a Dutch Psychic.” Correspondences 9, no. 1 (2021). Benalcazar, D. P., Bastias, D., Perez, C. A., & Bowyer, K. W. (2019). A 3D iris scanner from multiple 2D visible light images. IEEE Access, 7, 61461-61472. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Wickenheiser, Ray A. “Trace DNA: a review, discussion of theory, and application of the transfer of trace quantities of DNA through skin contact.” Journal of Forensic Science 47.3 (2002): 442-450. Karnan, M., Akila, M., & Krishnaraj, N. (2011). Biometric personal authentication using keystroke dynamics: A review. Applied soft computing, 11(2), 1565-1573. Hema, Chengalvarayan Radhakrishnamurthy, M. P. Paulraj, & Harkirenjit Kaur. “Brain signatures: A modality for biometric authentication.” In 2008 International Conference on Electronic Design, pp. 1-4. IEEE, 2008. Hillis, David M., John P. Huelsenbeck, & Clifford W. Cunningham. “Application and accuracy of molecular phylogenies.” Science 264.5159 (1994): 671-677. Bolle, Ruud M., Sharath Pankanti, & Nalini K. Ratha. “Evaluation techniques for biometrics-based authentication systems (FRR).” In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, vol. 2, pp. 831-837. IEEE, 2000. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Bruntha, P. M., Dhanasekar, S., Hepsiba, D., Sagayam, K. M., Neebha, T. M., Pandey, D., & Pandey, B. K. (2022). Application of switching median filter with L2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Systems, 1-7. Rathgeb, C., K. Pöppelmann, & E. Gonzalez-Sosa. “Biometric technologies for elearning: State-of-the-art, issues and challenges.” In 2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA), pp. 558-563. IEEE, 2020. Laas-Mikko, Katrin, Tarmo Kalvet, Robert Derevski, & Marek Tiits. “Promises, Social, and Ethical Challenges with Biometrics in Remote Identity Onboarding.” In Handbook of Digital Face Manipulation and Detection, pp. 437-462. Springer, Cham, 2022. Hadid, Abdenour, Nicholas Evans, Sébastien Marcel, & Julian Fierrez. “Biometrics systems under spoofing attack: an evaluation methodology and lessons learned.” IEEE Signal Processing Magazine 32, no. 5 (2015): 20-30. Breebaart, Jeroen, Bian Yang, Ileana Buhan-Dulman, & Christoph Busch. “Biometric template protection.” Datenschutz und Datensicherheit-DuD 33, no. 5 (2009): 299-304. Grant, T. & Lebo, M. J., 2016. Error correction methods with political time series. Political Analysis, 24(1), pp. 3-30.

本书版权归Nova Science所有

Biometric Technology and Trends [37]

[38]

[39]

[40]

[41]

[42]

[43]

[44] [45]

[46]

[47]

[48] [49]

[50]

143

Anand, R., Singh, B. & Sindhwani, N. (2009). Speech Perception & Analysis of Fluent Digits’ Strings using Level-By-Level Time Alignment. International Journal of Information Technology and Knowledge Management, 2(1), 65-68. Sindhwani, N., Bhamrah, M. S., Garg, A., & Kumar, D. (2017, July). Performance analysis of particle swarm optimization and genetic algorithm in MIMO systems. In 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-6). IEEE. Anand, R., Sindhwani, N., & Dahiya, A. (2022, March). Design of a High Directivity Slotted Fractal Antenna for C-band, X-band and Ku-band Applications. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 727-730). IEEE. Anand, R., Arora, S., & Sindhwani, N. (2022, January). A Miniaturized UWB Antenna for High Speed Applications. In 2022 International Conference on Computing, Communication and Power Technology (IC3P) (pp. 264-267). IEEE. Sindhwani, N., & Singh, M. (2014). Transmit antenna subset selection in MIMO OFDM system using adaptive mutation Genetic algorithm. arXiv preprint arXiv:1410.6795. Blanchard, Enka, & Ted Selker. “Origami voting: a non-cryptographic approach to transparent ballot verification.” In International Conference on Financial Cryptography and Data Security, pp. 273-290. Springer, Cham, 2020. Kansara, K. & Kadhiwala, B., 2020, October. Non-cryptographic Approaches for Collaborative Social Network Data Publishing-A Survey. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC) (pp. 348-351). IEEE. Prabhakar, Salil, Sharath Pankanti, & Anil K. Jain. “Biometric recognition: Security and privacy concerns.” IEEE security & privacy 1, no. 2 (2003): 33-42. Juneja, S., Juneja, A., & Anand, R. (2019, April). Reliability modeling for embedded system environment compared to available software reliability growth models. In 2019 International Conference on Automation, Computational and Technology Management (ICACTM) (pp. 379-382). IEEE. Said, MA Meor, M. H. Misran, M. A. Othman, M. M. Ismail, H. A. Sulaiman, A. Salleh, & N. Yusop. “Biometric attendance.” In 2014 International Symposium on Technology Management and Emerging Technologies, pp. 258-263. IEEE, 2014. Noma-Osaghae, Etinosa, Okonigene Robert, Chinonso Okereke, Olatunji J. Okesola, & Kennedy Okokpujie. “Design and implementation of an iris biometric door access control system.” In 2017 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 590-593. IEEE, 2017. Prabhakar, Salil, Sharath Pankanti, & Anil K. Jain. “Biometric recognition: Security and privacy concerns.” IEEE security & privacy 1, no. 2 (2003): 33-42. Xu, Bing, Tobechukwu Agbele, & Richard Jiang. “Biometric blockchain: a secure solution for intelligent vehicle data sharing.” Deep Biometrics. Springer, Cham, 2020. 245-256. Jennings, W. G., Khey, D. N., Maskaly, J. & Donner, C. M., 2011. Evaluating the relationship between law enforcement and school security measures and violent crime in schools. Journal of police crisis negotiations, 11(2), pp. 109-124.

本书版权归Nova Science所有

144 [51] [52] [53]

[54]

[55] [56]

[57] [58]

[59]

[60]

[61]

[62]

[63] [64]

[65]

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al. Nosek, J., & Lukeš, P. (2019). A concept of a new AFIS tool for automated data record. MAD-Magazine of Aviation Development, 7(4), 12-14. Fieldhouse, Sarah. “Consistency and reproducibility in fingermark deposition.” Forensic Science International 207, no. 1-3 (2011): 96-100. Kanematsu, N., Yoshida, Y., Kishi, N., Kawata, K., Kaku, M., Maeda, K., Taoka, M. & Tsutsui, H., 1986. Study on abnormalities in the appearance of finger and palm prints in children with cleft lip, alveolus, and palate. Journal of maxillofacial surgery, 14, pp. 74-82. Lee, W. W., Alkureishi, M. L., Wroblewski, K. E., Farnan, J. M. & Arora, V. M., 2017. Incorporating the human touch: piloting a curriculum for patient-centered electronic health record use. Medical Education Online, 22(1), p. 1396171. Hashiyada, Masaki. “Development of biometric DNA ink for authentication security.” The Tohoku journal of experimental medicine 204, no. 2 (2004): 109-117. Bahado-Singh, R.O., Tan, A., Deren, O., Hunter, D., Copel, J. & Mahoney, M. J., 1996. Risk of Down syndrome and any clinically significant chromosome defect in pregnancies with abnormal triple-screen and normal targeted ultrasonographic results. American journal of obstetrics and gynecology, 175(4), pp. 824-829. Hildebrandt, Mireille. “Defining profiling: a new type of knowledge?.” In Profiling the European citizen, pp. 17-45. Springer, Dordrecht, 2008. Noore, A., Singh, R., & Vatsa, M. (2007). Robust memory-efficient data level information fusion of multi-modal biometric images. Information Fusion, 8(4), 337346. Arshad, M. J., Durrani, A. I., Khan, U. G., Farooq, A., Afzal, M., Wajid, A., & Abdulnasir, M. (2014). A Study of Internet Threats, Avoidance and Biometric Security Techniques-Comparison of Biometric Techniques. Journal of Faculty of Engineering & Technology, 21(2), 135-146. Karam, Walid, Hervé Bredin, Hanna Greige, Gérard Chollet, & Chafic Mokbel. “Talking-face identity verification, audiovisual forgery, and robustness issues.” EURASIP Journal on Advances in Signal Processing 2009 (2009): 1-15. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Becerra, Aldonso, J. Ismael De La Rosa, & Efrén González. “Speech recognition in a dialog system: from conventional to deep processing.” Multimedia Tools and Applications 77, no. 12 (2018): 15875-15911. Razeghinejad, Mohammad Reza, & Mohammad Banifatemi. “Ocular biometry in angle closure.” Journal of ophthalmic & vision research 8, no. 1 (2013): 17. Cui, Y., Meng, Q., Guo, H., Zeng, J., Zhang, H., Zhang, G., & Lan, J. (2014). Biometry and corneal astigmatism in cataract surgery candidates from Southern China. Journal of Cataract & Refractive Surgery, 40(10), 1661-1669. Castano, E., Sacchi, S., & Gries, P. H. (2003). The perception of the other in international relations: Evidence for the polarizing effect of entitativity. Political psychology, 24(3), 449-468.

本书版权归Nova Science所有

Biometric Technology and Trends [66]

[67] [68]

[69]

[70]

[71]

[72]

[73]

[74]

[75]

[76]

[77]

[78]

[79]

145

Castano, Emanuele, Alain Bonacossa, & Peter Gries. “National images as integrated schemas: subliminal primes of image attributes shape foreign policy preferences.” Political Psychology 37, no. 3 (2016): 351-366. Vogel, D. (2000). Migration control in Germany and the United States. International Migration Review, 34(2), 390-422. Peeters, Bart, & Guido De Roeck. “Reference based stochastic subspace identification in civil engineering.” Inverse problems in Engineering 8.1 (2000): 4774. Nagarajaiah, Satish, & KalilErazo. “Structural monitoring and identification of civil infrastructure in the United States.” Structural Monitoring and Maintenance 3, no. 1 (2016): 51-69. Zandian, H., Olyaeemanesh, A., Takian, A., & Hosseini, M. (2016). Contribution of targeted subsidies law to the equity in healthcare financing in Iran: exploring the challenges of policy process. Electronic Physician, 8(2), 1892. Saini, M. K., Nagal, R., Tripathi, S., Sindhwani, N., & Rudra, A. (2008). PC Interfaced Wireless Robotic Moving Arm. In AICTE Sponsored National Seminar on Emerging Trends in Software Engineering (Vol. 50). Chaudhary, A., Bodala, D., Sindhwani, N., & Kumar, A. (2022, March). Analysis of Customer Loyalty Using Artificial Neural Networks. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 181-183). IEEE. Pandey, D., & Wairya, S. (2022). A Novel Algorithm to Detect and Transmit Human-Directed Signboard Image Text to Vehicle Using 5G-Enabled Wireless Networks. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-11. Pandey, B. K., Pandey, D., & Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. David, M. W., G. A. Hussein, & K. Sakurai. “Secure identity authentication and logical access control for airport information systems.” In IEEE 37th Annual 2003 International Carnahan Conference onSecurity Technology, 2003. Proceedings., pp. 314-320. IEEE, 2003. Srivastava, A., Gupta, A., & Anand, R. (2021). Optimized smart system for transportation using RFID technology. Mathematics in Engineering, Science & Aerospace (MESA), 12(4). Goyal, B., Dogra, A., Khoond, R., Gupta, A., & Anand, R. (2021, September). Infrared and Visible Image Fusion for Concealed Weapon Detection using Transform and Spatial Domain Filters. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-4). IEEE. Kaura, C., Sindhwani, N., & Chaudhary, A. (2022, March). Analysing the Impact of Cyber-Threat to ICS and SCADA Systems. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 466-470). IEEE. Podio, Fernando L. “Personal authentication through biometric technologies.” In Proceedings 2002 IEEE 4th International Workshop on Networked Appliances (Cat. No. 02EX525), pp. 57-66. IEEE, 2002.

本书版权归Nova Science所有

146 [80]

[81] [82]

[83]

[84]

[85]

[86]

[87]

[88]

[89]

[90]

[91]

Mandeep Kaur, Ritesh Sinha, Syed Ali Rizvi et al. Blanco-Gonzalo, Ramon, Oscar Miguel-Hurtado, Aitor Mendaza-Ormaza, & Raul Sanchez-Reillo. “Handwritten signature recognition in mobile scenarios: Performance evaluation.” In 2012 IEEE International Carnahan Conference on Security Technology (ICCST), pp. 174-179. IEEE, 2012. Buciu, I., & Gacsadi, A. (2016). Biometrics systems and technologies: A survey. International Journal of Computers Communications & Control, 11(3), 315-330. Elliott, Stephen J., Sarah A. Massie, and Mathias J. Sutton. “The perception of biometric technology: A survey.” In 2007 IEEE Workshop on Automatic Identification Advanced Technologies, pp. 259-264. IEEE, 2007. Nainan, Sumita, Akshay Ramesh, Vipul Gohil, & Jaykumar Chaudhary. “Speech controlled automobile with three-level biometric security system.” In 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), pp. 1-6. IEEE, 2017. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Pramanik, S., Ghosh, R., Pandey, D., Samanta, D., Dutta, S., & Dutta, S. (2021). Techniques of Steganography and Cryptography in Digital Transformation. In Emerging Challenges, Solutions, and Best Practices for Digital Enterprise Transformation (pp. 24-44). IGI Global. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Degerine S. & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Degerine S. & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints,” in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

Biometric Technology and Trends [92]

[93] [94]

147

Zaïdi, A. “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices,” in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Zaïdi, A. “Accurate IoU Computation for Rotated Bounding Boxes in and ,” in Machine Vision and Applications, 32, 114, 2021. Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., & Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 7

Comparison of Digital Image Watermarking Methods: An Overview Vibha Aggarwal Sandeep Gupta* Navjot Kaur Virinder Kumar Singla and Shipra Collge of Engineering & Management, Punjabi University Neighborhood Campus, Rampura Phul, Punjab, India

Abstract In the present, technology is primarily focused on preventing unauthorized access to data due to the increased use of social media and the internet. The most popular technology for keeping digital media safe online is digital watermarking. In digital watermarking, a message is hidden beneath an image, video, or song under various guises. Considerations for watermarking digital images include imperceptibility, capacity, robustness, and security. This work provides an overview of image watermarks, watermark types, and the need of watermarking. This work introduces various methods of digital image watermarking depending on the spatial and frequency domain.

Keywords: digital watermarking, discrete cosine transform, discrete wavelet transform, discrete fourier transform, spatial domain method, frequency domain method *

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

150

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

1. Introduction The frequent use of the Internet to upload/download audio, video, text and images in day to day life makes routine tasks easier and at the same time makes the system more susceptible to unauthorized use of digital data on the Internet [1-4]. The rapid growth of communication networks allows the popular distribution of multimedia data in various manners. Since digital productions are quite easy to be duplicated, the discouraging unauthorized copying and distributing of electronic documents will be real challenges. This is due to the fact that data on the internet can be accessible to anyone who may or may not be authorized to access that data. To protect digital data from unauthorized use, the invisible watermark is the tool of the past in recent years [5]. With this approach, the source image is hidden in the target image in such a way that it is very difficult to recognize that image. A standard structure for digital watermarking is shown in Figure 1. A message is embedded undetectably into the atypical graphical material on the generation side. After that, the embedded content is sent across a noisy wired or wireless channel. This noisy channel could indicate data issues such as visual information degradations, modifications, and probable signal processing procedures. The watermark message is then attempted to be detected from the accessible content in the detector. The detector may also use the original data during detection, depending on the application.

Figure 1. Watermarking scheme.

2. Watermarking Throughout history, cryptography and steganography [6-8] have been employed to provide secrecy to communications in both war and peace. More

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

151

effective methods of concealing information were created as technology advanced and detection procedures improved. Steganography has found new applications with the advent of the Internet. However, because the medium is relatively unsecure, it is also subject to more powerful attacks. Furthermore, as digital media becomes more important, new issues emerge, as it is now relatively easy to replicate and even change multimedia information. Watermarking technology, on the other hand, plays a vital part in company security since it allows an invisible mark to be placed in multimedia data to identify the rightful owner, track authorized users, or detect criminal tampering. The main distinction between steganography and watermarking is the watermarking scheme’s better resilience. Digital watermarking refers to a set of approaches for invisibly conveying information by embedding it in the cover data [5]. Bits of information are encoded in such a way that they are absolutely invisible using digital watermarking. Actual bits are dispersed across the image in such a way that they cannot be detected and resist attempts to remove the hidden data [5]. A perfect watermarking method would incorporate a large quantity of data that could not be erased or changed without rendering the cover object useless. It can be thought of as a type of steganography in which one message is embedded in another and the two messages are related in some way. Watermarking digital content is comparable to watermarking actual objects, except that digital content is employed instead of physical objects. A low-energy signal is contained in another signal in digital watermarking. The low-energy signal is known as a watermark, and it displays metadata about the main signal, such as security or rights information. Because it covers the watermark, the main signal in which the watermark is embedded is referred to as the cover signal. A still image, audio clip, video sequence, or text document in digital format is usually used as the cover signal. The process of embedding information into a digital signal is known as digital watermarking [1]. For example, the signal could be audio or video. If the signal is duplicated, the information is duplicated as well. The information in visible watermarking is visible in the picture or video, whereas in invisible watermarking, the information is added as digital data to the audio, picture, or video but is not visible.

本书版权归Nova Science所有

152

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

2.1. Classification of Digital Watermarking Techniques Digital watermarking techniques are classed in a variety of ways based on a variety of factors. The following is a list of different types of watermarking procedures. Each of the kinds listed below has its own set of applications.

2.1.1. Robust and Fragile Watermarking Robust watermarking is a technique that prevents the watermark from being affected by changes to the watermarked content. Fragile watermarking, on the other hand, is a technique in which the watermark is destroyed when the watermark content is changed or interfered with. 2.1.2. Public and Private Watermarking Users of the content are allowed to detect the watermark in public watermarking, but they are not allowed to identify the watermark in private watermarking. 2.1.3. Asymmetric and Symmetric Watermarking Asymmetric watermarking (also known as asymmetric key watermarking) is a technique in which the watermark is embedded and detected using distinct keys. The same keys are used for embedding and detecting watermarks in asymmetric watermarking (or symmetric key watermarking). 2.1.4. Steganographic and Non-Steganographic Watermarking Steganographic watermarking is a technology that hides the presence of a watermark from content users. Users are aware of the presence of a watermark in non-steganographic watermarking. In fingerprinting applications, steganographic watermarking is employed, whereas non-steganographic watermarking techniques can be used to protect privacy. 2.1.5. Visible and Invisible Watermarking Watermarks that are visible when the content is seen are those that are embedded in visual content in such a way that they are visible when the content is viewed. Watermarks that are invisible are undetectable by simply looking at the digital material. The first and most fundamental method of watermarking was visible watermarking. This function takes the cover object and applies the watermark to it. The watermark is now visible on the cover object. This worked well for

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

153

identification, but not so well for steganography. Though visible watermarking has been around for a long time, it is not a secure method of watermarking for copy protection and copywriting. The visual watermarking algorithm cannot be kept a secret. This type of watermarking could only be used to identify the owner of a piece of artwork. Invisible watermarking is employed in all other applications. Watermarks that are invisible are undetectable by simply looking at the digital material.

2.2. Requirements The following are some of the most typical criteria used to assess a digital watermarking process: Watermark perceptual transparency, computational difficulty of embedding and detection, computational difficulty of the algorithm, false positive rate of watermark detection at the detector, data recovery with or without access to the unusual signal, bit-rate of data embedding process, robustness in contradiction of attacks, speed of embedding and retrieval process, and coherence of embedding and detection. When examining the basic motive for digital watermarking and most other applications, interpretability, robust against intended or non-intended signal actions, capacity, and security may be the most important and common requirements for generating usable and effective watermarks. Imperceptibility refers to the perceived closeness between both the original and watermarked information. The owner of the original data normally does not allow any degrading in the data. As a result, the perceptual quality of the unmodified and watermarked information should be the same. Subjective experiments are commonly used to assess watermark perceptual quality [9]. Robustness refers to the ability to identify the watermark after the watermarked information has passed through a particular signal processing procedure. The types of threats that a watermarking system should be capable of withstanding are determined by how watermarking is used. While the capacity to survive data transfer in a network is sufficient for copyright protection in the broadcasting monitoring system using digital watermarking, this is not the situation for copyright protection in the broadcasting monitoring system using digital watermarking. In this case, signal processing that is wholly unknown is used. Watermarked data is commonly used by attackers to execute activities. As a consequence, as long as the watermarked data’s quality

本书版权归Nova Science所有

154

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

is to be preserved, the watermarking approach must be robust to any possible copyright signal processing activity. Capacity is the ability to check and differentiate between various watermarked versions of an image with an absolutely low probability of error as the amount of those increases. In other words, it’s the amount of data that may be usually hidden without generating noticeable content changes [10]. The objective of the watermark is to remove the implanted data directly, hence it must be attack-resistant. An attacker would be unable to delete, recover, or change the watermark without having the private key. Watermarking methods with a sophisticated algorithm will have higher computing costs than those with a simple approach. Although consumer equipment’s processing speed and memory size have improved over time, algorithm complexity has made applications more resource intensive. In resource-constrained environments like mobile devices, computational simplicity is still favored. Currently, mobile applications must strike a balance between battery life, bandwidth usage, memory allocation, and a variety of other considerations. It’s possible that extending picture watermarking to video frame watermarking will necessitate a low-complexity technique. Fast enough watermarking detection procedures would enable a smooth transition from one frame to the next in real time. The practical value of video frame watermarking will be harmed if the watermark detection is very complicated. The execution time of watermark embedding and detection phases can be measured using minimally equipped systems to determine computational cost. Furthermore, if done in computer software, watermarking applications should take up less hard disc space. There is a notable trade-off between these criteria. If the capacity is raised, there may be visible distortions in the information [11-13]. Imperceptibility may suffer as a result of increased watermark strength due to robustness. Robustness, on the other hand, is inversely related to capacity. Following the content attacks, it is clear that retrieving the secret watermark even without bit mistakes will get even more difficult as the amount of data in the disguised information increases. The application determines the optimum balance of these needs. Although the number of hidden bits in some cases, such as broadcast monitor, should be sufficient to identify all broadcasts, in others, such as copyright protection, just one bit of secret information, identifying the content’s source, may be necessary. All of the other trade-offs should be considered while choosing the optimum watermarking approach.

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

155

2.3. Techniques Based on different data security criteria, Watermarking techniques can be categorized as shown in figure 2. Depending upon the area wherein the watermark is embedded, they’re categorized into spatial and frequency area techniques [14]. The source and cover information are intermingled in the spatial domain of watermarking. Some picture analysis methods (for example, edge detection) can provide perceptive information of the image, which can then be employed to embed a watermarking key directly in the intensity values of selected portions of the image. These methods are straightforward and effective for embedding a hidden watermark into a main image, but they are not resistant to ordinary image changes. The most common method for this is the least significant bit modification algorithm, which changes the LSB of a chosen pixel in an image. In additive watermarking pseudo random pattern of noise that may be integer or floating is added in spatial domain. In patchwork technique by Gaussian distribution patches are inserted in the image as one if intensity of one patch increased and intensity of second patch reduced. For the images that also have some text part, water marks are mixed with text. In frequency domain watermarking, the cover message’s transform coefficients are modified rather than its pixel values. The markings are applied on the numbers of the image’s transform coefficients rather than the image’s intensity in this method. The watermarked image is then created by inverse converting the marked coefficients. The most common transforms used in image watermarking are Singular Value Decomposition (SVD), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), Discrete Fourier Transform (DFT), Lifting Wavelet Transform (LWT), Discrete Hadamard Transform (DHT), and Discrete Hadamard Transform (DHT) [15, 16]. The image is divided into pseudo-frequency bands by DCT. Watermarks are mixed in with subbands of middle frequency. In global DCT, the watermark is applied to the entire image, whereas in block-based DCT, it only affects the local region [17]. The visual data’s low- and mid-frequency components contain the majority of the perceptual relevant information. As a result, compression methods and other image processing procedures can only have a limited impact on this important region of the input images without obliterating much of the visual content. As a result, including a watermark in the transform domain’s significant coefficients enhances robustness.

本书版权归Nova Science所有

156

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

Figure 2. Image watermarking techniques.

The data is split into two pieces in DWT, usually high and low frequency. The signal’s edge components are mostly contained in the high frequency portion. The low frequency section is divided into two sections: high and low frequency. This operation is repeated until the data has been completely deconstructed or the application has terminated. Generally, no more than five decomposition steps are calculated for compaction and watermarking applications. LWT scheme is the improvement of DWT. In LWT, by using the correlation in the pixel value, redundancy is reduced. It uses Euclidean algorithm to decompose any finite filter bank into lifting steps. The upside down capability of the addition operation is investigated in order to perform the authentic reconstruction. Accepting the rounding error in convolution allows for reconstruction. The Fourier transform is widely recognised as one of the greatest and most extensively used image processing techniques. In any partial time period, the Fourier transform cannot supply frequency data. The Discrete Fourier Transform area was examined because it is resistant to a variety of geometric attacks, including translation, cropping, rotation, and scaling [18, 19].

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

157

The Discrete Hadamard Transform (DHT) is commonly used in image compression and processing. DHT is an orthogonal square matrix with the order n and the values (+-1). When compared to other methods, DHT employs a lower number of coefficients. The DHT method has a low level of complexity because it only requires basic subtraction and addition operations. When compared to other high gain transformation methods such as DWT and DCT at high noise levels, DHT has a useful high frequency band and middle band for concealing or inserting the watermark.

2.4. Applications Over the last two decades, digital watermarks have grown in popularity. Data can be embedded with digital data that can be retrieved later. Texts, logos, handwritten signatures, and numbers can all be used as watermarks, and they have a variety of applications, as noted below.

2.4.1. Copyright Protection For copyright protection, digital watermarks are utilized. The watermark information indicates the original image’s owner (author). This ensures that intellectual inventiveness is demonstrated [20]. 2.4.2. Copyright Authentication In copyright validation, a digital watermark is utilized. The procedure of having to validate the source document’s digital identity is known as verification. A digital watermark is a piece of data that can be used to prove that a document’s content hasn’t been tampered with. This is accomplished by embedding watermark data in the content that may be recovered to verify the authenticity of the original data. This can be used as information in a police investigation, for example [20]. 2.4.3. Fingerprinting and Digital Signatures Customers purchase a variety of media, including still photographs. Customerspecific data is embedded in each piece of digital medium. This could be a model number from the original equipment manufacturer (OEM) or another useful model. The origin of the exposed content can then be traced using this data. Large government entities frequently require this level of security. The data in this example relates to the legal recipient of the digital data rather than

本书版权归Nova Science所有

158

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

the source of the information, and it is used to uniquely identify distributed copies of the material. This is important for tracing or monitoring unlawfully made data copies [20]. The recipient will be identified by the watermark in any valid buying or selling. Each copy will have a separate watermark applied by the work’s owner or maker. If it is used inappropriately, the owner will be able to determine who made the unlawful copy. The abuser is referred to as a traitor, while the individual who takes the traitor’s place is referred to as a pirate. An adversary is someone who tries to remove or fake a watermark.

2.4.4. Copy Protection and Device Control Clone control is largely used to prevent illicit copying of digital data. A watermark can be used to alert the recording device that the data is not editable. The watermark in the media being played must be verified by compliant players, as authorized by the patent license. A digital watermark can also be used to turn on copy control devices. Manufacturer must integrate the new encoders governed by patent law requirements into their devices to use these methods and defend the global communications industry [20]. 2.4.5. Broadcast Monitoring Illegal re-broadcasting must be avoided at all costs by production companies. Digital watermarks can be employed in this situation to automatically monitor broadcasts on satellite networks across the world and detect any illegal broadcast content. Similarly, many organizations want to know how to receive all of the airtime they pay for from broadcasters in a secure manner. Passive monitoring, which attempts to directly recognize the material being disseminated, and active monitoring, which is dependent on pertinent information being broadcast with the content, are the two types of verification procedures.

3. Performance Metrics 3.1. Signal to Noise Ratio (SNR) The signal-to-noise ratio, which is analyzed in decibels, is the ratio between the intended information and the unwanted signal or power of background noise (dB).

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

159

3.2. Peak Signal to Noise Ratio (PSNR) The PSNR is the decibel (dB) ratio between a signal’s maximum achievable power and the power of corrupting noise that affects the accuracy of its representation. The PSNR is the most often used metric for assessing the quality of Lossy compression recovery. In this situation, the signal is the actual data, while the noise represents the compression error. When evaluating compression algorithms, it is used as a rough estimate of human perception of reconstruction quality, therefore one reconstruction might appear to be closer to the original than another despite having a lower PSNR (a higher PSNR would generally suggest a higher quality reconstruction) [21]. It is best characterized using the mean squared error (MSE), which is defined as follows for two m×n monochromatic images I and K, in which one of the images is assumed a noisy copy of the other: 1

𝑛−1 2 𝑀𝑆𝐸 = 𝑚𝑛 ∑𝑚−1 𝑖=0 ∑𝑗=0 ‖𝐼(𝑖, 𝑗) − 𝐾(𝑖, 𝑗)‖

(1)

The PSNR is defined as: 𝑀𝐴𝑋 2

𝑃𝑆𝑁𝑅 = 10 𝑙𝑜𝑔10 ( 𝑀𝑆𝐸𝐼 ) = 20 𝑙𝑜𝑔10 (

𝑀𝐴𝑋𝐼

√𝑀𝑆𝐸

)

(2)

here, MAXi is the maximum possible pixel value of the image. The MSE will be zero if the two images are similar, giving in an infinite PSNR. When compared to its source, an image with all zero value pixels (black) does not normally provide a PSNR of zero. If I is totally white and K is totally black, a PSNR of zero can be achieved (or vice versa).

3.3. Weighted Peak Signal to Noise Ratio (WPSNR) The insertion of various weights for perceptually various regions, in contrast to the PSNR where all areas are considered with the same weight, is a simple technique to adapting the classical PSNR for watermarking application [22]. WPSNR adds an additional parameter called the Noise Visibility Function (NVF), which is a texture masking function, as a penalization factor since the human eye is less sensitive to changes in textured areas than in smooth areas. NVF estimates the amount of texture in any area of a picture using a Gaussian model. The value of NVF is used as a penalization factor in the WPSNR.

本书版权归Nova Science所有

160

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al.

𝑊𝑃𝑆𝑁𝑅 = 10 𝑙𝑜𝑔10

𝐿𝑚𝑎𝑥 2 𝑀𝑆𝐸×𝑁𝑉𝐹

(3)

3.4. Effectiveness The digital watermark mechanism is reliant on the signal input. The capacity to detect watermarks immediately after the introduction procedure is termed to as effectiveness. While 100 percent effectiveness is ideal, it is usually impossible to accomplish. Because of the lack of redundancy, sinking a perfectly random signal, for example, is impossible. Because 100 percent efficiency is impossible to obtain, most watermarking algorithms are designed to target certain applications and are commercially viable in terms of resilience, safety, and other attributes.

3.5. Efficiency The ability to integrate is referred to as efficiency. The required watermark size is determined by the application. The author name or serial number may simply be embedded in the original signal in anti-copy applications. In order to prevent editing, file descriptions and other pertinent information may be inserted in forensic applications. Although it is feasible to combine as much data as you like, signal quality or stealth are usually sacrificed. Plotting PSNR as a function of integration, where we may examine both qualities together, is a popular watermark evaluation approach system.

4. Comparison of Watermarking Techniques Spatial Domain techniques are not robust against various attacks but they are less complex in calculations that means are simple. When compared to frequency domain or wavelet techniques, this technique is less tedious. LSB technique is simple to implement and quality image degradation is less. But at the same time this technique is sensitive to noise and prone to various attacks like cropping, shuffling and scaling. Patchwork method strong against attacks but has low capacity [35-44]. For texture mapping data is added in continuous texture pattern of an image but this may its limitation also as this method needs more texture zones for its effective implementation.

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview

161

Frequency domain watermarking techniques superimposed the spatial domain of assigning the data through various order bits in a way that is robustness. Various available frequency domains transform techniques different advantages and disadvantages as: Discrete Cosine Transform has rational execution time means has moderate complexity and tolerable robustness. DCT has higher computational efficiency than DFT. But DCT has block effect and picture cropping effect which causes that if an image is reduced to higher compression ratios, then these blocks are visible [23-26]. In case of DWT, it has better frequency analysis, great localization and energy consumption but processing cost is higher, with large compression time and is less robust. LWT has less complexity so its operation time is less; border management not needed, all operations are performed in parallel and inverse of LWT is straight forward but information form coefficient extraction is more tedious [27-28]. DFT is more robust so perform better against attacks like scaling. But it is not strong against attacks such as shearing and cropping. When talking about DHT it is more efficient, computational cost is low because it is less complex [29-31]. But complexity of DHT is less.

Conclusion and Future Outlook The method of digital watermarking has confirmed to be extraordinarily useful. Digital picture watermarking is better technique for securely transmitting the information. Security is the prime issue when data transmission is under consideration [32-34]. Form this work it can be inferred that instead of frequency domain technique, spatial domain technique is most widely used technique due to the fact that image recovery is considerably good even the image may be cropped or translated. On contrary side frequency domain technique is more secure but due to its complexity watermarking recovery is difficult.

References [1]

[2]

Anand, R., Shrivastava, G., Gupta, S., Peng, S. L., & Sindhwani, N. (2018). Audio watermarking with reduced number of random samples. In Handbook of Research on Network Forensics and Analysis Techniques (pp. 372-394). IGI Global. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized

本书版权归Nova Science所有

162

[3] [4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al. Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15. Gupta, M. & Anand, R. (2011). Image Compression using Set of Selected Bit Planes on Basis of Intensity Variations. Dronacharya Research Journal, 3(1), 35-40. Gupta, M. & Anand, R. (2011). Image Compression using Set of Selected Bit Planes using Adaptive Quantization Coding. In International Conference on Advanced Computing, Communication and Networks (pp. 457-461). Cox, I. J., Kilian, J., Leighton, F. T., & Shamoon, T. (1997). Secure spread spectrum watermarking for multimedia. IEEE transactions on image processing, 6(12), 16731687. Pandey, B. K., Pandey, D., Wairya, S., Agarwal, G., Dadeech, P., Dogiwal, S. R., & Pramanik, S. (2022). Application of Integrated Steganography and Image Compressing Techniques for Confidential Information Transmission. Cyber Security and Network Security, 169-191. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., & Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Swanson M. D., M. Kobayashi, & A. H. Tewfik (1998). Multimedia DataEmbedding and Watermarking Technologies. Proceedings of the IEEE, 86 (6), 1064–1087. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Anand, R., Mann, A., & Sharma, K. (2020). Deep Metric Learning-based Face Recognition Pipeline with Anti-Spoofing on Raspberry-Pi Single-Board Computer. Test Engineering & Management, 82, 4302-4308. Anand, R., Singh, B., & Sindhwani, N. (2009). Speech perception & analysis of fluent digits’ strings using level-by-level time alignment. International Journal of Information Technology and Knowledge Management, 2(1), 65-68. Saqib, M., & Naaz, S. (2017). Spatial and frequency domain digital image watermarking techniques for copyright protection. Int. J. Eng. Sci. Technol. (IJEST), 9(2), 691-699. Gupta, G., & Aggarwal, H. (2009). Digital image watermarking using two dimensional Discrete Wavelet Transform, Discrete Cosine Transform and Fast Fourier Transform. International Journal of Recent Trends in Engineering, 1(1), 616.

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview [16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24] [25]

[26]

[27]

[28] [29]

163

Juneja, S., & Anand, R. (2018). Contrast Enhancement of an Image by DWT-SVD and DCT-SVD. In Data Engineering and Intelligent Computing (pp. 595-603). Springer, Singapore. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/10. 1007/978-981-19-0312-0_20. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Saroha, R., Singh, N., & Anand, R. Echo Cancellation By Adaptive Combination Of Normalized Sub Band Adaptive Filters. International Journal of Electronics & Electrical Engineering, 2(9), 1-10. Arnold M. K., M. Schmucker, S. D. Wolthusen, Techniques and Applications of Digital Watermarking and Content Protection, Artech House, 2003, ISBN: 1-58053111-3. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., & Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021

本书版权归Nova Science所有

164

[30]

[31]

[32]

[33]

[34]

[35] [36]

[37] [38]

[39]

[40]

[41]

Vibha Aggarwal, Sandeep Gupta, Navjot Kaur et al. 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Ratnaparkhi, S. T., Singh, P., Tandasi, A., & Sindhwani, N. (2021, September). Comparative analysis of classifiers for criminal identification system using face recognition. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-6). IEEE. Kohli, L., Saurabh, M., Bhatia, I., Shekhawat, U. S., Vijh, M., & Sindhwani, N. (2021). Design and Development of Modular and Multifunctional UAV with Amphibious Landing Module. In Data Driven Approach Towards Disruptive Technologies (pp. 405-421). Springer, Singapore. Pandey, D., & Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Sharma, M., Sharma, B., Gupta, A. K., Khosla, D., Goyal, S., & Pandey, D. (2021). A Study and Novel AI/ML-Based Framework to Detect COVID-19 Virus Using Smartphone Embedded Sensors. In Sustainability Measures for COVID-19 Pandemic (pp. 59-74). Springer, Singapore. Pramanik, S., Ghosh, R., Pandey, D., Samanta, D., Dutta, S., & Dutta, S. (2021). Techniques of Steganography and Cryptography in Digital Transformation. In Emerging Challenges, Solutions, and Best Practices for Digital Enterprise Transformation (pp. 24-44). IGI Global. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher m Key with RADG Automata. Asian Journal of Information Technology, 16(5). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Degerine S. & Zaidi A., “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi A., “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & Zaidi A., “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179.

本书版权归Nova Science所有

Comparison of Digital Image Watermarking Methods: An Overview [42]

[43]

[44]

165

Degerine S. & Zaidi A., “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints,” in SIAM Journal on Optimization, 2007, Vol. 17, No. 4: pp. 997-1014, doi: 10.1137/050622821. Zaïdi A., “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices,” in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Zaïdi A., “Accurate IoU Computation for Rotated Bounding Boxes in 𝑅2 and 𝑅3 ,” in Machine Vision and Applications, 32, 114, 2021.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 8

Novel Deep Transfer Learning Models on Medical Images: DINET Harmanpreet Kaur1,* Reecha Sharma2,† and Gurpreet Kaur3,‡ 1Department

of Computer Science and Engineering, Chandigarh University, Punjab, India 2Department of Electronics & Communication Engineering, Punjabi University, Punjab, India 3Department of Electronics & Communication Engineering, Chandigarh University, Punjab, India

Abstract Deep learning, specifically Transfer Learning (TL), is currently very appealing in deep learning because of its ability to train deep neural networks in situations where there is a scarcity of very large, labelled data, such as medical images. However, fine-tuning pre-trained models on similar datasets have proven to be successful in addressing the challenges with limited data. The proposed deep model has shown that it can increase learned characteristics of the medical image dataset thereby avoiding the effort to train the network from scratch. In this work, we proposed a novel deep model called the “Dense Block-Inception Network” (DINET) network to demonstrate the effectiveness of transfer learning systems on medical images and address the problem of scarcity in having domain expertise in the medical field, a major challenge in Corresponding Author’s Email: [email protected]. Corresponding Author’s Email: [email protected]. ‡ Corresponding Author’s Email: [email protected]. * †

In: The Impact of Thrust Technologies on Image Processing Editors: Digvijay Pandey, Rohit Anand, Nidhi Sindhwani et al. ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

168

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur medical image analysis. Furthermore, we demonstrated that with transfer learning, the use of non-medical data is a relevant strategy that can be leveraged to train and fine-tune deep neural networks to extract feature maps in the medical images. Finally, it is demonstrated that using pretrained models in conjunction with transfer learning is a viable strategy for generating learned representations in chest X-ray images, with the potential to overcome the problem of limited availability of large labelled data. This research focused on the implementation and evaluation of a novel deep model to evaluate the effectiveness of transfer learning systems on medical images. Future work may focus on improving the effectiveness of transfer learning by training with more data to improve the discriminative features and increase learned interactions on new data to solve medical image recognition tasks. Potential improvements may also focus on using different class balancing methods such as Synthetic Minority Over-sampling Technique (SMOTE) to address the issues present in many deep learning datasets of under-sampling and oversampling in classes.

Keywords: deep learning (DL), transfer learning (TL), CIFAR-10, image classification, dense block-inception network” (DINET)

1. Introduction Deep learning (DL) is an active and popular research area in the field of Machine Intelligence that provides researchers with tools and techniques to solve complex tasks involving very large datasets [1-3]. Additionally, Deep Learning is a subclass of machine learning that applies mathematics to mimic the cognitive abilities of the human brain in order to perform human-level tasks, such as speech and image recognition [4-6]. Deep learning is the process of training a computer to mimic the structure and function of the human brain, especially the biological neural networks, through the application of Artificial Neural Networks (ANN) to solve computer vision and recognition problems [7-12]. For example, Waymo, formerly Google self-driving car project, is an autonomous driving company that uses deep learning techniques to develop self-driving vehicles trained on millions of miles of public roads and over a billion simulated miles to handle complex scenarios such as traffic signs, pedestrians, car surroundings, etc. within the context of real-world driving conditions [13]. It is stated in [14] that deep learning allows computational models that have many layers of processing to learn representations of data at multiple levels of abstraction. Deep learning models, for example, DCNNs,

本书版权归Nova Science所有

Novel Deep Transfer Learning Models on Medical Images: DINET

169

have shown remarkable results in the classification of medical images [15, 16]. Within the DCNNs, the convolutional filters provide the workhorse power by convolving across the underlying images to extract features necessary for the learning and prediction of outputs. Typically, deep learning models include hundreds of hidden layers or convolutions, generating millions of parameters that require powerful and large-scale computing resources to train. Additionally, deep learning frameworks, for example, TensorFlow [17], PyTorch [18] and Keras [19] have been optimized to scaleup and speed up the use of federated and distributed learning algorithms for deep neural networks [20, 21]. With transfer learning, therefore, the CNN network does not have to be retrained from scratch. There has been previous research reporting that using different transfer learning approaches can enhance performance [22] and accuracy of training deep learning models. The following scenarios also provide classifications or further categorization on different settings of when to apply transfer learning [23]. The following scenarios also provide classifications or further categorization on different settings of when to apply transfer learning [23]. Instance transfer: This approach aims to reuse knowledge gained from parts of the source domain data to learn the tasks of a target domain. In other words, importance is given to re-weighting and sampling. Feature-representation learning: The idea behind this approach is to apply optimal features learned from the source domain that can better represent features in the target domain. The objective, in this case, is to minimize error rates and significantly improve the performance of learned features of the target tasks. Parameter transfer: The assumption in this scenario is that models used for similar tasks between the source and target domains share common parameters or prior distribution of priors. Essentially, this means that learned weights can be transferred across different domain tasks. Table 1. Methodological approaches to transfer learning Heading level Instance transfer Feature-Representation Parameter-Transfer Relational-Knowledge Transfer

Inductive Transfer x x x x

Transductive Transfer X X

Unsupervised Transfer x

本书版权归Nova Science所有

170

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

Relational-knowledge transfer: The intuition behind this case is that unlike the other three previous approaches, the contextual relationship is between the source and the target tasks based on similar data. The implication is that the relational knowledge learned from similar data can be inherently transferred across source and target tasks. The relationship between the different settings and approaches to transfer learning is summarized in Table 1. Multi-Task Learning (MTL), an inductive transfer approach is one of the most successful and widely adopted approaches because of its ability to improve generalization by utilizing specific domain information obtained from training signals of related tasks [24]. Some of the many successful applications of MTL range from drug discovery [25] to NLP [26]. Regardless of the selected TL approach, the goal is to find an accurate marginal or conditional distribution difference in the source domain or a combination of both [27]. In this work, we focus on the feature-representation learning of the transfer learning approach where the source and target class labels are different, but we pick the optimal features from the source domain to classify our target labels.

1.1. Medical Image Classification: Transfer Learning Humans have the inherent capacity to conceptualize complex concepts learned in one domain and use that knowledge learned to solve another related task in either a similar or different domain. For example, a person can learn how to play the guitar and use this knowledge to learn how to play the violin. The underlying principles are such that cross-referencing related tasks and applying knowledge learned from these tasks to solve other related tasks is much easier. Researchers and data scientists believe that this concept of knowledge transfer to solve related tasks from one domain to another domain is paramount towards achieving the goal of strong AI. Traditionally, in the context of deep learning, accessing very large datasets with labelled data for supervised learning is not only tedious but time-consuming and expensive. Therefore, the concept of transfer learning is gaining wide adoption and finding success in computer recognition tasks. Given the formal definition and notations of transfer learning (TL) described earlier, TL can also be defined as the ability to identify deep connections [28]; the ability to extend what has been learned from one domain to a new domain [29]. The insight behind TL is to leverage pre-trained models

本书版权归Nova Science所有

Novel Deep Transfer Learning Models on Medical Images: DINET

171

to transfer knowledge gained through solving a specific task and then reuse that knowledge learned to decode different problems unrelated to the same domain. One popular computer vision problem is medical image classification, localization and segmentation tasks where one can retrieve knowledge learned from a non-medical image domain (source) and make predictions in a medical image domain (target). Recent studies in computer and vision literature have provided empirical evidence on the successes of using TL with CNNs to represent learned features trained on very large-scale datasets. For instance, the following studies demonstrated the use of CNNs architectures pre-trained on ImageNet as either feature extractors [30-32] or fine-tuning [33, 34] networks. With the success of CNNs, the transferability of deep representations across tasks has been comprehensibly investigated, especially using the transfer learning paradigm [22, 35, 36]. Also, the use of pre-trained networks in TL enables deep convolutional neural network (DCNN) models to improve its generalization performance to new classification tasks previously unseen by the model. In a pre-trained model, the trained weights enable the bottom hidden layers of the ConvNets to learn low-level universal features such as curves, edges, and lines useful for most image analysis tasks. The top convolutional layers tend to specialize in learning more abstract features (e.g., eyes, nose, ears) and fit those features to the specific classification task of interest (e.g., face or jawlines). Transfer learning improves the learning of interaction of relationships or patterns in the target domain by leveraging knowledge from related domains. In many image recognition tasks where limited dataset exists, application of TL techniques has achieved considerable success in transferring knowledge from one domain to another to solve different image recognition problems. Medical image classification is considered a sub-domain of image classification problems and inherent neural networks such as CNNs for classification challenges can also be applied to it. Furthermore, in the computer vision domain, specifically the medical domain, prior research suggests that using TL with ImageNet pre-trained models can have a significant impact on the success of medical image classification tasks [37, 38]. The literature review also shows that the use of a transfer learning system on the classification of Optical Coherence Tomography (OCT) images yielded an accurate model that rivaled the judgment of six human experts [39]. Regarding medical image classification tasks, In [40], there has been demonstrated a multi-label classification task using DCNN architecture to evaluate the performance on the Chest X-ray 14 dataset.

本书版权归Nova Science所有

172

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

Over the past years, several studies have contributed to the development of deep learning networks with capabilities to learn representations of feature maps with multiple levels of abstraction [41], for example, AlexNet [42], GoogleNet [43], VGGNet [44] ResNet [45] and DenseNet [46]. Representation learning refers to a collection of methods that allow machines to automatically discover insights from raw data [47]. Prior studies have also focused on the use of pre-trained ResNet-50 and DenseNet-121 architectures on medical images to develop state-of-the-art models to address classification and detection problems. For example, the ResNet-50 architecture was leveraged and pre-trained on ImageNet to build a deep model using TL on the Chest X-ray 14 dataset. Moreover, weakly supervised learning has been used to examine pathology localisation through the classification of thoracic diseases [47-50]. In binary classification tasks, researchers used the ChestXray 14 dataset for pneumonia detection using the CheXNet model [51-53]. CheXNet is a recent DCNN effort on the classification of chest X-ray images using a fine-tuned DenseNet-121 with a modified fully connected layer. Researchers have found that higher resolution images can improve model performance especially with the use of spatial location information, which greatly improves classification accuracy [54]. Our approach extends the depth of the proposed network and introduces active interactions among the learned feature maps into the network. We use pre-trained models for our novel method and apply transfer learning strategies, which benefit from less time spent in learning new tasks. The remainder of this section is organized as follows. In section 2, the literature review is discussed. We describe the methodology in Section 3. The experiments and results are discussed in Section 4. Next, we present a discussion of the findings in Section 5. Finally, section 6 presents the conclusion, limitations, and future research.

2. Literature Review In recent years, the popularity of deep CNNs with the ability to learn multiscale features in different visual recognition tasks has given rise to the design of other multi-scale CNNs to improve some of the inherent computer vision challenges. Inception network [43] is one such heavily engineered class of CNN, consisting of 22 layers of neural networks to solve classification and detection tasks. The Inception model based on the prior work by [55] incorporated a dimensionality reduction layer (1 x 1 convolutional filter) to

本书版权归Nova Science所有

Novel Deep Transfer Learning Models on Medical Images: DINET

173

improve the expensive computation and training process of the network. Additionally, the other notable novel contributions of Inception networks were the introduction of inception modules, and kernel tricks to improve performance. The introduction of auxiliary classifiers in the Inception network helped to mitigate the problem of vanishing gradients, in other words, preventing parts of the network from ‘dying out’. Similarly, later versions of Inception networks introduced further improvements such as the use of residual connections which in effect dramatically improved the speed and efficiency of training the network [56]. An illustration of the inception module is shown in Figure 1.

Figure 1. An inception module with dimensionality reduction.

Another class of deep CNN is the DenseNet architecture, which proved that CNNs can substantially go deeper and give much more accuracy without sacrificing performance, especially when training deep networks. DenseNet comprises 121 layers concatenated together in a feed-forward version as shown in Figure 2 [46].

Figure 2. 5-layer dense block with a growth rate of k = 4.

本书版权归Nova Science所有

174

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

Like Inception, DenseNet has the advantage of alleviating the problem of vanishing gradients. Other compelling benefits that DenseNet offers include feature reuse, strong feature propagation, and a significant reduction of the number of parameters thus improving the efficiency of training deep networks [46]. Our proposed novel method incorporates the beneficial aspects of DenseNet such as parameter efficiency, and concatenation of feature maps which promotes feature reuse and replaces the inception modules in favor of DenseNet modules. Thus, TL techniques and results from medical image classification have been reported in the literature but it is not clear to what extent the findings are effective towards generalizing the models in the medical images domain. In this study, we seek to answer the central objective of determining the effectiveness of transfer learning using our novel deep model on medical images and finding an optimal cut-off point through the fine-tuning technique. In this work, we develop a deep model based on a deep transfer learning method with increased depth for medical image classification. Then we further investigate the effectiveness of TL on medical image classification and potentially advance the learned knowledge (model) to generalize for other unrelated problems in the medical image domain. Finally, in medical image diagnosis, it is worth noting that a deep learning system that minimizes the occurrences of false positives is much more beneficial in mitigating the risks associated with misdiagnosis.

3. Methodology This section presents the steps and procedures taken for the medical image classification task. Section 3.1 describes the datasets and data pre-processing steps taken for the classification task. Sections 3.2 and 3.3 describe the experimental setup and implementation environment for achieving the task. Moreover, in this work, we apply a machine learning methodology, specifically using non-linearity functions on medical imaging recognition tasks. A pre-trained DIM network is used for transfer learning purposes to leverage feature learning from source data (non-image data) to our target data (medical images). Finally, the DIM network is trained using versions of the novel deep architecture to investigate the effectiveness of fine-tuning on the medical images.

本书版权归Nova Science所有

Novel Deep Transfer Learning Models on Medical Images: DINET

175

3.1. Datasets To ensure the robustness of our proposed method, we trained and evaluated it on two publicly available datasets: CIFAR-10, and ChestX-ray 8. The datasets were randomly split into 90:10 (CIFAR-10) and 80:10:10 (ChestX-ray 14) proportions for training and validation respectively. The two datasets are used for the training and validation phase of our novel method. The following is a description of each of the datasets: •

•

•

CIFAR-10: The CIFAR-10 dataset comprises of 60,000 color images from diverse objects with an image size of 32×32 pixels for each image and categorized into 10 classes (airplane, bird, dog, frog, deer, dog, horse, ship, truck, automobile) for a total of 6000 images per class [57]. During the training of the proposed method, the datasets were automatically split into 50000 training images and 10000 test images. Chest X-ray 14: Chest X-ray is one of the popular imaging modalities due to its cost-effectiveness in performing medical examinations. Although the diagnosis of Chest X-rays can be challenging enough, this dataset is one of the largest publicly available medical images focusing on clinical diagnosis of Chest X-rays. This dataset consists of 112,120 frontal-view chest X-ray images extracted from 30,805 unique patients spanning 14 different classes of thoracic pathologies including Cardiomegaly, Consolidation, Edema, Emphysema, Effusion, Fibrosis, Hernia, Infiltration, Nodule, Mass, Pleural thickening, Pneumonia, and Pneumothorax [40]. Accordingly, this dataset is considered suitable for weakly supervised learning with labels that have acceptable accuracy for research efforts. Data pre-processing and transformation: Data pre-processing is important and is an on-going research area in machine learning. A well-balanced dataset is critical for obtaining a much more accurate model on image recognition tasks. According to [58], data preprocessing may include data cleaning, integration, transformation, and reduction operations. Quality decisions depend on quality data which leverages the ability to transform data into a form that can be efficiently and accurately processed by computers. Figure 3 shows the distribution of all the diagnoses associated with the Chest X-rays.

本书版权归Nova Science所有

176

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

As illustrated in Figure 3, the classes in the Chest X-ray dataset are heavily imbalanced, a condition known as linear imbalance, in which minority classes are nearly equal and a large gap exists between majority and minority classes. For example, the majority class has over 60,000 images classified as No finding at the expense of the rare classes such as Pneumonia, requiring adjustments to balance the distribution.

Atelectasis Nodule Pneumothorax Pleural Thickening Hernia 0

10000200003000040000500006000070000 Count of Chest X-rays per…

Figure 3. The probability distribution of chest x-ray classes.

To overcome this problem of class imbalance, we applied the undersampling method on the No Finding class to standardize the count around the mean. After the data pre-processing step, the results are displayed in Figure 4 where the goal is to distribute the weights across the classes to minimize the large differences between the majority and minority class.

Frequency ( %)

Adjusted Frequency of Diseases in Patient Group 80 60 40 20 0

Figure 4. Adjusted distribution after pre-processing.

本书版权归Nova Science所有

Figure 5. Images showing types of chest x-ray diagnoses.

本书版权归Nova Science所有

178

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

In Figure 5, the sample images show the different types of thoracic pathologies with their associated labels. In data transformation, a Keras library flow_from_dataframe was used to binarize the labels to binary vectors from the categorical data by applying one hot encoder. The reason for using a one-hot encoder is to transform the categorical variables into a form that the machine algorithms can understand (e.g., into 1 and 0s). Afterward, the images were resized to a dimension of 112 x112 pixels to match the input layer of the pre-trained model and assist with minimizing the overhead during the training step thus improving the overall learning speed. Additional pre-processing steps included normalization and standardization of training and validation datasets.

3.2. Deep CNN Phase For our experimental setup, we developed our model using stacked CNNs, pooling, and fully connected layers. Next, we prepared our novel deep model by training it on the CIFAR-10 dataset as a pre-trained network. Many CNN architectures developed for different deep learning tasks have been proposed in the literature [43]. In this work, we use our novel architecture with additional structural and functional improvements by introducing a DenseNet module (DIM module) into the network. The DIM modules provide an efficient yet simple block that maximizes on multiscale representations of learned features.

3.3. Implementation Details We developed our implementation strategy for our proposed model in Python using Keras [59], a deep learning framework with TensorFlow [17] as the backend to implement the network architecture. Following prior work, we trained our network using stochastic gradient descent (SGD) with weight decay of 0.0001, a momentum of 0.9 [45, 60] and a mini-batch was set to 1024. Unlike Adam and RMSprop optimizers which tend to converge faster, SGD was selected for its better generalization properties [61]. All experiments were conducted using model and data-parallelism. The setup included a multiple GPU environment running Windows 10 operating system; utilizing dual NVIDIA TITAN V 12GB/16GB GPU for training and transfer learning with Intel core i7 processor to facilitate computation more rapidly. Additionally,

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET

179

more rigorous training of the proposed method was deployed on Lawrence Supercomputer running CentOS Linux operating system which comprised of over 2,000 CPU cores, 1.5TB memory and multiple GPU accelerators (2x NIVIDIA Tesla P100 16GB/1x NVIDIA Tesla V100 32GB). By using this GPU configuration platform, we implemented a pre-trained DIM network on CIFAR-10 and re-trained the network on the Chest X-ray dataset using finetuning approaches. The transfer learning approach involved jointly training the pre-trained model with our new classifier on Chest X-ray images and later fine-tuned the higher layers to find the optimal cut-off layers. Moreover, we considered an adaptive learning schedule where iterations of the learning process used 200 epochs with a decreasing learning rate schedule of 5% for every 10 epochs.

4. Experiment Results This section discusses the transfer learning approach, experiments, and metrics used to measure the performance of the proposed DINET. Section 4.1 explains the transfer learning approach selected for the medical image classification task. Section 4.2 and section 4.3 describe the experiments and evaluation used in this work. The results are presented in Section 4.4. We trained and fine-tuned three versions of the DIM network as shown in Table 2. Notably, all the experiments were fine-tuned based on the pre-trained model trained on the CIFAR-10 dataset.

4.1. Implementation Details In the suggested approaches, the encoded information of the DCNN residing at the lower-level layers is responsible for detecting low-level features such as colors, visual edges, contours, shapes, and textures which are universal across most image recognition problems. Likewise, the higher-level layers detect more complex abstract concepts and objects (such as “human eye”, or “human ear”). Therefore, the lower-level layers were trained in a feed-forward fashion to discover the universal features that will be reused during the training of the medical images (target domain).

本书版权归Nova Science所有

180

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

In this work, we adopted transfer learning with a fine-tuning approach, a popular strategy for model reuse where the weights of the pre-trained model were adjusted by unfreezing the higher-level layers of the network for training on the medical image classification task. The common practice is to remove the last layer (SoftMax layer) of the pre-trained model and replace it with a new SoftMax layer relevant to the problem under investigation. The new classifier is then retrained on top of the network with the new dataset (medical images). Furthermore, the other vanilla practice involves freezing the weights of the lower-level layers of the pre-trained network; this is because basic features that are relevant to our problem such as edges, and curves are already captured. By unfreezing higher-level layers of the pre-trained network for training or jointly training with the classifier, and continuing with backpropagation, the network can focus on learning specific data-centric features of our medical images. Figure 6 illustrates the transfer learning workflow for common practices when training a pre-trained network [62].

4.2. Experiments We set up several experiments, training the pre-trained model for the three DINET versions, where the DIM modules were strategically positioned at different parts (top, middle, and bottom) of the proposed architecture. The original size of the CIFAR-10 images was 32 x32. The images were then upsampled resulting in image sizes of 112 x 112. Similarly, the medical images were resized and cropped resulting in an image size of 112 x 112. For data augmentation, we employed the following variations: size rescale, rotation, width, and height range, shift range, zoom range, horizontal flip, and fill mode. In experiment (1), the DIM ver1 network (top part) was then pre-trained on the CIFAR-10 dataset (source domain) in preparation to learn our medical images (target domain). Once the pre-trained model was trained until convergence, the next step was to save on disk the performance of the model together with the trained weights in preparation for the next step of transferring the model to the problem of interest (medical image classification). The fully connected layers of the pre-trained model were truncated, and a new classifier defined on top of the network. We used dropout regularization with a ratio of 0.25 in the fully connected layers and created a SoftMax layer in the new classifier [63]. We also added a batch normalization layer consistent with current practice to facilitate accelerated learning rates during the training of the network. Next, the new classifier was trained with the medical image

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET

181

dataset to ensure that minimal error signal was propagated throughout the network during the training process. Lastly, the fine-tuning step was applied, where several blocks of the pre-trained network (higher layers) were unfrozen and retrained together with our medical images while freezing the rest of the network. The process was iterated until the best performing OCL was identified. In experiment (2), we experimented with DIM ver2 and DIM ver3. First, the DIM module was strategically positioned in the middle of the network for DIM ver2. The training procedure followed in the experiment (1) was implemented for DIM ver2 as well. Second, the DIM module was also strategically positioned at the bottom of the network, and we called this DIM ver3. Similarly, the training procedure used for DIM ver2 was utilized here as well. The results for both experiments are shown in Table 2. In summary, the main application of our novel method was in the use of transfer learning systems in the diagnosis of medical images, more specifically, Chest X-ray images.

4.3. Evaluation Procedures and Techniques To assess our proposed network, we empirically examine the effectiveness of knowledge transfer through incremental fine-tuning to evaluate how learned feature representations mitigate the problem of vanishing gradients and identify the OCL that achieves performance improvements in the multi-class classification problem. Also, evaluation metrics like classification losses (categorical cross-entropy loss) metrics were used. The network was incrementally fine-tuned beginning with the top layer to the next few upper ones. With the GPU Capabilities, each training scenario took hours to train with fine-tuning of different higher-level layers of the network to determine the OCL.

4.4. Results The results reveal that when our proposed model was trained using variants of DIMx modules strategically positioned in the network, DIM v1 produced better performance while alleviating the problem of vanishing gradients.

本书版权归Nova Science所有

182

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

Table 2. Performance of different DIMx versions DIMx Version DIM v1 DIM v1 DIM v1 DIM v3

Fine tuning layer Conv_152 Conv_154 Conv_155 Conv_264

epochs 200 200 200 200

Parameters 6,195,615 6,195,615 6,195,615 6,004,095

Accuracy 0.6871 0.6851 0.6851 0.5899

Loss 1.7961 1.8665 1.8076 1.70

The classification accuracy that gave the best balance between bias and variance trade-off was 68.71%. After each fine-tuning step, we identified the OCL (see Table 2) that gave the best model performance while balancing the overfitting and underfitting problem, a concept known as the bias-variance trade-off. We believe these results are significant and promising on medical image tasks showing that with correct hyper-parameter tuning on our proposed model, transfer learning approaches can produce respectable performances while mitigating the problem of vanishing gradients.

5. Discussion The recent achievements of leveraging transfer learning in reusing the weights from pre-trained models and re-training the networks on a new problem of interest have received increased attention in the medical imaging domain. TL is currently very attractive in deep learning because of its ability to train deep neural networks where scarcity of very-large labelled data such as medical images still exists. However, fine-tuning pre-trained models on similar datasets has proven effective in addressing data challenges. The proposed deep model has shown that it can increase learned characteristics of the medical image dataset thereby avoiding the effort to train the network from scratch. In particular, the transfer learning approach we used enhanced the robustness of the model by alleviating the problem of vanishing gradients and therefore, improving the convergence of the model. In other words, the convergence concept infers that neither significant change in the error is observed, nor any performance increase is reported. Another observation was that retraining a few of the batch normalization layers jointly with the new classifier improved accuracy and minimized the overfitting problem. Also, we noted that when we introduced a batch normalization layer on top of our classifier, the efficiency of the learning rate improved thus leading to better model performance. This finding can be

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET

183

observed from the experiments performed on DIM v1 where we achieved better results than all the other DIMx versions while mitigating the problem of vanishing gradients. Extraction of knowledge from pre-trained models using natural images seems a viable strategy compared to the traditional method of training from scratch especially instances where large labelled domain-specific datasets are limited. Recent studies used Inception V3 that demonstrated the success of using pre-trained models on natural images and used the deep model to fine-tune medical data. In this work, we also showed that using transfer learning from nonmedical images that are considerably different from Chest X-rays, can be successfully employed to learn or generalize contextual information from medical images and achieve comparable classification performance. Therefore, this approach to transfer learning could be a viable solution when the availability of domain-specific datasets and computational resources are limited or severely constrained.

Conclusion In this work, we proposed a novel deep model called the “Dense BlockInception Network” (DINET) network to demonstrate the effectiveness of transfer learning systems on medical images and address the problem of scarcity in having domain expertise in the medical field, a major challenge in medical image analysis. The innovation of DINET is its ability to combine the parameter efficiency of Dense Blocks and the feature map extraction abilities of Inception networks while mitigating the vanishing gradient problem. The experiments have empirically confirmed that performance improvements can be achieved when DIM modules are strategically positioned in the network. The proposed deep model was able to successfully perform the classification of medical images from non-image data using a supervised inductive transfer learning approach. Furthermore, we demonstrated that with transfer learning, the use of nonmedical data is a relevant strategy that can be leveraged to train and fine-tune deep neural networks to extract feature maps in the medical images. However, there is more room for improvement, for example, combining features extracted from multi-task and multi-source approaches and then jointly training with the medical images. The results achieved by our proposed model show promising results in the classification of thoracic diseases. Healthcare practitioners such as radiologists can potentially benefit from these DCNN

本书版权归Nova Science所有

184

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur

models which can provide an automatic diagnosis of diseases thereby saving time and reduce the workload required to analyze medical images thus shifting their focus on other high-level risk medical images. Finally, the findings in this work have implications for theory and practice. First, our findings confirm that the effectiveness of using transfer learning systems for medical image tasks while leveraging pre-trained models using natural images is achievable. This was successfully established by the authors in [38] who demonstrated the beneficial aspects of transfer learning systems in transmission of learned representations from natural images that are disparate to medical datasets. Second, while transfer learning systems can offer satisfactory performances with the correct amount of hyper-parameter tuning, applying different kernel tricks could potentially help to extract relevant amounts of contextual information and discriminative features in solving various medical image recognition tasks. To this end, we have demonstrated that pre-trained models in combination with transfer learning is a relevant strategy for generating learned representations in the Chest X-ray Images and can potentially overcome the problem of limited availability of large labelled data (domain-specific).

Limitations Despite promising performance results from our proposed deep model, there exist limitations in this work. Training deep neural networks often requires very large datasets which in most cases are difficult to access or acquire. While transfer learning may help resolve the problem of having large amounts of labelled data, training requires tedious amounts of time and technology infrastructure (e.g., more GPUs) to evaluate a robust model. Although we achieved the research objectives in this work, due to memory limitations and time-constraint, the CheXpert dataset) was not used in this work to further evaluate the generalizability of our deep model. Data imbalance and the quality of data was another observed limitation. A noisy dataset could negatively impact the performance of deep learning models. Moreover, unbalanced data may lead to biases, especially during the learning process. Despite our efforts to use available techniques to address the class imbalances, the data quality may have impacted the performance of the model to accurately classify all the thoracic pathologies. Researchers widely believe that the medical image dataset has several problematic issues relating to unlabeled images and noise which could potentially prevent deep neural

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET

185

networks from achieving desired performance capabilities. To motivate future research advancements, other datasets could be made available to support the evaluation of different DCNN network architecture performances, generalizability, and effectiveness of transfer learning systems.

Future Outlook As mentioned previously, this dissertation focused on the implementation and evaluation of a novel deep model to evaluate the effectiveness of transfer learning systems on medical images. Future work may focus on improving the effectiveness of transfer learning by training with more data to improve the discriminative features and increase learned interactions on new data to solve medical image recognition tasks. Potential improvements may also focus on using different class balancing techniques such as Synthetic Minority Oversampling Technique (SMOTE) to address the issues present in many deep learning datasets of under-sampling and over-sampling in classes. In future research, we may consider other approaches of transfer learning like multi-task learning [24, 51], multi-source learning [64] and multi-source domain adaptation [65] to examine in much more depth, the effectiveness of the cross-domain transfer of features and the effectiveness of fine-tuning of pre-trained models on other classes of deep neural networks. In the literature survey, we observed the limited application of unsupervised transfer learning which could be another attractive research area for researchers. Moreover, the impact of data quality and dataset size on transfer learning systems may be investigated further to measure the effectiveness of the underlying functions from the input (source domain) to outputs (target domain) [66-75]. Ultimately, we believe that input from domain experts should be included in the evaluation efforts to enhance the design, development, and implementation of deep neural networks for solving medical imaging problems. Finally, although some of the issues continue to exist, we believe our findings may further create and drive opportunities to motivate future research directions on medical image tasks.

References [1]

Sindhwani, N., Verma, S., Bajaj, T., & Anand, R. (2021). Comparative analysis of intelligent driving and safety assistance systems using YOLO and SSD model of

本书版权归Nova Science所有

186

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur deep learning. International Journal of Information System Modeling and Design (IJISMD), 12(1), 131-146. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Chaudhary, A., Bodala, D., Sindhwani, N., & Kumar, A. (2022, March). Analysis of Customer Loyalty Using Artificial Neural Networks. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 181-183). IEEE. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15 Pandey, D., & Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Pandey, B. K., Pandey, D., Wariya, S., Aggarwal, G., & Rastogi, R. (2021). Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification. Augmented Human Research, 6(1), 1-14. Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Pandey, B. K., Pandey, D., Wairya, S., & Agarwal, G. (2021). An advanced morphological component analysis, steganography, and deep learning-based system to transmit secure textual data. International Journal of Distributed Artificial Intelligence (IJDAI), 13(2), 40-62. Sindhwani, N., Anand, R., Shukla, R., Yadav, M., & Yadav, V. (2021). Performance Analysis of Deep Neural Networks Using Computer Vision. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(29), e3. Pandey, B. K., Pandey, D., Wairya, S., Agarwal, G., Dadeech, P., Dogiwal, S. R., & Pramanik, S. (2022). Application of Integrated Steganography and Image Compressing Techniques for Confidential Information Transmission. Cyber Security and Network Security, 169-191. Sindhwani, N., Rana, A., & Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Pramanik, S., Ghosh, R., Pandey, D., & Ghonge, M. M. (2021). Data Hiding in Color Image Using Steganography and Cryptography to Support Message Privacy. In Limitations and Future Applications of Quantum Cryptography (pp. 202-231). IGI Global. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731.

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET [14]

[15] [16] [17]

[18]

[19] [20] [21]

[22] [23] [24] [25] [26] [27] [28] [29]

[30]

[31]

187

Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Ker, J., L. Wang, J. Rao, & T. Lim: Deep Learning Applications in Medical Image Analysis: IEEE Access, vol. 6, pp. 9375–9389, (2018). Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Gupta, O. & R. Raskar: Distributed learning of deep neural network over multiple agents: arXiv:1810.06060, (2018). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Yosinski, J., J. Clune, Y. Bengio, & H. Lipson: How transferable are features in deep neural networks?, arXiv:1411.1792, (2014). Pan, S. J. & Q. Yang: A Survey on Transfer Learning: IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, (2010). Caruana, R., L. Pratt, & S. Thrun: Multitask Learning: Multitask Learn., vol. 28, pp. 41–75, (1997). Ramsundar, B., S. Kearnes, P. Riley, D. Webster, D. Konerding, & V. Pande: Massively Multitask Networks for Drug Discovery, arXiv1502.02072, (2015). McCann, B., N. S. Keskar, C. Xiong, & R. Socher : The Natural Language Decathlon: Multitask Learning as Question Answering, arXiv1806.08730, (2018). Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Cook, D., K. D. Feuz, & N. C. Krishnan: Transfer learning for activity recognition: a survey, Knowl. Inf. Syst., vol. 36, no. 3, pp. 537–556, (2013). Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., & Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Razavian, A. S., H. Azizpour, J. Sullivan, & S. Carlsson : CNN Features Off-theShelf: An Astounding Baseline for Recognition, 2014 IEEE Conf: Comput. Vis. Pattern Recognit. Work., (2014).

本书版权归Nova Science所有

188 [32]

[33]

[34]

[35]

[36] [37] [38]

[39]

[40]

[41] [42] [43]

[44] [45]

[46] [47]

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur Zhou, B., Y. Li, & J. Wang: A Weakly Supervised Adaptive DenseNet for Classifying Thoracic Diseases and Identifying Abnormalities, arXiv1807.01257 (2018). Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Oquab, M., L. Bottou, I. Laptev, & J. Sivic: Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks, 2014 IEEE Conf. Comput. Vis. Pattern Recognit., (2014). Azizpour, H., A. S. Razavian, J. Sullivan, A. Maki, & S. Carlsson: Factors of Transferability for a Generic ConvNet Representation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 9, pp. 1790–1802, (2016). Huh, M., P. Agrawal, & A. A. Efros: What makes ImageNet good for transfer learning?, arXiv1608.08614 [cs], (2016). Bar Y., I. Diamant, L. Wolf, S. Lie, & H. Greenspan: Chest pathology detection using deep learning with non-medical training, IEEE Xplore. pp. 294–297, (2015). Bruntha, P. M., Dhanasekar, S., Hepsiba, D., Sagayam, K. M., Neebha, T. M., Pandey, D., & Pandey, B. K. (2022). Application of switching median filter with L2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Systems, 1-7. Anand, R., Khan, B., Nassa, V. K., Pandey, D., Dhabliya, D., Pandey, B. K., & Dadheech, P. (2022). Hybrid convolutional neural network (CNN) for Kennedy Space Center hyperspectral image. Aerospace Systems, 1-8. Wang, X., Y. Peng, L. Lu, Z. Lu, M. Bagheri, & R. M. Summers: ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localisation of Common Thorax Diseases: 2017 IEEE Conf. Comput. Vis. Pattern Recognit., (2017). Goodfellow, I., Y. Bengio, & A. Courville, Deep learning. The Mit Press PP Cambridge, Massachusetts, (2016). Krizhevsky, A., I. Sutskever, & G. E. Hinton: ImageNet classification with deep convolutional neural networks: Commun. ACM, vol. 60, no. 6, pp. 84–90, (2017). Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, & D. Anguelov: Going deeper with convolutions: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1–9, (2015). Simonyan K., & A. Zisserman: Very deep convolutional networks for large-scale image recognition: arXiv e-prints, (2015). He, K., X. Zhang, S. Ren, & J. Sun: Deep Residual Learning for Image Recognition: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778, (2016). Huang, G., Z. Liu, L. Van Der Maaten, & K. Q. Weinberger: Densely Connected Convolutional Networks: 2017 IEEE Conf. Comput. Vis. Pattern Recognit., (2017). Hwang S., & H.-E. Kim: Self-Transfer Learning for Fully Weakly Supervised Object Localisation, arXiv1602.01625,(2016).

本书版权归Nova Science所有

Novel Deep Transfer Learning Model on Medical Images: DINET [48]

[49]

[50]

[51]

[52] [53]

[54]

[55]

[56]

[57] [58] [59]

[60] [61]

[62]

189

Sedai, S., D. Mahapatra, Z. Ge, R. Chakravorty, & R. Garnavi: Deep multiscale convolutional feature learning for weakly supervised localisation of chest pathologies in X-ray images, arXiv1808.08280, (2018). Yan, C., J. Yao, R. Li, Z. Xu, & J. Huang: Weakly Supervised Deep Learning for Thoracic Disease Classification and Localisation on Chest X-rays: Proc. 2018 ACM Int. Conf. Bioinformatics, Comput. Biol. Heal. Informatics, (2018). Yao, L., J. Prosky, E. Poblenz, B. Covington, & K. Lyman: Weakly Supervised Medical Diagnosis and Localisation from Multiple Resolutions, arXiv1803.07703, (2018). Guan, Q., Y. Huang, Z. Zhong, Z. Zheng, L. Zheng, & Y. Yang: Diagnose like a Radiologist: Attention Guided Convolutional Neural Network for Thorax Disease Classification, arXiv1801.09927, (2018). Guan, X., J. Lee, P. Wu, & Y. Wu: Machine Learning for Exam Triage, arXiv1805.00503,(2018) Pandey, D., Wairya, S., Sharma, M., Gupta, A. K., Kakkar, R., & Pandey, B. K. (2022). An approach for object tracking, categorization, and autopilot guidance for passive homing missiles. Aerospace Systems, 5(4), 553-566. Kaur, J., Jaskaran, Sindhwani, N., Anand, R., Pandey, D. (2023). Implementation of IoT in Various Domains. In: Sindhwani, N., Anand, R., Niranjanamurthy, M., Chander Verma, D., Valentina, E.B. (eds) IoT Based Smart Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-04524-0_10 Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Ioffe S., & C. Szegedy: Batch normalization: Accelerating deep network training by reducing internal covariate shift, ICML’15: Proc. of the 32nd International Conference on International Conference on Machine Learning. JMLR.org, pp. 448456, (2015). Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images, (2009). García, S., J. Luengo, & F. Herrera, Data Preprocessing in Data Mining. Springer International Publishing PP - Cham, (2015). Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Szegedy, C., S. Ioffe, V. Vanhoucke, & A. Alemi: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, arXiv1602.07261, (2016). Luo, L., Y. Xiong, Y. Liu, & X. Sun: Adaptive Gradient Methods with Dynamic Bound of Learning Rate, arXiv1902.09843, (2019), http://arxiv.org/abs/1902. 09843. Transfer Learning, (2020), https://www.mathworks.com/discovery/transferlearning.html.

本书版权归Nova Science所有

190 [63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

[74]

[75]

Harmanpreet Kaur, Reecha Sharma and Gurpreet Kaur Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, & R. Salakhutdinov: Dropout: A simple way to prevent neural networks from overfitting: J. Mach. Learn. Res., vol. 15, pp. 1929–1958, (2014). Christodoulidis, S., M. Anthimopoulos, L. Ebner, A. Christe, & S. Mougiakakou: Multisource Transfer Learning With Convolutional Neural Networks for Lung Pattern Analysis, IEEE J. Biomed. Heal. Informatics, vol. 21, no. 1, pp. 76–84, (2017). Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Sharma, M., Sharma, B., Gupta, A. K., Khosla, D., Goyal, S., & Pandey, D. (2021). A Study and Novel AI/ML-Based Framework to Detect COVID-19 Virus Using Smartphone Embedded Sensors. In Sustainability Measures for COVID-19 Pandemic (pp. 59-74). Springer, Singapore. Pandey, D., Pandey, B. K., & Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Jain, N., Chaudhary, A., Sindhwani, N., & Rana, A. (2021, September). Applications of Wearable devices in IoT. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-4). IEEE. Kaura, C., Sindhwani, N., & Chaudhary, A. (2022, March). Analysing the Impact of Cyber-Threat to ICS and SCADA Systems. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 466-470). IEEE. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine, S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes”, Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Degerine, S. & A. Zaidi, “Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints”, in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821. Zaïdi, A. “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Zaïdi, A. “Accurate IoU Computation for Rotated Bounding Boxes in 𝑅2 and 𝑅3 ”, in Machine Vision and Applications, 32, 114, 2021.

本书版权归Nova Science所有

Chapter 9

A Review of the Application of Deep Learning in Image Processing Harmanpreet Kaur1,* Reecha Sharma2 and Lakwinder Kaur1 1Department

of Computer Science and Engineering, Punjabi University, Punjab, India of Electronics & Communication Engineering, Punjabi University, Punjab, India 2Department

Abstract Deep learning (DL) is the most significant advancement in artificial intelligence during the past decades. In general, deep learning is an effective area in image processing. Furthermore, various image processing applications contribute to the rapid development of network architecture, layer design, and training approaches. Deep learning is a critical research area in the field of artificial intelligence. The primary objective of this chapter is to review recent findings and future prospects in deep learning. This chapter discusses the most recent advances in deep learning research. It begins with a discussion of the three basic deep learning models: multilayer perceptrons, convolutional neural networks (CNN), and recurrent neural networks (RNN), before delving deeper into deep learning applications in a variety of artificial intelligence domains, such as speech processing, computer vision, and natural language processing. Aside from that, challenges and issues in image processing using deep learning are presented, in addition to promising future directions. *

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

192

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

Keywords: recurrent neural networks (RNNs), convolutional neural networks (CNN), Alexnet, ResNet, VGGNet, long and short-term memory network (LSTM)

1. Introduction Deep learning (DL) is an important area of research in artificial intelligence (AI). AI has many applications, such as speech processing, computer vision, natural language processing, and so on. The first computational model of a neuron was created by Warren McCulloch (a neuroscientist) and Walter Pitts (a logician) [1]. In October 2016, United States government published the “American National Artificial Intelligence Research and Development Strategy”. High-tech companies such as Google, IBM, and Amazon have been increasing their investment in artificial intelligence in the past few years. Artificial Intelligence (AI) has a variety of applications in human life, and an increasing number of artificial intelligence start-up companies are emerging. In 1958, Rosenblatt introduced the first generation of single-layer perceptron in neural networks. A neural network of the first generation is capable of identifying basic geometric shapes like triangles and squares [2]. A human must design neural networks capable of learning, understanding, and remembering information in order to develop artificial intelligence. However, humans were unable to achieve their objectives as a result of the shortcomings of the first generation of neural networks. A perceptron is a single-layer network; it can only classify linearly separable problems. But, a simple logic function cannot be solved with a perceptron and requires a higher order classification system. Neural networks consist of fixed layers, which is incompatible with the idea of an intelligent machine [3]. The authors in [4] presented a second-generation neural network in 1986, which replaced a single fixed feature layer with multiple hidden layers. Thus, neural network is broadly utilized in a diversity of machine learning applications, such as image processing, control, and optimization. In order to solve the problem of nonlinear classification, error back propagation algorithm was utilized to employ sigmoid function as an activation function. It was noted that the back propagation algorithm had the problem of vanishing gradients [5]. In 1995, Cortes and Vapnik developed the support vector machine. Subsequently, several shallow machine learning algorithms have been developed [6]. During this period, research on neural networks was not performed. Hinton (2005) studied the graph model in the brain [7]. In 2006,

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

193

G. E. Hinton developed an auto-encoder to reduce the dimensionality of the data [8], suggested using pre-training to efficiently train deep belief networks [9] to eliminate gradient vanish problem. Bengio et al. proved that the pretraining method is also suitable for unsupervised learning such as selfencoding [10]. In recent year, The authors in [11] used an energy-based model in order to efficiently learn the sparse representation, provided the framework for deep learning and deep learning has been rapidly developed in recent years. The US Department of Defense’s DARPA program funded the first deep learning project. Glorot et al. presented the ReLU activation function in 2011, which effectively resolved the gradient vanishing problem [12]. Deep learning was first to make a major breakthrough in voice recognition. The authors in [13, 14] were the first to use deep learning to reduce the error rate of voice recognition to 20% to 30%. An in-depth study was conducted that reduced the top five error rates of ImageNet [15] for image classification problems from 26% to 15% [16]. The local minimum problem is usually not a serious problem, and that the shadow network eliminates this problem [17, 18].

Figure 1. An inception module with dimensionality reduction.

Development history of deep learning is illustrated in Figure 1. Solid circles represented the level of success that depth learning achieved during the current year whereas hollow circles signify the key turning point in the rise

本书版权归Nova Science所有

194

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

and fall of depth learning. Straight lines diagonally ascending indicate that the depth of learning is rising, while diagonally downward straight lines indicate that it is falling.

Figure 2. Forward propagation of MLP.

1.1. Basic Network Structure: Multi-Layer Perception (MLP) Multi-layer Perception (MLP) [2] is also known as forward propagation network, deep feed-forward network, which is the most basic deep learning network structure shown in Figure 2. It consists of several layers, each consisting of a number of neurons. The multi-layer sensor that uses the radial basis function to activate the function is known as the radial basis function network. In neural network training, the main objective is to decrease the loss function and the optimization technique is typically batch gradient descent.

1.2. Convolutional Neural Network (CNN) Convolutional Neural Network (CNN) is widely used in image processing, also known as time-delay networks. CNN is primarily comprised of convolutional layers and pooling layers, which are influenced by principles of visual neuroscience. In addition to preserving the spatial continuity of the image, convolutional layer extracts the local features of the image. Maximum or average pooling is used in the pooling layer. The pooling layer reduces the dimensionality of the hidden layer in the middle, reduces the amount of computation in subsequent layers, and provides rotational invariance. Figure 3 illustrates of convolution and pooling operations, in which a 3 × 3

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

195

convolution kernel is used in conjunction with a 2 × 2 pooling operation. There are approximately 60 thousand trainable parameters in this model [19].

Figure 3. Illustration of convolution and pooling operation.

(a)

(b)

(c) Figure 4. RNN and its extended form: (a) General form of RNN, (b) and (c) its extended form.

CNN provides hierarchical representations of visual data. CNN layers are weighted differently and each layer learns different features of the image. As

本书版权归Nova Science所有

196

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

the layer is raised, the composition becomes more specific. CNN technique is processed layer by layer, and it identifies entire system [20]. The second layer of CNN can identify corners, edges, and colours; the third layer can identify more complex invariability such as texture; the fourth layer can identify dog specific features such as the face; the fifth layer can identify specific objects such as keyboards or dogs. In the case of face recognition, CNN recognizes first the points, edges, colours, corners, and then the corners of the eyes, lips, nose, and finally the entire face. CNN is easy to implement and speed up on FPGA and other hardware [21]; the weights of CNN in different convolutional layers are shared and they are all the weights of the convolution kernel [22]. Recurrent Neural Network Recurrent neural networks [4] are suitable for the analysis of time series data, and are widely used in the speech and natural language processing fields. Human speech and language are inherently timesequential. Figure 4 illustrates the RNN and its expanded diagram. The RNN uses the output of the hidden layer at the previous moment as the input of the hidden layer at the present moment, and can use the information from the past moment, so it has a memory. As the RNN shares weights at all times, model parameters are greatly reduced. However, RNN training is still difficult, so [23] improved the RNN training method.

2. Network Structure Improvements 2.1. Improvement of Convolutional Neural Network ILSVRC (ImageNet Large Scale Visual Recognition Challenge) has greatly contributed to the development of convolutional neural networks. ImagNet has continuously improved with new convolutional neural networks from AlexNet [16] to ZFNet [20]. In 2015, the number of network layers has steadily increased with the introduction of VGGNet [24], GoogleNet [25], and ResNet [26]. In addition, the model capabilities have been continuously improved. AlexNet demonstrated for the first time ever the powerful capabilities of deep learning. ZFNet is the result of the visual interpretation of convolutional neural networks. In VGGNet, it has been demonstrated that network depth can enhance the effects of deep learning. GoogleNet was able to break the convolutional layer pooling for the first time. For the first time, ResNet has successfully trained 152 layers of a neural network. R-CNN [27] is the principal method of CNN applied to object detection, along with its

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

197

improvements Fast R-CNN [28], Faster R-CNN [29], and Mask R-CNN [30]. The process of improvement involves replacing shallow machine learning models with deep learning models, achieving training in all areas, and the speed of the process is increasing. In addition, in the network-in-network structure [31], the idea of nesting networks within networks was creatively proposed. In the spatial transformation network [32], it is demonstrated that the model effect can be improved without necessarily changing the structure of the network, and even by modifying the input data.

(i)

(ii)

(iii)

(iv)

Figure 5. CNNs based architectures model: (i) AlexNet model, (ii) VGGNet model, (iii) ResNet model and (iv) ReLU model.

本书版权归Nova Science所有

198

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

2.1.1. AlexNet Model Hinton participated in the 2012 ILSVRC and won the first prize in order to verify the effectiveness of deep learning. The neural network model used is called AlexNet [16] and its shown in Figure 5. This network consists of five layers of convolutional layers, max-pooling layers, and dropout layers, which are connected by three layers of fully connected layers. There are 1,000 neurons in the output layer, which correspond to 1,000 categories. The probability of each category is determined using the Softmax function. Use dropout to avoid overfitting and train your model using batch gradient descent with momentum and weight decay to increase the training data. AlexNet was trained using two GPUs running in parallel for six days, and ReLU was used as the activation function to reduce training time by six times compared with Tanh. This series of technologies adopted by AlexNet is still widely used. VGGNet model: Simonyan et al. successively increased the number of convolutional layers in AlexNet, followed six different kinds of networks, and investigated the influence of the depth of the convolutional layer. The results indicate that a deeper neural network achieves better results. Increase the number of layers to the 16th and 19th, and this effect improves significantly [24]. In VGGNet, the convolution kernel is strictly 3*3 with 1 stride and padding. It uses a 2×2 max-pooling with a step size of 2. In comparison with ZFNet 7×7 kernel, the VGGNet is only 3×3 kernel, reducing the model parameters and the convolutional layer of two successive layers creating the effect of a 7×7 convolution kernel. Following this, a 3×3 kernel is also used. VGGNet is implemented with Cafe and the image noise is used to enhance the training data, which results in good results in both image classification and object recognition. 2.1.2. ZFNet Model ILSVRC 2013 was won by ZFNet with an error rate of 11.2%. ZFNet is a finetuned version of AlexNet, and there are still eight layers in the network. In order to understand the role of each layer of the CNN, Zeiler and Fergus used the de-convolutional network to visualize the CNN [33]. This visualization was used to identify a better network structure ZFNet that requires less training data. In comparison to AlexNet, ZFNet uses only 1.3 million images to train its model. ZFNet is composed of 7 layers, while AlexNet is composed of 11 layers. In the first layer, ZFNet is able to retain a greater amount of relevant information due to the smaller convolution kernel. GoogleNet: GoogleNet is champion of ILSVRC 2014, the top5 error rate is 6.7%, and its network layer number is 22. GoogleNet illustrates that CNNs do not necessarily need to stack

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

199

convolutional layers and pooling layers in sequence [25]. The Inception module is used by GoogleNet, and each layer of the module is a convolutional layer and a pooling layer at the same time, so there is no need to specify whether the layer is a convolutional layer or a pooling layer.

2.1.3. Deep Residual Network (ResNet) The ResNet model is the winner of ILSVRC 2015. In addition, this network won competitions in three other categories: image classification, object positioning, and object detection [26]. The error rate of image classification tasks was 3.57%, which was higher than the error rate of 5.1%. The ResNet network consists of 152 or even 1,000 layers. Deep networks suffer from gradient disappearance. For computer vision, Girshick et al. proposed R-CNN for object detection. Object detection is the process of creating a rectangle around each object in an image. It is separated into two parts; first step is to create a box around the object, and then classify the framed object to identify which object is specific [27]. In R-CNN, approximately 2000 boxes are generated, and a trained CNN, such as AlexNet is used to extract features from pictures in each box, and then the features are placed into SVM for classification, and at the same time, the features are extracted from the pictures in each box. Fast RCNN was developed by combining the steps of CNN extraction, SVM classification, and regression in R-CNN to develop an overall model that is fast and accurate [28]. Fast R-CNN requires input data from the entire image and several boxes. As a first step, create a feature map by applying various convolutional layers and a pooling layer to the entire image. Then apply the region of interest pooling layer to each box in order to create a feature map with a fixed size. In the second step, connect several fully connected layers in order to determine the probability of each category and to determine the four values of the box for each. Faster R-CNN processes the entire image by first applying convolutional layers and pooling layers. In this feature map, region proposal networks are used to generate boxes [29]. All other operations are performed in a similar way to Fast R-CNN. Therefore, Faster R-CNN has also replaced the method of generating boxes with a deep learning model and has changed from generating on the entire image to generating on a smaller feature map. A further increase in model training speed has been achieved. Mask R-CNN method adds parallel branches of semantic segmentation to Fast R-CNN, and includes segmentation tasks on the basis of original box generation, classification, and regression tasks, in order to achieve both object detection and semantic segmentation simultaneously. Furthermore, there are improvements to convolutional neural networks [33],

本书版权归Nova Science所有

200

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

such as de-convolutional neural networks [34], Stack-convolutional autoencoders [35], SRCNN [36], OverFeat [37], FlowNet [38], MR-CNN [39], FV-CNN [40], DeepEdge [41], DeepContour [42], Deep Parsing Network [43], TCNN [44], 3D-CNN [45] so on.

2.2. Improvement of Recurrent Neural Network Recurrent neural networks suffer from gradient disappearance or gradient explosion [46], and cannot utilize the long-term information of the past. In the case of sigmoid functions, the derivative is a number less than 1, and multiplication of multiple derivatives less than 1 will lead to vanishing gradients. There are several solutions to this problem, including LSTM [47] and hierarchical RNN [48, 49]. In general, RNNs are only capable of processing one-dimensional data such as time series, and multi-dimensional RNNs [50] have been proposed for the processing of multi-dimensional objects such as images. Natural language processing tasks require the use of context information. The RNN is a complex learning algorithm and requires a large amount of computation that does not require repeatedly calculating gradients to achieve high accuracy. The recurrent neural network lacks reasoning capabilities, so it cannot perform the task that requires reasoning. Memory networks [51] and neural Turing machines [52] can solve this problem by adding memory modules.

2.2.1. Long and Short-Term Memory Network (LSTM) Long- and short-term memory networks use LSTM units in place of neurons in a RNN. In order to manage information flow, input gates, output gates, and forget gates are added to input, output, and forget information [51]. There are two transmission states in the LSTM: cell state and hidden state. The cell state changes slowly from time to time and the hidden state at different times may be very different. A gate mechanism is established to achieve a trade-off between the input at the old time and the input at the new time. It is essential to adjust the focus of the memory according to the training goal and then execute the entire encoding sequence. A LSTM can remember the important information for a long time, forget the unimportant information, it can deal with gradient disappearance and explosion, and it has better performance than a RNN over a longer sequence. Gated recurrent unit (GRU) [53] is a lightweight version of LSTM that has only two gates: the update gate and the reset gate. In LSTMs, the update gate controls how much information is stored

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

201

in the past and how much is input from the input layer. Reset gates are similar to forget gates. GRU has no output gate, so it outputs the entire state. By using fewer connections and parameters, GRU enables faster and more efficient training.

2.2.2. Hierarchical RNN Hierarchical RNN [49] utilizes the prior knowledge that time dependence is hierarchically structured, and long-term dependence can be represented by a variable. The hierarchical multi-scale RNN captures the hierarchical structure in a sequence by encoding the time dependence between the different time scales. It is possible to eliminate gradient disappearance with layered RNNs and multi-scale RNNs. 2.2.3. Bi-Directional RNN The bi-directional RNN processes the same sequence forwards and backwards, which is equivalent to having two RNNs, followed by the same output layer. In order to avoid forming an information loop, there is no connection between the forward and backward hidden layers. For example, for natural language processing tasks, both the above and below of a word will affect the word, and the Bi-directional RNN can use the context information of the word. The Bi-directional RNN can capture the information before and after it at the same time, and the Bi-directional LSTM [50, 51] and the Bidirectional GRU [54] have also been proved to be very effective. 2.2.4. Multi-Dimensional RNN A RNN is suitable to process one-dimensional data, such as time series, whereas a multi-dimensional RNN [50] can process multiple-dimensional data, such as images, videos, and medical images. In multi-dimensional RNN, the aim is to transform multi-dimensional data into one-dimensional data in a specific order that can be handled by a RNN. Maintaining spatial continuity within the multidimensional data is essential for this arrangement sequence.

3. Applications of Deep Learning in Image Processing 3.1. Speech Processing Deep learning is broadly used in several areas of speech processing [13], both in terms of standard small datasets [55] and large datasets [14]. Speech

本书版权归Nova Science所有

202

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

processing involves two main tasks: speech recognition and speech synthesis. In recent years, deep learning has been widely applied to speech recognition [56]. It is the first time that the accuracy of speech recognition of Microsoft [57] has reached levels; it achieved 5.9% in 2016. Deep learning is also used by major companies to synthesize speech, such as Google [58], Apple [59], and Google DeepMind [60]. IFLYTEK [61] developed a parallelized WaveNet model for the synthesis of speech, and Ping et al. developed a realtime speech synthesis system known as DeepVoice3 [62].

3.2. Computer Vision Deep learning is broadly used in several areas of computer vision, such as traffic sign detection and classification [63], face recognition [64], face detection [65], image classification [26], multi-scale transform fusion image [60], Object detection [29], image semantic segmentation [43], real-time multi-person pose estimation [66], pedestrian detection [67], scene recognition [68], object tracking [44], end-to-end video classification [69], Human action recognition in video [70] and so on. Furthermore, there are several applications such as automatically colouring black and white images [71], converting graffiti into artistic paintings [72], transferring artistic styles [73], removing mosaics from images [74], etc. Additionally, Oxford University and Google Deep Mind developed LipNet [75], a system which was able to read lips with an accuracy rate of 93%, well beyond human levels of 52%.

3.3. Natural Language Processing NEC Labs America [76] first applied deep learning to the field of natural language processing. Presently, when dealing with natural language, word2vec [77] is typically used to convert words into word vectors, which can then be used as word features. Deep learning techniques are broadly used for natural language processing tasks. These include part-of-speech tagging [78], Grammatical analysis [79], nomenclature recognition [80], semantic role labeling [81], distributed representation using only letters To learn language models [82], Twitter sentiment analysis [83], text classification [84], Reading comprehension [85], automatic question and answer [78], [86], Dialogue system [87], etc.

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

203

Deep learning is also applicable to other aspects of bioinformatics, such as predicting drug activity [88], predicting the location of the human eye [39]. Because of the large amount of data available, deep learning has many applications in finance, such as financial market forecasting [89], securities investment portfolio [90], insurance loss forecasting [91], etc. and there have been a number of financial technology start-ups that have emerged as a result. In addition, the real-time power generation dispatching algorithm based on deep learning can reduce the total pollutant emissions as a consequence of meeting the real-time power generation task, and achieve the goals of energy conservation and emission reduction. Additionally, deep learning can be used to diagnose problems with electric submersible plunger pumps, prevent malfunction accidents, and extend inspection intervals. Furthermore, deep learning is capable of achieving very good accuracy when applied to the modeling of strongly nonlinear and complex chemical processes.

4. Existing Problems and Future Directions of Deep Learning Deep learning has made significant advances in a number of fields. However, there are still some issues with deep learning. Researchers should focus on resolving these problems. This section also outlines possible solutions to these problems. The problems are classified into four types: training problems, landing problems, functional problems, and domain problems. Training problems include issues such as insufficient training time for deep learning. Landing problems refer to issues that limit the practical implementation of deep learning. Functional problems refer to Deep learning. It is not a task that is currently not well accomplished. Problems in specific domain of computer vision and natural language processing are referred to as domain problems.

4.1. Training Problem Large amount of computing power and training time: In order to train the model for the WMT’14 data set, Google [92] utilized 96 K80GPUs for 6 days, and it also took 3 days to fine-tune and improve the model. In addition to training the model only once, and adjusting various hyper-parameters, the total training time is very long, and the cost of the 96 K80 GPU is also very high. It requires a lot of resources to run the deep learning model, and the training

本书版权归Nova Science所有

204

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

process takes too long. To accelerate the training of the model, new hardware, algorithms, and system designs are required. For example, the GPU is not the only component used when it comes to hardware, but also the FPGA chip, which is an efficient and high-performance chip with minimal power consumption, and the ASIC chip, which is tailored to specific requirements. Google developed the TPU chip and the Cloud TPU in order to support Tensor Flow’s deep learning framework. ASICs are also utilized in autonomous driving applications. The AI processor Drive PXPegasus for L5 fully automatic driving has been released by NVidia.

4.1.1. The Gradient Disappearance Problem The training of deep models is very difficult, and models with too many network layers have the problem of vanishing gradients.] There are several aspects of training methods, skills, and network structure that can alleviate the problem of vanishing gradients. Use ReLU [12] instead of Sigmoid as the activation function, use Dropout, batch normalization [93]; these techniques can reduce gradient disappearance. Highway networks introduce a carrier gate and a transformation gate into its method of data flow, so that the output is directly input and transformed. LSTM networks use a gate mechanism, adding input gates, output gates, and forget gates that control the amount of data passed. ResNet [26] differentiates between two types of input; first, ResNet adds a linear connection between two or more layers to ensure that gradients are transmitted to the bottom layer through the linear path. 4.1.2. Use Large-Scale Labelled Training Datasets Training deep learning models needs large amounts of labelled dataset. It is difficult, time consuming, and costly to manually label data; in some fields, such as incurable diseases, it is nearly impossible to collect sufficient amounts of labelled data. Hence, unsupervised learning is an important direction for deep learning research. Therefore, Godfellow et al. proposed the generative adversarial network [94] and a new learning paradigm proposed by Microsoft [95]. 4.1.3. Distributed Training Problem A deep learning model cannot be trained sufficiently on a single computer, especially when there are too many training data, and a model that is too large cannot be fitted. Consequently, large-scale distributed deep learning model is required for training. Distributed parallel training can be classified into three types: data parallel, model parallel, and hybrid parallel. In hybrid parallelism,

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

205

data parallelism is combined with model parallelism. Data parallelism can be utilized on multiple machines. On the other hand, model parallelism can also be utilized on a single machine. Currently, data parallelism is the most widely used technique in distributed systems. There are various methods for data parallelism, such as parameter average methods or update methods, synchronous or asynchronous updates centralized or distributed synchronizations. Using the parameter averaging method, the parameters of each node are transmitted to a parameter server, and then the global parameters are determined by averaging all the parameters. The update method does not transmit parameters, but rather the amount of parameter updates. Microsoft [96] proposed an asynchronous update method with delay compensation; Strom removed the parameter server from the center and distributed parameter updates evenly among the nodes. Network communication was reduced by three orders of magnitude due to the highly compressed update between nodes [97].

4.2. Landing Problem 4.2.1. Too Many Hyper-Parameters To achieve deep learning, there are too many parameters that must be considered, such as how data is collected, generated, selected, and divided; network structure (MLP, CNN, RNN); number of layers and neurons in each layer; weight initialization; weight decay; momentum; learning rate; longing radiation; dropout; number of iterations; bitch; whether to use SGD or Adam; the best way to aggregate distribution data. The hyper-parameters can be learned from another neural network in order to automate the design of the neural network. In order to accelerate convergence, DeepMind [25] uses the learning to learn algorithm in conjunction with another network. Google has released CloudAuto ML, which is capable of finding the optimal network structures automatically. In order to optimize the network structure, Han Honggui et al. proposed that hidden layer neurons could be dynamically added or deleted using the competition mechanism. Zhang et al. proposed a method for designing multi-layer adaptive modular neural networks. 4.2.2. Reliability is Insufficient The high average accuracy of deep learning may, however, result in poor predictions in some test cases [98-100]. It is a high-reliability technology that does not yield unusual results, such as unmanned vehicles, remote surgery,

本书版权归Nova Science所有

206

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

satellite launches, etc. Therefore, it is essential to improve deep learning’s performance so that the range of applications of deep learning can be expanded, in order to ensure that deep learning is not so bad when it fails.

4.2.3. Poor Interpretability Deep learning is a black box algorithm that is difficult to interpret. The models of deep learning are typically complex and include millions of parameters. After applying the model online, if it has a serious impact on a certain user, it is impossible to determine which parameter has the problem, so that a certain parameter cannot be adjusted to solve the user’s problem. The combination of deep learning with symbol learning and interpretability algorithms can provide a powerful expressive capability and a certain degree of interpretability [101103]. Microsoft uses the Graph Learning Machine to combine machine learning and knowledge graphs Wu et al. [104] proposed to combine deep learning with an interpretable decision tree algorithm and improved the interpretability of deep learning using tree regularization. 4.2.4. Model Size Is Too Large Due to the size of the deep learning model itself, the GPU cannot handle it, and it cannot be used on a mobile device. Language models, in particular, have a large vocabulary and a large number of output neurons, resulting in a very large model. It is therefore necessary to compress the model in order to ensure accuracy. There are three types of matrices. The compressed convolutional filter pertains to the process of reducing storage and calculation costs through a special structure that applies convolutional filters; knowledge refinement refers to the process of extracting large models’ knowledge in order to train smaller and more compact models.

4.3. Functional Problem There is no different way to perform Small sample learning as humans do: Deep learning requires large numbers of training samples, and the sample utilization rate is relatively low, machine learning requires fewer samples. There is currently no unified framework for providing domain prior knowledge to deep learning algorithms. It is possible to learn with small samples of humans through the use of deep learning, knowledge graphs, logical reasoning, symbolic learning, etc. while utilizing the data and knowledge available.

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

207

4.3.1. Lack of Ability to Solve Logical Problems Currently, deep learning is only effective in perceptual tasks, such as image classification and speech recognition. In the absence of ability to think logically, it is unable to perform tasks that require logical reasoning. A deep learning algorithm can be improved by adding logical reasoning, a knowledge map that stores knowledge, and storage modules, including neural Turing machines [105], and memory networks [106]. 4.3.2. Small Data Challenges Deep learning offers the capability to extract the features of images automatically, making it appropriate for tasks requiring the extraction of features, such as image recognition. On the other hand, there are some tasks, such as the prediction of insurance user churn, for which the effective features can be extracted from the training data manually. DL is not as effective as GBDT, XGBost, and LightGBM [107] on these small data problems with existing characteristics, and even less effective than ordinary shallow machine learning algorithms. 4.3.3. Unable to Handle Multiple Tasks Simultaneously Although, the human brain is capable of recognizing speech, recognizing images, and understanding text at the same time. Currently, DL models are trained on a particular data set to complete a particular task, and the trained model can only perform that task. It is possible to connect neural networks with different functions into one larger neural network by some means in order to accomplish multiple tasks and further develop general artificial intelligence. Google [108] used a sparse gate matrix and reverse propagation to combine multiple multilayer perceptron sub-networks into a single network. 4.3.4. Ultimate Algorithm A machine learning system can be categorized into five types: symbolism, connectionism, evolutionism, Bayesian, and analytic. Deep learning belongs to connectionism. It is highly extensible and can be used as a method of incorporating other types into a master algorithm that combines the five major types [109]. The OpenAI [110] study showed that evolutionary genetic algorithms instead of back-propagation algorithms are more effective in training deep reinforcement learning. A number of probabilistic modeling inference libraries have been open-sourced, including Edward, by Google [111], ZhuSun, by Tsinghua University [112], and Pyro by Uber and Starford,

本书版权归Nova Science所有

208

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur

all of which prove Bayesian premises. It can be combined with deep learning. Furthermore, deep learning can also be combined with generalized linear models such as logistic regression [91].

4.4. Domain Issues 4.4.1. Image Understanding Issues Deep learning currently has a good performance in image recognition and other visual tasks, but there are not many achievements in image understanding such as visual relationship understanding, image content interpretation, and visual attention point prediction [113-115]. First, the process of understanding visual relationships needs to identify the primary objects, and then identify the relationship among the identified objects. Question and answer of image content is to provide answers to the corresponding questions in relation to the selected film. Visual attention point prediction refers to the ability to predict which part of the image people are most interested in for a given image. All these require a good understanding of image content, and the problem of image understanding requires further exploration by scholars. 4.4.2. Natural Language Processing Issues Language is a higher-level unnatural signal compared with speech and images. Language is a symbol system that is entirely generated and processed by the human brain. Natural language processing is not significantly impacted by deep learning as it is by speech and images, but the deep learning algorithm is inspired by a human brain. Deep learning is expected to yield more results in the area of natural language processing.

Conclusion This chapter has given an insight into deep learning, network structures and their improvement techniques and an overview of deep learning application in Image Processing. It has been found that Deep learning currently has a high performance in image recognition and other visual tasks, but there are not many achievements in image understanding such as visual relationship understanding, image content interpretation, and visual attention point

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing

209

prediction. In contrast to speech and images, deep learning does not significantly affect natural language processing. However, there are a few issues as well. For example, in order to train the model, a large amount of data was required and a large amount of computation power and time. Researchers should focus their efforts on resolving these issues in the future. There are still many opportunities and challenges for the future development of deep learning technology, and it is highly promising.

References [1] [2] [3] [4]

[5]

[6] [7]

[8] [9] [10]

[11]

[12]

Mcculloch, W., and Pitts, W. (1990). A logical calculus of the ideas immanent in nervous activity: Bull. Math. Biol., vol. 52, no. 1–2, pp. 99–115. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain.: Psychol. Rev., vol. 65, no. 6, pp. 386–408. Minsky, M., and Papert, S. A. (1969). Perceptrons: An Introduction to Computational Geometry: The MIT Press. Rumelhart, D. E., Hinton, G. E., and Williamsm, R. J. (1986). Learning representations by back-propagating errors: Nature, vol. 323, no. 6088, pp. 533– 536. Cecotti, H. (2015). Handwritten digit recognition of Indian scripts: A cascade of distances approach: 2015 International Joint Conference on Neural Networks (IJCNN). Cortes, C., and Vapnik, V. (1995). Support-vector networks: Mach. Learn., vol. 20, no. 3, pp. 273–297. Hinton, G. E. (2005). What kind of a graphical model is the brain? IJCAI’05: Proc. of the 19th international joint conference on Artificial intelligence, vol. 5. pp. 1765–1775. Hinton, G. E. (2006). Reducing the Dimensionality of Data with Neural Networks, Science, (80), vol. 313, no. 5786, pp. 504–507. Hinton, G. E., Osindero, S., and Teh, Y. W. (2006). A Fast Learning Algorithm for Deep Belief Nets: Neural Comput., vol. 18, no. 7, pp. 1527–1554, Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. (2006). Greedy layerwise training of deep networks: In Advances in neural information processing systems. pp. 153–160, Ranzato, M., Poultney, C., Chopra, S., and Le Cun, Y. (2007). Efficient learning of sparse representations with an energy-based model: Adv. Neural Inf. Process. Syst., vol. 19, p. 1137. Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep Sparse Rectifier Neural Networks: In Proceedings of the fourteenth international conference on artificial intelligence and statistics, vol. 15. pp. 315–323.

本书版权归Nova Science所有

210 [13]

[14]

[15]

[16]

[17]

[18] [19] [20]

[21] [22] [23]

[24] [25]

[26]

[27]

[28]

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., and Jaitly, N. (2012). Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups, IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82–97. Dahl, G. E., Yu, D., Deng, L., and Acero, A. (2012). Context-Dependent PreTrained Deep Neural Networks for Large-Vocabulary Speech Recognition: IEEE Trans. Audio. Speech. Lang. Processing, vol. 20, no. 1, pp. 30–42. Deng, J., Dong, W., Socher, R., Li, L. J., and Li, K. (2009). ImageNet: A largescale hierarchical image database: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255. Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks: Commun. ACM, vol. 60, no. 6, pp. 84– 90. Dauphin, Y., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., and Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization: Annual Conference on Neural Information Processing System. pp. 2933–2941. Choromanska, A., Henaff, M., Mathieu, M., Arous, G., and Lecun, Y. (2014). The Loss Surfaces of Multilayer Network: Artif. Intell. Stat., vol. 38, pp. 192–204, Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition: Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, DiCecco, R., Lacey, G., Vasiljevic, J., Chow, P., Taylor, G., and Areibi, S. (2016). Caffeinated FPGAs: FPGA framework For Convolutional Neural Networks: 2016 International Conference on Field-Programmable Technology (FPT). pp. 265– 268. Yan, F., Jin, L. P., and Dong, J. (2017). Review of Convolutional Neural Network: Chinese J. Comput. Sutskever, I. (2013). Training Recurrent Neural Networks: tspace.library. utoronto.ca. Pascanu, R., Mikolov, T., and Bengio, Y. (2013). On the difficulty of training recurrent neural networks: In Proceedings of the 30th International Conference on International Conference on Machine Learning, vol. 28. pp. 1310–1318. Simonyan, K., and Zisserman, A. (2015). Very Deep Convolutional Networks For Large-Scale Image Recognition, arXiv e-prints. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., and Anguelov, D. (2015). Going deeper with convolutions: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1–9. He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Residual Learning for Image Recognition: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation: 2014 IEEE Conference on Computer Vision and Pattern Recognition. pp. 580–587. Girshick, R. (2015). Fast R-CNN: 2015 IEEE International Conference on Computer Vision (ICCV). pp. 1440–1448.

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing [29]

[30]

[31] [32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

211

Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster R-CNN: Towards RealTime Object Detection with Region Proposal Networks: IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149. He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017). Mask R-CNN: 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, NJ, USA: IEEE. pp. 2980–2988. Lin, M., Chen, Q., and Yan, S. (2014). Network in network:arXivPrepr. arXiv1312.440. Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015). Spatial transformer networks: International Conference on Computer Vision. Piscataway, NJ, USA, IEEE. MIT Press, pp. 2017-2025 PP-Montreal, Canada. Zeiler, M. D., Krishnan, D., Taylor, G. W., and Fergus, R. (2010). Deconvolutional networks: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 2528–2535, Lin, T. Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017). Feature Pyramid Networks for Object Detection: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Masci, J., Meier, U., Cireşan, D., and Schmidhuber, J. (2011). Stacked convolutional auto-encoders for hierarchical feature extraction: International Conference on Artificial Neural Networks, vol. 6791: Springer Berlin Heidelberg, pp. 52–59. Dong, C., Loy, C. C., He, K., and Tang, X. (2014). Learning a Deep Convolutional Network for Image Super Resolution: Computer Vision – ECCV 2014: Springer International Publishing, pp. 184-199 PP-Cham. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and Lecun, Y. (2013). OverFeat: Integrated recognition, localization and detection using convolutional networks: International Conference on Learning Representations (ICLR). Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., and Golkov, V. (2015). FlowNet: Learning Optical Flow with Convolutional Networks: 2015 IEEE International Conference on Computer Vision (ICCV). Liu, N., Han, J., Zhang, D., Wen, S., and Liu, T. (2015). Predicting eye fixations using convolutional neural networks: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 362–370. Cimpoi, M., Maji, S., and Vedaldi, A. (2015). Deep filter banks for texture recognition and segmentation: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3828–3836. Bertasius, G., Shi, J., and Torresani, L. (2015). DeepEdge: A multi-scale bifurcated deep network for top-down contour detection: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4380–4389. Shen, W., Wang, X., Wang, Y., Bai, X., and Zhang, Z. (2015). DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3982–3991.

本书版权归Nova Science所有

212 [43]

[44] [45]

[46]

[47] [48]

[49] [50]

[51]

[52] [53]

[54]

[55] [56]

[57]

[58]

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur Liu, Z., Li, X., Luo, P., Loy, C. C., and Tang, X. (2015). Semantic Image Segmentation via Deep Parsing Network: 2015 IEEE International Conference on Computer Vision (ICCV). pp. 1377–1385. Nam, H., Baek, M., and Han, B. (2016). Modeling and propagating CNNs in a tree structure for visual tracking, arXiv1608.07242. Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. (2015). Learning spatiotemporal features with 3D convolutional networks: International Conference on Computer Vision. Piscataway, NJ, USA, IEEE. pp. 1635–1643. Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult: IEEE Trans. Neural Networks, vol. 5, no. 2, pp. 157–166. Hochreiter, S., and Schmidhuber, J. (1997). Long Short-Term Memory: Neural Comput., vol. 9, no. 8, pp. 1735–1780. El Hihi, S., and Bengio, Y. (1995). “Hierarchical recurrent neural networks for long-term dependencies,” Annual Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press. MIT Press, pp. 493-499 PP-Denver, Colorado. Chung, J., Ahn, S., and Bengio, Y. (2017). Hierarchical multiscale recurrent neural networks: arXiv1609.01704. Graves, A., Fernandez, S., and Schmidhuber, J. (2007). Multi-dimensional recurrent neural networks: International Conference on Artificial Neural Networks. pp. 549–558. Thireou, T. and Reczko, M. (2007). Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins: IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 4, no. 3, pp. 441–446. Graves, A., Wayne, G., and Danihelka, I. (2014). Neural turingmachines, arXiv:1412.3555. Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2015). Gated feedback recurrent neural networks: Proc. 32nd Int. Conf. Int. Conf. Mach. Learn., vol. 37, pp. 20672075 PP-Lille, France. Vukotić, V., Raymond, C., and Gravier, G. (2016). A step beyond local observations with a dialog aware bidirectional GRU network for Spoken Language Understanding: Conference of the International Speech Communication Association. Piscatmay, USA, IEEE. pp. 3241–3244. Mohamed, A., Dahl, G. E., and Hinton, G. (2012). Acoustic modeling using deep belief networks: IEEE Trans. Audio. Speech. Lang. Processing, vol. 20, pp. 14–22, Graves, A., Mohamed, A., and Hinton, G. (2013). Speech recognition with deep recurrent neural networks: IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 6645–6649. Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., and Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11. Ze, H., Senior, A., and Schuster, M. (2013). Statistical parametric speech synthesis using deep neural networks: 2013 IEEE Int. Conf. Acoust. Speech Signal Process., pp. 7962–7966.

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing [59]

[60] [61] [62]

[63]

[64]

[65]

[66]

[67] [68]

[69] [70]

[71] [72] [73] [74] [75]

213

Bruntha, P. M., Dhanasekar, S., Hepsiba, D., Sagayam, K. M., Neebha, T. M., Pandey, D., and Pandey, B. K. (2022). Application of switching median filter with L2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Systems, 1-7. Sreeja, G., and Saraniya, O. (2019). Image Fusion Through Deep Convolutional Neural Network: Deep Learn. Parallel Comput. Environ. Bioeng. Syst., pp. 37–52. Liu, L. J., Ding, C., Jiang, Y., Zhou, M., and Wei, S. (2017). The IFLYTEK system for blizzard challenge 2017. Anand, R., Khan, B., Nassa, V. K., Pandey, D., Dhabliya, D., Pandey, B. K., and Dadheech, P. (2022). Hybrid convolutional neural network (CNN) for Kennedy Space Center hyperspectral image. Aerospace Systems, 1-8. Zhu, Z., Liang, D., Zhang, S. H., Huang, X., Li, B., and Hu, S. (2016). Traffic-sign detection and classification in the wild: 2016 IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2110–2118. Kaur, J., Jaskaran, Sindhwani, N., Anand, R., and Pandey, D. (2023). Implementation of IoT in Various Domains. In: Sindhwani, N., Anand, R., Niranjanamurthy, M., Chander Verma, D., Valentina, E.B. (eds) IoT Based Smart Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-04524-0_10. Yang, S., Luo, P., Loy, C. C., and Tang, X. (2018). Faceness-net: Face detection through deep facial part responses: IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, pp. 1845–1859, Cao, Z., Simon, T., Wei, S. E., and Sheikh, Y. (2017). Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields: 2017 IEEE Conf. Comput. Vis. Pattern Recognit. Tian, Y., Luo, P., Wang, X., and Tang, X. (2015). Deep Learning Strong Parts for Pedestrian Detection: 2015 IEEE Int. Conf. Comput. Vis. Zhou, B., Lapedriza, À., Xiao, J., Torralba, A., and Oliva, A. (2014). Learning deep features for scene recognition using places database: Int. Conf. Mach. Learn., pp. 1187–1196. Fernando, B., and Gould, S. (2016). Learning end-to-end video classification with rank-pooling: IEEE Conf. Comput. Vis. Pattern Recognit., pp. 1219–1225. Lan, Z., Zhu, Y., Hauptmann, A. G., and Newsam, S. (2017). Deep Local Video Feature for Action Recognition: 2017 IEEE Conf. Comput. Vis. Pattern Recognit. Work., pp. 1219–1225. Cheng, Z., Yang, Q., and Sheng, B. (2015). “Deep Colorization,” 2015 IEEE Int. Conf. Comput. Vis., pp. 415–423. Champandard, A. J. (2016). Semantic style transfer and turning two-bit doodles into fine artworks, arXiv:1603.01768, vol. abs/1603.0 Gatys, L. A., Ecker, A. S., and Bethge, M. (2016). Image Style Transfer Using Convolutional Neural Networks: IEEE Xplore. pp. 2414–2423. Dahl, R., Norouzi, M., and Shlens, J. (2017). Pixel Recursive Super Resolution: 2017 IEEE Int. Conf. Comput. Vis. Assael, Y. M., Shillingford, B., Whiteson, S., and De Freitas, N. (2016). LIPNET: END-TO-END SENTENCE-LEVEL LIPREADING, arXiv:1611.01599, pp. 1–13.

本书版权归Nova Science所有

214 [76]

[77]

[78]

[79]

[80]

[81]

[82]

[83]

[84]

[85]

[86]

[87]

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. P. (2011). Natural language processing (almost) from scratch: J. Mach. Learn. Res., vol. 12, pp. 2493–2537. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality: Annual Conference on Neural Information Processing Systems: pp. 2493–2537. Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., and Gulrajani, I. (2016). Ask me anything: Dynamic memory networks for natural language processing: International Conference on Machine Learning. New York, NJ, USA: ACM, pp. 1378–1387. Weiss, D., Alberti, C., Collins, M., and Petrov, S. (2015). Structured Training for Neural Network Transition-Based Parsing: Proc. 53rd Annu. Meet. Assoc. Comput. Linguist. 7th Int. Jt. Conf. Nat. Lang. Process, vol 1, pp. 3111–3119. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., and Dyer, C. (2016). Neural Architectures for Named Entity Recognition: ACLWeb, pp. 260-270 PP-San Diego, California. He, L., Lee, K., Lewis, M., and Zettlemoyer, L. (2017). Deep Semantic Role Labeling: What Works and What’s Next: Proc. 55th Annu. Meet. Assoc. Comput. Linguist, vol. 1, pp. 473–483. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260A. Severyn and A. Moschitti: Twitter Sentiment Analysis with Deep Convolutional Neural Networks: Proc. 38th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 959–962, (2015). Sayel, N. A., Albermany, S., and Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 58685881. M. Hermans and B. Schrauwen,: Training and analyzing deep recurrent neural networks: Proc. 26th Int. Conf. Neural Inf. Process. Syst., vol. 1, pp. 190198 PP-Lake Tahoe, Nevada, (2013). Albermany, S., Ali, H. A., and Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). X. Zhou, D. Dong, H. Wu, S. Zhao, D. Yu, and H. Tian: Multi-view Response Selection for Human-Computer Conversation: Proc. 2016 Conf. Empir. Methods Nat. Lang. Process., pp. 372–381, (2016). Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E., and Svetnik, V. (2017). Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships: J. Chem. Inf. Model., vol. 55, no. 2, pp. 263–274, Dixon, M. F., Klabjan, D., and Bang, J. H. (2017). Classification-based financial markets prediction using deep neural networks: Algorithmic Financ., vol. 6, pp. 67–77. Sayal, M. A., Alameady, M. H., and Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2).

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing [88]

[89]

[90]

[91] [92]

[93]

[94]

[95]

[96]

[97]

[98]

[99]

215

Zhang, R., Li, W., Tan, W., and Mo, T. (2017). Deep and Shallow Model for Insurance Churn Prediction Service: 2017 IEEE Int. Conf. Serv. Comput., pp. 346– 353. Hussein, R. I., Hussain, Z. M., and Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Pandey, D., and Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., and Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Albermany, S. A., and Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. N. Strom: Scalable distributed DNN training using commodity GPU cloud computing: Conf. Int. Speech Commun. Assoc., (2015). Juneja, S., Juneja, A., and Anand, R. (2019, April). Reliability modeling for embedded system environment compared to available software reliability growth models. In 2019 International Conference on Automation, Computational and Technology Management (ICACTM) (pp. 379-382). IEEE. Pandey, B. K., Pandey, D., Wairya, S., and Agarwal, G. (2021). An advanced morphological component analysis, steganography, and deep learning-based system to transmit secure textual data. International Journal of Distributed Artificial Intelligence (IJDAI), 13(2), 40-62. Pramanik, S., Ghosh, R., Pandey, D., and Ghonge, M. M. (2021). Data Hiding in Color Image Using Steganography and Cryptography to Support Message Privacy. In Limitations and Future Applications of Quantum Cryptography (pp. 202-231). IGI Global. Jain, S., Kumar, M., Sindhwani, N., and Singh, P. (2021, September). SARS-Cov2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Singh, S. K., Thakur, R. K., Kumar, S., and Anand, R. (2022, March). Deep Learning and Machine Learning based Facial Emotion Detection using CNN. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 530-535). IEEE. Shukla, R., Dubey, G., Malik, P., Sindhwani, N., Anand, R., Dahiya, A., and Yadav, V. (2021). Detecting crop health using machine learning techniques in smart agriculture system. Journal of Scientific and Industrial Research (JSIR), 80(08), 699-706.

本书版权归Nova Science所有

216 [100]

[101]

[102]

[103] [104] [105]

[106]

[107]

[108]

[109]

[110]

[111]

[112]

Harmanpreet Kaur, Reecha Sharma and Lakwinder Kaur Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. A. Graves, G. Wayne, and I. Danihelka, Neural turing machines, arXiv:1410.5401, (2014). Sharma, M., Sharma, B., Gupta, A. K., and Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Pandey, D., Wairya, S., Sharma, M., Gupta, A. K., Kakkar, R., and Pandey, B. K. (2022). An approach for object tracking, categorization, and autopilot guidance for passive homing missiles. Aerospace Systems, 5(4), 553-566. Albermany, S., and Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Domingos, P. (2017). The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World: Semantic Scholar. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., and Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Tran, D., Kucukelbir, A., Dieng, A. B., Rudolph, M., Liang, D., and Blei, D. M. (2017). Edward: A library for probabilistic modeling, inference, and criticism, arXiv e-prints, Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., and Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Pandey, D., Pandey, B. K., and Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., and Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15. Sindhwani, N., Rana, A., and Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Degerine, S., and Zaidi, A. (2004). “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June, doi: 10.1109/TSP.2004.827195. Zaidi, A. (2005). “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov., doi: 10.1109/TSP.2005.855077.

本书版权归Nova Science所有

A Review of the Application of Deep Learning in Image Processing [113]

[114]

[115]

217

Degerine, S., and Zaidi, A. (2007). “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes”, Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis in independent components”, Treatise IC2, Signal and image series], Hermes Science, ISBN: 2746215179. Zaïdi, A. (2019). “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, doi: 10.1109/LSP.2019.2909651 Degerine, S. and Zaidi, A. (2007). “Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints”, in SIAM Journal on Optimization, Vol. 17, No. 4, pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 10

The Survey and Challenges of Crop Disease Analysis Using Various Deep Learning Techniques P. Venkateshwari G.B. Pant DSEU Okhla-I Campus, New Delhi, India

Abstract Crops yield and population should be balanced; otherwise, the country suffers from food deprivation. The adjustment of yields and populations is impeded by various issues, such as heavy rainfall, floods, natural calamities, nutritional inefficiency, and infections caused by insects. This work discusses crop infections caused by insects and the various methods used to detect crop infections. Finally, this paper summarizes existing techniques, limitations, and prospects for sophisticated and cutting-edge crop infection detection.

Keywords: CNN, deep learning, neural network, Image processing

1. Introduction Plant disease is a common and recurring issue in the life of a farmer. They need to learn more about plant science. Abiotic factors are natural facts caused by weather changes. Plants become infected with microorganisms as a result of biotic factors. According to the Food and Agriculture Organization, pests 

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

220

P. Venkateshwari

destroy 40% of crop yield. Invasive insects cause a 70 billion dollar crop yield loss, while biotic factors cause a 220 billion loss [1]. Crop yields are lower due to a lack of pollination. The contribution of wild pollinators to crop yield is estimated to be around 1.5 billion dollars [2].

Figure 1. Research progress in the area of plant disease detection and identification.

IEEE Digital library is used for the search and selection of papers. Plant disease is the keyword used for searching purpose. By limiting the year field, the number of papers in conference and journals are noted down. Figure 1 represents the research progress in the area of plant disease detection and identification. The outbreak of the crop disease causes large impact on the agriculture production. Sometimes it may cause an irreparable loss, the previously planted crops are lost at once. If it is a small-scale type disease, it may cause loss of quality and crop yield. The infection may be small or large, it should be found and treated on time so that the quality, crop yield and population could be balanced [3, 4]. Artificial Intelligence [5-8] has a rapid growth in recent years. In almost in all applications the AI techniques are applied, it made the life quite easier. The main goal of image recognition in agriculture application is to detect and classify the various types of images, detect type of crops, type of disease caused, severity level of the disease is identification and so on. Then the corresponding countermeasures can be formulated to solve the issues and it ensures the improvisation of crops yield.

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis …

221

2. Literature Survey 2.1. Leaf Diseases 2.1.1. Grape Grape leaf disease like black rot, Esca, leaf blight etc. highly affects the yield and growth of the grape plant. The visible symptoms of infections start mainly on the grape leaves. The computer vision techniques and image processing techniques are used to detect the infective part in earliest manner to prevent further spread of disease in plant. Anil Bharate, et al., [9] proposed a method to classify the healthier and unhealthier grape leaves. The author used two feature extraction algorithms like texture feature and colour feature extracted, the extracted images are given as an input to two classifiers such as SVM and KNN. The final experimental results showed that the SVM has 90% accuracy and KNN has 96.7% accuracy. Ujjwalsingh et al., [10] proposed a method to detect the black measles disease in grape leaf. First the ROI area is segmented and statistical features are extracted from the image then those images are given as an input to a supervised SVM classifiers. The experimental results showed that the author achieved 97% accuracy in classifying the black measles infected leaves. Khaing Zin Thet et al., [11] proposed a CNN technique disease identification in Grape leaves. The images are resized to 150 x150 dimensions then the data augmentation is used to avoid the overfitting problem. Then the images are trained using VGG16 architecture using GAP, FC and SVM. The experimental results showed that VGG 16 with FC, GAP and SVM has achieved 95%, 98% and 86.8% accuracy respectively. Bin Liu et al., [12] proposed a Leaf GAN technique to identify the real and fake disease. Xception method is used as a discriminative model for recognising and classifying the image, it has achieved accuracy of 98.70%. Author used a generative model for generating more training images using Leaf GAN, LGRN, LG-PR, DCGAN and WGAN. The Fid score of Leaf GAN achieved better results than other method. ZinonZinonos et al., [13] proposed a grape leaf disease spot identification using GRAD-CAM in last layer of CNN architecture. The author also used a LoRa technology to get images from the remote area. The disease spot identification is done with the help of Grad-Cam method. The experimental results showed that the ResNet50 with and without data augmentation achieved classification accuracy of 98.77% and 98.65% respectively.

本书版权归Nova Science所有

222

P. Venkateshwari

2.1.2. Citrus There are some common and exotic disease of citrus leaf, which affects the yield and growth of the citrus plant like Anthracose, greening, Canker and overwatering. Harpeet Singh et al., [14] proposed a citrus leaf colour and feature extraction and CES enhancement method is used to improve the chromatic image features. The diseased one is segmented out from the infected leaf pool with a help of K-means clustering technique. ANOVA-F test method is employed to make a feature selection for classification phase. LDA classifier is used for detecting the various disease. The experimental results showed that the LDA classifier achieves 86% accuracy in detecting various disease stated above and it has better performance than other classifiers like SVM, KNN and MLP. Quen Chen et al., [15] used a DNN for identification of defected leaf in citrus. The author used a seven layers architecture which consists of one input layer, three convolutional layer, two fully connected layers and one output layer. ADAM optimization algorithm is used for training purpose. The author addressed and identified three citrus leaf disease like canker, scab and Anthracose. The experimental reports showed that the classification accuracy of disease like canker, scab and Anthracose 98.60%, 9.25% and 95.12 respectively. Luaibi et al., [16] proposed a DL technique like Alexnet and Resnet for identifying the disease. The author addressed and used three types of citrus leaf like Phyllocnistis Citrella, Scale insects and lack of element. The report showed that the Alexnet with data augmentation has an accuracy of 97.92% which is better than Resnet 95.83%. M. A. S. Adhiwibawa et al., [17] proposed T2 hotelling multivariate method. This method detects the anomaly from the healthy leaves. Multivariate control method has better performance than single variate method. 2.1.3. Apple Apples are one of the high nutritional and medicinal value fruit in the world. Apples are used in many food based industry for producing Jams, vinegar, sauce, juice etc. Apple yields are affected due to alternaria leaf, grey spot, mosaic, brown spot and rust. Peng Jiang et al., [18] used ALDD data set and Deep CNN. The Deep CNN uses INAR-SSD model for testing the diseased apple. It detects the common apple leaf diseases. The detection results have single class contains single object, single class contains multiple objects and multiple class contains multiple objects. IAR SSD uses VGG-INEP feature extractor, this method has an accuracy of 78.80%mAP and 23.13 FPS. Yuanqiu Luo et al., [19] original Res block has two 1x1 conv layer, one 3x3 conv layer and batch normalization with Relu activation function. Improved

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis …

223

Res block has three stages of Res blocks. Improved down sample is used in this work, it has Max pool 2x2, Conv 1x1 and Batch Normalization. Pyramid Convolution is used instead of 3x3 Conv. Siddharth Singh Chohan et al., [20] proposed a RBFNN method for identifying the apple leaf disease and classification. BFO optimization technique is used further improving the accuracy. The author conducted experimental results and showed that BRBFNN has better performance than GA and SVM method. The author has addressed only the fungal disease identification and classification. M.A. Khan et al., [21] used contrast stretching, segmentation and Optimization stages in their work. The infected area is enhanced and Segmentation is done with the help of SCP technique and as a cherry on a top GA method is applied to optimize the result. Then the optimized image is used for classification purpose M-SVM method is used, which has achieved higher accuracy of 97.20%. Junsoo Lee [22], to find the infected apple leaf OCT and LAMP method was used. Infection rates and spread ratio are calculated.

2.1.4. Other Rice and wheat are main food crops which is interrelated with the social condition, national development and agricultural security. The TNAU agricultural portal listed out some common paddy disease like Brown spot, sheath Rot, Sheath Blight, False Smut, Grain Discolouration and Leaf Streak. As per G. Zhou et al., [23], 2DFM-AMMF is a 2D filter, 2D-Otsu segmentation has been applied to reduce the noise interference in the background. Then K-Means Clustering algorithm is used to optimize the distance. Prabhakar M [24], presented a system for assessing the severity level of disease in tomato leaves using a foldscope and ResNet101 CNN method. A. Khattak et al., [25] presented a two convolutional layer CNN model for separating low level and high level attributes. The model used in this work distinguishes the healthy and diseased leaves. Ashraf Darwish [26], used a CNN network for identifying and classifying the maize crop. CNN hyperparameter problem is optimized by using OLPOS method to reduce the parameter problem. V P Kour [27], used two Optimization techniques like PSO and SVM for identifying and classifying the tomato plant disease. Xuan Nie [28], proposed a method to detect a particular type of soil borne disease in strawberry crop. Strawberry Verticllium Wilt Detection which uses a Faster RCNN and a multi task learning method to identify a disease in the strawberry plant. Lakshay Goyal [29], proposed a deep learning architecture to classify 10 types of wheat disease like Tan spot, powdery mildew, Wheat Loose Smut, Fusarium Head Blight, crown & Root Rot, Black Chaff, Wheat Streak Mosaic,

本书版权归Nova Science所有

224

P. Venkateshwari

Leaf rust, Healthy Wheat and Karnal Bunt. Eisha Akanksha [30], proposed an OPNN method for detecting the maize plant disease. PNN classifier is used for classifying the disease. AJO techniques is used for smoothing the parameters used in the work. AJO is the parameter optimization technique.

3. Four Phase Technique Deep learning techniques are used to identify the image and recognize the image. Deep learning method is used on larger number of datasets and it produces more accuracy. DNN model is used for extracting more features for better performance and classification. The collected information is projected in four phases i. ii. iii. iv.

Data Collection Data Augmentation Data Detection and Classification Optimization

3.1. Data Collection This is the first phase deals with the collection of datasets. Image is captured from the field directly. Image quality is the most important factor for further image processing. The image quality is affected by various factors like light illumination, difference in sensor, difference in distance, climate etc. The table 1 describes the crop type and datasets used for analysis.

3.2. Data Augmentation The data augmentation phase is an important technique to increase the dataset by modifying the copies of images in the dataset, to multiply the amount of dataset. Data Augmentation methods are like resizing, segmenting, cropping, rotating, shearing, shifting, normalizing, zooming, flipping, image sharping, contrast adjustment, noise removal etc. The Table 2 shows the augmentation techniques used in each work.

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis …

225

Table 1. Comparison of crops, no. of images and datasets Crop name Citrus leaves Grape leaves

Datasets Plant village and collected dataset Images collected Plant Village dataset ALDD dataset, FGVC7 and Baidu AI studio Plant village Image collected LWDCD2020 Plant Village kaggle

Apple leaves Rice leaves Strawberry Wheat Tomato Corn

No. of images 2293-3171 4062-6000

Reference [15, 25] [11, 12]

26377

[18, 19]

7448 3531 12000 50000 174000

[23] [28] [29] [24] [26]

Table 2. Comparison of different types data augmentation techniques Crop name Citrus leaves

Image size 224x224, 154x272,128x128

Grape leaves Apple leaves

150x150 224x224

Rice leaves Strawberry

2400x1600 1024x768

Wheat Tomato

224x224 227x227

Corn

N/A

Augmentation Technique Image resize Segmentation, Translation, rotation, scaling, cropping, normalization Shearing, shifting, zooming, Horizontal flip Rotation, horizontal & vertical flipping, brightness, sharpness and contrast Rotation and cropping Different angle images, different background NA Rotation Translation Scaling Resize Segmentation Crop Flipping Rotation Zooming Noise Removal Background Removal

References [15-17]

[11] [18] [23] [28] [29] [24]

[26]

3.3. Data Detection and Classification This phase is used for detecting the plant disease using various Neural Network techniques. Each and every crop is detected and classified using some Neural Network technique and accuracy of identification and classification is noted and showed in the Table 3.

本书版权归Nova Science所有

226

P. Venkateshwari

Table 3. Comparison of deep learning techniques and accuracy achieved Crop name Citrus leaves Grape leaves

Apple leaves Rice leaves

Technique 7 layer CNN VGG16 SVM Leaf GAN INAR-SSD Improved RESNET Faster-RCNN

Strawberry Wheat

Faster-RCNN CNN with 21 Convolution layer, 7MP and 3FC layers

Tomato

ResNet101

Corn

VGG16 VGG19 AE Model

Accuracy 98.60% 94.5% 97.2% 98.25% 78.80% 94.23% 96.71%98.26% 99.95% Training 98.62% Testing 97.88% Training 976% Testing 94.6% 97.9% 97.7% 98.2%

References [15] [10-12]

[18, 19] [23] [28] [29]

[24]

[26]

3.4. Optimization This is the last phase of the four phase technique. It is used to find the satisfactory output by setting some objective function [31-37]. The values are iterated by comparing with the objective function. The Table 4 listed below shows the technique used for optimizing the results and accuracy achieved by using that technique. Table 4. Comparison of various optimization technique and accuracy achieved Crop name Citrus leaves

Grape leaves Apple leaves Tomato Corn

Technique Stochastic gradient algorithm such as Adam Grad method VGG with GAP BRBFNN P-SVM Artificial Jelly Fish optimization

Accuracy 98.60%

References [15]

98.9% 87.05% 97.42% 95%

[11] [20] [27] [30]

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis …

227

4. Issues Related to Plant Disease Identification 1) In most of the work the predefined datasets are used like plant village and Crowd AI or the image generated with the help GAN method. In practical scenario, the results are affected by many factors like geographic area and range, differences in sensor, differences caused due to illumination of light, hybrid crops, climate etc. So original dataset to be used for both training and testing of an image using any detection and classification method [38-45]. 2) There are various types of traditional augmentation methods are available. So, the training images can be made by combining several augmentation techniques like, scaling, translation, rotation, flipping etc. In most of the work one or two methods are used for creating the datasets [46-50]. 3) High tech Hardware devices are required for classifying the images and it could be deployed in the agricultural land for real time disease identification. High tech camera is required for covering the geographical area of the land and its crops. Multiple cameras should be deployed to cover the entire geographical region of the land and its crops. So that the real time disease detection and identification could be done to identify the disease at earliest stage. 4) Most of the paper addressed about the single disease identification and classification. It didn’t addresses about the multi disease identification and classification. Multiple disease identification is a complex way, so it has been ignored. So, the researchers can device a method to identify the multiple disease. 5) Broad leaf disease identification and detection is quite easier when compared with the crops having structure like small minute leaves, long narrow leaves and curly leaves especially in lettuce. 6) Some different plant diseases and complex type diseases are needed to be classified and segmented properly. 7) Plant disease, which has the same symptoms and overlapping of two or multiple infections in a plant, needs to be segmented and classified using Deep learning techniques. 8) Need to build an IoT based real time crop monitoring and diagnosing system. 9) Economic importance plants should be diagnosed at earlier stage. So in order to detect the severity level of plant disease other parts of the plant should also be taken into account.

本书版权归Nova Science所有

228

P. Venkateshwari

Conclusion and Future Outlook In order to manage the population pressure, the agriculture field should be automated. The automatic disease detection of a plant will promote the production and helps to decrease the agriculture loss. The automatic disease detection is achieved by image processing and detection using Deep learning methods. This comparative study analysis helps to highlight implementation of various image processing method, image augmentation method, Deep neural network techniques and optimization methods. The study also reveals that to achieve high accuracy, optimization technique should be used along with the DNN method. The information collected are elaborated in four phases of DNN. First phase is a Data collection phase. Second phase is a Data augmentation phase. Third phase is a Data Detection phase and fourth phase is anoptimization phase.

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

(2021) Climate change fans spread of pests and threatens plants and crops, new FAO study. [online] Available: https://www.fao.org/news/story/ en/item/1402920/ icode/. Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., & Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11. Yang G., G. Chen, Y. He, Z. Yan, Y. Guo & J. Ding, “Self-Supervised Collaborative Multi-Network for Fine-Grained Visual Categorization of Tomato Diseases,” in IEEE Access, vol. 8, pp. 211912-211923, 2020, doi: 10.1109/ ACCESS.2020. 3039345. Zhang L. Li, S. & B. Wang, “Plant Disease Detection and Classification by Deep Learning—A Review,” in IEEE Access, vol. 9, pp. 56683-56698, 2021, doi: 10.1109/ACCESS.2021.3069646. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., & Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Pandey, B. K., Pandey, D., & Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer.

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis … [8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

229

Sindhwani, N., Anand, R., Shukla, R., Yadav, M., & Yadav, V. (2021). Performance Analysis of Deep Neural Networks Using Computer Vision. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(29), e3. Bharate A. A. & M. S. Shirdhonkar, “Classification of Grape Leaves using KNN and SVM Classifiers,” 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 2020, pp. 745-749, doi: 10.1109/ ICCMC48092.2020.ICCMC-000139. Singh U., A. Srivastava, D. Chauhan & A. Singh, “Computer Vision Technique for Detection of Grape Esca (Black Measles) Disease from Grape Leaf Samples,” 2020 International Conference on Contemporary Computing and Applications (IC3A), 2020, pp. 110-115, doi: 10.1109/IC3A48958.2020.233281. Thet K. Z., K. K. Htwe & M. M. Thein, “Grape Leaf Diseases Classification using Convolutional Neural Network,” 2020 International Conference on Advanced Information Technologies (ICAIT), 2020, pp. 147-152, doi: 10.1109/ICAIT51105. 2020.9261801. Liu B., C. Tan, S. Li, J. He & H. Wang, “A Data Augmentation Method Based on Generative Adversarial Networks for Grape Leaf Disease Identification,” in IEEE Access, vol. 8, pp. 102188-102198, 2020, doi: 10.1109/ACCESS.2020.2998839. Zinono Z. S, S. Gkelios, A. F. Khalifeh, D. G. Hadjimitsis, Y. S. Boutalis & S. A. Chatzichristofis, “Grape Leaf Diseases Identification System Using Convolutional Neural Networks and LoRa Technology,” in IEEE Access, vol. 10, pp. 122-133, 2022, doi: 10.1109/ACCESS.2021.3138050. Singh H., Rani R., Mahajan S. (2020) Detection and Classification of Citrus Leaf Disease Using Hybrid Features. In: Pant M., Sharma T., Verma O., Singla R., Sikander A. (eds) Soft Computing: Theories and Applications. Advances in Intelligent Systems and Computing, vol 1053. Springer, Singapore. https://doi.org/10.1007/978-981-15-0751-9_67. Bruntha, P. M., Dhanasekar, S., Hepsiba, D., Sagayam, K. M., Neebha, T. M., Pandey, D., & Pandey, B. K. (2022). Application of switching median filter with L2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Systems, 1-7. Luaibi, Ahmed & Salman, Tariq & Miry, Abbas. (2021). Detection of citrus leaf diseases using a deep learning technique. International Journal of Electrical and Computer Engineering. 11. 1719-1727. 10.11591/ ijece.v11i2.pp1719-1727. Adhiwibawa M. A. S., W. H. Nugroho & Solimun, “Detection of Anomalies in Citrus Leaves Using Digital Image Processing and T2 Hotelling Multivariate Control Chart,” 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT), 2019, pp. 310-314, doi: 10.1109/ICAIIT.2019.8834453. Jiang P., Y. Chen, B. Liu, D. He & C. Liang, “Real-Time Detection of Apple Leaf Diseases Using Deep Learning Approach Based on Improved Convolutional Neural Networks,” in IEEE Access, vol. 7, pp. 59069-59080, 2019, doi: 10.1109/ACCESS. 2019.2914929. Luo Y., J. Sun, J. Shen, X. Wu, L. Wang & W. Zhu, “Apple Leaf Disease Recognition and Sub-Class Categorization Based on Improved Multi-Scale Feature

本书版权归Nova Science所有

230

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

P. Venkateshwari Fusion Network,” in IEEE Access, vol. 9, pp. 95517-95527, 2021, doi: 10.1109/ ACCESS.2021.3094802. Chouhan S. S., A. Kaul, U. P. Singh & S. Jain, “Bacterial Foraging Optimization Based Radial Basis Function Neural Network (BRBFNN) for Identification and Classification of Plant Leaf Diseases: An Automatic Approach Towards Plant Pathology,” in IEEE Access, vol. 6, pp. 8852-8863, 2018, doi: 10.1109/ACCESS. 2018.2800685. Anand, R., Khan, B., Nassa, V. K., Pandey, D., Dhabliya, D., Pandey, B. K., & Dadheech, P. (2022). Hybrid convolutional neural network (CNN) for Kennedy Space Center hyperspectral image. Aerospace Systems, 1-8. Pandey, D., Wairya, S., Sharma, M., Gupta, A. K., Kakkar, R., & Pandey, B. K. (2022). An approach for object tracking, categorization, and autopilot guidance for passive homing missiles. Aerospace Systems, 5(4), 553-566. G. Zhou, W. Zhang, A. Chen, M. He & X. Ma, “Rapid Detection of Rice Disease Based on FCM-KM and Faster R-CNN Fusion,” in IEEE Access, vol. 7, pp. 143190143206, 2019, doi: 10.1109/ACCESS.2019.2943454. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). LakshayGoyal, Chandra Mani Sharma, Anupam Singh, Pradeep Kumar Singh, “Leaf and spike wheat disease detection & classification using an improved deep convolutional architecture”, Informatics in Medicine Unlocked, Volume 25, 2021, 100642, ISSN 2352-9148, https://doi.org/ 10.1016/ .imu.2021.100642. Akanksha E., N. Sharma & K. Gulati, “OPNN: Optimized Probabilistic Neural Network based Automatic Detection of Maize Plant Disease Detection,” 2021 6th International Conference on Inventive Computation Technologies (ICICT), 2021, pp. 1322-1328, doi: 10.1109/ICICT50 816.2021.9358763.

本书版权归Nova Science所有

The Survey and Challenges of Crop Disease Analysis … [31]

[32]

[33]

[34]

[35]

[36] [37]

[38] [39]

[40]

[41]

[42] [43] [44] [45]

[46]

231

Pandey, D., Pandey, B. K., & Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Anand, R., & Chawla, P. (2022). Bandwidth Optimization of a Novel Slotted Fractal Antenna Using Modified Lightning Attachment Procedure Optimization. In Smart Antennas (pp. 379-392). Springer, Cham. Sindhwani, N., & Singh, M. (2020). A joint optimization based sub-band expediency scheduling technique for MIMO communication system. Wireless Personal Communications, 115(3), 2437-2455. Srivastava, A., Gupta, A., & Anand, R. (2021). Optimized smart system for transportation using RFID technology. Mathematics in Engineering, Science & Aerospace (MESA), 12(4). Shukla, R., Dubey, G., Malik, P., Sindhwani, N., Anand, R., Dahiya, A., & Yadav, V. (2021). Detecting crop health using machine learning techniques in smart agriculture system. Journal of Scientific and Industrial Research (JSIR), 80(08), 699-706. Madhumathy, P., & Pandey, D. (2022). Deep learning based photo acoustic imaging for non-invasive imaging. Multimedia Tools and Applications, 81(5), 7501-7518. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Degerine S. & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195.

本书版权归Nova Science所有

232 [47]

[48]

[49]

[50]

P. Venkateshwari Zaidi A., “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse enc omposantes indépendantes”, Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi A., “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Degerine S. & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints“, in SIAM Journal on Optimization, 2007, Vol. 17, No. 4: pp. 997-1014, doi: 10.1137/ 050622821.

本书版权归Nova Science所有

Chapter 11

Image Processing and Computer Vision: Relevance and Applications in the Modern World Sukhvinder Singh Deora1,* and Mandeep Kaur2,† 1Department

of Computer Science and Applications, Maharshi Dayanand University, Rohtak, India 2Department of Computer Applications, Panipat Institute of Engineering and Technology, Samalkha, Panipat, India

Abstract Humans see and understand this world better than most other races. They see and interpret images for various purposes during their life. However, a drastic change has occurred in the use, storage, interpretation and sharing of images after the invention and promotion of digital computers and cameras. It is challenging to impart capabilities of interpreting visual information embedded in still images, graphics, video, or moving images produced by various capturing devices within reach of our sensory system. The digital methods based on electronic sensor components replaced the corresponding analog-based physical approach to form images. Still, there was room for understanding and using digital images in day-to-day life chores: households, industry, information exchange, sales, marketing, defense, and medical science. This chapter aims to provide a concise overview of the said technologies, their application areas and the future scope. * †

Corresponding Author’s Email: [email protected]. Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

234

Sukhvinder Singh Deora and Mandeep Kaur

Keywords: image processing, computer vision, artificial intelligence, smart applications

1. Introduction The word “image” comes from the Latin word “imago” - an artifact that depicts visual perception that resembles some subject of interest: an object, person, building, area [1]. It is like a photograph or picture. In physics, an image is a distributed amplitude of color(s) formed when light reflected from any object enters our eyes and forms a collection of colors on the retina that reminds our brain of some natural object. In computer science, such a pictorial script is a writing that is a collection of colorful symbols with various semantics to produce alphabets, drawings, maps, graphs, charts and banners on the display devices. A digital image is an image composed on display devices with the help of a finite array of picture elements called pixels. Images have discrete numeric quantities of intensity or gray level, color, hue, saturation, light [2]. The twodimensional inputs create a digital image in spatial coordinates in the x and yaxis.

1.1. Digital Image Processing Digital image processing deals with using digital computers to process images using some algorithms [3-5]. It involves a comprehensive range of algorithms applied to the image data [6]. It can take care of noise and distortions in images. It is affected by the developments in computer science, mathematics, hardware components and statistics that have applications for people management, buildings maintenance, environment protection, agriculture crop yield prediction, military, Industry and medical science. Basic image processing involves formation, acquisition, storage, transmission, recognition, interpretation and modifications of visuals called images. There are various methods used for processing images digitally to extract different types of information. Pre-processing involves operations of brightness, luminance color quantization and control, and the management of colors in images. The major types of operations that lead to the extraction of information related to filtering, histogram analysis, image transformations, segmentation and classification [7, 8].

本书版权归Nova Science所有

Image Processing and Computer Vision

235

1.2. Image Acquisition Digital image acquisition deals with capturing a physical scene and creating its digital image that requires processing like compression, storage, printing, and display [9]. The image capturing devices like cameras, scanners, photocopiers produce images. To analyze these images, one requires learning the mathematics of mono or stereo cameras to understand the image analysis. It also includes special chips that convert the captured photons to images using software programs.

1.3. Digital Histogram Plots A digital trace of an image using a histogram of a gray image by using the frequency of its gray levels in the range (0, L-1). It is a discrete function defined by: H(rk)=nk

(1)

where rk is the kth gray level value and nk is the number of pixels of imagehaving the level rk. The normalized version of the histogram is given by: P(rk)=nk /n

(2)

for k=0,1,2,…..,L-1 where P(rk) estimates the probability of occurrence of gray level rk. One can construct the Histogram plots that show (see Figure 1) an illustrated version of H(rk)=nk on the y axis versus rk taken on the x-axis.

1.4. Characteristics The sum of all components of a normalized histogram is equal to 1. The histogram has more frequencies towards the lower side if an image is dark. We have frequencies on the other side of the histogram (refer to Figure 1). Normalized histograms are helpful for various purposes of image analysis. Transformation of an image to a high contrast image turns its range of gray values into a higher dynamic range image that appears more crisp. The

本书版权归Nova Science所有

236

Sukhvinder Singh Deora and Mandeep Kaur

construction of histogram for bright image, low-contrast and high-contrast image to analyze and use in image processing.

Figure 1. Histogram of an image in low/high brightness and contrast.

本书版权归Nova Science所有

Image Processing and Computer Vision

237

2. Image Storage and Manipulation Image file formats used in different environments and platforms result in different file types. Such formats have different image handling features like brightness, luminance, contrast, color space, color mapping, color management. Various types of operations like filtering, histograms, transformations are helpful for various image analyses and its applications in real life and Industry. Image processing can automate different operations carried out in medical science, production, quality management, security, surveillance. Modern applications in the Industry use image processing in quality control, product production management, access control, and surveillance of employees during work, identification of objects, security. It has applications for home security, parental control, automatic access management, and object recognition in day-to-day life. In medical science, image processing algorithms help detect diseases, early prediction of high-risk diseases, lung diseases and heart diseases [10-12]. Soil segmentation, crop yield estimation, smoke and fire detection, land use control, house mapping for house tax imposition, road construction supervision uses satellite imagery. Traffic light control and traffic management and location of accident sites for ambulance services use the Internet of Things and mobile technologies that use location, images, and other information.

2.1. Image Segmentation It allows the use of many complex algorithms that offer sophisticated performance for various scenarios, which involves using smart things, the Internet and images acquired from various sensors and computing networks.

2.2. Feature Extraction If the initial set of raw data related to images is very complex, one must divide it and reduce it to manageable chunks/groups. This reduction of dimensionality of an image or its part is called feature extraction. The number of feature values is reduced to a smaller number by taking most essential variables that should be studied more precisely for solving some particular type of problem and others may be ignored.

本书版权归Nova Science所有

238

Sukhvinder Singh Deora and Mandeep Kaur

2.3. Multi-Scale Signal Analysis A critical aspect of image analysis might involve the blending/embedding an original property of some image (like signal) to a set of other derived signals to thoroughly understand the behavior of the original feature of interest [13]. In this, the original signal is added to some other family of derived signals of the image, thus enabling the analysis of different representation levels to analyze the image(s) with greater ease in consonance with other parameters of interest.

2.4. Pattern Recognition Another technique of identification, recognition and analysis of images is Pattern recognition. It involves an automated recognition of patterns and regularities in image-related data [14]. Patterns in data helps in the creation of applications that involve statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning. These can be used to identify general patterns and absurd/outlier pattern breakage for identifying points of interest during image processing.

2.5. Projection Projection of images reduces the number of parameters of an image data set. It deals with transforming image characteristics into simpler data forms with generally fewer profile aspects [15]. This operation is helpful for image segmentation of objects present in a view image using computers and object matching algorithms.

3. Image Processing Techniques The processing of images involves various techniques. A concise introduction to some of the techniques is as under:

本书版权归Nova Science所有

Image Processing and Computer Vision

239

3.1. Anisotropic Diffusion It is a non-linear and space-variant transformation of the original image. In this technique, noise in the image is reduced without removal of significant contents like edges, lines and points of interest of the image [16-18]. This process creates a scale-space that is a parameterized family of successively more blurred images from the original image. Gaussian filter is applied to form different space invariant transformation images in a particular combination between the original image and a filter that depends on the local content of the original image. It helps to remove noise from digital images without blurring the edges of the image using a constant diffusion coefficient.

3.2. Hidden Markov Models One-dimensional image data is processed to categorize the images based on some key features in this well-known method. It uses some statistical distance measures between images based on the similarity of their statistical models for classification and retrieval tasks. It has applications in segmentation, recognition of printed words, video analysis.

3.3. Image Editing It refers to using different techniques, tools, and software to manipulate digital images. Image-capturing devices like scanners and digital cameras produce good enough images, but editing requires special software tools.

3.4. Image Restoration It is a process by which a blurred image is converted back to a better image using the point spread function (PSF). This restoration involves using an inverse filter, pseudo-inverse filter and Wiener filter to suppress high frequencies in the image i.e., smoothing the image, or low frequencies, i.e., enhancing edges in the image. Estimating the original image from the corresponding degraded image has applications in medical imaging, astronomical imaging, and forensic science, where the quality of image data captured from crime locations is not of good quality.

本书版权归Nova Science所有

240

Sukhvinder Singh Deora and Mandeep Kaur

3.5. Independent Component Analysis It is a computational method used to decompose complex image datasets to separate multi-dimensional variant signals into its additive components that are statistically independent of each other.

3.6. Linear Filtering It is an image enhancement method that processes a part of the signal frequency spectrum using a transfer function. The output pixel value is some linear combination of values of pixels in the neighbourhood of the current pixel position.

3.7. Neural Networks It is a computing design that uses a human brain-like structure to recognize patterns in the data for image classification problems. It consists of three layers: input, hidden, and output. Generally, a given new image is compared with a database of images to match some feature(s) for classifying the given new image. Applications include security-related problems, identification of objects, defective objects and more.

3.8. Point Feature Matching It involves matching features in the source dataset of images with the target dataset of images. A correspondence is established based on transferring attributes from source to target data (object). It has applications in image registration, camera calibration and object-recognition problems.

3.9. Principal Components Analysis This statistical technique helps in identifying similarities and differences in the datasets. It reduces the dimensionality without loss of information by creating a summary of data contents using a smaller set of indices that can be

本书版权归Nova Science所有

Image Processing and Computer Vision

241

analyzed visually for any decision-making. It eliminates unwanted data points that appear correlated but do not contribute towards the decision-making process. There are other image processing methods like self-organizing maps and the use of wavelets for understanding the images in a better manner. However, an exploratory reader may read further to know other image processing tools and techniques for research and industrial use.

4. Newer Applications of Image Processing Image processing is being used in various real time problem solving in the society, industries, educational institutions, production houses, supply chain management, quality control and e-governance. Image based solutions based on image analysis can be identified. A few of them have been discussed in this section.

Figure 2. Identification of active learning participation in which the point of view of eyes is identified along with lips movement with speech audio from the learner’s end.

本书版权归Nova Science所有

242

Sukhvinder Singh Deora and Mandeep Kaur

Figure 3. Workplace surveillance camera that records the visuals and sound in the room to check for the amount of work done by the employees.

4.1. Active Learning Participation Online learning has changed how the World is learning using different computer-based tools. Educators are learning to use various tools for teaching, evaluation, monitoring [19, 20]. Identifying actual participation in the learning process is still an open area of research. Active learning is evaluated from the video stream captured from the learner’s end by verifying the point of focus of the learner’s eyes on part of the screen. Speech captured from the learner’s side is tested for synchronization of mouth movement and the audible sound. These two parameters can detect actual active participation of a learner during the online sessions (Figure 2 shows the evaluations). The amount of time the learner was participating in the learning session is evaluated in terms of %-age of focused attention of the learner during any online session.

4.2. Workplace Surveillance The electronic surveillance is used to ensure security in certain scenarios. It involves the recording of the employees’ activities using cameras to monitor the employees and everyone who enters the premises of an organization (refer Figure 3). Nowadays, one can use the AI-enabled applications that use the video stream from a camera for identification of activities of the employees. The movement of employees is also reported using trigger-based notifications

本书版权归Nova Science所有

Image Processing and Computer Vision

243

by the AI applications. However, research findings show that the efficiency of employees decreases when they under observation while at work. The companies must try to evaluate the ethical values of the candidates before selection so that they are self-regulated and feel the responsibility of work to deliver their best while at the workplace [21].

Figure 4. Identification of popular buildings(monuments/temples/places etc.) on the basis of pattern matching and edge detection techniques that are compared with the feature set of images of such buildings in image datasets.

4.3. Building Recognition Nowadays, researchers have developed many algorithms that are used for human face recognition using feature matching techniques. Likewise, one can use a similar approach to identify famous buildings by matching their structural design information in the image and comparing it with the pool of tagged images in the database (see Figure 4). This domain of research involves the recognition of building in different environments. Automatic building detection by localization information and architectural design [22] is done in real-time. It involved region-based schemes using Hough transformation. A precise shape of a building is derived and compared with the database of such famous buildings. This method is robust and performs well using highresolution images [23]. Geo-tagging is also another application that uses geographical identification of image data using GPS technology (see Figure 5). An alternative approach can be to use the feature descriptors similarities with an already created dataset of geo-tagged images [24]. One can also use the environmental context also for location identification and at the same time, it ensure higher accuracy of recognitions.

本书版权归Nova Science所有

244

Sukhvinder Singh Deora and Mandeep Kaur

Figure 5. Location Identification using pattern matching.

Figure 6. Identification of breakage of building edges and presence of cracks by comparing the current image of the building with its old images to automate the maintenance action plans.

4.4. Image Change Detection Images of the same object captured at different points of time can be compared with each other to detect the changes in the object. One can use this approach to detect changes in the buildings/objects/rooms etc., using modern change detection techniques [25]. It uses image feature analysis of the images for its logical inspection. Automatic crack detection (see Figure 6), rusting of steel/iron objects, growth and observation of crops, plants, micro-organisms,

本书版权归Nova Science所有

Image Processing and Computer Vision

245

disease symptoms observation using set of image processing techniques like the Min-Max Gray Level Discrimination (M2GLD) and feature detection for image analysis processing like low contrast, uneven illumination and noise pollution [26].

Figure 7. Object, race identification of all human races is possible using a world feature database.

4.5. Human Race Detection Human face detection is taken further ahead by ethnicity prediction and age/age-group determination. In this approach, feature extraction like colour, head-body proportion, colour attributes, human skull size and proportion evaluations etc., of different ethnicity groups is classified (refer Figure 7). These feature classifications are used to predict a person’s ethnicity belonging to any of these groups. Masood et al., have attempted a similar research where the accuracy achieved was 98.6% for some specific ethnic groups using FERET database and Multilayer Perceptron (MLP) class of feed forward ANN approach [27-29] and using ensemble framework using Linear Discriminant Analysis (LDA) taking dataset of 263 subjects and 2630 facial images [30]. Tariq et al., have provided the pattern recognition approach for gender and race identification in their study. A qualitative analysis of colour and facial views is done for gender identification using four different representations [31]. Other demographic characteristics including race, gender, and age are

本书版权归Nova Science所有

246

Sukhvinder Singh Deora and Mandeep Kaur

predicted in the study by H. Tin et al., [32] using a face images data set of 250 subjects collected from the Internet. New applications may be developed for ensuring security, identification and recognition of persons in scenarios where there can be a good mix of people from different countries, race and cultures in a place like the international airports, ship voyages etc.

4.6. Rusting of Steel One of the early industrial applications of use of image processing is identification of the rusting of iron/steel objects like bridges, building constructions and monuments of archaeological importance constructed at different places (see Figure 8). This was required to avoid the travel and personal inspection of such objects by the experts and avoid any human errors during such inspections due to fatigue etc. It was a good idea to get the objects photographed by other persons. Khayatazad et al., have built automated image processing technique [33] to avoid time-consuming and subjective approach and introduced artificial intelligence based algorithm for recognition of corrosion damage using colour images using histogram of corrosion representing colours. It uses roughness, colour and roughness analysis in graylevel co-occurrence matrix of the image for further accuracy to locate corrosive areas of such objects. In fact, such solutions are cost-effective for the industry and quality management departments in lowering the expenses incurred on traditional inspection approach that may be replaced by use of image analysis approach [34].

Figure 8. Rusting of iron object.

本书版权归Nova Science所有

Image Processing and Computer Vision

247

Figure 9. Object deformation detection using geometric design comparison. In the above image, the case of the electric meter is checked for deformity to fire a trigger for maintenance is generated by the relevant smart device.

4.7. Object Deformation Objects and products manufactured in the Industry have certain design aspects that use some geometric design and symmetry. The geometric definitions can be used to identify the deformity of the objects like the smart electricity meter placed on the electric poles in states like Haryana, Andhra Pradesh and Telangana. Vehicles and stray animals that hit these meters often cause serious issues like electric shock, defect in meter or its case is broken at times (see Figure 9). These states can be recognized easily using conjugation approach of image manipulation and object variability identification. CNN based image manipulation model that records the contour representations [35] in general objects, handwritten documents for recognition and medical imaging [36].

4.8. Tourism Management Although there have been different applications that have been developed for tourism management that guide the tourists about tourist places [37], other applications are required to use other aspects of technology for ensuring safety, location tracing and activities monitoring etc. (refer Figure 10). A new

本书版权归Nova Science所有

248

Sukhvinder Singh Deora and Mandeep Kaur

domain of application of image processing is to use the analysis of images using deep learning model for tourism management, behavior of tourists at different tourist locations [38-44]. Image analysis for identifying the whereabouts, acitivity management, safety of tourists at dangerous points of the hills, river banks, sea coasts, dams etc. (see Figure 11) in case of disasters and natural calamity is ensured with specific changes in the existing solutions.

Figure 10. Tourist activity management and safety.

Figure 11. Tourists’ whereabouts and disaster management.

Conclusion and Future Outlook A major shift is there in the utilization, storage, interpretation and sharing of the various images after the evolution of the various digital devices. These digital devices have replaced the earlier analog devices to form images. The

本书版权归Nova Science所有

Image Processing and Computer Vision

249

digital images can be used in the daily life applications in households, industry, information exchange, sales and marketing, defense, medical industry etc. This article has given a detailed review of these technologies, their application in the various fields and also their future scope.

References [1] [2]

[3]

[4]

[5] [6]

[7]

[8]

[9] [10]

[11]

[12]

Image - Wikipedia. (2007). Retrieved 20th February 2022, from https://en.wikipedia. org/wiki/Image. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Vyas, G., Anand, R., & Holȇ, K. E. Implementation of Advanced Image Compression using Wavelet Transform and SPHIT Algorithm. International Journal of Electronic and Electrical Engineering. ISSN, 0974-2174. Malik, S., Saroha, R., & Anand, R. (2012). A Simple Algorithm for reduction of Blocking Artifacts using SAWS Technique based on Fuzzy Logic. International Journal Of Computational Engineering Research, 2(4), 1097-1101. Kumar, R., Anand, R., & Kaushik, G. (2011). Image Compression Using Wavelet Method & SPIHT Algorithm. Digital Image Processing, 3(2), 75-79. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. Pandey, B. K., Pandey, D., Wariya, S., & Agarwal, G. (2021). A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(28), e3. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Anand, R., Mann, A., & Sharma, K. (2020). Deep Metric Learning-based Face Recognition Pipeline with Anti-Spoofing on Raspberry-Pi Single-Board Computer. Test Engineering & Management, 82, 4302-4308. Sharma, M., Sharma, B., Gupta, A. K., Khosla, D., Goyal, S., & Pandey, D. (2021). A Study and Novel AI/ML-Based Framework to Detect COVID-19 Virus Using

本书版权归Nova Science所有

250

[13]

[14] [15]

[16]

[17]

[18]

[19] [20] [21]

[22]

[23]

[24]

[25] [26]

Sukhvinder Singh Deora and Mandeep Kaur Smartphone Embedded Sensors. In Sustainability Measures for COVID-19 Pandemic (pp. 59-74). Springer, Singapore. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/10. 1007/978-981-19-0312-0_20. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (7): 629–639. (1990) doi:10.1109/34.56205. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Madhumathy, P., & Pandey, D. (2022). Deep learning based photo acoustic imaging for non-invasive imaging. Multimedia Tools and Applications, 81(5), 7501-7518. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Li, J., Huang, W., Shao, L., Allinson, N.: Building recognition in urban environments: A survey of state-of-the-art and future challenges. Information Sciences. Vol 277. pp. 406-420. ISSN 0020-0255 https://doi.org/10.1016/j.ins. 2014.02.112 (2014). Cui S. Y., Yan Q., Liu Z. J., Li M.: Building Detection And Recognition From High Resolution Remotely Sensed Imagery. The International Archives of the Photogrammetry. Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008, pp. 411-416. ()) 2008). Gordan, P., Boros, H., Giosan, I.: ViewTM to create a dataset. The dataset and the user VISAPP 2020. 15th International Conference on Computer Vision Theory and Applications. pp. 268-275. (2020). Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hoang, N. D.: Detection of Surface Crack in Building Structures Using Image Processing Technique with an Improved Otsu Method for Image Thresholding.

本书版权归Nova Science所有

Image Processing and Computer Vision

[27]

[28]

[29]

[30] [31]

[32] [33]

[34]

[35]

[36]

[37]

[38]

251

Advances in Civil Engineering. Article ID 3924120, 10 pages, https://doi.org/ 10.1155/2018/3924120 (2018) Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Chaudhary, A., Bodala, D., Sindhwani, N., & Kumar, A. (2022, March). Analysis of Customer Loyalty Using Artificial Neural Networks. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 181-183). IEEE. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Tariq, U., Hu, Y., Huang, T. S.: Gender and Race Identification by Man and Machine. In: Wang, P. S. P. (eds) Pattern Recognition, Machine Intelligence and Biometrics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-224072_13 (2011). Tin, H., Sein, M.: Race Identification for Face Images. ACEEE Int. J. on Information Technology. Vol. 01. No. 02. Sep 2011. pp. 35-37. (2011). Khayatazad, M., Pue, L. D. Waele, W. D.: Detection of corrosion on steel structures using automated image processing. Developments in the Built Environment. Vol. 3. 100022, ISSN 2666-1659, https://doi.org/10.1016/j.dibe.2020.100022 (2020). Kohli, L., Saurabh, M., Bhatia, I., Shekhawat, U. S., Vijh, M., & Sindhwani, N. (2021). Design and Development of Modular and Multifunctional UAV with Amphibious Landing Module. In Data Driven Approach Towards Disruptive Technologies (pp. 405-421). Springer, Singapore. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Hou, T.: Construction of Practical Teaching System for Smart Tourism Management Major in the Era of Big Data. In 2021 2nd Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC2021). Association for Computing Machinery, New York, NY, USA, 929–933. doi:https://doi.org/10.1145/3452446. 3452669 (2021). Zhang, K. & Chen, Y. & Li, C.: Tourism Management Discovering the tourists’ behaviors and perceptions in a tourism destination by analyzing photos’ visual content with a computer deep learning model: The case of Beijing. Tourism Management. 75. 595-608. 10.1016/j.tourman.2019.07.002. (2019).

本书版权归Nova Science所有

252 [39]

[40]

[41]

[42]

[43]

[44]

Sukhvinder Singh Deora and Mandeep Kaur Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Degerine S. & Zaidi A., “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi A., “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & Zaidi A., “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi A., “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices,” in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Degerine S. & Zaidi A., “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints,“ in SIAM Journal on Optimization, 2007, Vol. 17, No. 4: pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

Chapter 12

Optimization Practices Based on the Environment in Image Processing Shipra1 Priyanka1,* Navjot Kaur1 Sandeep Gupta1 and Reecha Sharma2 1Punjabi

University Neighborhood Campus, Rampura Phul, Punjab, India 2Department of Electronics and Communication Engineering, Punjabi University Patiala, Punjab, India

Abstract Optimization algorithms are high-performance algorithms that are used to solve extremely difficult optimization issues. Nature-inspired algorithms are a collection of unique problem-solving methods and tools that are inspired by natural phenomena. Nature-inspired techniques are important in image processing. By reducing image noise and blur, it improves image enhancement, restoration, image segmentation, image edge detections, image generation, image denoising, image pattern recognition, image thresholding, and so on. Numerous optimization techniques for various image processing applications have been described so far. This chapter examines several naturally inspired optimization strategies and provides a brief overview of some of them.

*

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

254

Shipra, Priyanka, Navjot Kaur et al.

Keywords: optimization, swarm intelligence, bat algorithm, ant colony optimization, elephant herding optimization and grey wolf optimization

1. Introduction Nature-inspired algorithms are extremely effective in locating optimal solutions to multivariate and multifunctional issues. Nature-inspired techniques play an important role in image processing. By reducing image noise and distortion, it improves picture enhancement, restoration, pattern classification, image thresholding, image edge detection, image formation, image compression, and so on. Many researchers gave unique ideas for performing various works on the images. Various novel techniques and algorithms inspired by nature have become popular in recent years. After the creation or iteration stage, the keys that are best surrounded by a huge group of options are transmitted, and inactivity is not required. In comparison to early Nature Inspired Algorithms, the most modern algorithms are extremely effective. In past few years, these algorithms have grown in favor as a solution to a variety of difficult real-world optimization issues. Each algorithm has various image processing applications. Image compression is used in a variety of applications, including internet surfing, medical science, naval applications, television transmission, and many more. Several picture compression approaches have been proposed by different scholars throughout the years.

2. Environment-Based Optimization Techniques Various novel approaches and algorithms inspired by nature have become popular in recent years [1-3]. After the creation or iteration stage, the keys that are best surrounded by a large group of solutions are advanced, and inactivity is not required. In comparison to early Nature Inspired Algorithms, the most modern algorithms are extremely successful. In recent years, these algorithms have grown in favour as a solution to a variety of difficult real-world optimization issues. All of these are classified as meta-heuristics algorithms [4-8]. Several nature-inspired optimization strategies have been developed and tested so far. Swarm Intelligence Algorithms and Evolutionary Algorithms are the two types of optimizations influenced by nature. This chapter gives a quick

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

255

review of a few Nature-Inspired Optimization Techniques that can help with image processing.

2.1. Evolutionary Algorithms Biological processes including fertilization, variation, mixing, and choice are used to create evolutionary algorithms. The evolutionary algorithm has solved various difficult problem-solving tasks thanks to the fitness function, and the stream which utilizes the evolutionary algorithm as a platform for solving problems is known as Evolutionary Computation [9-11]. The fitness value is the cornerstone of evolutionary computing, and improving it will lead to the best solution. Initializing, selecting, genetic operators, and terminating are the four major processes in evolutionary algorithms. Simply said, fitter members will live and multiply in an EA, whereas unfit parts might fall off and not contribute to future generations’ gene pools, as in natural selection.

2.1.1. Genetic Algorithm (GA) D. Goldberg, J. Holl, and K. De Jong invented the Genetic Algorithm (GA) in 1989. A genetic algorithm is a type of evolutionary algorithm that is based on natural selection. This method is similar to mutation, crossover, and selection processes. It’s a Darwinian-inspired computational approach. It imitates natural selection. GA performance is determined by four factors: population size, generation time, mutation rate, and generation number. GA provides a set of probable solutions to a problem at random. To evaluate each solution, it is submitted to the fitness function. The finest solutions from the previous stage will be used to produce new possible solutions. The procedure will be repeated until a satisfactory solution is identified [12]. The genetic algorithm iteratively modifies a population of initial solutions. At each stage, the genetic algorithm selects parents at randomly to generate the children of the next generation. The population “evolves” toward an ideal solution over generations. A typical genetic algorithm necessitates the following: 1. A genetic description of the solution domain; 2. A fitness function for assessing the solution domain. Timetabling and scheduling problems appear to be particularly wellsuited to genetic algorithms’ solution. GAs have also been used in the field of

本书版权归Nova Science所有

256

Shipra, Priyanka, Navjot Kaur et al.

engineering. When it comes to solving global optimization challenges, genetic algorithms are frequently used. Genetic programming (GP) is a type of Genetic Algorithm that is being used to analyze and choose the best choice from a set of alternatives. This employs biological evolution to solve complex problems. John Koza spearheaded the use of genetic programming (GP) in 1992. The operations are the selection of the fittest programs for reproduction (crossover) and mutation based on a predefined fitness metric, typically proficiency at the target task. The crossover procedure entails switching random bits of selected pairings (parents) in order to develop fresh and distinct offspring that become part of a new generation of programes. Mutation is the process of replacing one random component of a programme with another random part of a programme. Some programes that were not chosen for reproduction are transmitted from one generation to the next. The selection and other processes are then applied to the new generation of programs in a recursive manner. Members of each new generation are typically more fit than members of prior generations, and the right program is frequently better than previous generations’ best-ofgeneration programmes. When an individual programme reaches a predetermined skill or fitness level, the recursion comes to an end. The application of GP in many disciplines of image processing, such as image analysis, figure ground segmentation, image segmentation, and picture acquisition, is presented by Liang et al. [13] and Mahmood et al. [14]. Data Mining, software synthesis and repair, predictive modelling, financial planning, soft sensors, designing, and image analysis are just a few of the early subjects Evolutionary strategies investigate the variation and recombine systems in order to provide more effective solutions. Evolutionary techniques use natural problem-dependent models, particularly variation and collection, as search operators. Similar to evolutionary algorithms, the variables are performed in a loop. An iteration of the loop is referred to as a generation. The generation sequence is repeated until a termination requirement is reached. In evolution techniques, selection is deterministic and relies solely on fitness rankings, not on real fitness values. As a result, the resulting method is invariant to monotonic modifications of the objective function. The basic development method acts on a population of size two: the current point (parent) and the result of its mutation. Only if the mutant’s fitness is at least equivalent to that of its parents can it be the parent of the next generation. The use of ES in several disciplines of image processing, such as image

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

257

segmentation and medical pictures, is represented by Naidu et al. [15] and Sarkar et al. [16].

2.2. Swarm Intelligence Algorithms in Image Processing Gerardo Beni and Jing Wang invented the Swarm Intelligence Algorithm in 1989. It is made up of agents or persons who have a local interest in one another and the environment. A swarm is a group of homogenous, intelligent agents communicating locally with each other and their environment, with no centralized control to enables global dynamic activity to form. They obey individual rules; there is no centralized control system, therefore they act independently. The interaction of agents on a local level will result in global behavior, which indicates global intelligence. Swarm-based algorithms are a new class of population-based, nature-inspired methods that can generate moderate, quick, and accurate solutions to a wide range of complex issues. Researchers have spent centuries studying the behavior of insect species due to the extraordinary efficacy of these natural swarm systems. Image segmentation, or the division of an image into several discontinuous pieces, is a typical difficulty in digital image processing. In terms of some specific properties, each zone should be homogeneous. For specific types of segmentation problems, however, several good solutions have been presented. The following is a general categorization of Swarm intelligence algorithms:

2.2.1. Bat Algorithm in Image Processing The BA technique is part of the SwarmIntelligence approach. Xin-She Yang created this method in 2010. The Bat Technique (BA) is an optimization algorithm based on micro-bat behavior. It is based on the echolocation behavior of microbats, as well as changes in emission pulse rates and loudness. Vector quantization’s peak signal to noise ratio is optimized using the BA technique. The technique was investigated by changing all possible Bat algorithm parameters for fast codebook creation and vector quantization of training set. The Frequency-tuning and Loudness parameters are used to intensify and diversify the algorithm, respectively. The Bat method outperforms LBG, PSO-LBG, QPSO-LBG, HBMO-LBG, and FALBG in terms of peak signal to noise ratio and reconstructed picture quality. The BALBG has a 1.841 times higher rate of convergence than the HBMO-LBG and FA-LBG, according to simulation results. The BA-LBG algorithm, although, needs additional parameters in comparison to the PSO-LBG, QPSO-LBG, and

本书版权归Nova Science所有

258

Shipra, Priyanka, Navjot Kaur et al.

FA-LBG algorithms. Alihodzic and Tuba [17] compared the novel bat method against other state-of-the-art algorithms for multilevel picture thresholding. The bat method in its pure form performed well, but the results were slightly below average, especially when Kapur’s criterion was utilized. Senthilnath et al. [18] suggested an unique bat algorithm (BA)-based clustering technique employing a multispectral satellite picture to solve crop type classification challenges. The recommended sectioned clustering approach is used to collect data in the form of optimal cluster centroids from training data. The extracted cluster centers are validated using test samples. A real-time multispectral satellite image and one benchmark data set from the University of California, Irvine (UCI) repository are used to demonstrate the suggested technique. The BA’s performance is compared to that of two other nature-inspired metaheuristic techniques: genetic algorithms and particle swarm optimization.

2.2.2. Ant Colony Optimization (ACO) in Image Processing Ant behaviour when hunting for food prompted the ant colony optimization (ACO) technique. After examining the behaviour of ants acquiring food and finding out the best path through their nest to the target, biologists concluded that a particular ant is a basic organism whose actions appears to be random, disorganized, and unmotivated. An ant colony, on the other hand, is extremely well-organized, complex, and efficient. The ants will always choose the quickest path between their colony and a food source. By emitting pheromone, these ants are communicating in an indirect manner. When they find a food source, the ant will drop pheromone on its way back to the nest. Marco Dongo, A. Colorni, and V. Maniezzo proposed ACO in 1991. The experimental findings reveal that our suggested approach performs better in terms of peak signal to noise ratio (PSNR) and structural similarity for varied densities of Gaussian noise (SSIM). ACO was inspired by observations of real ants solving discrete optimization issues. ACO’s discretion and parallel nature make it ideal for image processing. The rationale for this is because ACO can not only search intelligently, but also has desirable properties. Ramos and Almeida [19] proposed the ACO research on digital image processing. Self-adaptive Ant Colony Optimization (SACO), a new approach to image segmentation based on ACO, was described by Jue [20]. It’s a distributed algorithm wherein every ant creates a candidate divide using pheromone data obtained by others. After several repetitions, the best division will appear. New cluster centres and tiny windows are generated to help algorithm convergence. A varied evaporation coefficient can prevent early convergence. As per analytical outcomes, SACO is compatible with many other global optimization algorithms for image

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

259

segmentation. Fernandes et al. [21] offer an expanded model of a digital image habitat-evolving artificial ant colony system. It is demonstrated that the current swarm may adjust the population size depending on the type of image on which it is expanding, responding rapidly to changing images and so converging more rapidly to new target regions, and managing the amount of their image foraging agent.

2.2.3. Artificial Honey Bee (ABC) Optimization in Image Processing In 2005, Dervis Karaboga proposed the ABC algorithm [22]. It’s a populationbased meta-heuristic algorithm inspired by honey bee foraging behaviour. The population and selection processes are two processes related with ABC. The population is a variation process that explores the many locations inside the search space, while the selection process guarantees that prior experiences are exploited. The startup phase employed bees phase, observer bee phase, and scout bee phase are the four phases of ABC. Honey bees use collective intelligence to find food sources, and they have a variety of skills, including information exchange through dancing, recalling the location of food sources (location and route), and judgement. These bees adjust their positions in response to environmental factors and assign tasks in a more dynamic manner. A honey bee hunts for food sources (flowers) and gathers information such as nectar quantity, ease of nectar extract, and range and position from a beehive. Food sources’ profitability is determined by criteria such as location, richness, and simplicity of extraction. Those who are employed and those who are unemployed Working bees are a type of bee. The employed bee discovers source of food and keeps track of nectar levels. They communicate information about the distance, position, and competitiveness of the food source to an unemployed bee in the hive dancing region. Unemployed bees are always looking for a food source to exploit, and if they find one, they share the knowledge with employed bees. Sajedi et al. [23] proposed RISAB, a revolutionary region-based image steganalysis technique based on an artificial bee colony algorithm. The purpose of RISAB is to pick a sub-image based on density according to the cover, caption, and difference imaging using an artificial bee colony. As steganalyzers, two approaches are used, notably SPAM and CC-PEV. The proposed method’s performance was compared to that of other methods. The results of the experiments reveal that RISAB outperforms the algorithms described. Mostafa et al. [24] signifies the applications of artificial bee colony (ABC) in several areas of Image processing. A liver segmentation technique based on a clustering process developed by an artificial bee colony algorithm has been presented. ABC is

本书版权归Nova Science所有

260

Shipra, Priyanka, Navjot Kaur et al.

used to obtain an initial segmentation of the liver area after a pre-processing stage that removes image annotations and connects the ribs. To get at the final segmentation, a region growth approach is used to refine the data. Experiments on a difficult dataset of 38 CT images indicate that our approach provides excellent segmentation results and outperforms numerous existing segmentation algorithms.

2.2.4. Cuckoo Optimization in Image Processing To address the shortcomings of picture segmentation under multiple thresholds, such as long duration and low quality, an improved cuckoo search (ICS) for multithreshold image segmentation is described. Image enhancement has recently been intensively researched as a global critical study issue by collaborators and contributors. This subject has been addressed in a number of recent cutting-edge research findings. In a related study, Lin [25] proposed additional method for increasing image quality with Infrared pictures (IR) for an extended-domain surveillance network. The suggested method’s most important feature is that it does not necessitate any previous information of the IR image or the establishment of any parameters. The two fundamental goals of this research are adaptive contrast enhancement using Adaptive Histogram-Based Equalization (AHBE) and enhancement of the intensity of higher spatial-frequency infrared images to maintain the data of original inputted-images. Zhao [26] proposed a novel method for image enhancement based on the Gravitational Search Algorithm (GSA). The Gravitational Search Algorithm (GSA) makes the best use of the normalized imperfect Beta function parameters by using the greyscale picture generalized approach. Cuckoo search (CS) is a meta-heuristic method named after cuckoo birds, often described as “Brood parasites.” It is difficult for it to construct a nest and lay eggs in the nests of other birds. Some steward fowls may associate with the other cuckoo who visits their nest right away. The host bird can determine whether or not the eggs are from its nest. If the eggs do not belong to it, it will either chuck them away or destroy the nest and start over. Imagine that every present egg is an option, and that the cuckoo egg represents a new and feasible solution, as this phenomenon suggests. As an outcome, every nest contains one cuckoo egg, as well as a variety of other eggs that seems to be a mixture of answers. In actuality, every new egg produced by the cuckoo acts as an unexpected settlement for the search algorithm, and a distribution mechanism formula decides the number of eggs left before going on to the next step. The new population for the next

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

261

generation could be represented by a different number of eggs; thus, increase in number of iteration enhances results. Iterations are repeated until the desired optimization is attained. To summary, the image’s brightness and intensity are improved significantly using CS algorithms and morphological methods. The image is now ready to be fed into the Cuckoo Search algorithm through Levy flight to produce an improved image. The CO algorithm computes the imagecontrast parameter by first converting the inputted full-colored picture to grayscale image and then using the CS algorithm’s fitness function. The major goal of the CS algorithm was to improve the image’s merit and quality by obtaining the most effective and desired quantity of contrast factor, in addition to the morphological processes that were accomplished by modifying intensity parameters. The CS algorithm produced the best picture noise reduction results and selected the optimal parameter and contrast value.

2.2.5. Firefly Optimization Algorithm in Image Processing Xin-She Yang [27] first proposed the Firefly algorithm in 2007. Yang presented the Firefly algorithm (FA), a revolutionary nature-inspired optimization technique. The method was inspired by the flashing activity of fireflies when it came to acquiring food. Because of its high conversion rate and short processing time, the Firefly method is one of the best algorithms for picture segmentation. Firefly with levy flight was used to optimize both parameter and computing time using the maximum entropy objective function. Because of its high conversion rate and short computing time, the Firefly method is one of the most ideal algorithms for picture segmentation. Firefly with levy flight was used to optimize both parameter and computing time using the maximum entropy objective function. In comparison to previous methods, the proposed approach is more efficient and takes less time. Yang proposed FA, which is a well-known stochastic approach for optimization. This method is based on the flashing behavior of fireflies, with the majority of fireflies having a luminance of brightness. This radiance is utilized to draw in enemies and victims. Every firefly has the ability to transmit brightness signals to other fireflies. Brightness and attractiveness are also important factors in FA. These are well-known FA regulations. (a) All fireflies are unisex, which implies that regardless of their gender, they may all attract other fireflies. (b) The attractiveness of two fireflies is related to the intensity of light or luminance, therefore the brighter the light, the more appealing they are. Low-light-intensity fireflies will migrate to the brighter one.

本书版权归Nova Science所有

262

Shipra, Priyanka, Navjot Kaur et al.

(c) The cost function or fitness function, which is employed for searching, is utilized to make the firefly bright. The effectiveness of the firefly method has been demonstrated in a variety of applications, including power price forecasting [28], manufacturing cell construction [29], economic dispatch problem and image analysis [30-31].

2.2.6. Elephant Herding Optimization (EHO) Algorithm in Image Processing In image processing, the Elephant Herding Optimization (EHO) algorithm is used. The Elephant Herding Optimization (EHO) approach, which is based on elephant herding behavior, was presented by Wang et al. [32]. An elephant herd is divided into clans, each commanded by a matriarch and consisting of female elephants and their calves. Male members want to be nomadic and lonely, whilst female members prefer to live with their families. As a result, they will eventually become self-sufficient, either travelling alone or finding a small company of male elephants. The elephant’s herding behavior is modelled as clan update and separation operators to solve an optimization problem. To address an optimization problem, we use elephant herding behavior and make the following assumptions. •

•

In each generation, a certain number of male elephants reside outside the clan and are regarded as the lowest fitness value (for the maximizing issue); The matriarchs are regarded as the highest fitness value.

Eva et al. [33] modified the elephant herding optimization technique for multilevel thresholding using Kapur and Otsu’s [34, 35] approach. The elephant herding optimization technique was shown to solve multilevel picture thresholding problems in more instances than previous algorithms, with reduced variance in almost every example, confirming the algorithm’s robustness. To address the multilevel image thresholding problem for image segmentation, Chakraborty et al. [36] proposes oppositional-based learning (OBL) and dynamic cauchy modification to enhance elephant herding optimization (IEHO) (DCM). Experiments show that the proposed IEHO surpasses earlier approaches in aspects of optimal solution, peak signal-tonoise ratio, and structure and feature similarity index.

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

263

2.2.7. Grey Wolf Optimization Algorithm in Image Processing The original GWO method is a smart optimization algorithm focused on emulating the grey wolf’s grey hierarchy and efficient collective hunting behavior. The Canidae family includes grey wolves. Grey wolves are apex predators, which means they are at the very top of the food chain. Grey wolves prefer to be part of a pack. The average group size is 5-12. They have a very tight social dominance hierarchy. The leaders, known as alphas, are a male and a female. The alpha is mostly in charge of hunting, sleeping arrangements, and waking times, among other things. The pack is dictated by the alpha’s decisions. Beta is the second rank in the grey wolf hierarchy. The betas are wolves who assist the alpha in making decisions and other pack duties. A male or female beta wolf can exist. It is the alpha’s counsellor and the pack’s disciplinarian. The beta enforces the alpha’s commands and offers feedback to the alpha all through the collection. Omega is the grey wolf with the lowest. As a scapegoat, the omega is being used. At all times, Omega wolves must surrender to all other powerful wolves. They are one of the last eating wolves. Regardless of the fact that the omega appears to be a little member of the group, it has been observed that if the omega is gone, the entire group suffers from internal conflict and difficulties. Because of the omega’s outpouring of all wolves’ hatred and anger, this has happened. This adds to the overall happiness of the pack as well as the preservation of the leadership structure. The omega can also be the pack’s babysitters in rare situations. The grey wolves are categorized into four groups based on their social standing. The hunt behavior is accomplished by tracking the cofferdam of prey in the GWO algorithm, and then the predation task is finished. One of the most often utilized strategies in today’s scientific study is the employment of thresholds for picture segmentation. Many threshold-selection techniques and algorithms have been suggested thus far. Wang et al. [37] investigated the highest interclass variance approach; and Dunn et al. [38] the threshold selection approach for homogenization error is investigated; Sahoo et al. [39] presented a Renyi entropy-based threshold segmentation method. It does, however, have downsides, and selecting dramatically increases the amount of work required. This is an optimization issue. The solution to the problem is to rapidly and precisely find the ideal threshold. Image Segmentation with Multiple Thresholds Using an Improved Grey Wolf Optimization Algorithm. The Wolf Optimization Algorithm, which is focused on weight improvement, is used. The method was used to the area of multi threshold image segmentation to achieve multi threshold image segmentation. The image segmentation approach based on this algorithm not just to provide superior picture

本书版权归Nova Science所有

264

Shipra, Priyanka, Navjot Kaur et al.

segmentation quality than the PSO method, but also has clear benefits in terms of execution times and completion. This advanced technique helps in secure data transmission and execution [40-53].

Conclusion Among the most useful techniques in image processing, and particularly in areas like object matching, is optimization. Grey wolf optimization, Bat optimization, Ant colony optimization, Artificial Bee Colony optimization, Firefly optimization, Cuckoo Search Algorithm, and Elephant Herding optimization were all discussed briefly. Each algorithm’s different image processing implementations were addressed. It will require a lot of effort, as well as strong programming skills, to develop these nature-inspired algorithms.

References [1]

[2]

[3]

[4]

[5]

[6]

Anand, R., and Chawla, P. (2016, March). A review on the optimization techniques for bio-inspired antenna design. In 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 2228-2233). IEEE. Sindhwani, N., and Bhamrah, M. S. (2017). An optimal scheduling and routing under adaptive spectrum-matching framework for MIMO systems. International Journal of Electronics, 104(7), 1238-1253. Anand, R., Arora, S., and Sindhwani, N. (2022, January). A Miniaturized UWB Antenna for High Speed Applications. In 2022 International Conference on Computing, Communication and Power Technology (IC3P) (pp. 264-267). IEEE. Pandey, D., Pandey, B. K., and Wairya, S. (2021). Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Computing, 25(2), 1563-1580. Pandey, B. K., Pandey, D., Wariya, S., Aggarwal, G., and Rastogi, R. (2021). Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans’ Text Recognition and Identification. Augmented Human Research, 6(1), 1-14. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., and Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15.

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing [7]

[8]

[9]

[10]

[11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

265

Jain, S., Sindhwani, N., Anand, R., and Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Pandey, D., and Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Pandey, B. K., Pandey, D., Wairya, S., Agarwal, G., Dadeech, P., Dogiwal, S. R., and Pramanik, S. (2022). Application of Integrated Steganography and Image Compressing Techniques for Confidential Information Transmission. Cyber Security and Network Security, 169-191. Sindhwani, N., and Singh, M. (2017, March). Performance analysis of ant colony based optimization algorithm in MIMO systems. In 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET) (pp. 1587-1593). IEEE. Sindhwani, N., and Singh, M. (2020). A joint optimization based sub-band expediency scheduling technique for MIMO communication system. Wireless Personal Communications, 115(3), 2437-2455. Maihami, V., and Yaghmaee, F. (2017). A Genetic-Based Prototyping for Automatic Image Annotation, pp. 1–13. Elsevier, New York. Liang, Y., Zhang, M., and Browne, W. N. (2017). Image Feature Selection Using Genetic Programming for Figure-Ground Segmentation. Eng. Appl. Artif. Intell., 62, 96–108. Mahmooda, M. T., Majid, A., Han, J., and Choi, Y. K. (2013). Genetic programming based blind image deconvolution for surveillance systems. Eng. Appl. Artif. Intell., 26, 1115–1123. Naidu, M. S. R., Rajesh Kumar, P., and Chiranjeevi, K. (2017). Shannon and Fuzzy Entropy Based Evolutionary Image Thresholding for Image Segmentation. 1–13. Elsevier, New York. Sarkar, S., Das, S., and Chaudhuri, S. S. (2016). Multi-level thresholding with a decomposition-based multi-objective evolutionary algorithm for segmenting natural and medical images. Appl. Soft Comput., 50, 142–157. Adis Alihodzic, and Milan Tuba. (2014). “Improved Bat Algorithm Applied to Multilevel Image Thresholding”, The Scientific World Journal, vol. 2014, Article ID 176718, 16 pages, https://doi.org/10.1155/2014/176718. Senthilnath, J., Kulkarni, S., Benediktsson, J. A., and Yang, X. S. (2016). A novel approach for multispectral satellite image classification based on the bat algorithm. IEEE Geosci. Remote Sens. Lett., 13(4), 599–603. Ramos, V., and Almeida, F. (2000). Artificial Ant Colonies in Digital Image Habitats—A mass Behavior Effect Study on Pattern Recognition, in: Proc. of 2nd Int. Wksp. on Ant Algorithms, Belgium, September, pp. 113–116. Jue, L. U. (2005). An Self-Adaptive Ant Colony Optimization Approach for Image Segmentation, International Conference on Space Information Technology, 5985 pp.647–652.

本书版权归Nova Science所有

266 [21]

[22]

[23] [24]

[25] [26] [27]

[28]

[29]

[30] [31]

[32]

[33]

[34]

[35]

Shipra, Priyanka, Navjot Kaur et al. Fernandes, C., Ramos, V., and Rosa, A. C. (2005). Self-regulated artificial ant colonies on digital image habitats International Journal of Lateral Computing, 2 (1), pp. 1-8. Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization, in: Technical report-tr06, Erciyes university, engineering faculty, computer engineering department. 1-10 Sajedi, H., and Ghareh Mohammadi, F. (2016). Region based image steganalysis using artificial bee colony. J. Vis. Commun. Image Represent. 1–25. Mostafa, A., Fouad, A., Elfattah, M. A., Hassanien, A. E., Hefny, H., Zhu, S. Y., and Schaefer, G. (2015). CT liver segmentation using artificial bee colony. Proc. Comput. Sci., 60, 1622–1630. Lin, C. L. (2011). An approach to adaptive infrared image enhancement for longrange surveillance. Infrared Phys. Technol., 54, 84–91. Zhao, W. (2011). Adaptive image enhancement based on gravitational search algorithm. Procedia Eng., 15, 3288–3292. Yang, X. S. (2009). Firefly algorithms for multimodal optimization in Stochastic Algorithms: Foundations and Applications. Lecture Notes in Computer Science., 5792, 169–178 Mandal, P., Haque, A. U., Meng, J., Srivastava, A. K. and Martinez, R. (2013). A novel hybrid approach using wavelet, firefly algorithm, and fuzzy ARTMAP for day-ahead electricity price forecasting. IEEE Transactions on Power Systems., 28(2), 1041–1051. Sayadi, M. K., Hafezalkotob, A. and Naini, S. G. J. (2013). Firefly-inspired algorithm for discrete optimization problems: an application to manufacturing cell formation. Journal of Manufacturing Systems., 32(1), 78–84. Horng, M. H. (2012). Vector quantization using the firefly algorithm for image Compression. Expert Systems with Applications., 39(1), 1078–1091. Pandey, D., and Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Hassanzadeh, T., Vojodi, H. and Mahmoudi, F. (2011). Non-linear grayscale image enhancement based on firefly algorithm in Swarm, Evolutionary, and Memetic Computing. Springer, Berlin, Germany. pp. 174– 181. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., and Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Kapur, J., Sahoo, P., and Wong, A. (1985). “A new method for gray-level picture thresholding using the entropy of the histogram”, Computer vision graphics and image processing, vol. 29, no. 3, pp. 273-285. Sharma, M., Sharma, B., Gupta, A. K., and Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and

本书版权归Nova Science所有

Optimization Practices Based on the Environment in Image Processing

[36]

[37]

[38]

[39]

[40] [41]

[42] [43]

[44]

[45] [46]

[47]

[48]

[49]

267

spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17. Chakraborty, F., Roy, P. K. and Nandi, D. (2019). Oppositional elephant herding optimization with dynamic Cauchy mutation for multilevel image thresholding. Evol. Intel. 12, 445–467. https://doi.org/10.1007/s12065-019-00238-1. Wang, G. G., Deb, S., Geo, X. Z., and Coelho, L. D. S. (2016). A new metaheuristic optimization algorithm motivated by elephant herding behavior. International Journal of Bio-inspired Computation, 8(6), 394–409. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., and Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., and Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Albermany, S. A., and Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany, S., and Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., and Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sayal, M. A., Alameady, M. H., and Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. A., Hamade, F. R., and Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Sayel, N. A., Albermany, S., and Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 58685881. Albermany, S., Ali, H. A., and Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Degerine, S., and Zaidi, A. (2004). “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June, doi: 10.1109/TSP.2004.827195.

本书版权归Nova Science所有

268 [50]

[51]

[52]

[53]

Shipra, Priyanka, Navjot Kaur et al. Zaidi, A. (2005). “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov., doi: 10.1109/TSP.2005.855077. Degerine, S., and Zaidi, A. (2007). “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes”, Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and independent component analysis”, Treatise IC2, Signal and image series], Hermes Science, ISBN: 2746215179. Zaïdi, A. (2019). “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, doi: 10.1109/LSP.2019.2909651 Degerine, S., and Zaidi, A. (2007). “Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints“, in SIAM Journal on Optimization, Vol. 17, No. 4, pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

Chapter 13

Simulating the Integration of Compression and Deep Learning Approaches in IoT Environments for Security Systems Harinder Singh1, Rohit Anand2,† Vivek Veeraiah3,‡ Veera Talukdar4,¶ Suryansh Bhaskar Talukdar5,# Sushma Jaiswal6,• and Ankur Gupta7,§ 1Department

of Computer Science & Information Technology, Sant Baba Attar Singh Khalsa College Sandaur, Punjab, India 2Department of Electronics and Communication Engineering, G. B. Pant DSEU Okhla-I Campus (Formerly G. B. Pant Engineering College), New Delhi, India 3Department of Research & Development Computer Science, Adichunchanagiri University, Mandya, Karnataka, India 4Department of Information Technology and Management, Shri Ram College of Commerce, Science and Arts, Mumbai, Maharashtra, India 5School of Computer Science and Engineering, Vellore Institute of Technology (VIT) Bhopal, Madhya Pradesh, India



Corresponding Author’s Email: [email protected]. Corresponding Author’s Email: [email protected]. ‡ Corresponding Author’s Email: [email protected]. ¶ Corresponding Author’s Email: [email protected]. # Corresponding Author’s Email: [email protected]. • Corresponding Author’s Email: [email protected]. § Corresponding Author’s Email: [email protected]. †

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

270

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

6Department

of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur, Chhattisgarh, India 7Department of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India

Abstract Deep learning approach has been frequently used for classification and prediction in several researches. In present work, compression mechanism has been integrated to deep learning approach in order to security enhance the security of IoT environment. Moreover, deep learning is making use of filtered dataset in order to improve the accuracy during classification of attack. Present researches are capable to detection and classify different attacks such as brute force, man in middle and SQL injection. Detection of such type of attack restricts the unauthentic data transmission. Simulation work is capable to classify attack according to different categories on the bases of deep learning based trained model.

Keywords: compression, deep learning, IoT, security, accuracy, F1 Score, recall value, precision

1. Introduction Bit-Swap [1] is a deep learning-based lossless data compression method that we provide. Using bits-back coding and asymmetrical number systems, it builds upon earlier work on practical compression using latent variable models. Our results show that Bit-Swap outperforms state-of-the-art image compressors on a wide variety of test pictures. To encourage further research and development along this path of cutting-edge compression concepts, we are making available the code for the approach and optimized models. Additionally, we include a demo and a pre-trained model for applying BitSwap compression and decompression to your own images. Large volumes of information useful for many purposes are generated by the Internet of Things’ (IoT) many sensors. It’s clear that Deep Learning (DL) would help make sense of this mountain of data, leading to better IoT. For effective analysis and storage, big IoT data must be routinely transferred between the edge and the cloud. In low-bandwidth wide area network settings, data transportation is

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

271

expensive. One way to alleviate the bandwidth crunch is by the use of data compression, which drastically reduces the amount of the data being sent.

1.1. Deep Learning There are several different subfields within machine learning. Because it is able to learn from structured data, it produces a lot of it. The term “deep neural learning” has been used to characterise this method. In this case, the term “deep neural network” [2-4] was created by a single person. Algorithms are essential to the process of machine learning. An enormous quantity of data is required for deep learning. According to recent research, automated machine learning cannot exist without human oversight. In computer programming, there are no hard and fast rules. In order to help a computer find a picture of a dog, for instance, as much detail as possible should be provided. More accurate search results may be obtained from enormous data sets, thanks to the increased accessibility of these sets by autonomous programs. The speed and accuracy of deep learning (Figure 1) surpass those of traditional machine learning. Many individuals understand that education may be structured to be more efficient. The term “hierarchical learning” is often used to describe this method of teaching. The breadth of uses for machine learning has expanded beyond what was formerly imagined. This strategy relies heavily on artificial neural networks.

Figure 1. Deep learning.

本书版权归Nova Science所有

272

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Machine learning has been recognized for some time as a common AI use case. “Artificial Intelligence” (AI) is a catchall phrase for the use of humanlevel intelligence in computer systems. In this post, we have grouped together some ideas. To some extent, this apparatus might reduce storage needs in reinforcement learning applications. An essential use of AI is machine learning. In this way, the system is given the opportunity to expand. This expands when its programming is incomplete. Machine learning’s ultimate goal is to simplify computer programs to the point where a single person can use them [5, 6]. The primary goal of AI research is to develop methods by which machines can acquire knowledge and skills without human intervention. Assistance from humans is unnecessary at this time.

1.2. IoT The term “Internet of Things” refers to a system in which electronic devices, software, a network of items, and various sensors are used to enable objects (which may or may not be human people) to collect and share data [7-9]. Connecting devices to the internet is nothing new, but the IoT makes it easier than ever. It allows items to recognise one other and share information when they are collocated. The ideas of the Internet of Things make it possible to link items or things at any time, in any place, with any other resource (Figure 2). Because these things interact intelligently with one other and allow judgments on a relative basis, the IoT has an influence on several areas of education, technology, and industrial applications. It seems that the Internet of Things is one of the most significant technological advances of our time. Data collected by networked smart devices has the potential to improve healthcare delivery in a number of ways, from allowing patients to make more well-informed decisions to helping businesses save money without sacrificing quality. An IoT Network is shown in Figure 3. House automation includes controlling the environment of your home, including the lighting, HVAC, audio/visual, and security systems [8]. Automating the shutdown of lights and gadgets or teaching the inhabitants about their own use might lead to long-term savings. A smart toilet seat can measure your vitals, including blood pressure, oxygen levels, weight, and heart rate. A platform or hubs that control electronics and home appliances might form the core of a smart or automated house’s design. In order to allow users to manage their home’s electronics using iOS devices like the iPhone and Apple Watch, several manufacturers are

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

273

considering incorporating Apple’s Home Kit technology into their products. Siri and other native iOS capabilities may be used for this purpose. A great example of a smart home gadget bundle that doesn’t need a WiFi bridge to be operated by Apple’s Home app or Siri is Lenovo’s Smart Home Essentials. Some examples of smart home hubs that may be used separately to connect up your choice smart devices are the Amazon Echo, Google Home, Apple HomePod, and the Samsung Smart Things Hub. Home Assistant, Open HAB, and Domoticz are just a few of the open-source alternatives to expensive proprietary commercial solutions. With use of internet, IoT aims to make it possible for everyday objects and machines to exchange data with one another. Smart gadgets like smartphones, tablets, laptops, and wireless sensor networks are all part of IoT.

Figure 2. IoT concepts.

Figure 3. IoT network.

1.3. Compression When hands are laid over a muscle group and pressure is applied, it is called compression and it is a very effective massage method. The next step is to

本书版权归Nova Science所有

274

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

repeat the process with your hands in a new location. Compressions might be mild or severe in intensity. Compression may either be lossless or lossy. When data is compressed using a lossy method, some information is lost whereas using a lossless method, all information is preserved [10-13]. The word “lossless” refers to the fact that no information is lost during the compression process. Method relies, rather, on coming up with cleverer methods to encode the data. Compression’s key benefits are that it requires less space for data storage, less time to transmit that data, and less bandwidth for communication [13]. The potential savings from this are substantial. Compared to storing identical uncompressed data, storing identical compressed files greatly reduces storage costs. Where particles in a longitudinal wave are squeezed together is called compression. In a longitudinal wave, the area where the particle separation is greatest is called a rarefaction. Compression describes a state in which the medium is compacted, whereas rarefaction describes a state in which the material is dispersed.

1.3.1. Lossless Compression As their name suggests, lossless compression methods keep all data intact throughout the compression process. With lossless compression, it is possible to reconstruct the original data from the compressed one. Data loss is unacceptable in many applications, thus developers often resort to lossless compression [14]. Lossless compression is becoming more relevant, and text compression is a key application area. If even a single letter is changed in the reconstruction, the statement’s meaning might change dramatically. To illustrate, think about the phrases “Do not send money” and “Do now send money.” Computer files and some sorts of data, such as financial information, are similarly argued. Maintaining the integrity of data is crucial if they are to be “improved” or processed in the future to provide additional insights. Let’s say we used lossy compression on a radiology picture and found that the resulting reconstruction, Y, was indistinguishable from the original, X. Artifacts that were previously undetected due to enhancement may now arise and significantly mislead the radiologist if this picture is ever enhanced. Since human lives are on the line, it is important to be cautious when choosing a compression strategy that results in a different reconstruction from the original. When scientist’s analyses satellite data, they may receive a variety of quantitative measures of things like plant cover, tree loss, and more. During processing, discrepancies between the original data and the rebuilt data may

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

275

be “enhanced.” You can’t always count on being able to retrieve previously obtained information. This means that it is not a good idea to provide room for variations during compression.

1.3.2. Lossy Compression Lossy compression is a kind of data compression used in the IT sector; it is also known as irreversible compression. These techniques reduce the amount of data being stored by making approximations and deleting some of the data being stored. These methods are used to lessen the volume of material that must be saved, processed, and sent. Increasing the degree of approximation results in more coarse pictures, as shown in the several iterations of the cat photo on this page. In contrast, lossless compression (also known as reversible compression) keeps all of the original information intact while reducing file size. Lossy compression may provide substantially larger reductions in data size than lossless methods [15].

Figure 4. Lossy data compression.

A well-designed lossy compression (like that shown in Figure 4) algorithm may often drastically shrink files without being detected by the enduser. Further data minimization may be desired even when obvious to the user. As soon as its first publication by Nasir Ahmed, T. Natarajan, and K. R. Rao in 1974, the discrete cosine transform (DCT) became the most widely used lossy compression method. A novel family of sinusoidal-hyperbolic transforms functions with features and performance on par with DCT was suggested in 2019 for lossy compression.

本书版权归Nova Science所有

276

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

1.4. Role of Compression in IoT For effective analysis and storage, big IoT data must be routinely transferred between the edge and the cloud. In low-bandwidth wide area network settings, data transportation is expensive. It is possible to alleviate the bandwidth crunch by using data compression to drastically decrease the amount of the data being sent. When data is compressed, the number of bits used to represent it is decreased. Data compression may lessen the need for expensive storage gear and increase the speed of information transfers, all while saving space. Today, data compression is crucial for storing information digitally on computer systems and sending it across communications networks, since an uncompressed digital picture may take 20 gigabytes. Bits, or patterns of 0s and 1s, are used in digital encoding (binary digits). In order to minimize the storage space required by several files, compression (or “data compression”) is used. Compressed files may be exchanged between computers more rapidly and take up less storage space than their uncompressed counterparts [16].

1.5. Role of Security in IoT For IoT devices to function properly, security must be included at every level, including hardware, software, and connection. Without proper protections for the Internet of Things, any appliance or robot in the factory might be compromised. Once hackers take over, they may steal the user’s digital information and even make the item useless. Internet-connected smart devices with embedded systems including CPUs, sensors, and communication hardware gather, transmit, and act on data they obtain from their surroundings to form an IoT ecosystem. IoT technology improves the efficacy of video surveillance. The ability of smart cameras and linked apps to interpret visual information without human interaction facilitates the introduction of several automated procedures. To fully realise the potential of these cameras, their security must be ensured. Protection, identification, and monitoring of hazards, as well as assistance in fixing vulnerabilities across a wide variety of devices that may pose security dangers to your organization, are all part of IoT security [16]. Security for IoT devices may be achieved by incorporation of active security mechanisms within the devices’ respective software. One technique to defend devices from assaults is to install security features, such as password protection for the software. Physical security systems that combine the

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

277

Internet of Things with video surveillance will be better equipped to aid in the completion of complicated tasks including operations management, predictive maintenance, risk reduction, cost reduction, and conflict management. When it comes to IoT uses, monitoring the environment is the most useful. It fosters improved sustainability by using cutting-edge sensor technology to detect environmental hazards in the air and water. Employing a sophisticated environmental monitoring system can help you maintain a safe and sanitary workplace.

1.6. Different Types of Cyber Crimes •

• •

•

•

•

• • •

Criminals use phishing, also known as “spoofing,” to trick people into giving over their personal information through email by making it seem to come from a reputable company [17]. Cyber stalkers are those who harass others over the Internet, usually via electronic means like email, social media, or websites. Ransom ware is a sort of malicious software that may be installed on a computer by cybercriminals in order to basically keep the user’s data hostage until the user pays the ransom. Many escorts offer their services in online ads, social media forums, or on their own personal websites, making it easier for clients to discover them in a discreet manner. In the last year, the National Center for Missing and Exploited Children received almost 10 million reports of possible child sexual exploitation. Theft of intellectual property, or piracy, is rampant on the Internet, with many works of literature, films, and other media accessible for free download. If someone gains unauthorized access to your email, social media accounts, or computer, they might go to prison for “account hacking.” Drug trafficking: the use of crypto currencies has enabled a meteoric growth in the internet drug trade over the last several years. Spyware, malicious software installed unintentionally on a victim’s computer or portable device, is the starting point for half of all credit card theft.

本书版权归Nova Science所有

278

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

1.7. Status of Cyber Crimes •

• • •

•

• •

Technology has become essential to the success of today’s organizations. It is crucial for a company’s operations that this technology be protected, supported, and kept up to date. Trusted cyber security services for your complete IT network from Visual Edge IT. Seventy-nine percent of SMBs feel they are susceptible to hackers and the expanding cybercrime sector because of security flaws. Interestingly, human error accounts for 85% of all security incidents, and 61% of those incidents include the usage of credentials. As in previous years, financially driven assaults remain the most prevalent kind, with 80 percent of malevolent actors belonging to the category of organised crime. Email is responsible for the delivery of 75% of all malware, while online downloads account for the remaining 25%. Hacking tools and bitcoin mining software are partly to blame for the 24 percent increase in browser-based threats. A staggering 77% of businesses lose access to systems and networks after a ransomware assault. An overwhelming majority of businesses (83%) say that safeguarding their ecosystems is just as important as securing their own operations.

1.8. Advantages of Cyber Security • •

• • •

Career prospects for those who have pursued training in cyber security are excellent at the moment. After completing a cyber security degree, you’ll have the independence to choose between working part-time or full-time on your own terms.• Competitive pay and benefits Upon finishing the cyber security programme, graduates will have access to many opportunities in a variety of fields. Cybersecurity experts are crucial since they serve as early warning systems and suppliers of digital safety. To keep up with the ever-evolving nature of cyber threats and the rapid pace at which new technologies emerge, cyber security experts will participate in ongoing training and education.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

279

1.9. Disadvantages of Cyber Security • • • • •

Setting up a firewall correctly is difficult. If a firewall is not properly designed, it may prevent users from accessing specific websites or services online. The system becomes even more sluggish as a result. Protecting the integrity of the new software requires periodic updates. Price may be too high for the typical consumer.

1.10. Different Cyber Attacks 1.10.1. Ransom-Ware Ransom-ware [17] is a kind of malicious software that encrypts data in order to prevent the owner (or an authorized user) from accessing it. By encrypting information and demanding a ransom in exchange for the decryption key, cybercriminals force businesses into a situation were giving over the money is the least difficult and most cost-effective option. 1.10.2. Brute Force Attack The overview of this attack is shown in Figure 5.

Figure 5. Brute force attacks.

1.10.3. Man in Middle Attack Man-in-the-middle (Mi TM) attacks (as shown in Figure 6) [17] are a form of cyberattack in which an adversary covertly intercepts and transmits

本书版权归Nova Science所有

280

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

communications between targets who think they are interacting directly with one another.

Figure 6. Man-in-middle attack.

1.10.4. Man in the Middle Attack Prevention • • •

To protect your online communications, it is recommended that you connect to the internet over a VPN. A virtual private network (VPN) significantly reduces an attacker’s ability to monitor or alter data sent over the internet. Have a strategy in place to respond to cyber security incidents in order to lessen the likelihood of data loss.

1.10.5. SQL Injection When a SQL Injection (Figure 7) attack occurs, the hacker uses a SQL query to steal sensitive information, such as a username and password, allowing them to log in under false pretences. SQL injection vulnerabilities, attacks, and tactics come up in many contexts and may take many forms. Common instances of SQL injection include: • • •

Modifying a SQL query to get more results in the case of concealed data. As an example of subverting application logic, one may alter a query to disrupt the application’s logic. Using a UNION attack, it is possible to compile information from many sources.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

• •

281

By carefully inspecting the database, one may learn both its version and its internal structure. Blind SQL injection occurs when the outcomes of a manipulated SQL query are omitted from the answers produced by the targeted application.

Figure 7. SQL injection.

1.10.6. Solution for SQL Injection Attacks • • •

Restricting SQL queries from submission forms may protect against SQL injection attacks. It may be possible to prohibit the use of the wildcard character in the input form. The on-screen keyboard is an option.

1.11. Role of Encryption in Cyber Security The use of cryptography [17] has become more important in protecting sensitive data online. In this way, only the sender and the receiver need see the contents of a message, protecting sensitive information. The term’s origin is in the Greek word crypto, from which we get the English word “secret.” The classification of Cryptography is shown in Figure 8.

本书版权归Nova Science所有

282

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Figure 8. Classification of cryptography.

1.12. Role of Firewall in Cyber Security • •

A firewall’s importance in cyber security cannot be overstated. A very scalable application firewall that can block new types of attacks. Protect your web apps and APIs automatically with cuttingedge savviness.

Figure 9. Firewall connecting public and private network.

• •

Cloud-agnostic. No holes in the defenses. A round-the-clock support system. It’s always being watched.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

283

A firewall connecting Public and Private Network is shown in Figure 9.

1.13. Intrusion Detection System (IDS) IDS are a kind of network security technology that were first developed to identify vulnerabilities exploited against a target application or machine. IDS (Figure 10) is of the following types: • • •

Signature-based Intrusion Detection Method. Anomaly-based Intrusion Detection Method. Hybrid Detection Method.

Figure 10. IDS Architecture.

1.14. Role of Machine Learning in Cyber Security Machine learning allows cyber security systems to recognize patterns, adapt to new situations, and thwart future assaults. This may aid cyber security teams in being more vigilant about potential dangers and quicker to react to actual assaults when they happen.

2. Literature Review Rwan Mahmoud, et al. (2016) focused on IoT Security: Current Status, Challenges and Prospective Measures. This research examines present status and challenges facing IoT security and provides expert analysis of those

本书版权归Nova Science所有

284

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

challenges. The overall goal of the IoT framework is to connect everything, everywhere, to everyone. Perception, Network, & Application layers make up standard three-tier architecture for IoT. To make IoT secure, it’s important to follow a set of regulations at each level. Security threats associated with the IoT framework must be addressed and managed if its future was to be assured. A lot of people have been trying to figure out how to protect the layers and devices that make up the Internet of Things from the unique security threats they pose. This article provides a high-level review of IoT security ideas, issues related to technology and security, potential solutions, and future prospects [18]. C. K. M. Lee, et al. (2016) looked this article aims to accomplish the following via the study and creation of a Cyber Physical System for Industrial Informatics Analytics that is based on IoT. This research examined the state of big data analytics in industry, and it proposed a cyber physical system that would allow companies to mix and match modules from different types of analytics software depending on their specific requirements. This study creates a novel context intelligence framework for mining big data that takes into account location, sensor, and unstructured information in the context of industrial informatics. The authors utilize a case study to demonstrate the cyber physical system they propose. To bring about the next paradigm change in industry, further research into system integration and the transition from traditional to smart factories is necessary [19]. Kumar Mandula, et al. (2016) provided the advent of affordable and widely distributed smart phones and ultra-fast mobile networks like 3G& LTE, the mobile sector has expanded greatly to put a wide range of useful services and apps within the reach of the general public. Connecting, controlling, and managing intelligent devices with IP address is one of the potential applications of the emerging field of IoT. IoT allows for the more efficient and hands-free delivery of services in a wide variety of contexts, including “smart” government, “smart” education, “smart” agriculture, “smart” healthcare, “smart housing,” etc. In this piece, they discuss IoT and how you can use it to automate your house using a cheap Arduino board and some free software for your smartphone. An indoor Bluetooth version and an outdoor Ethernet version of a home automation system are described here [20]. Ioannis Andrea, et al. (2016) provided a lot of people have been thinking about IoT ever since it was first conceived as a concept in the 1990s. This is because it allows for the connecting of many different electronic devices via many different technologies. However, rapid advancements in IoT have been made over the last decade without due consideration of the substantial security

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

285

goals and difficulties involved. This research delves into the intentions and objectives of IoT security and offers a new categorization of threats and defenses. It then go on to examine future security trends and difficulties that must be met to alleviate worries about the safety of such networks and encourage widespread use of the Internet of Things [21]. Victor R. Kebande, et al. (2016) focused on a generic digital forensic investigation framework for Internet of Things (IoT). In-Depth Internet of Things Investigations Using a Common Digital Forensics Framework (IoT) Although there has been much study of the Internet of Things, applying DF theory to DFIs in IoT-based infrastructures has gotten surprisingly little attention. As a result of the heterogeneity and scattered nature of IoT networks, existing DF tools and methodologies have not been able to completely adapt IoT to DF methodology. Due to the complex nature of IoT environments, it is difficult for DF investigators and law enforcement agencies to collect, assess, and analyze evidence for use in legal proceedings (LEA). This study seeks to close a knowledge gap by creating a DF framework for conducting DFIs in an IoT environment, where none exist at present. The authors build a general DFIF-IoT on this foundation, which will allow future IoT investigation capabilities to be supported with certainty. The following are a few advantages of the suggested layout: International Standard for Information Technology, Security Techniques, Incident Investigation Principles, and Process ISO/IEC 27043: 2015 has been met. Authors argue that incorporating the proposed paradigm into future DF tool development might boost the effectiveness of digital forensic criminal investigation in IoT networks [22]. King-Hang Wang, et al. (2018) provided regarding safety of a brand new ultra-lightweight authentication system in an Internet of Things setting, specifically for RFID tags. A lightweight mutual authentication mechanism for RFID tags was recently presented by Tewari and Gupta for use in Internet of Things settings. Their system is designed to save the time and effort spent storing and processing information while yet maintaining privacy. Unfortunately, this study takes use of the protocol’s flaw. A database server’s tag’s shared secret might be compromised by this attack. The option of making some adjustments to the system and calling it good is also investigated [23]. Amin Azmoodeh, et al. (2018) introduced crypto-ransom ware in Internet of Things networks by measuring their power use In an IoT architecture, this may include a broad variety of Internet-enabled phones or products like Android devices, but which are more likely to be targeted by ransomware writers if they have greater processing capabilities (such as storage capacity). Keeping Android devices secure from ransomware requires constant

本书版权归Nova Science所有

286

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

monitoring of their power usage, which we do in this research using a machine learning-based technique. For the purpose of distinguishing between malicious code and legitimate programmers and creators, our suggested system tracks the amount of energy used by each process. Therefore, they show that our suggested method outperforms K-Nearest Neighbors, Neural Networks, Support Vec tor Machine, and Random Forest in terms of accuracy, recall, precision, and F-measure [24]. Amichai Painsky, et al. (2019) Random Forests Lossless Compression. State-of-the-art predictive modelling techniques include ensemble methods. When applied to today’s big data, these techniques often need a large army of “sub-learners,” each of which develops in complexity in proportion to the size of the dataset. As a consequence, there will be a higher demand for storage capacity, which might drive up storage costs significantly. Most often, this issue arises in a subscriber-based setting, where a user’s unique outfit must be kept on a device with limited space (such as a cellular device). They provide a new approach to lossless compression of tree-based ensemble approaches, with a special emphasis on random forests, in this paper. Our proposed strategy relies on first modelling the trees in the ensemble probabilistically and then clustering the models using Bregman divergence. This enables us to discover a small but sufficient collection of models that adequately describes the trees and could be easily stored and updated. Using a wide range of contemporary datasets, our compression method is shown to achieve very high compression ratios. Predictions may be made from the compressed format, and the original ensemble can be perfectly reconstructed using our technique. They also provide a theoretically sound lossy compression strategy that gives us command over the distortion-coding-rate tradeoff [25]. R. Naveen Kumar, et al. (2019) presented wavelets and the fractional Fourier transform, they have developed a lossless picture compression technique. Compression algorithms were crucial to meeting the need for rapid data transport in today’s ever-evolving information technology landscape. It is a challenging aspect of data compression to preserve information integrity while reconstructing data at a high compression rate. In this study, they present a novel lossless picture compression approach that combines wavelet and fractional transformations. Despite wavelets’ superiority for feature extraction across several frequency resolutions, most current compression techniques ignore the wavelet decomposition’s low-frequency sub-bands. However, the fractional Fourier transform was a useful generalized Fourier transform that may be used to code the source picture efficiently while maintaining lossless compression. In order to compress these potentially vulnerable sub-bands of

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

287

the wavelet transform, they have used the discrete fractional Fourier transform [26]. A. Krishnamoorthy, et al. (2019) focused on Automated shopping experience using real-time IoT. Using an ultrasonic sensor and a Raspberry Pi, they create a fully automated shopping experience based on embedded sensors. A grocery shop, for example, would benefit from this kind of system since it would lessen the burden on the staff to run the whole store. Real-time checkout assistance for customers was the primary focus. In addition, being able to analyse data in real time will aid in learning about and responding to users’ wants, requirements, and interests. Our technology will recognise the item as it was selected off the shelf and added to the user’s virtual cart, automating the billing process and saving the customer time. Having this feature increases the system’s appeal from both the seller’s and the buyer’s vantage points. Once the customer has left the shop, the total cost of their purchases will be deducted from the payment method they used inside the app. Using a camera and an algorithm for face recognition called Kairos, we can keep tabs on who comes and goes from the shop at all times. Google’s Firebase serves as the database that allows for these instantaneous updates to occur. In this work, they detail the structure of the system and the apparatus used in the experiments [27]. Sachin Kumar, et al. (2019) looked IoT is a revolutionary approach for future technology enhancement. This new high-tech way of life is made possible by the Internet of Things (IoT), which represents a shift in cultural norms. There have been several paradigm shifts made possible by IoT, including the rise of the “smart city,” “smart home,” “pollution control,” “energy conservation,” “smart transportation,” and “smart important researches and studies have been conducted to improve technology using IoT. However, several obstacles remain that must be overcome before the full promise of IoT can be realized. To solve these issues, we need to consider all of the dimensions of IoT: its uses, challenges, enabling technologies, social and environmental effects, etc. In-depth discussion from both a technical and sociological perspective is the major goal of this evaluation. The many challenges associated with the Internet of Things are also discussed, along with key architecture and application areas. Moreover, the article features the present literature and demonstrates how it has contributed to many aspects of the Internet of Things. Moreover, they have investigated the role that big data and analytics play in the context of the Internet of Things. For anyone unfamiliar with IoT & its applications, this article will serve as a valuable resource [28].

本书版权归Nova Science所有

288

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Deepa Pavithran, et al. (2020) introduced towards building a block chain framework for IoT. Block chain was a promising technology with applications outside the financial sector. As an example, its role in networks based on IoT was not well understood and requires more study. As a result of the restricted capabilities of IoT devices &distributed ledger architecture of the block chain protocol, this is the case. If block chain capabilities are tailored to match the Internet of Things, it might have significant advantages. Many of the issues plaguing the Internet of Things today may therefore be fixed. However, there may still be a number of obstacles to overcome when putting block chain to use for IoT. In this study, we provide a comprehensive overview of the most up-to-date literature about block chain in the Internet of Things. In particular, we highlight the five most important aspects of an IoT block chain architecture, together with the design considerations and problems associated with each. In addition, they identify the obstacles to developing a trustworthy block chain infrastructure for the Internet of Things. Our simulations of two distinct block chain deployment models revealed that device-to-device architecture provides much higher throughput than gateway-based models [29]. Fadi Al-Turjman, et al. (2020) presented 5G/IoT-enabled UAVs for multimedia delivery in industry-oriented applications. The IIoTs was a rapidly expanding network of devices that were linked together via the use of embedded sensors to gather and share data. In order to monitor, gather, exchange, analyse, and distribute a useful and difficult quantity of information, a wide variety of items are linked together via a complex communication system. UAVs were anticipated to be engaged in a variety of IIoT-related applications where multimedia and video streaming plays a vital role due to their autonomy, mobility, and communication/processing capability. The multimedia routing used by IIoT and its supporting infrastructure is where our attention is mostly directed outside of regular business hours. They offer a Canonical Particle Swarm (CPS) optimization data delivery system tailored to the needs of industry, which can be used for recovering, building, and choosing k-disjoint pathways that can tolerate failure of the parameters while still fulfilling quality of service requirements. Using a multi-swarm technique, the ideal direction is chosen for multipath routing while talking to the UAV. Results from tests establishing the genuineness of the proposed technique indicate that it outperforms both the standard CPMS optimization & FMPS optimization [30]. Kobra Mabodi, et al. (2020) focused on the IoT may be protected against malicious actors by using a multi-tiered trust-based intelligence architecture

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

289

that relies on cryptographic authentication. In this research, they propose a hybrid approach that relies on cryptographic authentication. The suggested solution consists of four distinct phases: node trust verification inside the IoT; route testing; grey hole attack detection; and malicious attack elimination within MTISS-IoT. In this work, the effectiveness of the technique is evaluated using extensive simulations in the NS-3 environment. Evidence from four independent tests shows that the MTISS-IoT technique has a 94.5 percent detection rate during a Grey hole assault, with a false positive rate of 14.1% and a false negative rate of 17.49% [31]. Maninder Jeet Kaur, et al. (2020) focused on IoT meets ML and the digital twin: data is transformed into useful knowledge and then put into practice. Technologies such as IoT, the block chain, and AI may require us to reevaluate the goals and course of globalization. Since Digital Twin makes a digital replica of the physical model for the purposes of remote monitoring, watching, and controlling, its influence on enterprises throughout the world was widely expected. This “living model” employs ML and AI to forecast the future behavior of related physical systems based on real-time data from a broad range of IoT sensors and devices. They have investigated the challenges and opportunities involved in establishing a digital twin with Internet of Things capabilities, including its architecture and its applications. Big data and the cloud, data fusion, and digital twin security are just a few of the exciting new research frontiers that have already been explored. With the aid of AI, maybe new models and technical systems can be developed for intelligent production [32]. Amaal Al Shorman et al. (2020) presented for unsupervised machine learning in IoT botnet detection, they present a one-class support vector machine with Grey Wolf optimization. Based on our findings, they suggest an unsupervised evolutionary approach to the problem of detecting IoT botnets. Using the efficiency of a recent swarm intelligence algorithm called Grey Wolf Optimization algorithm (GWO) to optimize the hyper parameters of the OCSVM and, simultaneously, to find the features that best describe the IoT botnet problem, the proposed method makes a substantial contribution to the ability to detect IoT botnet attacks launched from compromised IoT devices. To show that the suggested technique works, they applied it to a recently updated version of a real-world benchmark dataset and scored it using the standard evaluation methods for anomaly detection. In terms of true positive rate, false positive rate, and G-mean, the suggested technique surpasses all existing algorithms tested on a large set of IoT devices. On top of that, it

本书版权归Nova Science所有

290

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

drastically reduces the number of criteria that must be used in order to have the shortest possible detection time [33]. Abhishek Verma et al. (2020) focused on ML based intrusion detection systems for IoT applications. In this study, they explore the potential of machine learning classification methods for defending IoT against DoS assaults. Classifiers that may aid in the progress of anomaly-based intrusion detection systems were thoroughly investigated (IDSs). Classifiers’ efficacy is measured in relation to standard metrics and validated in various ways. For this purpose, popular datasets such as CIDDS-001, UNSW NB15, and NSLKDD are employed as benchmarks. The statistical significance differences between classifiers are examined using the Friedman and Nemenyi tests. Classifier response times on IoT-specific hardware are also evaluated using Raspberry Pi. We also go through a process for determining which classifier is ideal for a certain application. The primary objectives of this research were to encourage the development of IDSs by IoT security researchers utilizing ensemble learning, and to provide suitable techniques for statistical evaluation of classifier performance [34]. Ruchi Vishwakarma, et al. (2020) provided a survey of DDoS attacking techniques and defence mechanisms in the IoT network. To fully realize the promise of wireless media in our daily lives, the Internet of Things has arisen as a game-changing breakthrough. Virtually everywhere in the globe, we may exert influence on our immediate environment by engaging with a variety of smart apps that were hosted on their own servers. Due to its widespread use, IoT presents a prime opportunity for expanding criminal networks. IoT vulnerabilities originate from factors such as restricted resources, inferior security, etc., and are exploited by malicious actors to get access to lawful equipment. DDoS attack on an IoT network was one that attempts to take down the network’s servers by overwhelming them with fake requests via the network’s communication channels. With the recent destruction of certain well-known servers reported over the last few years, research into methods of protecting against DDoS in the IoT has become an urgent field of study. In this study, they explore how ‘Distributed’ DoS in the IoT might be the result of malicious software and botnets. Different methods of protecting against distributed denial of service attacks (DDoS) are compared and contrasted in order to highlight the vulnerabilities they each have. Further, we detail the unanswered questions and difficult problems that must be solved before we can have a really effective defense against DDoS attacks [35]. Sharu Bansal, et al. (2020) introduced IoT Ecosystem: A Survey on Devices, Gateways, Operating Systems, Middleware and Communication.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

291

M2M technologies were the foundation of IoT in today’s interconnected world. As IoT grows, it was bringing together massive technologies like Big Data, AI, and ML to handle the massive amounts of data and devices. The article begins with a brief introduction to the classification schemes used in the IoT ecosystem. The technical characteristics of the Internet of Things’ devices, gateways, operating systems, middleware, platforms, data storage, security, communication protocols, and interfaces are then explained. This research also discusses the primary challenges that must be addressed in order to expand IoT. Big data, cloud computing, and fog computing, along with their role in the Internet of Things, have been briefly examined. At last, it shows the expanding range of uses made possible by the Internet of Things [36]. Aljaz Jeromel et al. (2020) looked an efficient lossy cartoon image compression method. In this study, they provide a novel lossy method for compressing cartoon pictures. At first, the picture is split up into sections that have about the same hue. They then discover the codes for each next place in the chain. Compression was achieved by run-length encoding and the Burrows-Wheeler Transform on a sequence of chain code symbols. If more compression of the binary stream was desired, an arithmetic encoder might be utilized. Since not all compression phases are also present during decompression, the suggested approach was asymmetric. Compared to JPEG, JPEG2000, Web, SPIHT, PNG, and two algorithms designed specifically for the compression of cartoon drawings, the compression ratios created by the proposed method were much higher [37].

3. Problem Statement There have been several researches in area of IoT security that are frequently making use of data encryption. But these researches are providing limited security because researches are not capable to manage all type of attacks. There is need of mechanism that should classify the different attacks to enhance security. Moreover, data need to be compressed before applying deep learning model so that performance of system should be high. Filtering of dataset during training also reduces the time consumption and increases accuracy. Considering issues in existing research in area of security of IoT it has been observed that there is need to integrate compression and deep learning to improve the security and performance of IoT system.

本书版权归Nova Science所有

292

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

4. Proposed Work It has been observed that there have been several security threats to IoT environment. There is need to improve the performance along with reduction in space consumption. Thus compression Proposed work (as shown in Figure 11) is focusing simulating the integration of compression and deep learning approach in IoT environment for security system. Such system is supposed to provide more reliable and flexible approach by reducing space consumption and increasing performance. Moreover, proposed work would improve the accuracy during detection of security threats. Start

Get the dataset of transmission history

Filter the compressed dataset before training

Apply LSTM model to train the network

Perform training with 70% and testing considering 30% dataset

If transaction is normal

Stop Classify type of attack

Perform transmission operation and store status in learning system

Figure 11. Process flow of proposed work.

Several cyber assaults are identified and categorized in the proposed study using a machine learning model applied to a dataset including a variety of transaction types. The simulation took into account both filtered and unfiltered data sets. It is anticipated that the accuracy of detection and classification will be high once the dataset has been filtered using an optimizer. The proposed hybrid paradigm provides a means of mitigating attacks such as brute force,

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

293

man in the middle, and SQL injection. LSTM-based machine learning has been utilised for classification. To ensure only legitimate IP addresses are used for logins, we implement this restriction. The user’s login information would be encrypted in the database to prevent the DBA from misusing the information. Due to the security risks associated with the login process, the suggested architecture forbids access to any inauthentic user. The proposed system would include a virtual keyboard and limit keystroke input. Using wildcard characters in the login procedure would be disabled.

4.1. Features of Proposed Model The proposed paradigm is flexible since it can withstand several threats. The proposed model outperforms the conventional one because it employs a machine learning technique to categories and limit assaults. Encryption was developed to keep information safe from prying eyes like governments and corporations [38, 39]. The proposed paradigm limits the use of incorrect data that might be used to compromise security. By switching to a user-defined port rather than a shared one, the proposed paradigm lowers the risk of an attack [40-53].

5. Results and Discussion The suggested research uses a machine learning model applied to a dataset of different transactions to detect and classify several cyber attacks. Two different situations have been simulated: one in which the dataset is filtered, and another in which it is not. Utilizing an optimizer to refine the dataset should boost detection and classification accuracy.

5.1. Confusion Matrix of Unfiltered Dataset Table 1 displays Confusion Matrix of Unfiltered Dataset. Table 2 displays the derived accuracy metrics after Table 1 has been processed using accuracy, precision, recall, and F1-score.

本书版权归Nova Science所有

294

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Table 1. Confusion matrix of unfiltered dataset SQL injection SQL injection 2233 Brute force attack 168 Man in middle attack 196 Normal 369 Results TP: 9018. Overall Accuracy: 75.15%.

Brute force attack 197 2242 353 194

Man in middle attack 381 191 2303 197

Normal 189 399 148 2240

Table 2. Accuracy of confusion matrix of unfiltered dataset Class 1 2 3 4

n (truth) 2966 2986 3072 2976

n (classified) 3000 3000 3000 3000

Accuracy 87.5% 87.48% 87.78% 87.53%

Precision 0.74 0.75 0.77 0.75

Recall 0.75 0.75 0.75 0.75

F1 Score 0.75 0.75 0.76 0.75

5.2. Confusion Matrix of Filtered Dataset Table 3 is considering the Confusion matrix of filtered dataset. Table 4 displays the derived accuracy metrics after Table 3 has been processed using accuracy, precision, recall, and F1-score. Table 3. Confusion matrix of filtered dataset SQL injection SQL injection 2491 Brute force attack 97 Man in middle attack 109 Normal 198 Results TP: 10280. Overall Accuracy: 85.67%.

Brute force attack 112 2586 241 59

Man in middle attack 299 102 2557 97

Normal 98 215 93 2646

Table 4. Accuracy of confusion matrix of filtered dataset Class 1 2 3 4

n (truth) 2895 2998 3055 3052

n (classified) 3000 3000 3000 3000

Accuracy 92.39% 93.12% 92.16% 93.67%

Precision 0.83 0.86 0.85 0.88

Recall 0.86 0.86 0.84 0.87

F1 Score 0.85 0.86 0.84 0.87

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

295

5.3. Comparison Analysis Table 5 displays the results of taking stock of the accuracy of completed work and recommended future work for each of the four classes (1, 2, 3, and 4). Compared to the original, unfiltered data, the filtered version has been shown to be more precise.

5.3.1. Accuracy Taking into account the data in Table 5, we can now show the precision of the filtered dataset in comparison to the unfiltered one in Figure 12. Table 5. Comparison analysis of accuracy Class 1 2 3 4

Unfiltered dataset 87.5% 87.48% 87.78% 87.53%

Filtered dataset 92.39% 93.12% 92.16% 93.67%

Figure 12. Comparison analysis of accuracy.

Precision of previous work and proposed work are taken for class 1, class2, class 3 and class 4 and shown in Table 6. It is observed that the Precision of filtered dataset with respect to unfiltered dataset.

5.3.2. Precision Considering Table 6, Figure 13 is drawn in order to visualize the precision of filtered dataset with respect to unfiltered dataset.

本书版权归Nova Science所有

296

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Table 6. Comparison analysis of precision Class 1 2 3 4

Unfiltered dataset 0.74 0.75 0.77 0.75

Filtered dataset 0.83 0.86 0.85 0.88

Figure 13. Comparison analysis of precision.

Recall value of previous work and proposed work are taken for class 1, class2, class 3 and class 4 and shown in Table 7. It is observed that the Recall value of filtered dataset with respect to unfiltered dataset.

5.3.3. Recall Value Considering Table 7, Figure 14 is drawn in order to visualize the recall value of filtered dataset with respect to unfiltered dataset. Table 7 Comparison analysis of recall value Class 1 2 3 4

Unfiltered dataset 0.75 0.75 0.75 0.75

Filtered dataset 0.86 0.86 0.84 0.87

F1-Score of previous work and proposed work are taken for class 1, class 2, class 3 and class 4 and shown in Table 8. It is observed that the F1-Score of filtered dataset with respect to unfiltered dataset.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning …

297

Figure 14. Comparison analysis of recall value.

5.3.4. F1-Score Table 8 shows the comparison analysis of F1-Score. Considering Table 8, Figures 15 is drawn in order to visualize F1-Score of filtered dataset with respect to unfiltered dataset. Table 8. Comparison analysis of F1-Score Class 1 2 3 4

Unfiltered dataset 0.75 0.75 0.76 0.75

Figure 15. Comparison analyses of F1-Score.

Filtered dataset 0.85 0.86 0.84 0.87

本书版权归Nova Science所有

298

Harinder Singh, Rohit Anand, Vivek Veeraiah et al.

Conclusion and Future Outlook Compression technique used in security system has reduced the probability of attack [40] and deep learning mechanism is providing better solution to classify the attack in order to prevent the probability of brute force, man in middle and Present work concludes that the accuracy parameters are providing better solution in case of filtered dataset as compared to unfiltered. Thus proposed work is providing high performance by integration of data compression and accuracy by integrating data filtering to deep learning system. Need of security in IoT system is growing day to day because there are several security challenges. There are several health-cares, commercial and educational application that are making use of IoT system [41]. Such security mechanism would be applicable in such system. Moreover deep learning model that has classified attack is capable to restrict external attacks. Compression mechanism reduces the space consumption and improves the performance during classification using deep learning model.

References [1]

[2]

[3]

[4]

[5]

[6]

Kingma, F., Abbeel, P., & Ho, J. (2019, May). Bit-swap: Recursive bits-back coding for lossless compression with hierarchical latent variables. In International Conference on Machine Learning (pp. 3408-3417). PMLR. Sindhwani, N., Anand, R., Shukla, R., Yadav, M., & Yadav, V. (2021). Performance Analysis of Deep Neural Networks Using Computer Vision. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(29), e3. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Singh, S. K., Thakur, R. K., Kumar, S., & Anand, R. (2022, March). Deep Learning and Machine Learning based Facial Emotion Detection using CNN. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 530-535). IEEE. Tiwari, I., Juneja, S., Juneja, A., & Anand, R. (2020). A statistical-oriented comparative analysis of various machine learning classifier algorithms. Journal of Natural remedies, Vol. 21, issue 3(S1), 139-144. Sindhwani, N., Rana, A., & Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning … [7]

[8]

[9]

[10]

[11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

[19]

299

Gupta, A., Asad, A., Meena, L., & Anand, R. (2023). IoT and RFID-Based Smart Card System Integrated with Health Care, Electricity, QR and Banking Sectors. In Artificial Intelligence on Medical Data (pp. 253-265). Springer, Singapore. Srivastava, A., Gupta, A., & Anand, R. (2021). Optimized smart system for transportation using RFID technology. Mathematics in Engineering, Science & Aerospace (MESA), 12(4). Gupta, B., Chaudhary, A., Sindhwani, N., & Rana, A. (2021, September). Smart Shoe for Detection of Electrocution Using Internet of Things (IoT). In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-3). IEEE. Pandey, B. K., Pandey, D., Wairya, S., & Agarwal, G. (2021). An advanced morphological component analysis, steganography, and deep learning-based system to transmit secure textual data. International Journal of Distributed Artificial Intelligence (IJDAI), 13(2), 40-62. Pramanik, S., Ghosh, R., Pandey, D., Samanta, D., Dutta, S., & Dutta, S. (2021). Techniques of Steganography and Cryptography in Digital Transformation. In Emerging Challenges, Solutions, and Best Practices for Digital Enterprise Transformation (pp. 24-44). IGI Global. Gupta, M., & Anand, R. (2011). Image Compression using set of selected but planes on basis of intensity variations. Dronacharya Research Journal, 3(1), 35-40. Anand, R., Shrivastava, G., Gupta, S., Peng, S. L., & Sindhwani, N. (2018). Audio watermarking with reduced number of random samples. In Handbook of Research on Network Forensics and Analysis Techniques (pp. 372-394). IGI Global. Malik, S., Singh, N., & Anand, R. Compression Artifact Removal Using SAWS Technique Based On Fuzzy Logic. International Journal of Electronics and Ekectrical Engineering, 2(9), 11-20. Pandey, D., & Pandey, B. K. (2022). An Efficient Deep Neural Network with Adaptive Galactic Swarm Optimization for Complex Image Text Extraction. In Process Mining Techniques for Pattern Recognition (pp. 121-137). CRC Press. Mahmoud, R., Yousuf, T., Aloul, F., & Zualkernan, I. (2016). Internet of things (IoT) security: Current status, challenges and prospective measures. 2015 10th International Conference for Internet Technology and Secured Transactions, ICITST 2015, 336–341. https://doi.org/10.1109/ICITST.2015.7412116. Lee, C. K. M., Yeung, C. L., & Cheng, M. N. (2016). Research on IoT based Cyber Physical System for Industrial big data Analytics. IEEE International Conference on Industrial Engineering and Engineering Management, 2016-Janua, 1855–1859. https://doi.org/10.1109/IEEM.2015.7385969. Mandula, K., Parupalli, R., Murty, C. H. A. S., Magesh, E., & Lunagariya, R. (2016). Mobile based home automation using Internet of Things(IoT). 2015 International Conference on Control Instrumentation Communication and Computational Technologies, ICCICCT 2015, 340–343. https://doi.org/10.1109/ICCICCT.2015. 7475301. Andrea, I., Chrysostomou, C., & Hadjichristofi, G. (2016). Internet of Things: Security vulnerabilities and challenges. Proceedings - IEEE Symposium on

本书版权归Nova Science所有

300

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

Harinder Singh, Rohit Anand, Vivek Veeraiah et al. Computers and Communications, 2016-Febru, 180–187. https://doi.org/10.1109/ ISCC.2015.7405513. Kebande, V. R., & Ray, I. (2016). A generic digital forensic investigation framework for Internet of Things (IoT). Proceedings - 2016 IEEE 4th International Conference on Future Internet of Things and Cloud, FiCloud 2016, 356–362. https://doi.org/ 10.1109/FiCloud.2016.57. Wang, K. H., Chen, C. M., Fang, W., & Wu, T. Y. (2018). On the security of a new ultra-lightweight authentication protocol in IoT environment for RFID tags. Journal of Supercomputing, 74(1), 65–70. https://doi.org/10.1007/s11227-017-2105-8. Azmoodeh, A., Dehghantanha, A., Conti, M., & Choo, K. K. R. (2018). Detecting crypto-ransomware in IoT networks based on energy consumption footprint. Journal of Ambient Intelligence and Humanized Computing, 9(4), 1141–1152. https://doi.org/10.1007/s12652-017-0558-5. Pandey, B. K., Pandey, D., Nassa, V. K., George, S., Aremu, B., Dadeech, P., & Gupta, A. (2023). Effective and secure transmission of health information using advanced morphological component analysis and image hiding. In Artificial Intelligence on Medical Data (pp. 223-230). Springer, Singapore. Naveen Kumar, R., Jagadale, B. N., & Bhat, J. S. (2019). A lossless image compression algorithm using wavelets and fractional Fourier transform. SN Applied Sciences, 1(3). https://doi.org/10.1007/s42452-019-0276-z. Krishnamoorthy, A., Vijayarajan, V., & Sapthagiri, R. (2019). Automated shopping experience using real-time IoT. In Advances in Intelligent Systems and Computing (Vol. 862). Springer Singapore. https://doi.org/10.1007/978-981-13-3329-3_20. Kumar, S., Tiwari, P., & Zymbler, M. (2019). Internet of Things is a revolutionary approach for future technology enhancement: a review. Journal of Big Data, 6(1). https://doi.org/10.1186/s40537-019-0268-2. Pavithran, D., Shaalan, K., Al-Karaki, J. N., & Gawanmeh, A. (2020). Towards building a blockchain framework for IoT. Cluster Computing, 23(3), 2089–2103. https://doi.org/10.1007/s10586-020-03059-5. Al-Turjman, F., & Alturjman, S. (2020). 5G/IoT-enabled UAVs for multimedia delivery in industry-oriented applications. Multimedia Tools and Applications, 79(13–14), 8627–8648. https://doi.org/10.1007/s11042-018-6288-7. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Sharma, M., Sharma, B., Gupta, A. K., & Pandey, D. (2022). Recent developments of image processing to improve explosive detection methodologies and spectroscopic imaging techniques for explosive and drug detection. Multimedia Tools and Applications, 1-17.

本书版权归Nova Science所有

Simulating the Integration of Compression and Deep Learning … [32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40] [41]

[42] [43]

[44]

[45]

301

Verma, A., & Ranga, V. (2020). Machine Learning Based Intrusion Detection Systems for IoT Applications. Wireless Personal Communications, 111(4), 2287– 2310. https://doi.org/10.1007/s11277-019-06986-8. Vishwakarma, R., & Jain, A. K. (2020). A survey of DDoS attacking techniques and defence mechanisms in the IoT network. Telecommunication Systems, 73(1), 3–25. https://doi.org/10.1007/s11235-019-00599-z. Bansal, S., & Kumar, D. (2020). IoT Ecosystem: A Survey on Devices, Gateways, Operating Systems, Middleware and Communication. International Journal of Wireless Information Networks, 27(3), 340–364. https://doi.org/10.1007/s10776020-00483-7. Pandey, D., Wairya, S. (2023). Perfomance Analysis of Text Extraction from Complex Degraded Image Using Fusion of DNN, Steganography, and AGSO. In: Mishra, B., Tiwari, M. (eds) VLSI, Microwave and Wireless Technologies. Lecture Notes in Electrical Engineering, vol 877. Springer, Singapore. https://doi.org/ 10.1007/978-981-19-0312-0_20. Pandey, B. K., Pandey, D., & Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. Pandey, B. K., Pandey, D., Wairya, S., Agarwal, G., Dadeech, P., Dogiwal, S. R., & Pramanik, S. (2022). Application of Integrated Steganography and Image Compressing Techniques for Confidential Information Transmission. Cyber Security and Network Security, 169-191. Kaura, C., Sindhwani, N., & Chaudhary, A. (2022, March). Analysing the Impact of Cyber-Threat to ICS and SCADA Systems. In 2022 International Mobile and Embedded Technology Conference (MECON) (pp. 466-470). IEEE. Pandey, D., Aswari, A., Taufiqurakman, M., Khalim, A., & Azahrah, F. F. (2021). System Of Education Changes Due to Covid-19 Pandemic. Asian Journal of Advances in Research, 10-15. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5).

本书版权归Nova Science所有

302 [46]

[47] [48]

[49]

[50]

[51]

[52]

[53]

Harinder Singh, Rohit Anand, Vivek Veeraiah et al. Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Degerine S. & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi, A. “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Degerine S. & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints,” in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

Chapter 14

A Review of Various Text Extraction Algorithms for Images Binay Kumar Pandey1, Digvijay Pandey2,† Vinay Kumar Nassa3,‡ A. Shahul Hameed4,¶ A. Shaji George5,# Pankaj Dadheech6,• and Sabyasachi Pramanik7,§ 1Department

of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, India 2 Department of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India 3Rajarambapu Institute of Technology, Rajaramnagar (Islampur-Maharshtra), India 4Department of Telecommunication, Consolidated Techniques Co. Ltd. (CTC), Riyadh, Kingdom of Saudi Arabia 5Department of Information and Communication Technology, Crown University, Int’l. Chartered Inc. (CUICI), Santa Cruz, Argentina 6Department of Computer Science and Engineering (NBA Accredited), Swami Keshvanand Institute of Technology, Management and Gramothan (SKIT), Jaipur, Rajasthan, India 

Corresponding Author’s Email: [email protected]. Corresponding Author’s Email: [email protected]. ‡ Corresponding Author’s Email: [email protected]. ¶ Corresponding Author’s Email: [email protected]. # Corresponding Author’s Email: [email protected]. • Corresponding Author’s Email: [email protected]. § Corresponding Author’s Email: [email protected]. †

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

304

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

7Department

of Computer Science and Engineering, Haldia Institute of Technology West Bengal, India

Abstract In recent years there has been a growing demand to preserve historical documents, books, screenshots, bill receipts, magazines, etc. and convert them into digital format. Moreover, the fast development of data innovation and the speedy propagation of the Internet have also lead to the huge amount of image and video data. The texts present in the image and video help us in the analysis of those images and videos as well as used in indexing, archiving, and retrieval. The process of extracting text from a picture into plain text is called text extraction. Finding image texts, editing, and document archiving all depend mainly on text extraction. However, it is a laborious task to extract text from intricately degraded photos. The orientation of the text, the font size, the variety of the background, the low image quality, the different colours of the letters, and the interference of noise are the key obstacles to text detection and extraction from complicated degraded photos. Different types of noise, such as Gaussian noise, salt-and-pepper noise, shot noise, speckle noise, quantization noise, and periodic noise, among others, can readily damage an image. Several image filtering techniques, including the Gaussian filter, mean filter, median filter, maximum and minimum filter, adaptive filter, and Wiener filter, are used to eliminate these various types of noise from images. After noise removal from complex degraded image, the text is extracted in binary form. Despite many efforts spent on this topic, there still needs some work to be done to improve the text extraction techniques. Even a very good optical character recognition system can be of no use if text-extraction is not performed well. We discuss an innovative text extraction method for complex damaged photos in this paper.

Keywords: optical character recognition, machine leaning, deep learning, noise and image processing

1. Introduction The advancement of the Internet led to the huge increment in images and videos database [1-3]. Most of the images and videos consist of numerous text information [4-7]. This text information can be used as an important parameter

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images

305

along with the visual features like orientation, colour and size to enhance the restoration of the particular images and videos. Moreover, in recent times there has been an increase in demand to preserve historical documents, books and convert them into digital format [8-11]. However, extracting texts from these natural images, scanned documents or videos is a tedious work. Noise often causes image distortion during image acquisition, processing, transmission, and reproduction [12-15]. Restoring the original image once the noise has been eliminated is one of the fundamental objectives of image processing. Text extraction is a method for collecting significant data from challenging images, whereas text extraction is a method for eliminating text from challenging images [16-18]. Text extraction has advantages for character identification. Text recognition is known as optical character recognition (OCR) [19]. Noise affects these systems’ accuracy and effectiveness in addition to orientation, line direction (vertical or horizontal), and text size. To find important and valuable information from a document image, text extraction plays a major role. Since, many search engines are text based only but, the major problem is that different users choose different keywords for the same image or video. So, generating the keywords from the document image can be a good approach. These keywords are used as an important feature to enhance the restoration of the relevant images and videos. There are many applications of text extraction from document images such as sign translation, video skimming, robotics, navigation and image to audio aid for visually impaired people. So, there is a huge demand for extraction of texts from images.

2. Literature Review A research work’s literature review must exhibit thorough understanding of the subject field and solid arguments to corroborate the investigation. A review of the literature assists us in locating and summarising the background research on a certain issue. There should be some reputable sources such as IEEE, Scopus, ACM and other standard journals or books to support this objective and based on these sources, we obtain reliable data.

本书版权归Nova Science所有

306

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

2.1. Comprehensive Study of the Existing Work A comprehensive study of the existing research work is discussed as follows. Kumar and Singh [20] chose handwritten text in Gurmukhi script. They proposed that the document image consists of a large set of values so; they found the maximum as well as minimum coordinates. These coordinates are basically white gaps between texts and segment text lines. Finally, words are formed using each text line. Saha et al. [21] Using image binarization, background information was removed, followed by edge detection during the pre-processing stage, and finally word segmentation using the Hough transform. Finally, a bounding box has been made for each word. Tripathi et al. [22] proposed an algorithm in which both the horizontal as well as vertical directions coordinates of white and black pixels are used for detection of straight-line and skew. After that the count of white pixels has been done from each text line. Later, the text line segmentation has been done on the basis of comparison of its features with the standard symbols. Alaei et al. [23] suggested an approach in which complicated images are first segmented using a piecewise linear segmentation algorithm before being binarized. Binarization may result in some text loss for complicated visual texts. Later, a dilation operation is performed to eliminate text line discontinuity. The text line separators were also employed. Finally, the contour points between touched text components are used to filter out the overlapped text lines. Pan et al. [24] projected that characters are found from image using partition scheme such as colour-based partition and partition based on gradient. Colour based partition is used to improve overall efficiency. The characters are then processed further to eliminate the non-text regions. After that grouping of characters have been done. Finally, texts features such as height, width, and area etc. have been calculated to form the grouping of texts. Sarkar et al. [25] proposed an algorithm for document images in Bangla. They first found the number of zones. Then they classified the connected components into two categories i.e., Do Not Segment (DNS) and Segment Further (SF) to avoid confusion using Multi-Layer Perceptron (MLP) classifier. Finally, fuzzy membership function has been used to create character segmentation points. This method can be used for other scripts as well. Roy et al. [26] first corrected the skew of words, then they created a group of individual and overlapped characters using SVM classifier. Then, they used

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images

307

convex hull shape analysis to find the cavity region of overlapping characters and calculated the first segmentation point. After that, they found other segmentation points as well by joining the lines. Finally, confidence levels of segmented characters along with the wrongly segmented characters are checked. Later, the out of order segmented characters are corrected by using dynamic programming algorithm. In this paper projected by Biswas and Das [27], two methods of text segmentation from land maps are used i.e., morphological approach and gray level analysis. In the first method, initially pre-processing is performed on the document image after that morphological approach is used to extract the text. However, in second Method, the text extraction is performed using intensity or gray level analysis. For text extraction, there are two morphological operations i.e., horizontal and vertical morphological operations which are used for horizontal and vertical text respectively. The map document image is classified into a number of small blocks of a size n x n. The value of n can be chosen as per our requirement. Priya and Gobu [28] Quantization was used to cut down on the image’s colour palette. The image is later divided using the wavelet transform to create feature vectors. Using FCM, the foreground and background clusters are then created using this feature vector. Erosion is also used to remove any leftover noise. Lee and Kim [29] initially divided the intricate image into sections. Then several labels are given to each segmented region. The CRF model is used for this. Each divided region offers a suggestion for a character region. These suggestions are organised to create text lines. This segmentation also aids in handling photographs with intricate backgrounds. Choudhary et al. [30] applied binarization on the picture of the document. The area of each connected component region is then determined. Then, the overly big or undersized sections were eliminated. True related components are finally split as separate characters. Gomathi et al. [31] proposed trimmed mean approach for word segmentation. In this approach they started with image binarization and later performed skew detection along with correction. Then the core regions are identified using trimmed mean approach. Then, gap width between words is calculated using vertical projection profile. Finally, word segmentation from text lines has been done using trimmed mean as threshold. Gonzalez et al. [32]a complicated image’s characters were located using the Maximally Stable Extremal Regions (MSER) technique. The MSER algorithm is utilised because it has superior stability and can detect several

本书版权归Nova Science所有

308

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

scales without smoothing them. Then, identical characters are combined for text segmentation. The classifier is used to segment text. Then, words are taken out of these manuscripts. Sun et al. [33] also performed the image binarization first and then the noise has been removed from document image. Later they found the connected components within a document image. After that, a bounding box has been created on each connected component. Bounding boxes are then used to find the average height as well as width of each connected components. Finally, they merged components to form word segmentation. Kaur and Mahajan [34] has centred on various image binarization methods. Existing research has demonstrated that no single approach is ideal for all sorts of pictures. Although a few studies employed image filters to decrease noise in their images, the guided filter was not used. It has the potential to increase the accuracy of current binarization approaches. Contrast enhancement can be done in one of two ways: traditionally or not at all. Adaptive contrast enhancement is therefore required. The majority of approaches do not employ an edge map that can efficiently map a specific character. To improve accuracy, the proposed approach incorporates complicated picture gradients as well as image contrast augmentation. Ryu et al. [35] measured spaces between and within words. Then, using feature vectors such as projection profiles, distances, and gap ratios that correspond to the adjacent pixels, they assigned labels to each gap by evaluating their cost functions. Islam et al. [36] suggested an algorithm that combines connected component and edge approaches. Since an algorithm based on connected components can only separate texts from non-texts, edge-based algorithms can only extract texture information. Therefore, combining these two approaches increases the process’s accuracy for extracting text. A. K. Chaubey [37] proposed a method for medical image segmentation. The image segmentation is a primary and an essential step that is needed in medical image processing. Image segmentation plays an important role to decide the image quality however, it is quite a challenging job. The proposed algorithm extracts the texture from the medical image. The author performed the extraction on an ultrasound image in the portable gray map (pgm) format. This is done by using the local and global thresholding and the Otsu’s algorithm. M. R. Gaikwad et al. [38] projected a method of text extraction for comic images. Since comic images contain very high levels of noise, a preprocessing step is involved that is performed by using Median filter. Median

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images

309

filter is used here because of its edge restoration attributes. The authors worked upon various comic images from different newspapers and comic books. They used region-based method for text extraction. After text extraction, it is recognized by optical character recognition process in which, first line segmentation is performed, after that word segmentation is carried out and then characters are segmented. Finally, texts are recognised and stored in the database. Using the proposed methodology, texts are extracted with a good accuracy. Mol et al. [39] projected a method of text detection and recognition for a scene text. The proposed method can be broadly categorized into two sections i.e., text detection and character recognition. The method starts with a preprocessing step that uses fractional Poisson enhancement after that feature extraction is done using MSER technique and then region filtering method is used for further enhancement of the text image. In the next step segmentation is done based on connected component and finally character is recognized using OCR process. The proposed method deals well for high resolution image as well as complex background image. In this work by Chiatti et al. [40], a robust method of text extraction for screenshots taken from smart phones is proposed. The method is based on Open CV library and an OCR module Tesser act. The algorithm starts with the conversion of screenshot images from RGB to gray scale. In the next step binarization is performed on the gray scale image obtained in the first step. Then, thresholding is done using a hybrid method which is performed by combining two techniques i.e., inverse thresholding and Otsu’s binarization. After that segmentation using a Connected Component approach is carried out. Finally, segmented regions are passed to the OCR engine and texts are recognized with a good accuracy. In this projected work by Akash et al. [41], a mechanism has been created to translate traffic instructions written in Bangla into English. This procedure entails three steps: machine translation, language model post-processing, and text extraction from a sign image. The Canny edge detector is used in this instance’s pre-filtering process to detect edges and lessen their impact from noise. Natei et al. [42] proposed an approach that combines the Edge Based and Connected Components algorithms and offers good performance in text extraction. Different photos from the DIBCO 2017 dataset have been tested in this projected algorithm, and it produces good results. Manikandan et al. [43] projected a novel approach that can perform the text extraction from a scene image as well as translate it into a desired

本书版权归Nova Science所有

310

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

language. The method is based on android platform. In this method first, the text extraction is done with the help of stroke width transform (SWT) and CC based approach. Then, the extracted texts are recognized using an open source OCR engine called ‘Tesseract.’ Finally, the recognized texts are translated to a desired language with the help of Google’s translation. A hybrid text line segmentation approach for text extraction from handwritten document is presented by Kiumarsi and Alaei [44]. The proposed method is based on the grouping and projection profile analysis. It starts with a pre-processing step in which a connected component labelling is done on the handwritten document image so that all the CCs can be extracted. After that connected components chains are created later separator lines are evaluated. Then projection profile is calculated and text line segmentation is performed. For text lines segmentation, the projection profile maximums are detected. And based on that, extraction of texts is carried out. After that there may be some CCs with higher height which are wrongly divided into two parts and may be assigned to more than one text line so, this wrong segmentation result and the overall performance can further be improved using a post-processing stage. The work proposed by Koshy et al. [45], focused on the effect of preprocessing techniques like thresholding, blurring and morphology. As the accuracy of text extraction from complex image is greatly affected by the quality of the image therefore, the image quality can be enhanced with the help of better pre-processing technique. The quality of the pre-processed images is determined by Image Quality Assessment (IQA) which can handle subjective as well as objective evaluations. Subjective evaluation is based on the way in which humans see the quality of image whereas; objective evaluation is based on the analysis of image quality that carried out by computational models and algorithms. Here, a type of IQA i.e., referenced method is used in which the original image is compared with a reference image. Authors performed different pre-processing techniques on a dataset and IQA is then used for image analysis and based on the results obtained, it is concluded that the opening morphology has the better accuracy than other techniques that not only keeps the image intact but also, enhances the quality of image. However, other techniques perform differently on the basis of the function and make significant changes in noise level reduction, image enhancement or binarization. Ghai and Jain [46] projected an efficient method for text extraction from complex coloured images having different font sizes, alignment, colour, orientation, complex background as well as various environment conditions.

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images

311

In this method, a sliding window is used to filter high frequency component. The feature extraction is performed using wavelet coefficients. Depending on these features, classification of complex image into text, simple and complex background is carried out with the help of k-means clustering. At last, text regions are located using voting decision and filtering based on area processes. This method can also be used to extract multilingual as well as handwritten texts. Deepa and Lalwani [47] projected a technique in which they started with the classification of the image on the basis of their features using convolutional neural network. After that text extraction is performed with the help of an OCR tool called Tesseract. Finally, this extracted text is stored. Bolan et al. [48] discussed a method to extract information from invoices based on template matching. In this method first pre-processing is performed followed by template matching and optical character recognition. Finally, information is stored in the database. Template matching is done using normalized correlation coefficient matching, which improves the accuracy of the overall system. The method provides comprehensive information from an invoice such as amount, products, date and buyer details accurately. Chakraborty et al. [49] MATLAB was used to discuss a text extraction algorithm. In essence, they used the OCR technology in addition to the template matching method. They first extract the MSER from the picture text, then divide the text using the Canny edge detector, and then filter the letters using the stroke width transform. Finally, text sections are subjected to the OCR procedure for extraction. The suggested approach has been used on a variety of text picture types and produced successful results. The traditional OCR techniques are difficult to get the ideal results for scene text recognition. In order to solve the problem of speed and space of text extractor, a compression technique which uses boundary samples and extractors has been discussed by Vaghade [50]. The number of pseudosamples required is reduced, and the integrated extractor is compressed into a more efficient compression extractor. A text feature based on local characteristics and a spatio-temporal histogram are also given in this work. The outcomes of integrating these two techniques show that the scene text recognition process may be significantly improved in accuracy. In this work Ranjitha et al. [51] proposed a method to extract the text from images of various orientations and to classify them. This is accomplished by performing noise removal in pre-processing stage on the complex picture. Following that, MSER is used to segment and extract key characteristics. The acquired result is then processed using SWT and geometrical characteristics

本书版权归Nova Science所有

312

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

are compared to the regions. Finally, all the processed regions are combined together to obtain the text, which is then extracted using OCR. This method provides 95.42% accuracy. Mukherjee et al. [52] proposed a method of feature extraction from handwritten text images. In this method various features such as slant angle, skew angle, baseline and writing pressure are analysed. The handwritten text images are considered as a group of white and black pixels. The variation in skew-angle and baseline however, slant and writing pressure show much variation and this can be used to identify various handwritings. In this work by Sumathi et al. [53], they employed a hybrid technique for word extraction from scene pictures, combining guided image filtering with MSER. Text features such as stroke width, compactness and colour divergence are considered for distinguishing the text and non-texts. Ada Boost classifier is used for the purpose of text and non-text classification. The proposed method provides good accuracy for text detection by removing the connected characters using guided image filter. However, this method of text extraction is not suitable if the image is more blurred or contains coloured background. Text extraction techniques and made a great contribution in the area of image processing and various application in secure data transmission, data analysis and Number detection in number plate etc. [54-67].

Conclusion and Future Outlook It is implied from the aforementioned literature works that the researchers worked on various text detection and extraction techniques and made a great contribution in the area of image processing. They have worked on a number of filtering algorithms and text extraction methods for coloured images, degraded historical documents, screenshots and land map images etc. Noise, such as Gaussian noise, exponential noise, and impulse noise degrades and contaminates the quality of document pictures. Because picture filters alone will not be able to recover the text borders, we must investigate how different filters behave and function when extracting text from images. Taking this into account, I will investigate the filters’ performance on coloured complex pictures with varying background colours, intensities and lighting as well as text with varying font sizes, forms, and alignments.

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images

313

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11] [12]

Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Chibber, A., Anand, R., & Singh, J. Smart Traffic Light Controller Using Edge Detection in Digital Signal Processing. In Wireless Communication with Artificial Intelligence (pp. 251-272). CRC Press. Pandey, D., Wairya, S., Al Mahdawi, R., Najim, S. A. D. M., Khalaf, H., Al Barzinji, S., & Obaid, A. (2021). Secret data transmission using advanced steganography and image compression. International Journal of Nonlinear Analysis and Applications, 12(Special Issue), 1243-1257. Singh, S. K., Thakur, R. K., Kumar, S., & Anand, R. (2022, March). Deep Learning and Machine Learning based Facial Emotion Detection using CNN. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 530-535). IEEE. Pandey, B. K., Pandey, D., Wariya, S., & Agarwal, G. (2021). A deep neural network-based approach for extracting textual images from deteriorate images. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(28), e3e3. Anand, R., Singh, B., & Sindhwani, N. (2009). Speech perception & analysis of fluent digits’ strings using level-by-level time alignment. International Journal of Information Technology and Knowledge Management, 2(1), 65-68. Jain, S., Kumar, M., Sindhwani, N., & Singh, P. (2021, September). SARS-Cov-2 detection using Deep Learning Techniques on the basis of Clinical Reports. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1-5). IEEE. Anand, R., Shrivastava, G., Gupta, S., Peng, S. L., & Sindhwani, N. (2018). Audio watermarking with reduced number of random samples. In Handbook of Research on Network Forensics and Analysis Techniques (pp. 372-394). IGI Global. Pandey, B. K., Pandey, D., Wairya, S., & Agarwal, G. (2021). An advanced morphological component analysis, steganography, and deep learning-based system to transmit secure textual data. International Journal of Distributed Artificial Intelligence (IJDAI), 13(2), 40-62. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., & Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Madhumathy, P., & Pandey, D. (2022). Deep learning based photo acoustic imaging for non-invasive imaging. Multimedia Tools and Applications, 81(5), 7501-7518. Ratnaparkhi, S. T., Singh, P., Tandasi, A., & Sindhwani, N. (2021, September). Comparative analysis of classifiers for criminal identification system using face recognition. In 2021 9th International Conference on Reliability, Infocom

本书版权归Nova Science所有

314

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21] [22]

[23] [24]

[25]

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al. Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-6). IEEE. Sindhwani, N., Rana, A., & Chaudhary, A. (2021, September). Breast Cancer Detection using Machine Learning Algorithms. In 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1-5). IEEE. Choudhary, P., & Anand, M. R. (2015). Determination of rate of degradation of iron plates due to rust using image processing. International Journal of Engineering Research, 4(2), 76-84. Pramanik, S., Ghosh, R., Pandey, D., & Ghonge, M. M. (2021). Data Hiding in Color Image Using Steganography and Cryptography to Support Message Privacy. In Limitations and Future Applications of Quantum Cryptography (pp. 202-231). IGI Global. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Meivel, S., Sindhwani, N., Anand, R., Pandey, D., Alnuaim, A. A., Altheneyan, A. S., & Lelisho, M. E. (2022). Mask Detection and Social Distance Identification Using Internet of Things and Faster R-CNN Algorithm. Computational Intelligence and Neuroscience, 2022. Selva Kumari, R. Shantha & Sangeetha, R.. (2015). Optical character recognition for document and newspaper. International Journal of Applied Engineering Research. 10. 15279-15285. Kumar R. & A. Singh, “Detection and segmentation of lines and words in Gurmukhi handwritten text,” in IEEE 2nd International Advance Computing Conference (IACC), Patiala, India, Feb. 19_20, 2010, pp. 353_356. Saha, S., S. Basu, M. Nasipuri, & D. K. Basu, “A Hough transform based technique for text segmentation,” J. Comput., Vol. 2, no. 2, pp. 134_141, Feb. 2010. Tripathi, S., Kumar, K., Singh, B. K. & Singh, R. P. (2012) Image segmentation: A review. International Journal of Computer Science and Management Research, 1, 838-843. Alaei, U. Pal, & P. Nagabhushan, “A new scheme for unconstrained handwritten text-line segmentation,” Pattern Recogn., Vol. 44, no. 4, pp. 917_928, Apr. 2011. Pan, Y. F., C. L. Liu, & X. Hou, “Fast scene text localization by learning-based filtering and verification,” in Proceedings of IEEE 17th International Conference on Image Processing (ICIP), Hong Kong, Sept. 26_29, 2010, pp. 2269_2272. Sarkar, R., S. Malakar, N. Das, S. Basu, M. Kundu, & M. Nasipuri, “Word extraction and character segmentation from text lines of unconstrained handwritten Bangla document images,” J. Intell. Sys., Vol. 20, pp. 227_260, Jan. 2011.

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images [26]

[27]

[28] [29] [30]

[31]

[32]

[33]

[34]

[35]

[36]

[37] [38] [39]

[40]

315

Roy, P. P., U. Pal, J. Llados, & M. Delalandre, “Multi-oriented touching text character segmentation in graphical documents using dynamic programming,” Pattern Recogn., Vol. 45, no. 5, pp. 1972_1983, May 2012. Biswas, S., & Das, A. K. (2012). Text extraction from scanned land map images. 2012 International Conference on Informatics, Electronics and Vision, ICIEV 2012, 231–236. Priya, M., & Gobu, C. K. (2013). A wavelet based method for text segmentation in color images. International Journal of Computer Applications, 69(3). Lee S. H. & J. H. Kim, “Integrating multiple character proposals for robust scene text extraction,” Image Vision Comput., Vol. 31, no. 11, pp. 823_840, Nov. 2013. Choudhary, A., R. Rishi, & S. Ahlawat, “A new approach to detect and extract characters from off-line printed images and text,” in Procedia Computer Science, First International Conference on Information Technology and Quantitative Management (ITQM), Suzhou, China, May 16_18, 2013, Vol. 17, pp. 434_440. Gomathi, S., R. S. U. Devi, & S. Mohanavel, “Trimming approach for word segmentation with focus on overlapping characters,” in International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, Jan. 4_6, 2013, pp. 1_4. Gonzalez, A., L. M. Bergasa & J. J. Yebes, “Text Detection and Recognition on Traffic Panels From Street-Level Imagery Using Visual Appearance,:” in IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 1, pp. 228-238, Feb. 2014. Sun, Y., Mao, X., Hong, S., Xu, W., & Gui, G. (2019). Template matching-based method for intelligent invoice information identification. IEEE Access, 7, 28392– 28401. Kaur, E. J. & Mahajan, R. Improved Degraded Document Image Binarization Using Guided Image Filter. International Journal of Scientific Research and Education, 2(07) (2014). Ryu, J., H. I. Koo & N. I. Cho, “Word Segmentation Method for Handwritten Documents based on Structured Learning,” in IEEE Signal Processing Letters, vol. 22, no. 8, pp. 1161-1165, Aug. 2015. Islam, R., Islam, M. R., & Talukder, K. H. (2016). An approach to extract text regions from scene image. International Conference on Computing, Analytics and Security Trends, CAST 2016, 138–143. Chaubey, A. K. (2016). Comparison of The Local and Global Thresholding Methods in Image Segmentation. World Journal of Research and Review (WJRR), 2(1), 1–4. Gaikwad, M. R. (2016). Text extraction and recognition using median filter. International Research Journal of Engineering and Technology, 717–721. Mol, J., Mohammed, A., & Mahesh, B. S. (2018). Text recognition using poisson filtering and edge enhanced maximally stable extremal regions. 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies, ICICICT 2017, 2018-Janua, 302–306. Chiatti, A., Yang, X., Brinberg, M., Cho, M. J., Gagneja, A., Ram, N., Reeves, B., & Giles, C. L. (2017). Text extraction from smartphone screenshots to archive in

本书版权归Nova Science所有

316

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49] [50] [51]

[52]

[53]

[54]

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al. situ Media Behavior. Proceedings of the Knowledge Capture Conference, K-CAP 2017. Akash, S. S., Kabiraz, S., Islam, S. A., Siddique, S. A., Huda, M. N., & Alam, I. (2019). A Real Time Approach for Bangla Text Extraction and Translation from Traffic Sign. 2018 21st International Conference of Computer and Information Technology, ICCIT 2018, 1–7. Natei, K. N., J, V., & S, S. (2018). Extracting Text from Image Document and Displaying its Related Information. Journal of Engineering Research and Applications, 8(5), 27–33. Manikandan, V., V. Venkatachalam, M. Kirthiga, K. Harini, & N. Devarajan, “Word segmentation in a document image using spectral partitioning,”in IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), Coimbatore, India, Dec. 28_29, 2010, pp. 1_4. Kiumarsi, E., & Alaei, A. (2018). A hybrid method for text line extraction in handwritten document images. Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, 2018-August, 241–246. Koshy, A., N. B. M. J., S. A. & A. John, “Preprocessing Techniques for High Quality Text Extraction from Text Images,” 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), 2019, pp. 14, doi: 10.1109/ICIICT1.2019.8741488. Ghai, D., & Jain, N. (2019). Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction. In Wireless Personal Communications (Vol. 109, Issue 1). Springer US. Deepa, R., & Lalwani, K. N. (2019). Image Classification and Text Extraction using Machine Learning. Proceedings of the 3rd International Conference on Electronics and Communication and Aerospace Technology, ICECA 2019, 680–684. Su, Bolan & Lu, Shijian & Tan, Chew Lim. (2010). Binarization of historical document images using the local maximum and minimum. International Workshop on Document Analysis Systems. 159-166. Chakraborty, G., Panda, S., & Roy, S. (2020). Text Extraction From Image Using MATLAB. SSRN Electronic Journal, 1–6. https://doi.org/10.2139/ssrn.3525969. Waghade, A. G., Zopate, A. V, Titare, A. G., & Shelke, S. A. (2018). Text Extraction from Text Based Image Using Android. 3–6. Ranjitha, P., Shamjiith, & Rajashekar, K. (2020). Multi-oriented text recognition and classification in natural images using MSER. 2020 International Conference for Emerging Technology, INCET 2020, 1–5. Mukherjee, S., & Ghosh, I. De. (2020). Feature Extraction from Text Images to Study Individuality of Handwriting. 4th International Conference on Computational Intelligence and Networks, CINE 2020, 3–8. Sumathi, C. P. & Santhanam, T & Devi, G. (2012). A Survey On Various Approaches of Text Extraction In Images. International Journal of Computer Science and Engineering Survey. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731.

本书版权归Nova Science所有

A Review of Various Text Extraction Algorithms for Images [55]

[56]

[57] [58]

[59] [60]

[61] [62]

[63]

[64]

[65]

[66]

[67]

317

Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Degerine S. & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi, A. “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse encomposantes indépendantes,” Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi, A. “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651 Degerine S. & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints“, in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 15

Machine Learning in the Detection of Diseases Aakash Joon Nidhi Sindhwani and Komal Saxena Amity University Noida, Uttar Pradesh, India

Abstract In the field of clinical imaging, computer-aided assessment (CAD) is a dynamic and rapidly developing area of research. Recently, so much effort has been put into further development of symptomatic PC applications, where failures in clinical presentation systems can lead to real abuse. Artificial intelligence plays an important role in diagnosing computers. Artificial Intelligence (AI) is the investigation of PC calculations that can consequently work on thorough experience and information use. It is viewed as a feature of computerized reasoning. AI calculations construct a model from an example of information, called preparing information, to settle on forecasts or choices without being modified to do as such. Machine learning calculations are utilized in a wide assortment of uses, for instance in medication, email separating, discourse acknowledgment and PC vision, where it is troublesome or difficult to foster customary calculations to play out the necessary assignments. The subset of artificial intelligence is closely related to computational understanding, which focuses onmaking predictions using personal computers. The study of numerical advances provides strategies, hypotheses, and applications in the field of artificial intelligence. Information mining is a related field of research that focuses on the exploratory exploration of information through self-directed 

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

320

Aakash Joon, Nidhi Sindhwani and Komal Saxena learning. Some Machine Learning (ML) applications use information and brain networks that emulate how the organic mind functions. In its application to business issues, ML is likewise alluded to as prescient examination. Plan acknowledgment is essentially about learning as a visual show. In the field of biomedicine, tolerating plan and manmade reasoning vows to runafter a more profound comprehension of contamination and control. This adds greater objectivity to thepowerful collaboration. For the investigation of multi-faceted and blended biomedical data, computerized reasoning furnishes a decent method for managing dynamic and robotized information handling. This article gives a similar investigation of various ML calculations for diagnosing different illnesses like coronary illness, diabetes, liver sickness, dengue fever and hepatitis.

Keywords: machine learning, artificial intelligence, clinical imaging

1. Introduction Machine learning could enable untouchables to think. PCs are getting more brilliant on account of man-made brainpower [1, 2]. Computerized reasoning is a subfield of the investigation of man-made consciousness. Numerous specialists accept that understanding is incomprehensible without preparation. Diagnosing disease is a method for figuring out which one’s sickness makes sense of the side effects of an individual. A few side effects and the characters are not characterized, so it’s the most difficult issue analyzed. Recognizing infection is truly significant to highlight and fix each sickness. AI is a region that can assist with anticipating forecast and past preparation information. Made by numerous researchers, there are various ways for AI to successfully work for the analysis of different illnesses [3-5]. AI models gain from designs in information instructional exercise models without unequivocal guidelines and then, at that point, use end to foster valuable expectations. The characterization techniques are generally utilized in medication to identify and better foresee infection [4, 5]. The sicknesses and medical issues like liver disease, ongoing kidney, bosom malignant growth, diabetes, and heart disorder fundamentally affect one’s well being and may prompt death whenever overlooked. The most famous AI in clinical applications innovation is a grouping since it fits the issues. In day-to-day existence, grouping calculations show up first. The preparation information is utilized to assemble the model, and afterwards

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

321

the model is applied to the test information to acquire an expectation. The guess and results are exceptionally encouraging. The strategies can decrease the demonstrative blunders and results can be acquired in a brief time frame [4, 5]. ML algorithms were created and used to validate the indications of clinical information [6]. Today, machine learning offers a variety of devices to explore information in a persuasive manner. In particular, recent computer-aided advances have provided a relatively inexpensive and accessible means of collecting and processing information. The computers for collecting and analyzing information have been installed in new and existing medical clinics, enabling them to collect and share information with vast amounts of data. AI methods are exceptionally successful in analyzing clinical information, and much work has been done on the analytical tasks. One of the numerous ML applications is used to create such classifier that can isolate the data in view of their properties. Such classifiers are used for clinical data examination and disease identification [7].

2. Types of Machine Learning Techniques Various types of machine learning techniques are shown in Figure 1.

Figure 1. Types of machine learning techniques.

本书版权归Nova Science所有

322

Aakash Joon, Nidhi Sindhwani and Komal Saxena

Artificial intelligence is a part of computational thinking. This gives the machines the same view of the world as a human and allows them to make decisions independently, without human intervention. This is the most common way to make cars naturally. Real machine learning creates a computer program that can perform data recovery to use it in training sessions. There are many types of machine learning, and we will discuss those in this chapter. 1. Supervised Learning: A bunch of preparing models is furnished with proper goals, and in view of this preparing set, the calculations answer accurately to every single imaginable information. Learning as a visual cue is one more name for administered learning [8]. Characterization and regression are two sorts of managed learning. Characterization: Gives a yes or no forecast, e.g., “Is this cancer growth dangerous?” “Does this treatment meet our quality guidelines?”. Regression: displays the response ‘how many’ and ‘how much.’ 2. Unsupervised learning: No right responses or targets are given in this learning. This is the singular learning technique that endeavors to find similitudes among data, and considering this likeness, this undirected Learning strategy describes the data. This is otherwise called thickness assessment. Independent learning includes clustering that creates groups based on similarity [8]. 1. Semi-supervised learning: Semi-supervised learning is a class of directed learning strategies. This strategy also includes simple information about the end training goal, often a modest amount of information described with a lot of simple information. Semisupervised learning lies between individual learning (unlabeled information) and directed learning (labeled information) [9]. 2. Reinforcement learning: This learning is based on the science of brain behavior [9]. Math tells you if the answer is wrong, but it doesn’t let you know how to solve the problem. It must explore and test different possible outcomes until it finds the right answer. This is also known as basic training. No upgradation is offered. Request for support differs from managed learning in that it neither provides accurate attribution of data sources and results, nor does it clearly indicate lowquality activities. Additionally, it focuses on online performance.

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

323

3. Evolutionary learning: This type of natural learning can be viewed as an educational interaction in which the organic life is adapted to work on endurance and conceptual skills [10]. It uses health functions to check the accuracy of the answer and we can connect this model to a computer. 4. Deep Learning: This piece of AI depends on a progression of numerical tasks. This preparation accounts for addressing the conversation at a huge level. It utilizes a mind-boggling diagram with various degrees of handling, comprising of many direct and nonstraight changes.

3. Machine Learning Algorithms 3.1. K Nearest Neighbor Algorithm (KNN) KNN [7] is a clear model and one of the most man-made intelligence processes for game plan issues, plan affirmation, and backslide. KNN gets neighbors among data using Euclidean distance between points of data. This algorithm is used for gathering and backslide problems. The worth of k (where k is a consistent described by the client) will see all of the relative existing component cases with the new case. It incorporates all the cases to find the new case for an equivalent arrangement. In like manner, the worth of K is basic and ought to be carefully picked considering the way that the structure can provoke. There are several limitations, for instance, terrible appearance when there is a wide arrangement dataset. The computation cost is extraordinarily high since we truly need to check the distance of all planning tests from every request model.

3.2. K-Means Clustering Algorithm K-Means clustering algorithm [7] is overall considered as an independent learning algorithm. It is used for the nearest neighborhood gathering. The data can be collected into k social occasions considering the resemblance between them. K is an entire number for the estimations. You truly need to know its value to work. K-suggests is the most normally elaborated computation for bundling and observing new data. The choice of the centroid k-mass is

本书版权归Nova Science所有

324

Aakash Joon, Nidhi Sindhwani and Komal Saxena

according to accompanying all along, it is finished with no obvious end goal in mind. Then, at that point, all of the spots in the new nearest centroid are recalculated and centroid is gathered pack. K means are particularly fragile to uproar peculiarities since some are affected by the midpoint. The potential gain of K-is that the methodology is easy to execute and computationally unraveled and successful.

3.3. Support Vector Machine SVM [7] has shown to be solid for the different portrayal issues. It tries to find the best hyperplane among classes by noticing the amount of spotlights on the class descriptors’ edge. The distance between classes is known as the edge. The better precision for the gathering can be gained when there is a higher edge. The significant things on the line are called help vectors. SVM is used for both backslide and game plan issues. This approach has capacity for dealing with an issue as a straight and nonlinear dataset. SVM computation uses different part types like straight twisting reason work and Sigmoid for an estimate model. SVM chips away at high-layered space for features and picks the best hyperplane for gathering educational things into two classes. It is capable of more unobtrusive and greater datasets that can’t be dealt with.

3.4. Naive Bayes Algorithm Straightforward Naïve Bayes (NB) [11] is a probabilistic and quantifiable strategy. It depends upon the plan computation. This Simulated intelligence application with clear order components comparatively adds to an extreme end. Calculating viability connects with this ease, NB’s philosophy is empowering and proper for an arrangement of disciplines. The essential pieces of the NB game plan are Pre, Post, and Class Contingent probabilities. This procedure has many advantages. It is basically just as essential and accommodating as a colossal dataset. It might be used on the gathering of separate and diverse issues. It requires a great deal of getting ready data and can be used for both discrete and diligent data. This algorithm can be used to channel and request spam messages report.

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

325

3.5. Decision Tree Algorithm Decision tree (DT) [12] is a directed estimation based on simulated intelligence to handle backslide and game plan issues by segregating data recursively according to express variables. The data are isolated into center points, leaves of the tree official decision. The justification for a decision tree is building a model that can be used to predict target factors. The tree is created using planning data for getting ready interaction. The terminal center point contains the class name and then again, a decision center point is a nonterminal center point. The association between conflicts doesn’t impact efficiency. No previous data to be dealt with is required. Whenever a tree is built, it is always built iteratively and can be over-arranged.

3.6. Logistic Regression (LR) Logistic Regression [13] is an overseen learning estimation used. Determined backslide is a mathematical model used in facilitated factors. The capacities tending to twofold request and tasks limits Relapse have a couple of more complicated increases. Essential backslide is an insightful backslide model. It is possible that a particular snippet of data or information has a spot with a particular order to utilize the backslide model. The determined backslide uses sigmoid. A limit that models the data calculated backslide has a couple of huge centers, including worked on execution, computational efficiency, basic affiliation to enter, no handiness or scaling required. Still the ability to address it is a non-straight issue that is leaned to overfitting.

4. Diagnosis of Diseases by Using Different Machine Learning Algorithms Many specialists slice through various AI computations to analyze diseases. Specialists concluded that AI computations perform well in diagnosing various diseases. In this review chapter, the diseases identified to be associated with microbial pathways include heart, diabetes, liver, dengue, and hepatitis.

本书版权归Nova Science所有

326

Aakash Joon, Nidhi Sindhwani and Komal Saxena

4.1. Heart Disease The authors in [14] presented a system with the ultimate objective of examination and noticing. The Coronary vein disease is distinguished and seen by this proposed structure. The heart instructive record is taken that involves 303 cases and 76 qualities/features. 13 features are used out of 76 components. Two tests with three computations Bayes Net, Backing vector machine, and utilitarian trees FT are performed for disclosure reason. WEKA instrument is used for acknowledgment. Right after testing Holdout test, 88.3% accuracy is achieved by using SVM methodology. In Cross Approval test, Both SVM and Bayes net give the accuracy of 83.8%. 81.5% accuracy is accomplished ensuing to using FT. Seven best components are gotten by using best option computation. For endorsement, Cross Approval test is used. By applying the test on 7 best picked features, Bayes Net accomplishes 84.5% of rightness, SVM gives 85.1% precision and FT request gives 84.5% precisely. The authors in [15] have done on diagnosing coronary ailment using the honest Bayes computation. Bayes speculation is used in Credulous Bayes. Thus, Innocent Bayes has a strong suspicion of opportunity. The instructive assortment used comes from one of the fundamental diabetes research establishments in Chennai. The educational record included 500 patients. It includes Weka as a contraption and plays out the rating with a segment of 70%. Honest Bayes offers an accuracy of 86.419%. In [16], the authors suggested the use of data mining ways of managing distinguish coronary sickness. WEKA data mining device, that contains a lot of simulated intelligence computations, is used for data mining purposes. Gullible Bayes, J48, and Sacking are used for this perspective. The UCI AI Lab outfits a coronary ailment dataset with 76 qualities. Only 11 characteristics are used for assumption. The Credulous Bayes account gives an accuracy of 82.31%. The J48 gives a precision of 84.35%. The authors in [17] attempted to break down coronary disease in diabetics using simulated intelligence systems. Artless Bayes and SVM estimations are completed with WEKA. An enlightening record of 500 patients accumulated at Chennai Exploration Establishment. The disorder was found in 142 people and 358 people were absent. While using the Credulous Bayes estimation, an accuracy of up to 74% is achieved. SVM gives the most raised objective of 94.60.

4.1.1. Analysis In the current configuration, SVM offers the most significant accuracy of 94.60% in 2012. SVM shows excellent execution results in various places.

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

327

The name of the brand or components used by Partiban and Srivatsa in 2012 is completely identical to the SVM. SVM meets this specification and offers an accuracy of 85.1%, but is as low as 2012. Planning and testing data sets for the two training datasets are unique and the data types are unique.

4.2. Diabetes Disease The authors in [18] have played out a work to predict diabetes disease by using decision tree and Credulous Bayes. Ailments happen when production of insulin is inadequate or there is improper use of insulin. Educational assortment used in this work is Pima Indian diabetes enlightening assortment. Various tests were performed using WEKA data mining instrument. In this enlightening, file rate split (70:30) predicts better contrasted with cross endorsement. J48 shows 74.8698% and 76.9565% accuracy by using Cross Approval and Rate Split Individually. Blameless Bayes presents 79.5652% exactness by using PS. Computations shows most raised precision by utilizing rate split test. Meta-learning computations for diagnosing diabetes have been analyzed by the authors in [19]. Pima Indian Diabetes data is open for use supported by the UCI AI Research facilities. WEKA is used in the examination. The Truck, Ada boost and Logi boost learning and assessing computations were used to predict the patients paying little mind to diabetes. Authorizers give an accuracy of 78.646%. Ada boost keeps an accuracy of 77.864% while Logi boost gives an accuracy of 77.479%. The score has a right correction speed of 66.406%. Endorse gives an overall accuracy of 78.646% and a misclassification speed of 21.354%, which is lower stood out from various progressions. Preliminary work was performed by the authors in [20] to anticipate diabetes. SVM is a simulated intelligence procedure that a specialist uses in this preliminary. The RBF focus in SVM is used for portrayal. The AI Research center is for a Diabetes Information Supply Establishment in Pima, Indiana at the College of California, Irvine. The MATLAB 2010a test is taken reliably. The SVM presents 78% accuracy. The authors in [21] are working on Innocent Bayes for expecting type-2 diabetes. Diabetes has three kinds of contamination. Type 1 is type 1 diabetes, type 2 is type 2 diabetes and type 3 is gestational diabetes. Type 2 diabetes results from extended insulin hindrance. The instructive file involves 415 cases and is assembled from different bits of India. MATLAB is used with SQL Server for model development. A 95% right assumption was gotten for Naive Bayes.

本书版权归Nova Science所有

328

Aakash Joon, Nidhi Sindhwani and Komal Saxena

4.2.1. Analysis The system considering Innocent Bayes is useful in diagnosing diabetes. Guileless Bayes offers the most raised accuracy of 95% in 2012. The results show that this system can give a nice hypothesis negligible proportion of goof and that this method is also critical for diabetes. Be that, in 2015, the accuracy given by Gullible Bayes was low. It has a precision of 79.5652%. The proposed diabetes screening model requires more data age and experience.

4.3. Liver Disease The authors in [22] predicted liver contamination using a voyager support contraption and individual taste gathering estimations. ILPD data is given by the UCI. The data series joins 560 states and 10 credits. The connection was made in view of timetable and execution accuracy. Simplex showed a limiting of 61.28% in 1670.00 ms. There is an achieved precision of 79.66% at 3210.00 ms with SVM. MATLAB is used for execution. SVM showed the most critical accuracy stood out from expecting Narive sinusoidal liver contamination. To the extent, it puts resources into a potential chance to investigate when appeared differently in relation to SVM. The authors in [23] used Innocent Bayes, K-STAR, and FT-Tree data mining estimations to handle liver contamination. An educational file can be accumulated from the UCI containing 345 cases and 7 ascribes. We completed cross-endorsement using the WEKA mechanical assembly. Sincere Bayes offers a 96.5% amendment and achieves an accuracy of 97.10% including FT material in 0.2 seconds. KStar computation cases decrease 83.47% of event accuracy. Considering the results, it provides the most raised request precision from the FT tree in the liver affliction dataset diverged from various data mining computations.

4.3.1. Analysis For diagnosing liver diseases, the FT-Tree computation gives the most essential result diverged from the various estimations. Whenever the FT tree estimation is applied to the liver disorder educational list, the time cost execution or model design is speedier stood out from various computations. According to its subscribers, it shows better execution. This computation is totally requested and gives an accuracy of 97.10%. Considering the results, this computation expects a huge part in choosing the most trustworthy request of state-of-the-art data resources.

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

329

4.4. Dengue Dengue fits into a genuine compelling disease. The WEKA fodiendi instrument [24] gives a matched test and Utendo 10-Core triplicem sanationem DT gives a precision of 99.95% and an accuracy of RS% east. Post PS, it was proposed the proportion in the prosthetic retinal intersection to expansion of dengue- produced aegros dengue. MATLAB has a Neural Retinal Adhesion Tool.

4.4.1. Analysis Advertising proposal classification, RS classification fund is developed in theory as a positive position. Atrium Effective Classifier is used for Prototyping. RS regular promittend is a method. Optima etiam is RS outside the neural network in the time range. This algorithm has complex issues.

4.5. Hepatitis Disease The data mining computations that are used for hepatitis contamination finding [25] are Gullible Bayes, Guileless Bayes updatable, FT Tree, K Star, J48, LMT, and NN. Hepatitis disease instructive assortment was taken from UCI AI document. Gathering results are assessed concerning precision and time. The examination is taken by using cerebrum affiliations and WEKA: data mining gadget. Results that are taken by using cerebrum affiliation are lower than the computations used in WEKA. In this Investigation of Hepatitis infection finding, the second strategy that is used is cruel set theory, by using WEKA. Execution of Harsh set framework is better than NN phenomenally on the off chance that there ought to emerge an event of clinical data examination. Blameless Bayes gives the accuracy of 96.52% in 0 sec. 84% Exactness is accomplished by the Credulous Bayes Updateable computation in 0 sec. In 0.2 sec, FT Tree presents the accuracy of 87.10%, K star offers 83.47% Accuracy while time taken for K star estimation is 0 sec. Rightness of 83% is achieved by J48 and time that J48 takes to describe is 0.03 sec. LMT gives 83.6% accuracy in 0.6 sec. Cerebrum network shows 70.41% of rightness. Naive Bayes is the best request estimation used in the terrible set technique. It offers high accuracy in the least amount of time. The authors used C4.5, ID3 and Truck estimations for diagnosing the disorder of hepatitis. This study uses the UCI hepatitis patient enlightening assortment. WEKA gadget is used in this assessment. Truck has offered unbelievable execution treatment

本书版权归Nova Science所有

330

Aakash Joon, Nidhi Sindhwani and Komal Saxena

of missing characteristics. Thus, Truck computation shows a most significant plan accuracy of 83.2%. ID3 Calculation offers 64.8% of precision and 71.4% is achieved by C4.5 estimation. Twofold decision tree (DT) that is delivered through Truck computation has only two or no child. DT that is formed by the C4.5 and ID3 can have something like two youths. Truck estimation performs well with respect to Exactness and time multifaceted nature.

5. Discussion and Analysis of Machine Learning Techniques Many machine learning algorithms work very well to diagnose heart disease, diabetes, liver, dengue and hepatitis. It is noted from the existing literature that Naive Bayesian algorithms and SVMs are widely used in disease detection algorithms. Both algorithms are more accurate than the other algorithms. Artificial neural networks are also very useful for prediction. The maximum output is also shown, but it takes longer than other algorithms. Tree algorithms are also used but are not widely accepted due to their complexity. Also, correctly answering the dataset attributes will improve accuracy. RS theory is not widely used, but it provides the best performance. Machine learning has emerged in the medical sector to provide tools and analyze disease-related data. Therefore, machine learning algorithms play an important role in achieving early disease detection. This paper provides a review of various machine learning algorithms for disease prediction, using standard data sets for various diseases such as liver, chronic kidney, breast cancer, cardiac syndrome, brain tumors, and many other diseases. The list of results detected by researchers for disease detection has been tabulated by the various ML algorithms [26]. After comparing the different models that predicted disease, we show that some algorithms have excellent accuracy in predicting SVMs, K-nearest neighbors, random forests, and decision trees. However, the accuracy of the algorithm itself can vary from dataset to dataset, as many important factors, such as datasets, feature selection, and number of features, affect the accuracy and performance of the model. Another important point found in this review is that you can improve the accuracy and performance of your model by using different algorithms to generate a single ensemble model. This article portrayed the associated studies to recognize the infections using a variety of man-made intelligence estimation methods for the period 20182020 works inspected. In this article, various estimations are used in peer researched assessments like determined backslide, K, etc. Nearest neighbor, support vector machine, K-suggests, Decision Tree models, sporadic forest

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

331

areas and shows. Moreover, these estimation methods have been applied to a couple of standard educational assortments. The accuracy and mindfulness results are picked as wonderful. In KNN and SVM, the estimation has been applied to two remarkable educational files and features. The results show that the SVM computation is better. The KNN computation avoids the SVM around 3%. The accuracy of decision trees is better than Arbitrary Woodland, around 1%. Unpredictable woods precision outperforms 12% Decision tree. General SVM and key backslide are vitally equal classifiers. The Decision tree, Arbitrary Woods, KNN, Naive Bayes and CNNs can be isolated into something like 3 orders K-suggests various groupings. All estimations work outstandingly if the dataset size is pretty much nothing, yet sensible for enormous datasets to use the significant learning computations like CNN. Considering everything, the accuracy of the estimation depends upon the size of the enlightening assortment, number of features and results.

6. Benefits of Machine Learning in Diagnosis of Diseases 6.1. Recognizes and Examines Ailments Computer based intelligence can perceive and distinguish diseases and secondary effects that are difficult to break down. It moreover breaks down malignancies that are trying to perceive to start with periods of other genetic issues. IBM Watson Genomics is presumably the best delineation of mental enlisting.

6.2. Drug Improvement and Gathering Starting stage drug exposure is one of the truly clinical benefits of man-made intelligence in medicine. It also joins Research and Development headways, for instance, bleeding edge sequencing and exactness prescription. It can help you with finding new treatment decisions for complex diseases. To be sure, even with solo learning, you can find plans in your data without making assumptions. This is quite possibly the most accommodating computer-based intelligence system. Project Hanover is a Microsoft drive. ML-based development is used in the various drives. It also contains sedates expressly

本书版权归Nova Science所有

332

Aakash Joon, Nidhi Sindhwani and Komal Saxena

expected for artificial intelligence based illness drugs and serious myeloid leukemia.

6.3. Clinical Imaging Diagnostics PC Vision’s imaginative advancement updates both simulated intelligence and significant learning. Microsoft’s Inner Eye may be the best safeguard of PC vision. It revolves around logical imaging instruments for picture examination. We can expect more data sources from the full plan of clinical pictures to be fundamental for the artificial intelligence driven demonstrative cycle. This happens as ML ends up being more regular and more intelligent.

6.4. Altered Medicine ML in Medical care maintains the improvement of modified drugs. These are more capable and effective when gotten together with perceptive assessment. It is in like manner arranged for future investigation and better appraisal of the disease. At this point, experts are limited to peruse a confined course of action of decisions. This even blocks threats to patients taking into account their clinical history and accessible genetic information. Biosensors and contraptions with state-of-the-art prosperity checking limits will in a little while enter the market. This will moreover assemble the data available in the latest ML-based clinical benefits plans.

6.5. Prosperity Record with Knowledge Keeping a nice prosperity record is a monotonous strategy. The development moreover helped with the data segment process. Regardless, most of the connection really consumes a huge piece of the day to wrap up. With clinical benefits, ML chips away at the procedure and saves time, money and effort. A record gathering approach considering ML-based OCR development tones down the progression aggregation process.

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

333

6.6. Research and Clinical Fundamentals There are stresses over how ML works and is applied in the clinical consideration industry. Man- made intelligence has a wide extent of purposes in research and clinical fundamentals. Clinical starters with emergency treatment are lavish. In like manner, it can require a surprisingly long time to wrap up. ML-based farsighted examination can be used to recognize likely competitor for the clinical fundamentals. It helps the experts with making candidate pools from a variety of data sources. These data sources in a similar manner consolidate virtual amusement, past expert visits, and that’s just a hint of something larger. ML has moreover tracked down a couple of utilizations to ensure non-stop data access and checking of exploratory individuals. It moreover determines the ideal financing test size to test and uses the power of electronic recording. Every one of these diminishes data botches.

6.7. Data Grouping Free support is endless in the clinical business. This gives trained professionals and specialists induction to every one of the information. Ceaseless prosperity data will in like manner influence the future clinical new developments. Apple offers its own investigation pack. This gives clients permission to natural applications that use facial affirmation considering ML. It moreover helps treat Parkinson’s ailment and Asperger’s contamination. The Man-made knowledge mind networks are important for data obtainment and ailment assumption. It perceives everything from digestive ailment eruptions to certifiable continuous pollutions. Eruption figure is especially critical in underdeveloped countries. Pro MED-Email is an online report at the chief stage. It helps screen perpetually emerging compelling afflictions. It in like manner examines circumstance occasions constantly.

6.8. Drug Things ML applications are at this point used in clinical consideration for drug headway. ML tracks down an anesthesiology application for various purposes.

本书版权归Nova Science所有

334

Aakash Joon, Nidhi Sindhwani and Komal Saxena

7. Challenges of Machine Learning in Detection and Diagnosis of Diseases 7.1. Data Irregularity Electronic thinking and simulated intelligence can chip away at all cycles in clinical consideration. ML requires first rate coordinated data to make accurate assumptions. The opening in clinical data can incite mixed up figures. This can genuinely influence clinical course. The shortfall of homogeneity of data in clinical benefits at this point confines the endless gathering of ML. Looking at set up by hand accounts and separated data incredibly commonly can incite inadequate encounters and wrong finishes. Furthermore, before computerbased intelligence computations can take advantage of the data, it ought to first be organized and cleaned.

7.2. Absence of Qualified Pioneers The shortfall of good data engineers is in like manner an obstacle for ML. Also, one of the fundamental obstacles to the duplication of computer-based intelligence and simulated intelligence is the shortfall of prepared specialists.

7.3. Supplier Scorn The overcoming provider insurance from taking on man-made intelligence in clinical consideration is the most inconvenient impediment. The clinical establishments need to redesign or supersede legacy systems to take on simulated intelligence. This requires resources that may be insufficient especially in everyday prosperity emergencies.

7.4. Data Security Data security issues can be summed up in data privacy, data confirmation, data uprightness, and data refreshments [27, 28]. Cryptographic procedures are the best answers to help these security needs [29-37]. The cryptographic insights

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

335

are the reason for the security and protection insurance of smart city application since they avoid the convergence of problematic gatherings during life- supporting cycles, preparing and sharing. In the space underneath, we attempt to sum up the cryptographic procedures as of now utilized in the smart city designs. We can likewise utilize IoT-based frameworks and biometrics intended to demonstrate generally genuineness. Specifically, these advancements can be utilized to normally distinguish an individual with uncommon good and friendly characteristics. Bio-data is taken out from fingerprints, face, words, manually written imprints and so on. Block chain is the response to a basic trial of safety, security, and straightforwardness in this individual, hierarchical, and functional information. Various sorts of brilliant city trades can be recorded on the block chain. Through smart contracts, complex lawful cycles can be made, and data exchange needs to happen accordingly. With shrewd agreements and appropriate applications, block chain offers a significant degree of fulfillment to make brilliant exchanges during the smart city working cycle. Block chain can offer features like predictable confirmation, protection, security, consistent arrangement and streamlining [38-42].

Conclusion and Outlook Surveyed authentic models that don’t convey extraordinary execution results are flooded with appraisal areas. Real models can’t hold complete data, handle missing characteristics and colossal components. ML accepts a huge part in various applications, for instance, picture distinguishing, data mining, ordinary language taking care of, and disease assurance. In so many districts, ML offers expected courses of action. This paper presents an outline of different artificial intelligence techniques to investigate different afflictions, similar to coronary disease, diabetes, liver contamination, dengue, and hepatitis. Various computations show incredible results since they describe the attributes precisely. From past assessments, SVM was found to give a prevalent accuracy of 94.60% for acknowledgment of coronary sickness. Diabetes is exactly examined by Guileless Bayes that gives the most raised appraisal accuracy of up to 95%. FT gives 97.10% accuracy in diagnosing liver ailments. In dengue distinguishing proof, the RS speculation achieves 100% accuracy. The HCV forward looking cerebrum network orchestrates precisely in light of the fact that it gives 98% accuracy. This study includes the characteristics and weaknesses of these estimations. An organized outline

本书版权归Nova Science所有

336

Aakash Joon, Nidhi Sindhwani and Komal Saxena

of overhauls in machine is shown to learn the computations for affliction gauge. Assessment shows that these computations further foster accuracy for a grouping of sicknesses. ML applications in clinical benefits might perhaps exhibit agitated examination and treatment. Also, this development is currently being utilized. It has similarly shown its practicality in the early disclosure of chest, skin and liver sicknesses. In the field of definite imaging and cautious sewing, ML computations at this point beat specialists. The ML computation in like manner achieved better results altogether speedier. Machine vision is a dreary string in these demonstrative applications. It is critical that advances in artificial intelligence are associated with trustworthy decisive applications. Additionally, the support of this strategy in all actuality and its extent of direction in the demonstrative field depends upon the course of trial and error.

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

Jain, S., Sindhwani, N., Anand, R., & Kannan, R. (2022). COVID Detection Using Chest X-Ray and Transfer Learning. In International Conference on Intelligent Systems Design and Applications (pp. 933-943). Springer, Cham. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Lal, A., Pinevich, Y., Gajic, O., Herasevich, V., & Pickering, B. (2020). Artificial intelligence and computer simulation models in critical illness. World Journal of Critical Care Medicine, 9(2), 13. Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Pandey, D., Nassa, V. K., Jhamb, A., Mahto, D., Pandey, B. K., George, A. H., & Bandyopadhyay, S. K. (2021). An integration of keyless encryption, steganography, and artificial intelligence for the secure transmission of stego images. In Multidisciplinary Approach to Modern Digital Steganography (pp. 211-234). IGI Global. Singh, S. K., Thakur, R. K., Kumar, S., & Anand, R. (2022, March). Deep Learning and Machine Learning based Facial Emotion Detection using CNN. In 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 530-535). IEEE. Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases

[8]

[9]

[10]

[11]

[12]

[13] [14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

337

Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15. Shukla, R., Dubey, G., Malik, P., Sindhwani, N., Anand, R., Dahiya, A., & Yadav, V. (2021). Detecting crop health using machine learning techniques in smart agriculture system. Journal of Scientific and Industrial Research (JSIR), 80(08), 699-706. Pandey, B. K., Pandey, D., Wariya, S., Aggarwal, G., & Rastogi, R. (2021). Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans’ Text Recognition and Identification. Augmented Human Research, 6(1), 1-14. Anand, R., & Chawla, P. (2020). Optimization of inscribed hexagonal fractal slotted microstrip antenna using modified lightning attachment procedure optimization. International Journal of Microwave and Wireless Technologies, 12(6), 519-530. Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46). Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01), 20-28. Sperandei, S. (2014). Understanding logistic regression analysis. Biochemia medica, 24(1), 12-18. Vembandasamy, K., Sasipriya, R. Deepa, E. (2015) Heart Illnesses Identification Utilizing Credulous Bayes Calculation. IJISET-Global Diary of Inventive Science, Designing and Innovation, 2, 441-444. Chaurasia, V. likewise, Buddy, S. (2013) Information Mining Way to deal with Recognize Coronary illness. Worldwide Diary of Cutting edge Software engineering and Data Innovation (IJACSIT), 2, 56-66. Parthiban, G. & Srivatsa, S.K. (2012) Applying Machine Learning Methods in Diagnosing Heart Disease for Diabetic Patients. International Journal of Applied Information Systems (IJAIS), 3, 25-30. K.C., Teoh, E.J., Yu, Q. & Goh, K.C. (2009) A Half and half Developmental Calculation for Characteristic Choice in Information Mining. Journal of Master Framework with Applications, 36, 8616-8630. Sen, S.K. & Dash, S. (2014) Application of Meta Learning Algorithms for the Prediction of Diabetes Disease. International Journal of Advance Research in Computer Science and Management Studies, 2, 396-401. Vidhyasree, M., & Parameswari, R. (2021, July). Meta Learning Gradient Boosted Neural Network Model Based Diabetes Risk Prediction with Bias Reduction Using OCT Image Attributes. In 2021 6th International Conference on Communication and Electronics Systems (ICCES) (pp. 1188-1195). IEEE. Lukmanto, R. B., Nugroho, A., & Akbar, H. (2019). Early detection of diabetes mellitus using feature selection and fuzzy support vector machine. Procedia Computer Science, 157, 46-54. Dalakleidi, K., Zarkogianni, K., Thanopoulou, A., & Nikita, K. (2017). Comparative assessment of statistical and machine learning techniques towards estimating the

本书版权归Nova Science所有

338

[22] [23]

[24] [25]

[26]

[27]

[28]

[29]

[30] [31]

[32] [33] [34]

[35]

[36]

Aakash Joon, Nidhi Sindhwani and Komal Saxena risk of developing type 2 diabetes and cardiovascular complications. Expert Systems, 34(6), e12214. Sipos, P. (2003). Some effects of lead contamination on liver and gallbladder bile. Acta Biologica Szegediensis, 47(1-4), 139-142. Ghosh, S. R., & Waheed, S. (2017). Analysis of classification algorithms for liver disease diagnosis. Journal of Science Technology and Environment Informatics, 5(1), 360-370. Shakil, K. A., Anis, S., & Alam, M. (2015). Dengue disease prediction using weka data mining tool. arXiv preprint arXiv:1502.05167. Ho, T. B., Nguyen, T. D., Kawasaki, S., Le, S. Q., Nguyen, D. D., Yokoi, H., & Takabayashi, K. (2003, August). Mining hepatitis data with temporal abstraction. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 369-377). Khan, B., Hasan, A., Pandey, D., Ventayen, R. J. M., Pandey, B. K., & Gowwrii, G. (2021). Fusion of Datamining and Artificial Intelligence in Prediction of Hazardous Road Accidents. In Machine Learning and IoT for Intelligent Systems and Smart Applications (pp. 201-223). CRC Press. Anand, R., Sindhwani, N., & Juneja, S. (2022). Cognitive Internet of Things, Its Applications, and Its Challenges: A Survey. In Harnessing the Internet of Things (IoT) for a Hyper-Connected Smart World (pp. 91-113). Apple Academic Press. Babu, S. Z. D., Pandey, D., Naidu, G. T., Sumathi, S., Gupta, A., Bader Alazzam, M., & Pandey, B. K. (2023). Analysation of Big Data in Smart Healthcare. In Artificial Intelligence on Medical Data (pp. 243-251). Springer, Singapore. Babu, S. Z. D., Pandey, D., & Sheik, I. (2020). An overview of a crime detection system using the art of data mining. International Journal of Innovations in Engineering Research and Technology, 7(05),125-139. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881 Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5).

本书版权归Nova Science所有

Machine Learning in the Detection of Diseases [37]

[38]

[39]

[40]

[41]

[42]

339

Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Detials in Signtures Based. American Academic Scientific Research Journal for Engineering, Technology, and Sciences, 26(1), 250-260. S. Degerine & A. Zaidi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” in IEEE Transactions on Signal Processing, vol. 52, no. 6, pp. 1499-1512, June 2004, doi: 10.1109/TSP.2004.827195. Zaidi A., “Positive definite combination of symmetric matrices,” in IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4412-4416, Nov. 2005, doi: 10.1109/TSP.2005.855077. Degerine S. & A. Zaidi, “Sources colorees” a chapter of a collective work entitled “Séparation de sources 1 concepts de base et analyse en composantes indépendantes”, Traité IC2, série Signal et image [“Separation of sources 1 basic concepts and analysis into independent components”, Treatise IC2, Signal and image series], Hermes Science, 2007, ISBN: 2746215179. Zaïdi A., “Necessary and Sufficient Conditions for the Existence of Robust Whitening Matrices”, in IEEE Signal Processing Letters, vol. 26, no. 6, pp. 863867, 2019, doi: 10.1109/LSP.2019.2909651. Degerine S. & A. Zaidi, “ Determinant Maximization of a Nonsymmetric Matrix with Quadratic Constraints“, in SIAM Journal on Optimization, 2007, Vol. 17, No. 4 : pp. 997-1014, doi: 10.1137/050622821.

本书版权归Nova Science所有

本书版权归Nova Science所有

Chapter 16

Applications for Text Extraction of Complex Degraded Images Binay Kumar Pandey1, Digvijay Pandey2 Vinay Kumar Nassa3 A. Shaji George4, PhD Sabyasachi Pramanik5 and Pankaj Dadheech6 1Department

of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, India 2Department of Technical Education, Institute of Engineering and Technology (IET), Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, India 3Computer Science Engginering Department, Rajarambapu Institute of Technology, Rajaramnagar (Islampur-Maharshtra), India 4Department of Information and Communication Technology, Crown University, Int’l. Chartered Inc. (CUICI), Santa Cruz, Argentina 5Department of Computer Science and Engineering, Haldia Institute of Technology, West Bengal, India 6Department of Computer Science & Engineering (NBA Accredited), Swami Keshvanand Institute of Technology, Management and Gramothan (SKIT), Jaipur, Rajasthan, India

Abstract In recent years, there has been an increase in demand for preserving historical documents and books and converting them to digital format. Furthermore, the rapid development of data innovation and the rapid 

Corresponding Author’s Email: [email protected].

In: The Impact of Thrust Technologies on Image Processing Editor: Digvijay Pandey ISBN: 979-8-88697-832-2 © 2023 Nova Science Publishers, Inc.

本书版权归Nova Science所有

342

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al. spread of the Internet have resulted in massive amounts of image and video data. The texts in images and videos aid in the analysis of those images and videos, as well as indexing, archiving, and retrieval. Various types of noises, such as Gaussian noise, salt and pepper noise, speckle noise, and so on, can easily affect an image. In order to remove these various noises from images, image filtering algorithms such as the Gaussian filter, mean filter, median filter, and so on are used. This chapter examines the impact of various pre-processing techniques such as thresholding, morphology, and deblurring on text extraction techniques. The experiment results show that pre-processing techniques improve the visual and structural quality of the document.

Keywords: complex degraded image, thresholding, morphology, edge detection, OCR

1. Introduction The advancement of the Internet led to the huge increment in images and videos database. Most of the images and videos consist of numerous texts information. Moreover, in recent times there has been an increase in demand to preserve historical documents, books and convert them into digital format. However, extracting texts from these natural images, scanned documents or videos is a tedious work. Noise often causes image distortion during image acquisition, processing, transmission, and reproduction. Restoring the original image once the noise has been eliminated is one of the fundamental objectives of image processing. Despite the numerous algorithms that have been created in the past, more work has to be done to enhance text extraction from images and videos. The main challenges in extracting text from complex degraded photographs are word orientation, font size, variety of background, low quality, differences in text colour, interference of noise, etc. Pre-processing approaches are therefore crucial for enhancing the quality of text extraction [1-3]. In this work, we look at how fundamental pre-processing approaches affect image quality and as a result, how they affect text extraction.

1.1. Noise Noise is a random variation in digital images that may cause different intensity value of pixels instead of true pixel values. In image processing, there seem to

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images

343

be various types of noise that affect the image in various ways. The following are a few of the most prevalent types of noises:

1.1.1. Gaussian Noise This noise mainly manifests itself during the image acquisition procedure. It is additive by nature. The fact that these noise models are tractable in both the spatial and frequency domains is the key justification for their use as shown in Figure 1. The PDF of a Gaussian random variable can be defined by the following equation: 𝑝(𝑧) =

1 √2𝜋𝜎

ⅇ 2

−

(𝑧−𝑧 ̅ )2 2𝜎2

where, z = Intensity 𝑧̅ = Mean value of z σ = Standard deviation σ2 = Variance of z

Figure 1. PDF of Gaussian noise.

本书版权归Nova Science所有

344

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

2. Impulse Noise This noise is also known as “salt and pepper” noise because it causes black dots to appear in areas that are bright and white dots to appear in areas that are dark. This noise primarily occurs during the transmission or conversion process. In an 8-bit image, the normal salt noise pixel value is 255, while the typical pepper noise pixel value is 0 as shown in Figure 2. The PDF for impulse noise is as follows: 𝑃𝑎 𝑓𝑜𝑟 𝑧 = 𝑎 𝑝(𝑧) = {𝑃𝑏 𝑓𝑜𝑟 𝑧 = 𝑏 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Figure 2. PDF of impulse noise.

2.1. Pre-Processing Methods The image pre-processing technique may be used to a wide range of advanced image processing tasks. The nature of these algorithms and how they may be utilised to accelerate the development of image processing is explained in [1]. There are either generic or specific pre-processing processes utilised in various text extraction approaches.

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images

345

3. Blurring High-frequency components in the picture, such as noise, are removed via blurring. There are various filters used for blurring but only two of them i.e., Gaussian filter and median filter are examined in this work.

3.1. Gaussian Filter The Gaussian filter blurs images using a Gaussian kernel. It has the ability to eliminate the Gaussian noise. The expression of a Gaussian function in two dimensions is defined as follows: 𝐺(𝑥, 𝑦) =

1 2𝜋𝜎

− 2𝑒

(x2 +𝑦2 ) 2𝜎2

where, x = distance along the horizontal axis y = distance along the vertical axis σ = standard deviation

3.2. Median Filter The median filter subtracts the centre element and replaces it with the median value after first calculating the median of all the pixels inside the defined region. This retains the edges while decreasing noise. This method eliminates the salt and pepper sounds. The phrase below can be used to define the median filter. 𝑓̂(𝑥, 𝑦) = mⅇdian {𝑔(𝑠, 𝑡)} (𝑠,𝑡)∈𝑆𝑥𝑦

4. Thresholding Thresholding methods try tomake a grayscale picture into binary form using pixel density as a criterion. Thresholding is a basic method of doing picture segmentation into two regions i.e., foreground and background. The output is determined by intensity threshold [4]. If the value of pixel intensity is more

本书版权归Nova Science所有

346

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

than the threshold value then it is replaced by a white pixel and if it is less than the threshold value, it is replaced by a black pixel. The method for thresholding that is most popular and widely used is discussed below.

4.1. Global Thresholding The global thresholding technique helps us to classify the image pixels and background pixels of an object. The global binarization is a method that uses a single threshold value for the entire document. While in the local binarization method instead of a single threshold value, different threshold values are selected for every pixel in the entire image [5]. Otsu’s method is a technique for image binarization that uses adaptive thresholding. It selects an optimal threshold value from a possible range of threshold values i.e., from 0 to 255. This method is based on global thresholding and it is used to carry out comprehensive image thresholding [6]. It is also used to convert a grayscale image to a binary image. The method considers two types of pixels that an image consists of i.e., foreground and background pixels, it then evaluates the optimum threshold value that distinguishes the two types of pixels in order to minimize their mixed spread as well as the intra-class variance or in an equivalent manner their inter-class variance is maximized [7].

5. Morphological Operations The term “morphology” describes a wide range of image processing techniques that modify images based on their forms or shapes. It is a data processing technique that can be applied to the processing of images. It has many uses, including noise reduction, boundary extraction, and texture analysis [8–9].

5.1. Dilation One of the essential processes in morphology is dilation. Although it may be used with grayscale photos as well, binary images are where it is most frequently employed. The effects of dilatation cause the items to grow.

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images

347

Through this process, the borders of foreground pixels are gradually increased, resulting in larger regions and fewer gaps in that region [10].

5.2. Erosion The size of items decreases as a result of erosion. The foreground’s borders are fundamentally eroded by erosion, which results in some pixels getting smaller and certain gaps getting wider [11]. Erosion is the reverse of dilatation. While, dilatation widens limits and fills holes, erosion narrows boundaries and widens holes [12-13]. If the structuring element does not entirely overlap ON-valued pixels, it sets an ON pixel to OFF.

5.3. Opening and Closing More complicated sequences can be created by combining the two primary procedures of dilation and erosion. The most helpful of them for morphological filtering are opening and closing. An opening operation is described as erosion followed by dilation, both of which are performed with the same structural element [14-20].

6. Methodology In our proposed method, the input complex degraded image is first filtered to remove noise by applying a suitable filter and if the input image is colour image then, it is converted into a gray scale image. Later, the Otsu’s thresholding technique is applied to the gray scale image and horizontal as well as vertical gradient are calculated and on the basis of these gradients the edges are detected. The next stage involves the morphological operations in which the dilation operation is performed first to enlarge the boundaries of the foreground pixels and then the erosion operation is applied on the masked image to narrow the boundaries and finally the closing operation is performed to improve the edges [21-24]. After that, a post-processing step is involved to further improve the image quality by reducing uneven illumination. Finally,

本书版权归Nova Science所有

348

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

the text is recognized with the help of OCR process. The flow chart of the proposed methodology is depicted in Figure 3.

Figure 3. Flowchart of the proposed methodology.

7. Experimental Analysis To test our text extraction technique from complex degraded images, we have used the random images from various Internet sources that contain a wide range of datasets having different background illumination, colour, image quality, font sizes, orientation etc. The proposed algorithm is implemented at the MATLAB R2018a platform. All the experiments are carried out on a standard computer (Intel Core i7-4770, 3.40 GHz CPU, 4 GB RAM and Windows 8.1 Pro,64-bit OS). Some of the complex degraded images and the extracted texts are shown in Table 1 and Table 2.

8. Results and Discussion The proposed method has its own advantages and disadvantages. It works properly on a wide range of datasets such as screenshots, invoices, documents and hoarding banners etc. However, it might not function very accurately on complex degraded images with poor image quality and uneven edges. A

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images

349

comparative analysis of accuracy for various images is illustrated in Tables 3, 4 and Figures 4, 5. Table 1. Results after using the suggested approach on a variety of intricate degraded photos with Gaussian noise Original Image

Image with the Gaussian noise

Extracted Text

MIDDLEBOROUGH

HYUNDAI I10 IS THE MOST AWARDED CAR OF TH3 YEAR 2008 I 10 INCOME TAX DEPART-MENT GOVT. OF INDIA Permanent Account Number Card ABCDE1234F APPLICANT NAME /Fathers Name APPLICANT’S FAlHER NAME 01/06/1995 Signature Name: SurpritKaur / DOB: 09-12-1989 / Male 18oo.12oo.13o1. ADHAR CARD MAKER FRANK

本书版权归Nova Science所有

350

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

Table 2. Results after employing the suggested technique on a variety of intricately degraded photos with salt and pepper noise Original Image

Image with salt and pepper noise

Extracted Text

MIDDLEBOROUGH

HYUNDAI I10 IS THE MOST AWARDED CAR OF TH3 YEAR 2008

INCOME TAX DEPART-MENT GOVT. OF INDIA Permanent Account Number Card ABCDE1234F APPLICANT NAME APPLICANTS FAlHER NAME 01/06/1995 Signature Name: Surprit Kaur /DOB: 09-12-1989 / Male 18oo.12oo.13o1 ADHAR CARD MAKER FRANK

Table 3. Comparison of accuracy for different images Image Id 1 2 3 4

Accuracy (%age) Original Image (Without Noise) 100.02 99.6157 99.0981 98.6714

Image with Gaussian noise 91.3077 98.2308 97.7988 97.7751

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images

Figure 4. Comparative analysis of various images on the basis of accuracy.

Figure 5. Comparative analysis of various images on the basis of accuracy.

Table 4. Comparison of accuracy for different images Image Id 1 2 3 4

Accuracy (% age) Original Image (Without Noise) 100.00 98.6154 98.0991 99.5714

Image with salt and pepper noise 93.3077 98.2308 97.3964 96.5000

351

本书版权归Nova Science所有

352

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al.

Conclusion and Future Outlook A performance optimization technique of text extraction from complex degraded images is discussed in this paper. The performance and efficiency of the suggested techniques have been illustrated through testing on a variety of data sets. The accuracy of the proposed method is quite good and in some cases it is more than 99%. The proposed algorithm, however, does not perform well on all types of complexly degraded images, documents, and so on. The future work is divided into two parts given below. I.

II.

III.

Addressing the method’s existing shortcomings, such as variation of light, unconnected components and orientation of the texts present in the complex degraded images. We have used only two types of noises in the original image i.e., Gaussian noise and salt and pepper noise and to remove these noises only Gaussian filter and median filter are used therefore, future work can be extended to other noises and filtering algorithms as well. We have only trained and evaluated text detection from complex images for English; hence, multi-language text extraction technique can also be used in future research.

References [1]

[2] [3]

[4] [5]

[6]

Vyas, G., Anand, R., & Holȇ, K. E. Implementation of Advanced Image Compression using Wavelet Transform and SPHIT Algorithm. International Journal of Electronic and Electrical Engineering. ISSN, 0974-2174. Sheik, H. & Bovik, A., “Image information and visual quality, Image Processing,” IEEE Transactions on, 15, 430-444, 2006. A. Koshy, N. B. M.J., S. A. & A. John, “Preprocessing Techniques for High Quality Text Extraction from Text Images,” 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), 2019, pp. 14, doi: 10.1109/ICIICT1.2019.8741488. Albermany, S. A., & Safdar, G. A. (2014). Keyless security in wireless networks. Wireless personal communications, 79(3), 1713-1731. S. Grover, K. Arora, & S. Mitra, “Text Extraction from Document Images using Edge Information,” in Annual IEEE India Conference (INDICON), Vol. 1-4, IEEE, Gujarat, India (2009). Albermany, S. (2016). A Technique for Classifying and Retrieving of Malware Details in Signatures Based. American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS), 26(1), 250-260.

本书版权归Nova Science所有

Applications for Text Extraction of Complex Degraded Images [7] [8]

[9]

[10] [11]

[12]

[13]

[14]

[15]

[16] [17]

[18]

[19] [20]

[21]

353

N. Otsu, “A Threshold Selection Method from Gray-level Histogram,” IEEE Trans. Syst. Man. Cybern. 8, 62-66 (1978. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017, April). New random block cipher algorithm. In 2017 International Conference on Current Research in Computer Science and Information Technology (ICCIT) (pp. 174-179). IEEE. Pandey, B. K., Pandey, D., & Agarwal, A. (2022). Encrypted Information Transmission by Enhanced Steganography and Image Transformation. International Journal of Distributed Artificial Intelligence (IJDAI), 14(1), 1-14. Albermany, S., & Baqer, F. M. (2021). EEG authentication system using fuzzy vault scheme. Journal of Discrete Mathematical Sciences and Cryptography, 1-6. Sayal, M. A., Alameady, M. H., & Albermany, S. A. (2020). The Use of SSL and TLS Protocols in Providing a Secure Environment for e-commerce Sites. Webology, 17(2). H. Kawano, H. Orii, H. Maeda, & N. Ikoma, “Text Extraction from Degraded Document Image Independent of Character Color Based on MAP-MRF Approach,” IEEE, Jeju Island, Aug. 20-24 (2009) 165-168. Pandey, B. K., Mane, D., Nassa, V. K. K., Pandey, D., Dutta, S., Ventayen, R. J. M., & Rastogi, R. (2021). Secure text extraction from complex degraded images by applying steganography and deep learning. In Multidisciplinary Approach to Modern Digital Steganography (pp. 146-163). IGI Global. De Baets, Bernard, Kerre, E., & Gupta, M. (1995). The fundamentals of fuzzy mathematical morphology, part 1 : basic concepts. International Journal Of General Systems, 23(2), 155–171. Hussein, R. I., Hussain, Z. M., & Albermany, S. A. (2020). Performance of Differential CSK under Color Noise: A Comparison with CSK. Journal of Engineering and Applied Sciences, 15(1), 48-59. Albermany, S. A., Hamade, F. R., & Safdar, G. A. (2017). New Block Cipher Key with RADG Automata. Asian Journal of Information Technology, 16(5). Singh, H., Pandey, B. K., George, S., Pandey, D., Anand, R., Sindhwani, N., & Dadheech, P. (2023). Effective Overview of Different ML Models Used for Prediction of COVID-19 Patients. In Artificial Intelligence on Medical Data (pp. 185-192). Springer, Singapore. Raghavan, R., Verma, D. C., Pandey, D., Anand, R., Pandey, B. K., & Singh, H. (2022). Optimized building extraction from high-resolution satellite imagery using deep learning. Multimedia Tools and Applications, 1-15. Sayel, N. A., Albermany, S., & Sabbar, B. M. (2021). A Comprehensive Survey on EEG Biometric Authentication and Identification. Design Engineering, 5868-5881. Albermany, S., Ali, H. A., & Hussain, A. K. (2003, December). Identity hiding by blind signature scheme. In Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing (pp. 1-12). Gupta, A., Anand, R., Pandey, D., Sindhwani, N., Wairya, S., Pandey, B. K., & Sharma, M. (2021). Prediction of Breast Cancer Using Extremely Randomized Clustering Forests (ERCF) Technique: Prediction of Breast Cancer. International Journal of Distributed Systems and Technologies (IJDST), 12(4), 1-15.

本书版权归Nova Science所有

354 [22]

[23]

[24]

Binay Kumar Pandey, Digvijay Pandey, Vinay Kumar Nassa et al. Jayapoorani, S., Pandey, D., Sasirekha, N. S., Anand, R., & Pandey, B. K. (2022). Systolic optimized adaptive filter architecture designs for ECG noise cancellation by Vertex-5. Aerospace Systems, 1-11. Bruntha, P. M., Dhanasekar, S., Hepsiba, D., Sagayam, K. M., Neebha, T. M., Pandey, D., & Pandey, B. K. (2022). Application of switching median filter with L2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Systems, 1-7. Anand, R., Khan, B., Nassa, V. K., Pandey, D., Dhabliya, D., Pandey, B. K., & Dadheech, P. (2022). Hybrid convolutional neural network (CNN) for Kennedy Space Center hyperspectral image. Aerospace Systems, 1-8.

本书版权归Nova Science所有

Index

A accuracy, xiii, 14, 32, 42, 45, 48, 70, 95, 121, 126, 130, 131, 132, 142, 159, 169, 172, 173, 175, 182, 200, 202, 203, 205, 206, 221, 222, 224, 225, 226, 228, 243, 245, 246, 270, 271, 286, 291, 292, 293, 294, 295, 298, 305, 308, 309, 310, 311, 312, 323, 326, 327, 328, 329, 330, 331, 335, 349, 350, 351, 352 Alexnet, 192, 222 ant colony optimization, 53, 254, 258, 264 artificial intelligence, xii, xiii, xv, 19, 20, 22, 28, 29, 38, 45, 56, 60, 80, 81, 82, 97, 99, 114, 116, 142, 145, 146, 162, 163, 186, 189, 190, 191, 192, 207, 209, 215, 216, 220, 228, 229, 230, 234, 246, 249, 251, 266, 267, 272, 299, 300, 301, 313, 314, 319, 320, 322, 332, 335, 336, 337, 338, 353 authentication, xi, 2, 9, 11, 14, 17, 20, 21, 47, 55, 81, 82, 97, 100, 114, 115, 117, 120, 122, 123, 126, 127, 129, 130, 140, 141, 142, 144, 145, 157, 163, 164, 187, 214, 216, 231, 249, 250, 267, 285, 289, 300, 301, 302, 317, 338, 353, 354

B bat algorithm, 254, 257, 265 biometric(s), vi, xii, 21, 42, 55, 81, 97, 115, 117, 118, 119, 120, 121, 122, 123, 126, 127, 128, 129, 130, 131, 132, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 164, 187, 214, 231, 249, 251, 267, 302, 317, 335, 338, 354

blurring, xiv, 11, 37, 239, 310, 345

C camera surveillance, xi, 26, 42 CIFAR-10, 168, 175, 178, 179, 180 clinical imaging, 319, 320 complex degraded image, 19, 114, 141, 162, 249, 304, 342, 347, 348, 352, 353 compression, vi, ix, xiii, 15, 19, 36, 37, 46, 54, 56, 80, 97, 114, 155, 157, 159, 161, 162, 235, 238, 249, 254, 266, 269, 270, 273, 274, 275, 276, 286, 291, 292, 298, 299, 300, 311, 313, 352 computer vision, xii, 22, 38, 40, 45, 46, 47, 50, 168, 171, 172, 191, 192, 199, 202, 203, 221, 234, 266 content based image recognition, 26 contrast enhancement, xi, 103, 104, 105, 113, 260, 308 convolutional neural networks (CNN), xii, 17, 18, 20, 22, 81, 141, 169, 172, 173, 178, 187, 188, 190, 191, 192, 194, 195, 196, 198, 199, 205, 210, 211, 213, 214, 215, 219, 221, 222, 223, 226, 229, 230, 247, 298, 313, 314, 331, 336, 354 countermeasures, 35, 47, 129, 220 cuckoo, xi, 103, 104, 105, 107, 108, 109, 110, 112, 113, 260, 264 cuckoo search algorithm, xi, 103, 104, 106, 107, 108, 109, 112, 113

D degraded, vii, xiv, 21, 56, 82, 99, 115, 146, 163, 189, 215, 230, 239, 250, 266, 301,

本书版权归Nova Science所有

356 304, 312, 315, 341, 342, 348, 349, 350, 352, 353 dense block-inception network (DINET), vi, xii, 167, 168, 179, 180, 183 diagnosis, 37, 42, 105, 111, 136, 174, 175, 181, 184, 189, 325, 331, 334, 338 digital image processing, 2, 26, 35, 36, 37, 38, 234, 257, 258 digital watermarking, x, xii, 149, 150, 151, 152, 153, 161 discrete cosine transform, 149, 275 discrete wavelet transform, 149

E edge detection, ix, xiii, 49, 50, 51, 52, 53, 68, 72, 73, 74, 75, 76, 77, 78, 79, 155, 243, 250, 253, 254, 306, 342 EHF or VHF, 84 elephant herding optimization and grey wolf optimization, 254 evolutionary, 64, 207, 254, 255, 256, 265, 266, 289, 323

F F1 Score, 270, 294 facial recognition, 118, 119, 121, 123, 133, 134 feature extractions, 60 fingerprints and retinal / iris recognition, 118 finite element method, 84 forensic, v, x, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 19, 20, 21, 22, 117, 132, 137, 142, 144, 160, 239, 285, 300 frequency domain method, 149

Index image enhancement, xi, xiii, 2, 26, 35, 36, 105, 240, 253, 260, 266, 310 image processing, ix, x, xi, 1, 2, 3, 4, 5, 8, 16, 17, 19, 21, 26, 35, 37, 38, 39, 44, 46, 48, 51, 52, 53, 54, 57, 67, 81, 84, 86, 98, 106, 114, 144, 155, 156, 162, 163, 188, 191, 192, 194, 216, 219, 221, 224, 228, 230, 234, 236, 237, 238, 241, 245, 246, 248, 250, 251, 253, 254, 255, 256, 258, 259, 262, 264, 266, 267, 300, 305, 308, 312, 314, 342, 344, 346 image segmentation, xi, xiii, 69, 103, 111, 113, 238, 253, 256, 257, 258, 260, 262, 263, 308 integrating, v, xi, 10, 25, 26, 31, 42, 43, 46, 77, 127, 130, 298, 311, 315 Internet of Things (IoT), v, vi, x, xi, xiii, xv, 6, 9, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 44, 47, 48, 49, 50, 51, 52, 54, 55, 57, 80, 83, 87, 88, 96, 99, 100, 143, 189, 190, 213, 227, 237, 269, 270, 272, 273, 276, 277, 283, 284, 285, 287, 288, 289, 290, 291, 292, 298, 299, 300, 301, 314, 335, 338, 364, 366 investigation, v, ix, 1, 2, 5, 7, 10, 12, 13, 17, 21, 43, 63, 65, 120, 122, 133, 134, 157, 180, 285, 300, 305, 319, 320, 329, 332, 333

K kidney stone, 103, 104, 113

L lifecycle, 11, 12 long and short-term memory network (LSTM), 192, 200, 201, 204, 293

G GSM, 84, 88

I image classification, 168, 171, 174, 179, 180, 193, 198, 199, 202, 207, 240, 265

M machine leaning, 304 machine vision, 26, 44, 47, 336 manipulation, 15, 112, 142, 237, 247 masking, v, xi, 15, 103, 105, 106, 107, 113, 159

本书版权归Nova Science所有

Index medical, vi, ix, x, xii, xiii, 2, 20, 31, 37, 41, 42, 46, 48, 54, 55, 56, 81, 82, 95, 97, 98, 99, 104, 105, 108, 111, 114, 116, 123, 136, 137, 142, 144, 146, 163, 167, 169, 170, 171, 172, 174, 175, 179, 180, 181, 182, 183, 184, 185, 187, 188, 189, 190, 201, 215, 216, 230, 233, 234, 237, 239, 247, 249, 251, 254, 257, 265, 266, 267, 299, 300, 308, 314, 320, 321, 330, 332, 336, 338, 353, 363 microstrip patch antenna, v, xi, 83, 84, 90, 92, 98, 99 mobile device, 4, 6, 8, 9, 96, 138, 154, 206 morphological, 20, 56, 61, 70, 80, 81, 99, 114, 142, 163, 186, 189, 215, 230, 249, 261, 267, 299, 300, 307, 313, 346, 347 morphology, ix, xiv, 310, 342, 346, 353

N network, xii, 4, 6, 7, 8, 10, 18, 26, 30, 31, 33, 35, 43, 44, 47, 48, 54, 57, 67, 81, 88, 98, 115, 118, 137, 141, 143, 153, 161, 162, 164, 167, 169, 172, 174, 178, 179, 180, 181, 182, 183, 185, 186, 189, 191, 192, 193, 194, 196, 198, 199, 200, 204, 205, 207, 208,210, 211, 212, 213, 214, 223, 225, 228, 229, 230, 249, 260, 265, 270, 272, 273, 276, 278, 280, 282, 283, 284, 288, 290, 299, 301, 313, 329, 335, 337, 366 neural network, x, xii, 54, 64, 69, 115, 141, 167, 168, 169, 171, 172, 182, 183, 184, 185, 187, 188, 190, 192, 194, 196, 198, 199, 200, 205, 207, 212, 213, 214, 216, 219, 228, 230, 231, 264, 271, 311, 313, 329, 330, 354 noise and image processing, 304

O optical character recognition (OCR), 86, 304, 305, 309, 310, 311, 312, 314, 332, 342, 348 optimum wavelet-based masking, 103, 104

357 P parameter(s), 14, 17, 27, 48, 90, 91, 106, 109, 110, 126, 127, 159, 169, 174, 182, 183, 184, 195, 196, 198, 201, 203, 205, 206, 223, 238, 242, 257, 260, 261, 288, 289, 298, 304 pattern, v, xi, xiii, 2, 16, 22, 38, 46, 48, 55, 57, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 78, 79, 93, 115, 123, 125, 141, 142, 155, 160, 164, 186, 187, 188, 190, 210, 211, 213, 238, 243, 244, 245, 250, 251, 253, 254, 265, 299, 314, 315 pattern analysis, xi, 60, 61, 62, 63, 64, 65, 66, 67, 68, 79 pattern recognition, xiii, 22, 38, 46, 48, 55, 60, 63, 64, 65, 67, 70, 71, 78, 238, 245, 253 photogrammetry, 2, 16, 250 plant disease, 219, 220, 223, 225, 227, 228, 230 precision, 28, 31, 32, 43, 45, 66, 99, 126, 129, 270, 286, 293, 294, 295, 296, 324, 326, 327, 328, 329, 331

R radio frequency identification (RFID), v, xi, 26, 29, 54, 83, 84, 85, 86, 87, 88, 89, 92, 95, 96, 97, 98, 114, 137, 145, 231, 285, 299, 300 recall value, 270, 296, 297 reconstruction, ix, xi, 8, 16, 36, 45, 69, 156, 159, 274 recurrent neural networks (RNNs), xii, 191, 192, 196, 200, 201, 210, 212, 214 ResNet, 17, 172, 189, 192, 196, 197, 199, 204 resolution, v, xi, 15, 22, 38, 39, 41, 56, 59, 82, 105, 116, 130, 146, 162, 172, 190, 211, 213, 216, 231, 243, 250, 252, 267, 300, 309, 313, 336, 353 return loss, 84

本书版权归Nova Science所有

358 S security, v, vi, xi, xiii, 25, 34, 42, 46, 81, 98, 99, 118, 122, 132, 136, 140, 143, 144, 146, 161, 162, 186, 265, 269, 276, 278, 279, 281, 282, 283, 285, 299, 301, 315, 334, 365, 366 security enhancement, 26 signal and image series, 22, 100, 146, 164, 190, 217, 232, 252, 268, 302, 317, 339 simulating, vi, xiii, 269, 292 smart applications, 47, 234 smart vehicle, v, xi, 83, 84, 86, 87, 97 spatial domain method, 15, 149 sphere(s), 131 storage, xii, 26, 35, 36, 43, 48, 140, 206, 207, 209, 233, 234, 235, 237, 248, 270, 272, 274, 276, 285, 286, 291 structure, 6, 34, 41, 46, 60, 62, 70, 71, 89, 92, 111, 125, 135, 150, 168, 194, 196, 197, 198, 201, 204, 205, 206, 212, 214, 227, 240, 262, 263, 281, 287, 323, 326 swarm intelligence, 254, 257, 289

T text extraction and restoration, 2

Index thresholding, xiii, xiv, 103, 111, 113, 250, 253, 254, 258, 262, 265, 266, 267, 308, 309, 310, 315, 342, 345, 346, 347 transfer learning (TL), vi, xii, 20, 55, 80, 162, 167, 168, 169, 170, 171, 172, 174, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 249, 265, 336 Treatise IC2, 22, 100, 146, 164, 190, 217, 232, 252, 268, 302, 317, 339

U ultra sound images, 103 ultrasound, v, xi, 46, 103, 104, 107, 112, 113, 308

V VGGNet, 172, 192, 196, 197, 198

W watermarking, vi, ix, x, xii, 15, 17, 46, 54, 115, 149, 150, 151, 152, 153, 154, 155, 156, 159, 160, 161, 162, 163, 299, 313 wavelet, xi, 19, 80, 103, 106, 107, 113, 155, 160, 162, 249, 266, 286, 307, 311, 315, 316, 352

本书版权归Nova Science所有

About the Editors

Dr. (h.c.) Digvijay Pandey is currently working as an Acting Head of Department in the Department of Technical Education Kanpur, Government of Uttar Pradesh, India. Before this, he joined TCS in 2012 as an IT analyst and worked on various US/UK/Canada projects until 2016. He is also a faculty member at IERT Allahabad.He has teaching and industry experience of more than 10 years. He works as an editor for a peer-reviewed international journal. He has over ten years of experience in the field industry as well as teaching. He has written 14 book chapters and 60 papers that have been published in Science Direct (Elsevier)/SCI/UGC/Scopus Indexed Journals and also acts as an editor for a peer-reviewed international journal. He has presented several research papers at national and international conferences. He chaired a session at the IEEE International Conference on Advance Trends in Multidisciplinary Research and Innovation's (ICATMRI-2020). He has four patents that have been published in The Patent Office Journal, and one that is currently being processed in the Australian Patent Office Journal. He serves as a reviewer for a number of prestigious journals, including Scientific Reports (nature Publication). Clinical and Translational Imaging (Springer), ijlter (Scopus Indexed), and a slew of others. His research interests include Medical Image Processing, Image Processing, Text Extraction, IOT, Devices. Email: [email protected].

Dr. Rohit Anand (PhD) is currently working as an Assistant Professor in the Department of Electronics and Communication Engineering at G.B. Pant Engineering College (Government of NCT of Delhi), New Delhi, India. He has done his PhD (ECE) from IKG Punjab Technical University, Kapurthala, Punjab, India. He has teaching experience of more than 19 years including UG and PG Courses. He is a Life Member of Indian Society for Technical Education (ISTE). He has published 6 book chapters in reputed books, 12 papers in Scopus/SCI Indexed Journals and 1 Patent.

本书版权归Nova Science所有

360

About the Editors

He has presented various research papers in National & International Conferences. He is presently a reviewer in many of the reputed and highly indexed International Journals. He has chaired a Session in two International Conferences. His research areas include IoT, Wireless Communication, Electromagnetic Field Theory, Antenna Theory and Design, Image Processing, Optimization, Optical Fiber Communication etc. Email: [email protected].

Dr. Nidhi Sindhwani (PhD) is currently working as an Assistant Professor in Amity School of Engineering and Technology (ASETD) (Delhi), Amity University, Noida, India. She has teaching experience of more than 14 years. She has received her BE degree in Electronics and Communication Engineering in 2004 and ME degree in Signal Processing in 2008, both from Maharishi Dayanand University, Rohtak, India. She has received her PhD(ECE) from UCOE, Punjabi University Patiala, India in 2018. She is also a Life Member of Indian Society for Technical Education (ISTE).She has published numerous papers with good impact factor in reputed International Journals and Conferences Her current research interest includes Wireless Networks and Communication, IOT, Machine learning and Signal Processing. Email: [email protected].

Prof. Binay Kumar Pandey is an Assistant Professor in the Department of Information Technology, College of Technology Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Udham Sing Nagar, Uttarakhand, India. He is a Life Member of The Indian Society for Technical Education (ID-LM 90463.), a Life Member of The Institution of Engineers (India) (Id. No. 1685295), and a Member of the ACM Association for Computing Machinery (9809629). He has 6 book chapters and 42 papers published in UGC/SCI/Scopus Indexed Journals. He has presented several research papers at national and international conferences. He is currently guiding one MTech. Student and has previously guided eight MTech. Students. He has served as the Session Chair for the IEEE International Conference on Advance Trends in Multidisciplinary Research and Innovation (ICATMRI2020). He has three patents that have been published in The Patent Office Journal, and one that is currently being processed in the Australian Patent

本书版权归Nova Science所有

About the Editors

361

Office Journal. He serves as a reviewer for several prestigious Springer Publication journals, including Environmental Monitoring and Assessment, Regenerative Engineering and Translational Medicine, Sleep and Vigilance. Image Processing, Text Extraction, Biomedical Image Processing, and Information Security are among his research interests. Email: [email protected].

Dr. Reecha Sharma (PhD) is currently working as an Assistant Professor in the Department of Electronics and Communication Engineering at Punjabi University, Patiala, India. She has done her PhD (ECE) from Punjabi University, Patiala, India in 2016. She is a Member of Institute of Electronics and Telecommunication Engineers (IETE). She has published 9 books and 10 papers in SCI Indexed Journals. She has presented various research papers in National & International Conferences. She is currently guiding 4 PhD students and 2 MTech students. She has delivered expert lectures on the various topics. Her research areas include Digital Image Processing, Feature Extraction, Face Recognition, Filter Designing, Artificial Neural Networks and optimization etc. Email: [email protected].

Dr. Pankaj Dadheech (PhD) completed his PhD degree in Computer Science & Engineering from Suresh Gyan Vihar University (Accredited by NAAC with ‘A’ Grade), Jaipur, Rajasthan, India. He completed his MTech. degree in Computer Science & Engineering from Rajasthan Technical University, Kota and he has completed his BE in Computer Science & Engineering from University of Rajasthan, Jaipur. He has more than 16 years of experience in teaching. He is currently working as an Associate Professor & Dy. HOD in the Department of Computer Science & Engineering (NBA Accredited), Swami Keshvanand Institute of Technology, Management & Gramothan (SKIT), Jaipur, Rajasthan, India. He has published 21 Patents at Intellectual Property India, Office of the Controller General of Patents, Design and Trade Marks, Department of Industrial Policy and Promotion, Ministry of Commerce and Industry, Government of India. He has published 4 Australian Patents at Commissioner of Patents, Intellectual Property Australia, Australian Government. He has published 1 German Patent. He has also Registered & Granted 2 Research Copyrights at Registrar of Copyrights, Copyright Office,

本书版权归Nova Science所有

362

About the Editors

Department for Promotion of Industry and Internal Trade, Ministry of Commerce and Industry, Government of India. He has presented 53 papers in various National & International Conferences. He has 57 publications in various International & National Journals. He has published 5 Books & 11 Book Chapters. He is a member of many Professional Organizations like the IEEE Computer Society, CSI, ACM, IAENG & ISTE. He has been appointed as a Ph.D. Research Supervisor in the Department of Computer Science & Engineering at SKIT, Jaipur (Recognized Research Centre of Rajasthan Technical University, Kota). He has also guided various MTech. Research Scholars. He has Chaired Technical Sessions in various International Conferences & Contributed as Resource Person in various FDP’s, Workshops, STTP’s, and Conferences. He is also acting as a Guest Editor of the various reputed Journal publishing houses, Conference Proceedings and Bentham Ambassador of Bentham Science Publisher. His area of interest includes High Performance Computing, Cloud Computing, Information Security, Big Data Analytics, Intellectual Property Right and Internet of Things. Email: [email protected].

本书版权归Nova Science所有

本书版权归Nova Science所有