Genetic and Evolutionary Computing: Proceedings of the Thirteenth International Conference on Genetic and Evolutionary Computing, November 1-3, 2019, ... in Intelligent Systems and Computing) [1 ed.] 9811533075, 9789811533075

This book gathers papers presented at the 13th International Conference on Genetic and Evolutionary Computing (ICGEC 201

819 123 68MB

English Pages 608 [587] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Genetic and Evolutionary Computing: Proceedings of the Fifteenth International Conference on Genetic and Evolutionary Computing (Volume I), October ... Notes in Electrical Engineering, 1145) [1st ed. 2024] 9819700671, 9789819700677

This first book of conference proceedings contains selected papers presented at ICGEC 2023, the 15th International Confe

118 42 52MB Read more

Genetic and Evolutionary Computing: Proceedings of the Fifteenth International Conference on Genetic and Evolutionary Computing (Volume II), October 6–8, 2023, Kaohsiung, Taiwan 981999411X, 9789819994113

This second volume of conference proceedings contains selected papers presented at ICGEC 2023, the 15th International Co

122 51 47MB Read more

Genetic and Evolutionary Computing: Proceedings of the Twelfth International Conference on Genetic and Evolutionary Computing, December 14-17, Changzhou, Jiangsu, China [1st ed.] 978-981-13-5840-1;978-981-13-5841-8

This volume of Advances in Intelligent Systems and Computing highlights papers presented at the 12th International Confe

429 52 29MB Read more

Intelligent Computing and Optimization: Proceedings of the 3rd International Conference on Intelligent Computing and Optimization 2020 (ICO 2020) 303068153X, 9783030681531

Third edition of International Conference on Intelligent Computing and Optimization and as a premium fruit, this book, p

2,790 80 166MB Read more

Proceedings of the Fourth International Conference on Microelectronics, Computing and Communication Systems: MCCS 2019 9789811555459, 9789811555466

This book presents high-quality papers from the Fourth International Conference on Microelectronics, Computing & Com

548 110 50MB Read more

Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019 (Algorithms for Intelligent Systems) 9789811502217, 9811502218

This book gathers selected papers presented at the International Conference on Advancements in Computing and Management

704 98 20MB Read more

Soft Computing: Theories and Applications : Proceedings of SoCTA 2019 (Advances in Intelligent Systems and Computing) [1st ed. 2020] 9811540314, 9789811540318

This book focuses on soft computing and how it can be applied to solve real-world problems arising in various domains, r

3,420 159 43MB Read more

Intelligent Computing and Information and Communication: Proceedings of 2nd International Conference, ICICC 2017 (Advances in Intelligent Systems and Computing Book 673) [1st ed. 2018] 9789811072451, 9781424489602, 9781450306030, 9811072450

162 66 12MB Read more

International Conference on Innovative Computing and Communications: Proceedings of ICICC 2021, Volume 3 (Advances in Intelligent Systems and Computing, 1394) [1st ed. 2022] 9811630704, 9789811630705

This book includes high-quality research papers presented at the Fourth International Conference on Innovative Computing

1,556 49 23MB Read more

Proceedings of International Conference on Intelligent Vision and Computing (ICIVC 2022): Volume 1 3031311639, 9783031311635

The conference proceedings book is a collection of high-quality research articles in the field of intelligent vision and

614 46 62MB Read more

Genetic and Evolutionary Computing: Proceedings of the Thirteenth International Conference on Genetic and Evolutionary Computing, November 1-3, 2019, ... in Intelligent Systems and Computing) [1 ed.]
9811533075, 9789811533075

Author / Uploaded
Jeng-Shyang Pan (editor)
Jerry Chun-Wei Lin (editor)
Yongquan Liang (editor)

Table of contents :
Preface
Conference Organization
Honorary Chairs
Advisory Committee Chairs
General Chairs
Program Chairs
Local Organization Chairs
Electronic Media Chairs
Invited Session Chairs
Publication Chair
Finance Chair
Program Committee Members
Contents
Nature Inspired Constrained Optimization
Shortest Path Searching for Logistics Based on Simulated Annealing Algorithm
Abstract
1 Introduction
2 Data Acquisition
2.1 Acquisition of SF Logistics Points
2.2 Distance Between Logistics Points
3 Methodology
3.1 Solution Space and Initial Solution
3.2 Objective Function
3.3 Generation of New Solutions
3.4 Target Function Difference
3.5 Metro Polis Accepts Guidelines
3.6 Algorithm Pseudo Code
4 Simulation Experiment Results
5 Conclusion
Acknowledgment
References
Ant Colony Optimization Algorithm for Network Planning in Heterogeneous Cellular Networks
Abstract
1 Introduction
2 Related Works
2.1 Relay Technique
2.2 Ant Colony Optimization
3 Problem Definition and Proposed Method
3.1 Network Architecture
3.2 Definition of Multi-objective Optimization Problem
3.3 Proposed Ant Colony Optimization Algorithm
4 Simulation Results
4.1 Simulation Parameters and Algorithms Compared
4.2 Planning Case
5 Conclusions
Acknowledgements
References
An Algorithm for Path Planning Based on Improved Q-Learning
Abstract
1 Introduction
2 Related Work
3 Q-Learning Algorithm
3.1 Basic Principle
3.2 Improved Q-Learning Algorithm
4 Experimental Results and Analysis
5 Summarizes
Acknowledgment
References
Ranking Based Opinion Mining in Myanmar Travel Domain
Abstract
1 Introduction
2 Problem Issues and Related Work
3 Opinion Mining in Travel Related Domains
4 Proposed Architecture
4.1 Review Data Collection from Users by Crowd Sourcing Method
4.2 Creating Ontology from Visitors’ Reviews
4.3 Mining Opinion from Visitors’ Reviews
5 Evaluation Results
6 Conclusion
References
Recent Advances on Evolutionary Optimization Technologies
Energy Efficient Container Consolidation Method in Cloud Environment Based on Heuristic Algorithm
Abstract
1 Introduction
2 Related Work
3 Proposed Algorithm
3.1 Host Overload Detection Algorithm
4 Experimental Results
5 Conclusion
Acknowledgement
References
An Improved Bat Algorithm Based on Hybrid with Ant Lion Optimizer
Abstract
1 Introduction
2 Related Work
2.1 Bat Algorithm
2.2 Ant Lion Optimizer
3 Hybrid Bat Algorithm with Ant Lion Optimizer
4 Experiment Results
5 Conclusion
References
A Parallel Strategy Applied to APSO
1 Introduction
2 Review of APSO and Parallel PSO
2.1 Adaptive Particle Swarm Optimization
2.2 Parallel PSO
3 Parallel APSO
4 Conclusion
References
A Binary Particle Swarm Optimization with the Hybrid S-Shaped and V-Shaped Transfer Function
Abstract
1 Introduction
2 Binary Particle Swarm Optimization
3 The Analysis of BPSO
3.1 BPSO Performance Analysis
3.2 S-Shaped Transfer Function Analysis
3.3 V-Shaped Transfer Function Analysis
4 A New Binary PSO with Hybride S-Shaped and V-Shaped Transfer Function
4.1 The Hybride Strategy of S-Shaped and V-Shaped Transfer Function
4.2 An Adaptive Mutation Strategy for the New BPSO
5 Experimental and Result Analysis
5.1 Related Parameter Settings and Raw Data Sets
5.2 Experimental Protocol
5.3 Analysis of Experimental Results
5.3.1 Comparison of Classification Accuracy of Different Schemes
5.3.2 Comparison of Experimental Curves
6 Conclusion
Acknowledgement
References
Method for Calculating Grounding Quantity of TN-C System Based on Multi-objective Optimization
Abstract
1 Introduction
2 The Establishment and Solution of Mathematical Models
2.1 Establishment of Mathematical Model
2.1.1 Establishment of the Objective Function
2.1.2 Constraints
2.1.3 Mathematical Model of Grounding Quantity Configuration
2.2 Contact Voltage Calculation
2.3 Solving the Mathematical Model
2.3.1 Establishment of an Unconstrained Optimization Model
2.3.2 Algorithm Flow
3 Example Analysis
3.1 Analysis of Short Circuit Fault Results
3.1.1 Single Phase Ground Short Circuit
3.1.2 Two-Phase Grounding Short Circuit
3.2 Analysis of the Results of Disconnection Failure
3.2.1 Single-Phase Disconnection
3.2.2 Two-Phase Disconnection
4 Conclusion
Acknowledgement
References
A Multiobjective-Based Group Trading Strategy Portfolio Optimization Technique
Abstract
1 Introduction
2 Two Objective Functions Used in Proposed Approach
3 The Proposed Approach
4 Experimental Results
5 Conclusion and Future Work
Acknowledgments
References
Research on Optimizing Parameters of Pitch Angle Controller Based on Genetic Algorithm
Abstract
1 Introduction
2 Model of the Wind Turbine System
2.1 Pitch Actuator
2.2 Drive Train
3 Optimization of the Fuzzy Controller Parameters Using Genetic Algorithm
3.1 Genetic Algorithm Programming
3.1.1 Combination of Simulink and Genetic Algorithm
4 Fuzzy Controller Design
5 Simulation and Discussion
6 Conclusion
References
Software Development and Reliability Systems
Research and Simulation Analysis of Brake Energy Recovery Control Strategy for Pure Electric Vehicles
Abstract
1 Introduction
2 Braking Theory Analysis
3 Control Strategy and Vehicle Modeling
3.1 Braking Energy Recovery Control Strategy Modeling
3.2 Vehicle Dynamics Modeling
4 Simulation Analysis
4.1 NEDC Cycle Condition Verification
5 Conclusion
References
Research on Coupling Vibration of Disk Crack and Shaft Crack of Rotor System Based on Finite Element Method
Abstract
1 Introduction
2 Theories Analysis
3 Numerical Results
4 Conclusion
Acknowledgements
References
Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy
Abstract
1 Introduction
2 Theoretical Model Analysis
3 Finite Analysis
3.1 Static Analysis
3.1.1 Static Analysis Under Bending Conditions
3.1.2 Static Analysis Under Torsional Conditions
4 Dynamic Analysis
5 Conclusion
Acknowledgements
References
Structural Design and Analysis of Ring Steering Wheel for Old-Age Walking Vehicle
Abstract
1 Introduction
2 Theories Analysis
2.1 Theories Research
2.2 The Establishment of a Modal
3 Finite Analysis
4 Static Analysis
4.1 Static Analysis Under Bilateral Loading
4.2 Static Analysis Under Unilateral Loading
4.3 Dynamic Analysis
5 Conclusion
Acknowledgements
References
Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance via Vehicle-to-Vehicle Communication
Abstract
1 Introduction
2 Methodology
2.1 Scenario Description
2.2 Model Formulation
2.3 Intersection Warning System
3 Experiments and Results
3.1 Simulation Platform
3.2 Simulation Setup
3.3 Results
4 Conclusions
Acknowledgment
References
A Novel Measure for Trajectory Similarity
Abstract
1 Introduction
2 Related Work
3 Definition
4 Algorithm
4.1 Known Algorithms
4.2 Novel Optimization
5 Experiment
5.1 Effectiveness Tests
5.2 Efficiency Tests
6 Conclusion
References
TS-DBSCAN: To Detect Trajectory Anomaly for Transportation Vehicles
Abstract
1 Introduction
2 Replaying of Trajectory
3 Methodology
3.1 Model Analysis
3.2 TS-DBSCAN Clustering
4 Experiments and Results
4.1 Trajectory Data Preprocessing
4.2 Clustering Experiment and Results
5 Conclusion
Acknowledgment
References
Abnormal Analysis of Electricity Data Acquisition in Electricity Information Acquisition System
Abstract
1 Introduction
2 Causes of Abnormal Electricity Data
2.1 Abnormal Electric Energy Metering Device
2.2 Data Channel Exception
2.3 Abnormalities Caused by Environmental Factors
2.4 Abnormalities Caused by Human Factors
3 Analysis of Abnormal Electricity Data
3.1 Example Introduction
3.2 Cause Analysis
4 Improvement Measures
4.1 Combining with the Actual Situation, Reduce the Device Failure
4.2 Optimizing the External Environment and Reasonable Overall Planning
5 Conclusion
Acknowledgment
References
A Method of Power Network Security Analysis Considering Cascading Trip
Abstract
1 Introduction
2 Mathematical Representation of Branch Chain Trip
3 Considering the Critical State of Cascading Trips in All Branches
4 Power Network Security Index Considering Cascading Trips
5 Calculation Algorithm of Security Index
6 Example
7 Conclusion
Acknowledgment
References
Construction Area Identification Method Based on Spatial-Temporal Trajectory of Slag Truck
Abstract
1 Introduction
2 Related Definition
2.1 Spatial-Temporal Trajectory Sequence
2.2 Grid Division
2.3 Grid Density
3 Methodology
4 Experiment
4.1 Experimental Data Description
4.2 Experimental Results and Analysis
5 Conclusion
Acknowledgment
References
Monitoring Applications and Mobile Apps
Study on Hazardous Scenario Analysis of High-Tech Facilities and Emergency Response Mechanism of Science and Technology Parks Based on IoT
Abstract
1 Introduction
2 Methodology
3 High-Tech Plant Hazard Level, Severity and Likehood Assessment
4 Emergency Response Level of High-Tech Plant
5 Conclusions
References
Power Flow Calculation Based Newton-Raphson Calculation
Abstract
1 Introduction
2 The Basic Principles and Steps of Newton’s Calculation
3 Power Flow Calculation Simulation Results and Analysis
4 Conclusion
References
Research on the Application of Instance Segmentation Algorithm in the Counting of Metro Waiting Population
Abstract
1 Introduction
2 Related Work
3 Research on Pixel-Level Instance Segmentation Algorithm
3.1 Algorithm Application
3.2 Improved FPN Process
3.3 Improved NMS Process
4 Experimental Results
5 Conclusion
References
Image and Video Processing
Implementation of Android Audio Equalizer Based on FFmpeg
Abstract
1 Introduction
2 FFmpeg Development and Use
2.1 FFmpeg Usage
2.2 The Overall Design
3 Analysis of FFmpeg High Order Audio Parameters Multiband Equalizer Algorithm
3.1 Operational Process Analysis
3.2 Algorithm Simulation Analysis
4 Development Process and Results Analysis
4.1 FFmpeg Compilation
4.2 Android FFmpeg Command Line Use
4.3 Analysis of Results
5 Conclusion
Acknowledgement
References
Can Deep Neural Networks Learn Broad Semantic Concepts of Images?
Abstract
1 Introduction
2 POPORO Image Dataset and Its Semantic Relatedness Rating
3 Experiments and Result Analysis
3.1 Are Similarity Measures of Images in Accordance to Semantic Relatedness?
3.2 Can DNN Learn Broad Image Semantic Relatedness?
3.3 Learn Broad Image Semantic Relatedness with Regression DNN
4 Related Work
5 Conclusion
Compliance with Ethical Standards
References
A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain
Abstract
1 Introduction
2 Proposed Scheme
2.1 Embedding Position Selection
2.2 Embedding Process
2.3 Extracting Process
3 Experimental Results
3.1 Embedding and Extraction Speed
3.2 Embedding Capacity and Visual Quality
3.3 Analysis
4 Conclusion
References
An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice
Abstract
1 Introduction
2 System algorithm
3 Experimental test
4 Conclusion
Acknowledgments
References
An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery
1 Introduction
2 YOLOv3-tiny
3 The Proposed Method
3.1 Clustering of Pedestrian
3.2 Dense-YOLOv3-tiny Network
3.3 Soft-NMS
4 Experiment and Analysis
4.1 Platform and Dataset
4.2 Training
4.3 Results
5 Conclusion
References
Fast Ship Detection in Optical Remote Sensing Images Based on Sparse MobileNetV2 Network
Abstract
1 Introduction
2 Methodology
2.1 Depth Separable Convolution
2.2 Bottleneck Residual Module
3 Proposed Method
3.1 Image Segmentation and Decimation
3.2 Model Training and Verification
3.3 Model Compression
4 Experiments
4.1 Dataset, Evaluation Metrics and Experiment Environment
4.2 Experiments and Analysis
5 Conclusion
References
Image Matching Using Phase Congruency and Log-Gabor Filters in the SAR Images and Visible Images
1 Introduction
2 Image Matching Based on Phase Congruency and Log-Gabor Filters for SAR and Visible Images
2.1 Corners Detection Based on Phase Congruency
2.2 Feature Descriptor Using Log-Gabor Filters and the Corresponding Keypoints Detection with RANSAC Algorithm
3 Experimental Results and Discussion
4 Conclusion
References
Pattern Recognition
WetlandNet: Semantic Segmentation for Remote Sensing Images of Coastal Wetlands via Improved UNet with Deconvolution
Abstract
1 Introduction
2 Relevant Techniques
2.1 Encoder-Decoder
2.2 Deconvolution
2.3 Depthwise Separable Convolution
3 The Proposed Method
3.1 WetlandNet Model Structure
3.2 Specific Parameters of WetlandNet
4 Experiments
4.1 Experimental Image
4.2 Training Datasets
4.3 Model Training
4.4 Experimental Results and Analysis
5 Conclusion
References
Intelligent Multimedia Tools and Applications
Digital Multimeter Automatic Verification Device Design
Abstract
1 Introduction
2 Technical Solutions
2.1 The Hardware Structure of the Device
2.2 Device Program and Its Composition
3 Specific Implementation
3.1 Automatic Verification Device Workflow
4 Experiments and Results Analysis
4.1 Automatic Handling Multimeter Function Test and Analysis
4.2 Automatic Plug-In Function Test and Analysis
4.3 Automatic Range Adjustment Function Test and Analysis
5 Conclusion
References
A Method to Prevent Cascading Trip in Power Network Based on Nodal Power
Abstract
1 Introduction
2 An Index to Measure the Safety of Power Grid When the Cascading Trip Occurs
3 Optimization Model for Preventing Grid Cascading Trip
4 Solution of Optimization Model
5 Example
6 Conclusion
Acknowledgment
References
Insulation Faults Diagnosis of Power Transformer by Decision Tree with Fuzzy Logic
Abstract
1 Introduction
2 The Basic Principle of Three Ratios Method
2.1 Definition of Decision Tree
2.2 Establishment of Decision Tree
3 The Ratio Range Classification Improved
3.1 Definition of Fuzzy Logic
3.2 Solution of Fuzzy Logic Under Decision Tree
4 Comprehensive Fault Diagnosis Results
4.1 Case Verification
4.2 Result Discussions
5 Conclusions
References
Smart City and Grids
Statistics Algorithm and Application of People Flow Detection Based on Wi-Fi Probe
Abstract
1 Introduction
2 Wi-Fi Probe Working Principle
3 System Structure
4 People Flow Statistics and Judgment Algorithm
4.1 Determination of the Range
4.2 Determine the Presence of Equipment
4.3 Population Statistics Processing (Data Processing)
5 Test Results and Data Analysis
5.1 Establishing Test Environment
5.2 Comparison Test Data
5.3 Analysis of Data
6 Conclusion
Acknowledgement
References
Research on the Status Quo and Problems of Power Grid Production in China
Abstract
1 Introduction
2 Overview of Cutting-Edge Technology for Smart Grid Development
2.1 Internet of Things
2.2 Cloud Computing
2.3 QR Code and Others
3 Storage and Protection of Big Data in the Power Grid
3.1 Data Storage Issues
3.2 Data Preservation Issues
4 Generation Gap Problem of Power Grid Equipment
4.1 Differences Between Knife Switches at Different Times
4.2 Differences Between Relay Protection Devices in Different Periods
5 Image Classification and Storage Problem After Drone Inspection Operation
5.1 The Grid Needs to Establish a System Image Data Platform
5.2 Remote Patrol and Fault Identification Technology Based on 5G Technology
6 Conclusions
References
Technologies for Next-Generation Network Environments
Small File Read Performance Optimization Based on Redis Cache in Distributed Storage System
Abstract
1 Introduction
2 Related Work
3 Problem Description for Small File Read
4 Cache Optimization Method/Scheme Based on Redis Cache Mechanism
4.1 The Designed Cache Replacement Optimization Algorithm in Redis
4.2 The Designed Redis Three-Level Cache Dynamic Elimination Scheme
5 Experiment and Evaluation
5.1 Experimental Environment and Dataset
5.2 Cache Hit Ratio
5.3 Redis Cache Hit Ratio
6 Conclusion
Acknowledgement
References
Neural Network and Deep Learning
Lightweight Real-Time Object Detection Based on Deep Learning
1 Introduction
2 Related Work
2.1 Brief Introduction of YOLO Algorithm
2.2 The Network of the Darknet19
3 The Proposed YOLO-light Network
3.1 Confusing of Features in Multiple Layers
3.2 The Framework of Multi-scale Prediction
3.3 Targets for Anchors Clustering
4 Experimental Verification and Result Analysis
4.1 Comparison in mAP
4.2 Comparison of Speeds
4.3 Comparisons of IoU Curves
4.4 Test Effect Charts
5 Conclusions
References
Design of Intelligent Water Purification Control System for Small Waterworks Based on LSTM
Abstract
1 Introduction
2 Task Objectives
3 Water Purification Process
4 Control System Composition
5 Calculation of Dosage Based on LSTM
6 System Programming
7 Conclusion
Acknowledgement
References
An Efficient and Flexible Diagnostic Method for Machinery Fault Detection Based on Convolutional Neural Network
1 Introduction
2 The Proposed Machinery Fault Diagnostic Method Based on CNN
3 Experimental Demonstrations
3.1 Data Preparation
3.2 Model Construction
3.3 Experimental Result
4 Conclusions
References
Image Super-Resolution Reconstruction Based on Multi-scale Convolutional Neural Network
Abstract
1 Introduction
2 Image Super-Resolution Reconstruction Based on Multi-scale Convolutional Neural Network
2.1 Model Architecture
2.2 Network Training and Loss Function
3 Experimental Results and Analysis
3.1 Data Set
3.2 Experimental Results and Analysis
3.3 Conclusion and Outlook
References
PLSTM: Long Short-Term Memory Neural Networks for Propagatable Traffic Congested States Prediction
Abstract
1 Introduction
2 Preliminaries
3 Methodology
3.1 Input Series Construction
3.2 PLSTM Predicting Component
3.3 Loss Function
4 Experiments
4.1 Data Set and Settings
4.2 Rationality of Input Series Construction
4.3 Comparison with Other Predictors
5 Conclusion
Acknowledgment
References
Artificial Intelligence Applications
Analysis of Dynamic Movement of Elevator Doors Based on Semantic Segmentation
Abstract
1 Introduction
2 Distance Estimation
2.1 Semantic Segmentation
2.2 Image Erosion
2.3 Prewitt Edge Detection Operator
3 Results and Discussion
3.1 Semantic Segmentation
3.2 Image Erosion
3.3 Prewitt Edge Detection Operator
3.4 Result
4 Conclusion
References
A Multi-component Chiller Status Prediction Method Using E-LSTM
Abstract
1 Introduction
2 Methodology
2.1 LSTM Neural Network
2.2 Improvement of LSTM Model
2.3 Loss Function
3 Experiments and Analysis
3.1 Dataset Description
3.2 Experiment Setup
3.3 Experimental results
4 Conclusion
Acknowledgement
References
Improvement of Chromatographic Peaks Qualitative Analysis for Power Transformer Base on Decision Tree
Abstract
1 Introduction
2 Introduce Decision Tree to Chromatographic Peak
3 The Processes of Improve Algorithm
3.1 Data Preparation and Feature Selection
3.2 {{\bf H}}_{2} Under Decision Tree Model Construction of Component Peak
4 Test Results and Discussions
5 Conclusions
References
A Review of the Development and Application of Natural Language Processing
Abstract
1 Introduction
2 Development of NLP
3 Application of NLP
3.1 Lexical Analysis
3.2 Syntax Analysis
3.3 Automatic Question and Answer
3.4 Text Summary
3.5 Machine Translation
3.6 Emotion Analysis
4 Problems and Prospects
4.1 Semantic Understanding Problems and Prospects
4.2 Low Resource Issues and Prospects
4.3 Other Areas of Future Applications
5 Conclusion
Acknowledgements
References
Decision Support Systems and Data Security
Rapid Consortium Blockchain for Digital Right Management
Abstract
1 Introduction
2 DRM System Requirement Analysis and Resolving Scheme
2.1 DRM System Requirement
2.2 Blockchain Selection
3 Digital Right Management System
3.1 Rapid Consortium Blockchain
3.2 Master-Slave Rapid Consortium Blockchain
4 Analysis of RCBDRM
5 Conclusion
References
Cryptanalysis of an Anonymous Message Authentication Scheme for Smart Grid
1 Introduction
2 Review of Wu et al.'s Scheme ch52wu19
2.1 Network Model
2.2 Detailed Scheme
3 Cryptanalysis of Wu et al.'s Scheme
4 Conclusion
References
Cryptanalysis of an Anonymous Mutual Authentication Scheme in Mobile Networks
1 Introduction
2 Review of Chung et al.'s Scheme ch53chung2016anonymous
2.1 Registration
2.2 Authentication and Session Key Establishment
3 Cryptanalysis of Chung et al.'s Scheme
3.1 Violating Perfect Forward Secrecy
3.2 Replay Attack
4 Conclusion
References
A Lightweight Anonymous Mutual Authentication Scheme in Mobile Networks
1 Introduction
2 The Proposed Scheme
2.1 Registration Phase
2.2 Authentication Phase
3 Informal Security Analysis of the Proposed Scheme
3.1 Replay Attack
3.2 Denial of Service Attack
3.3 User Impersonation Attack
3.4 Knowing Session Key Attack
3.5 Anonymity
4 Performance Analysis
5 Conclusion
References
Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme
Abstract
1 Introduction
2 DSSE Scheme
3 Forward Privacy
4 Discussion
5 Conclusion
References
Data Classification and Clustering
A Novel Topic Number Selecting Algorithm for Topic Model
Abstract
1 Introduction
2 Related Works
3 MTN (Multiple-Topic-Number) Algorithm
3.1 Basic Theory
3.2 Process of MTN
4 Experimental Results
4.1 Experiment of Chinese
4.2 Experiment of English
5 Conclusion
Acknowledgement
References
A Multivariate Time Series Classification Method Based on Self-attention
1 Introduction
2 Related Work
2.1 Convolutional Neural Network for Multivariate Time Series Classification
2.2 Self-attention
3 Self-attention Temporal Convolutional Networks
3.1 Network Structure
3.2 Temporal Convolutional Networks
3.3 Self-attention
4 Performance Evaluation
4.1 Datasets
4.2 Model Setup
4.3 Evaluation Metrics
4.4 Results
5 Conclusion
References
Butterfly Detection and Classification Based on Integrated YOLO Algorithm
1 Introduction
2 Data Set, Data Annotation and Data Preprocessing
2.1 Data Set
2.2 Data Annotations
2.3 Data Preprocessing
3 Butterfly Detection and Classification Method
3.1 YOLO Model
3.2 Integrated YOLO Algorithm
4 Experimental Results and Analysis
4.1 Evaluation Index
4.2 YOLO V3 Model Effect Experiment
4.3 Integrated Model Effectiveness Experiments
5 Summary and Prospect
References
Text Classification Based on Topic Modeling and Chi-square
Abstract
1 Introduction
2 Methodology
2.1 Text Preprocessing
2.2 Feature Extraction
2.3 Dimension Reduction
2.4 Chi-square
2.5 Classification
2.6 t-SNE
3 Experimental Results and Analysis
4 Conclusion
References
Big Data Analysis and Ontology System
Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis
Abstract
1 Introduction
2 Ripley’s K-function Application Analysis
2.1 Ripley’s K-function in 2D
2.2 Ripley’s K-function in 3D
3 Experimental Results and Analysis
3.1 2D Point Cloud
3.2 3D Point Cloud
4 Conclusion
Acknowledgement
References
InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method
Abstract
1 Introduction
2 Terrain Factors Estimation by LS
3 Experiments and Analysis
3.1 PIFE Model Experiment
3.2 Unwrapping Experiment and Analysis
4 Conclusion
Acknowledgment
References
Research and Implementation of Distributed Intelligent Processing Architecture
Abstract
1 Introduction
2 Introduction of the Whole Idea
3 Practical Application Results
4 Conclusions
Acknowledgments
References
Sentiment Analysis System for Myanmar News Using Support Vector Machine and Naïve Bayes
Abstract
1 Introduction
2 Related Works
3 Proposed System
3.1 Training Dataset
3.2 Preprocessing
3.3 Feature Extraction and Transformation
3.4 Classification
4 Experimental Result
5 Conclusion and Future Work
References
State Evaluation of Electric Bus Battery Capacity Based on Big Data
Abstract
1 Introduction
2 Data Preprocessing
2.1 Data Background
2.2 Data Processing
3 Establishment of Regression Model
3.1 Support Vector Regression
3.2 Model Evaluation
3.3 Battery Capacity Estimation
4 Conclusions
References
High-Utility Itemset Mining in Big Dataset
1 Introduction
2 Related Work
3 Proposed MapReduce Framework
3.1 Reveal the Set of 1-HTWUIs
3.2 Generate the Task Files
3.3 Execute MapReduce Framework
4 Conclusion
References
Study on Health Protection Behavior Based on the Big Data of High-Tech Factory Production Line
Abstract
1 Introduction
2 Methodology
3 Statistical Analysis and Results in First Generation
4 Conclusions
References
Social Network and Stock Analytics
A New Convolution Neural Network Model for Stock Price Prediction
1 Introduction
2 Related Work
3 Stock Prediction Model Based on Convolutional Neural Network
3.1 Stock Indexes and the Input Image
3.2 Normalization Function
3.3 The Flow Chart of SSACNN
4 Experimental Results
5 Conclusion
References
Author Index

Citation preview

Advances in Intelligent Systems and Computing 1107

Jeng-Shyang Pan Jerry Chun-Wei Lin Yongquan Liang Shu-Chuan Chu Editors

Genetic and Evolutionary Computing Proceedings of the Thirteenth International Conference on Genetic and Evolutionary Computing, November 1–3, 2019, Qingdao, China

Advances in Intelligent Systems and Computing Volume 1107

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/11156

Jeng-Shyang Pan Jerry Chun-Wei Lin Yongquan Liang Shu-Chuan Chu •

•

•

Editors

Genetic and Evolutionary Computing Proceedings of the Thirteenth International Conference on Genetic and Evolutionary Computing, November 1–3, 2019, Qingdao, China

123

Editors Jeng-Shyang Pan College of Computer Science and Engineering Shandong University of Science and Technology Qingdao, China Yongquan Liang College of Computer Science and Engineering Shandong University of Science and Technology Qingdao, China

Jerry Chun-Wei Lin Department of Computer Science, Electrical Engineering and Mathematical Sciences Western Norway University of Applied Sciences Bergen, Norway Shu-Chuan Chu School of Computer Science, Engineering and Mathematics Flinders University Bedford Park, Australia

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-3307-5 ISBN 978-981-15-3308-2 (eBook) https://doi.org/10.1007/978-981-15-3308-2 © Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

This volume composes the proceedings of the Thirteen International Conference on Genetic and Evolutionary Computing (ICGEC 2019), which was hosted by Shandong University of Science and Technology and was held in Qingdao, Shandong, China, on November 1–3, 2019. ICGEC 2019 was technically co-sponsored by Shandong University of Science and Technology (China), Fujian Provincial Key Lab of Big Data Mining and Applications (Fujian University of Technology, China), National Demonstration Center for Experimental Electronic Information and Electrical Technology Education (Fujian University of Technology, China), National University of Kaohsiung (Taiwan), Western Norway University of Applied Sciences (Norway), and Springer. It aimed to bring together researchers, engineers, and policymakers to discuss the related techniques, to exchange research ideas, and to make friends. More than seventy excellent papers were accepted for the final proceeding. Two plenary talks were kindly offered by Professor Kuo-Hui Yeh (National Dong Hwa University, Hualien, Taiwan, IEEE Senior Member) and Professor Zheming Lu (Zhejiang University, China). We would like to thank the authors for their tremendous contributions. Furthermore, we would also express our sincere appreciation to the reviewers, program committee members, and local committee members for making this conference successful. October 2019

Jerry Chun-Wei Lin General Chair

v

Conference Organization

Honorary Chairs Jeng-Shyang Pan Leon S. L. Wang Yao Qingguo

Shandong University of Science and Technology, China National University of Kaohsiung, Taiwan Fujian University of Technology, China

Advisory Committee Chairs Tzung-Pei Hong Josh Han-Chieh Chao

National University of Kaohsiung, Taiwan National Dong Hwa University, Taiwan

General Chairs Jerry Chun-Wei Lin Shu-Chuan Chu Liang Yonquan

Western Norway University of Applied Sciences, Norway Flinders University, Australia Shandong University of Science and Technology, China

Program Chairs I-Hsin Ting Philippe Fournier-Viger Jimmy Ming-Tai Wu Hong-Bo Liu

National University of Kaohsiung, Taiwan Harbin Institute of Technology (Shenzhen), China Shandong University of Science and Technology, China Dalian Maritime University, China

vii

viii

Conference Organization

Local Organization Chairs Weimin Zhang Chien-Ming Chen

Shandong University of Science and Technology, China Shandong University of Science and Technology, China

Electronic Media Chairs Tsu-Yang Wu Thi-Thi Zin Matin Pirouz Nia

Shandong University of Science and Technology, China University of Miyazaki, Japan California State University, Fresno, USA

Invited Session Chairs Peng Yanjun Voznak Miroslav

Shandong University of Science and Technology, China VSB-Technical University of Ostrava, Czech Republic

Publication Chair Zhongying Zhao

Shandong University of Science and Technology, China

Finance Chair Jui-Fang Chang

National Kaohsiung University of Sciences and Technology, Taiwan

Program Committee Members Yongtang Bao Binge Cui Hsing-Chung Chen Chun-Hao Chen Chien-Ming Chen Youcef Djenouri Hamido Fujita

Shandong University of Science and Technology, China Shandong University of Science and Technology, China Asia University, Taiwan Tamkang University, Taiwan Shandong University of Science and Technology, China Norwegian University of Science and Technology, Norway Iwate Prefectural University, Japan

Conference Organization

Vicente Garcia Diaz Philippe Fournier-Viger Tzung-Pei Hong Jean Lai Jerry Chun-Wei Lin Wei-Cheng Lin Irfan Mehmood Weijian Ni Matin Pirouz Nia Seungmin Rho Ja-Hwung Su Wei Song I-Hsing Ting Pei-Wei Tsai Cun-Wei Tsai Raylin Tso Bay Vo Miroslav Voznák Chen-Shu Wang Mu-En Wu Wei-min Chen Wen-quan Zeng Tsu-Yang Wu Jimmy Ming-Tai Wu Cheng-Wei Wu Chen Xin Lu Yan Ji Zhang Thi-Thi Zin

ix

University of Oviedo, Spain Harbin Institute of Technology (Shenzhen), China National University of Kaohsiung, Taiwan Hong Kong Baptist University, Hong Kong Western Norway University of Applied Sciences, Norway National Kaohsiung University of Science and Technology, Taiwan Bradford University, UK Shandong University of Science and Technology, China California State University, USA Sejong University, South Korea Cheng Shiu University, Taiwan North China University of Technology, China National University of Kaohsiung, Taiwan Swinburne University of Technology, Australia National Sun-Yat Sen University, Taiwan National Chengchi University, Taiwan Ho Chi Minh City University of Technology, Vietnam VSB-Technical University of Ostrava, Czech Republic National Taipei University of Technology, Taiwan National Taipei University of Technology, Taiwan Yili Vocational and Technical College, China Guangdong University of Science and Technology, China Shandong University of Science and Technology, China Shandong University of Science and Technology, China National Ilan University, Taiwan Shandong University of Science and Technology, China Shandong University of Science and Technology, China University of Southern Queensland, Australia University of Miyazaki, Japan

x

Zhongying Zhao Jianli Zhao Weimin Zheng

Conference Organization

Shandong University of Science and Technology, China Shandong University of Science and Technology, China Shandong University of Science and Technology, China

Contents

Nature Inspired Constrained Optimization Shortest Path Searching for Logistics Based on Simulated Annealing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weihui Xu, Rong Hu, Zhihui Chen, Hanlin Chen, Sijie Luo, Haiyan Yang, and Jinjuan Wen Ant Colony Optimization Algorithm for Network Planning in Heterogeneous Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . Fan-Hsun Tseng, Fan-Yi Kao, Tsung-Ta Liang, and Han-Chieh Chao

3

11

An Algorithm for Path Planning Based on Improved Q-Learning . . . . . Shimin Gu

20

Ranking Based Opinion Mining in Myanmar Travel Domain . . . . . . . . Nilar Aye and Thinn Thu Naing

30

Recent Advances on Evolutionary Optimization Technologies Energy Efficient Container Consolidation Method in Cloud Environment Based on Heuristic Algorithm . . . . . . . . . . . . . . Linlin Tang and Yao Meng An Improved Bat Algorithm Based on Hybrid with Ant Lion Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thi-Kien Dao, Shu-Chuan Chu, Jeng-Shyang Pan, Trong-The Nguyen, Truong-Giang Ngo, Trinh-Dong Nguyen, and Huu-Trung Tran A Parallel Strategy Applied to APSO . . . . . . . . . . . . . . . . . . . . . . . . . . . Qing-Wei Chai, Jeng-Shyang Pan, Wei-Min Zheng, and Shu-Chuan Chu A Binary Particle Swarm Optimization with the Hybrid S-Shaped and V-Shaped Transfer Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lei Jiang, Jianhua Liu, Dongli Cui, Guannan Bu, Dongyang Zhang, and Renyuan Hu

41

50

61

69

xi

xii

Contents

Method for Calculating Grounding Quantity of TN-C System Based on Multi-objective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jielin Hua, Rongjin Zheng, Huiqiong Deng, and Siyu Chen A Multiobjective-Based Group Trading Strategy Portfolio Optimization Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Hao Chen, Munkhjargal Gankhuyag, Tzung-pei Hong, Mu-En Wu and Jimmy Ming-Tai Wu Research on Optimizing Parameters of Pitch Angle Controller Based on Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shiguang Zheng, Chih-Yu Hsu, Jeng-Shyang Pan, and Joe-Yu

78

87

94

Software Development and Reliability Systems Research and Simulation Analysis of Brake Energy Recovery Control Strategy for Pure Electric Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Qi-Chao Li and Yi-Jui Chiu Research on Coupling Vibration of Disk Crack and Shaft Crack of Rotor System Based on Finite Element Method . . . . . . . . . . . . . . . . . 111 Peng-Fei Peng, Yi-Jui Chiu, and Xiao-Yun Li Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Qi-Chao Li, Yi-Jui Chiu, Guo-Wei Weng, and Hao-Da Huang Structural Design and Analysis of Ring Steering Wheel for Old-Age Walking Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Guo-Wei Weng, Yi-Jui Chiu, Qi-chao Li, and Wen-jun Liu Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance via Vehicle-to-Vehicle Communication . . . . . . . . . . . . . . . . . 134 Bijun Chen, Lyuchao Liao, Fumin Zou, and Yuxin Zheng A Novel Measure for Trajectory Similarity . . . . . . . . . . . . . . . . . . . . . . 143 Sijie Luo, Fumin Zou, Qiqin Cai, Feng Guo, Weihui Xu, and Yong Li TS-DBSCAN: To Detect Trajectory Anomaly for Transportation Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Xinke Wu, Lyuchao Liao, Fumin Zou, Jiurui Liu, Bijun Chen, and Yuxin Zheng Abnormal Analysis of Electricity Data Acquisition in Electricity Information Acquisition System . . . . . . . . . . . . . . . . . . . . 161 GuanWei Xu and Rongjin Zheng

Contents

xiii

A Method of Power Network Security Analysis Considering Cascading Trip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Hui-Qiong Deng, Xing-Ying Lin, Peng-Peng Wu, Qin-Bin Li, and Chao-Gang Li Construction Area Identification Method Based on Spatial-Temporal Trajectory of Slag Truck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Jinjuan Wen, Fumin Zou, Lyuchao Liao, Rong Hu, Zhiyuan Hu, Zhihui Chen, Qiqin Cai, and Jierui Liu Monitoring Applications and Mobile Apps Study on Hazardous Scenario Analysis of High-Tech Facilities and Emergency Response Mechanism of Science and Technology Parks Based on IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Kuo-Chi Chang, Kai-Chun Chu, Yuh-Chung Lin, Trong-The Nguyen, Tien-Wen Sung, Yu-Wen Zhou, and Jeng-Shyang Pan Power Flow Calculation Based Newton-Raphson Calculation . . . . . . . . 200 Peidong Sun, Pengcheng Cao, and Peiqiang Li Research on the Application of Instance Segmentation Algorithm in the Counting of Metro Waiting Population . . . . . . . . . . . . . . . . . . . . 210 Yan Cang, Chan Chen, and Yulong Qiao Image and Video Processing Implementation of Android Audio Equalizer Based on FFmpeg . . . . . . 221 Weida Zhuang, Xingli He, and Jinyang Lin Can Deep Neural Networks Learn Broad Semantic Concepts of Images? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Longzheng Cai, Shuyun Lim, Xuan Wang, and Longmei Tang A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Kai An, Zheming Lu, Faxin Yu, and Xuexue Luo An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Yi-Nung Chung, Ming-Sung Chiu, Chien-Chih Lin, Jhen-Yang Wang, and Chao-Hsing Hsu An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Yulin Yang, Baolong Guo, Cheng Li, and Yunpeng Zhi Fast Ship Detection in Optical Remote Sensing Images Based on Sparse MobileNetV2 Network . . . . . . . . . . . . . . . . . . . . . . . . . 262 Jinxiang Yu, Tong Yin, Shaoli Li, Shuo Hong, and Yu Peng

xiv

Contents

Image Matching Using Phase Congruency and Log-Gabor Filters in the SAR Images and Visible Images . . . . . . . . . . . . . . . . . . . . . . . . . 270 Xiaomin Liu, Huaqi Zhao, Huibin Ma, and Jing Li Pattern Recognition WetlandNet: Semantic Segmentation for Remote Sensing Images of Coastal Wetlands via Improved UNet with Deconvolution . . . . . . . . . 281 Binge Cui, Yonghui Zhang, Xinhui Li, Jing Wu, and Yan Lu Intelligent Multimedia Tools and Applications Digital Multimeter Automatic Verification Device Design . . . . . . . . . . . 295 Qingdan Huang, Liqiang Pei, Yuqing Chen, Rui Rao, and Huiyuan Lv A Method to Prevent Cascading Trip in Power Network Based on Nodal Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Hui-Qiong Deng, Peng-Peng Wu, Xing-Ying Lin, Qin-Bin Lin, and Chao-Gang Li Insulation Faults Diagnosis of Power Transformer by Decision Tree with Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 Cheng-Kuo Chang, Jie Shan, Kuo-Chi Chang, and Jeng-Shyang Pan Smart City and Grids Statistics Algorithm and Application of People Flow Detection Based on Wi-Fi Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Wenbin Zheng, Xiao Liu, Peng Li, Li Lin, and Hancong Wang Research on the Status Quo and Problems of Power Grid Production in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 Bo-Lin Xie, Yuh-Chung Lin, Jeng-Shyang Pan, and Kuo-Chi Chang Technologies for Next-Generation Network Environments Small File Read Performance Optimization Based on Redis Cache in Distributed Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Bizhong Wei, Liqiang Deng, Ya Fan, and Miao Ye Neural Network and Deep Learning Lightweight Real-Time Object Detection Based on Deep Learning . . . . 355 Zhefei Wei, Zhe Huang, Baolong Guo, Cheng Li, and Geng Wang Design of Intelligent Water Purification Control System for Small Waterworks Based on LSTM . . . . . . . . . . . . . . . . . . . . . . . . . 366 Ying Ma, Zhigang He, Jianxing Li, Kan Luo, Zhengshan Chen, and Lisang Liu

Contents

xv

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection Based on Convolutional Neural Network . . . . . . . . . . . . . . . . 380 Geng Wang, Baolong Guo, Cheng Li, Zhe Huang, and Jie Hu Image Super-Resolution Reconstruction Based on Multi-scale Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Jianqiao Song and Feng Wang PLSTM: Long Short-Term Memory Neural Networks for Propagatable Traffic Congested States Prediction . . . . . . . . . . . . . . . 399 Yuxin Zheng, Lyuchao Liao, Fumin Zou, Ming Xu, and Zhihui Chen Artificial Intelligence Applications Analysis of Dynamic Movement of Elevator Doors Based on Semantic Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 Chih-Yu Hsu, Joe-Yu, and Jeng-Shyang Pan A Multi-component Chiller Status Prediction Method Using E-LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Chenrui Xu, Kebin Jia, Zhuozheng Wang, and Ye Yuan Improvement of Chromatographic Peaks Qualitative Analysis for Power Transformer Base on Decision Tree . . . . . . . . . . . . . . . . . . . 429 Jie Shan, Cheng-Kuo Chang, Hao-Min Chen, and Jeng-Shyang Pan A Review of the Development and Application of Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Wei-Wen Guo, Li-Li Huang, and Jeng-Shyang Pan Decision Support Systems and Data Security Rapid Consortium Blockchain for Digital Right Management . . . . . . . . 447 Yue Wu, Zheming Lu, Faxin Yu, and Xuexue Luo Cryptanalysis of an Anonymous Message Authentication Scheme for Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Xiao-Cong Liang, Tsu-Yang Wu, Yu-Qi Lee, Tao Wang, Chien-Ming Chen, and Yeh-Cheng Chen Cryptanalysis of an Anonymous Mutual Authentication Scheme in Mobile Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 Lei Yang, Tsu-Yang Wu, Zhiyuan Lee, Chien-Ming Chen, King-Hang Wang, Jeng-Shyang Pan, Shu-Chuan Chu, and Mu-En Wu A Lightweight Anonymous Mutual Authentication Scheme in Mobile Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468 Zhiyuan Lee, Tsu-Yang Wu, Lei Yang, Chien-Ming Chen, King-Hang Wang, Jeng-Shyang Pan, Shu-Chuan Chu, and Yeh-Cheng Chen

xvi

Contents

Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 Zhuoyu Tie, Eric Ke Wang, Jyh-Haw Yeh, and Chien-Ming Chen Data Classification and Clustering A Novel Topic Number Selecting Algorithm for Topic Model . . . . . . . . 483 Linlin Tang and Liang Zhao A Multivariate Time Series Classification Method Based on Self-attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Huiwei Lin, Yunming Ye, Ka-Cheong Leung, and Bowen Zhang Butterfly Detection and Classification Based on Integrated YOLO Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 Bohan Liang, Shangxi Wu, Kaiyuan Xu, and Jingyu Hao Text Classification Based on Topic Modeling and Chi-square . . . . . . . . 513 Yujia Sun and Jan Platoš Big Data Analysis and Ontology System Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 Linlin Tang, Xupeng Tong, and Jingyong Su InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534 Xiaoqing Zhang, Weike Liu, Yongguo Zheng, and Zhiyong Wang Research and Implementation of Distributed Intelligent Processing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 Chunyu Chen, Yong Zhang, Yulong Qiao, Hengxiang He, and Xingfu Zhang Sentiment Analysis System for Myanmar News Using Support Vector Machine and Naïve Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Thein Yu and Khin Thandar Nwet State Evaluation of Electric Bus Battery Capacity Based on Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558 Yifan Li, Weidong Fang, Fumin Zou, Sheng Wang, Chaoda Xu, and Weisong Dong High-Utility Itemset Mining in Big Dataset . . . . . . . . . . . . . . . . . . . . . . 567 Jimmy Ming-Tai Wu, Min Wei, Jerry Chun-Wei Lin, and Chien-Ming Chen

Contents

xvii

Study on Health Protection Behavior Based on the Big Data of High-Tech Factory Production Line . . . . . . . . . . . . . . . . . . . . . . . . . . 571 Kuo-Chi Chang, Kai-Chun Chu, Yuh-Chung Lin, Trong-The Nguyen, Tien-Wen Sung, Yu-Wen Zhou, and Jeng-Shyang Pan Social Network and Stock Analytics A New Convolution Neural Network Model for Stock Price Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 Jimmy Ming-Tai Wu, Zhongcui Li, Jerry Chun-Wei Lin, and Matin Pirouz Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

Nature Inspired Constrained Optimization

Shortest Path Searching for Logistics Based on Simulated Annealing Algorithm Weihui Xu1,2, Rong Hu1,2(&), Zhihui Chen1,2, Hanlin Chen1,2, Sijie Luo1,2, Haiyan Yang1,2, and Jinjuan Wen1,2 1

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, China [email protected], [email protected], [email protected], [email protected], [email protected] 2 Fujian Provincial Big Data Research Institute of Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, China Abstract. With the rapid development of economy, the logistics industry is growing and the distribution network is becoming more and more complex. The traditional random transportation mode is prone to reverse flow and detour transportation, which leads to low transportation efficiency. In order to solve this problem, this paper proposes a logistics shortest path search algorithm based on simulated annealing. Through the simulation of the experiment, the optimal route can be found in a short time. As a result, the driving distance can be reduced and the distribution speed can be accelerated. Most importantly, the distribution efficiency of SF logistics in Fuzhou can be improved. Keywords: Logistics

Simulated annealing algorithm Shortest path

1 Introduction With the development of the professionalism of logistics technology and market economy as well as the rise of e-commerce, logistics and distribution industry has also developed rapidly [11]. The coming problem is how to quickly deliver goods to the distribution points of different express delivery company and to the next city. Preventing the loss caused by overtime delivery is essential. Moreover, the transportation rate of logistics is closely related to the traffic problems of cities [1]. And these conflicts have become more and more serious, such as vehicle noise, traffic jam, energy waste, automobile exhaust pollution and so on. Nowadays, the problem of traffic congestion in cities is aggravating. As an important pillar of the national economy, the logistics and distribution industry will face a huge challenge [2]. Facing so many problems, if we choose the inappropriate path for transportation, it will increase the distance of transportation. As a consequence, there will be unnecessary burden to the road and great increase to the cost of transportation [10]. It is significant to choose an optimal and efficient path to avoid these problems, which can not only reduce the time of logistics delivery, but also reduce the cost of logistics and improve the efficiency of logistics [3]. Traveling Salesman Problem (TSP) represents a kind of combinatorial optimization problem. It has important engineering and theoretical value in logistics distribution, © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 3–10, 2020. https://doi.org/10.1007/978-981-15-3308-2_1

4

W. Xu et al.

computer network, electronic map, traffic guidance, electrical wiring, VLSI unit layout and so on [4]. It has attracted many scholars’ attention. TSP is simply described as: a businessman wants to go to n different cities to sell goods, the distance between i and j is d (i and j are two different cities) [5]. The problem lies in how to choose the shortest path to make the businessmen pass every city and go back to the starting point [6]. SF Logistics has many points in Fuzhou. Logistics distribution trucks coming from north to Fuzhou need to pass through various distribution points of SF in Fuzhou to Quanzhou. In order to improve transportation efficiency, it is particularly important to make the shortest path, which can increase the efficiency of transportation and bring better experience to customers. The problem of SF logistics is basically the same as TSP. The TSP starts from the beginning and returns to the same place after all the cities. The problem of SF Logistics starts from the beginning and goes out through all the points without returning to the starting point. We can solve this problem by defining the starting point and the end point. TSP is a typical combinatorial optimization problem and a NP-hard problem. TSP is simple to describe and early researchers used precise algorithms to solve this problem. Common methods include branch and bound method, linear programming method and dynamic programming method [6]. However, the total number of possible paths increases exponentially with the number of cities, so when the number of cities is too large, it is generally difficult to accurately find the global optimal solution. With the development of artificial intelligence, many intelligent optimization algorithms that are independent to the problems have emerged, such as ant colony algorithm, genetic algorithm, simulated annealing, neural network, particle swarm optimization algorithm, immune algorithm and so on. They are developed by simulating or explaining some natural phenomena or processes to solve complex combinatorial optimization problems [7]. It provides new ideas and methods. These algorithms have their own advantages and disadvantages. The earliest idea of simulated annealing algorithm was proposed by Metropolis in 1953, and in 1983 Kirkpatrick successfully introduced the annealing idea into the field of combinatorial optimization. Simulated annealing algorithm is an extension of local search algorithm [8]. In theory, it is a global optimal algorithm. This paper will use simulated annealing algorithm to solve the shortest path problem of SF logistics.

2 Data Acquisition 2.1

Acquisition of SF Logistics Points

This paper analyses the delivery mode of SF logistics vehicles which is similar to TSP and we have to get the distribution points of SF Logistics in Fuzhou. Firstly, we connect the application programming interface of Amap developer platform. Then, we choose concerning geocodes and grab the distribution points of SF Logistics in Gulou District, Taijiang District, Cangshan District, Mawei District, Jin’an District and Minhou District of Fuzhou with the key word “SF”. Finally, we obtain the latitude and longitude of SF Logistics in Fuzhou. After obtaining the distribution points of SF Logistics in five districts of Fuzhou, we crawl down the latitude and longitude coordinates of these distribution points and save them into CSV documents for our next research. Finally, six distribution points

Shortest Path Searching for Logistics Based on Simulated Annealing Algorithm

5

are found in Gulou District, seven in Taijiang District, five in Cangshan District, two in Mawei District, six in Jin’an District and seven in Minhou District. The total number of distribution points is 39. Distribution points of SF Logistics in Fuzhou as shown in Fig. 1 shows that the green point is the distribution point of SF Logistics. From the graph, we can see that most of the distribution points are distributed a little more near the city center, and the overall distribution is relatively scattered.

Fig. 1. Distribution points of SF Logistics in Fuzhou

2.2

Distance Between Logistics Points

In order to find the shortest path for SF Logistics, we need to calculate the distance between these 39 points. The result of only calculating their linear distance or calculating their European distance is not in line with the actual situation. So we connect the application programming interface of Amap Developer Platform and choose Web API for connection service. The API of Amap has the function of path planning and distance measurement. The API of Amap can freely choose the latitude and longitude of the starting point and the end point as well as what transportation tool to use. So it can plan the path between two places with latitude and longitude, and calculate the distance between two points. This distance is the actual driving distance, which accords with the actual situation, and the calculated data can be useful. This paper calculates the distance between the 39 distribution points of SF logistics through the functions provided by the api of Amap and forms a 39*39 matrix. As shown in Table 1 below, the first row and the first column in the table represent 39 SF logistics distribution points, and the numbers in the matrix are the distances between 39 distribution points in meters. For example, the distance from the first logistics point to itself is 0 m, and the distance to the second logistics point is 4653. Because it is the actual map distance, the navigation path may be different, so the second logistics point to the first point is 5063.

6

W. Xu et al. Table 1. Distance between SF logistics points 1 2 3 … 38 39

1 0 5063 7252 … 30663 32672

2 4653 0 4227 … 30183 32192

3 10729 3885 0 … 35366 37375

… … … … … … …

38 28113 28891 35041 … 0 5632

39 30704 31482 37632 … 5852 0

3 Methodology The shortest path solving problem of SF Logistics is a combinatorial optimization problem, and the simulated annealing algorithm is a random method to solve the combinatorial optimization problem. Combine the shortest path solution of SF Logistics with the simulated annealing algorithm to solve it and find a shortest path. SF Logistics has a total of 39 logistics distribution points in Fuzhou. After determining the starting point and the end point, the remaining 37 points are randomly arranged, and a sequence is initialized. Given an initial temperature, two points are randomly changed as the temperature drops. Generate a new solution and compare it with the old solution. If the total distance is smaller, accept the new solution. If the total distance is more, accept it with a certain probability to prevent it from falling into local optimum. The final temperature is reduced to zero and the path with the shortest total distance is found. 3.1

Solution Space and Initial Solution

The solution space S of the shortest path solving problem of SF Logistics is to visit all the distribution points of SF in Fuzhou. The starting point and the ending point have been fixed, that is, the collection of all other SF points except the starting point and the ending point. The solution space S can be expressed as a set of all permutations and combinations of fp1 ; p2 ; . . .; pn g. The optimal solution of the simulated annealing algorithm is independent of the initial state, so the initial solution S_0 can generate a random sequence by a random function [9]. S0 ¼ ðp1 ; p2 ; . . .; pn Þ

ð1Þ

pi indicates that the nth SF logistics point is indicated. 3.2

Objective Function

The objective function of the shortest path solving problem of SF Logistics is to access the total length of all Fuzhou SF logistics points [9].

Shortest Path Searching for Logistics Based on Simulated Annealing Algorithm

Dðp1 ; p2 ; . . .; pn Þ ¼

Xn þ 1 i¼1

d ð pi ; pi þ 1 Þ

7

ð2Þ

Finally, all the SF logistics points in Fuzhou are accessed through the simulated annealing algorithm, and the minimum value of the objective function Dðp1 ; p2 ; . . .; pn Þ is the shortest path. The corresponding S ¼ ðp1 ; p2 ; . . .; pn Þ sequence is the optimal solution for the shortest path of SF Logistics. 3.3

Generation of New Solutions

The new solution is the new SF logistics distribution path, and the generation of new solutions is very important for solving problems. The new solution can be generated by using the following two methods separately or alternately: (1) Two transformation method By randomly changing the order of delivery of the two SF logistics points, a new set of delivery sequences can be obtained, namely: optional serial number u, v (set u < v < n), exchange access order between u and v, if The solution before the exchange is: Si ¼ ðp; p2 ; . . .; pu ; . . .; pv ; . . .; pn Þ, the path after the exchange is the new path, that is [9]: S0i ¼ ðp1 ; . . .; pu1 ; pv ; pv1 ; . . .; pu þ 1 ; pu ; pv þ 1 ; . . .; pn Þ

ð3Þ

(2) Three transformation method Randomly select three SF logistics points, and select two of them to insert into the third point, namely: optional serial numbers u, v and x ðu 6 v \ xÞ, insert the path between u and v after accessing x, Si ¼ ðp1 ; p2 ; . . .; pu ; . . .; pv ; . . .; pw ; . . .; pn Þ is the original SF logistics distribution path, and the new path after exchange is [9]: S0i ¼ ðp1 ; . . .; pu1 ; pv þ 1 ; . . .; pw ; pu ; . . .; pv ; pw þ 1 ; . . .; pn Þ

3.4

ð4Þ

Target Function Difference

A new SF logistics point distribution sequence is generated by transformation, and the distance of the new sequence is subtracted from the total distance of the old sequence, and the difference is obtained, that is [9]: DD0 ¼ D S0i DðSi Þ

ð5Þ

the total distance difference between the two new and old paths is DD0 . 3.5

Metro Polis Accepts Guidelines

The generation of a new sequence does not necessarily result in a smaller total distance than the original distribution sequence. In order to prevent falling into a local optimal

8

W. Xu et al.

solution, a poor probability is accepted with a certain probability. The acceptance criteria are [9]: ( P¼

1; 0 exp DD T ;

DD0 \0 DD0 [ 0

ð6Þ

T even the current temperature of the simulated annealing algorithm. 3.6

Algorithm Pseudo Code

Generate S produces new solutions through two-variation and three-variation methods, cur_T is the temperature in the simulated annealing algorithm, alpha is the attenuation coefficient, and round is the number of iterations.

Initiation: cur_T ,S ,alpha,round While cur_T >1: for i in range(round): new_S = generate_S delta_distance = new_distance – cur_distance if delta_distance 15, the e-Q-learning algorithm shows its obvious advantages, that is, the cumulative V value of the path is obviously greater than the traditional Q-learning algorithm. At the same time, it is also found from Fig. 2 that when n > 25, the V value accumulation of the e-Q-learning algorithm no longer changes, but the V value accumulation of the Q-learning algorithm still increases slightly, indicating that the former converges faster. At the same time, in order to show that the optimal path found by the e-Q-learning algorithm is superior to the optimal path found by the Q-learning algorithm, we set a loss function, which can compare the optimal path artificially set by us with the optimal path obtained by the two algorithms. The experimental results are shown in Fig. 3.

Fig. 2. Convergence comparison of algorithms

Fig. 3. Comparison of loss function values of the algorithm

28

S. Gu

According to Fig. 3, we can clearly see that when the number of iterations n 15, the loss function values of the two algorithms do not show much difference. However, when n > 15, the loss function value of the e-Q-learning algorithm decreases rapidly, that is, from this time on, the e-Q-learning algorithm’s path is better than that of the Q-learning algorithm. At the same time, it is also found from Fig. 3 that when the loss function values of both algorithms converge, the loss function value of the e-Q-learning algorithm is significantly lower than that of the Q-learning algorithm, which indicates that the e-Q-learning algorithm has a stronger Learning ability. In short, the experimental results show that the optimal path found by the e-Q-learning algorithm is not only better than the optimal path found by the traditional Q-learning algorithm, but also faster in convergence speed in the simulation environment of the maze for the robot with 54 states.

5 Summarizes In this paper, an improved e-Q-learning algorithm is proposed, and dynamic e is introduced to improve the efficiency of the algorithm. In particular, when calculating the loss function value, the e-Q-learning algorithm calculates the value of V at the same time, so that the feedback of the environment is more fully grasped. Once the positive feedback is obtained, the convergence speed of the system is improved by increasing the exploration factor e to limit the random search. Experimental results show that compared with the existing Q-learning algorithm, the improved e-Q-learning can not only obtain a better optimization path, but also has a faster convergence rate. Acknowledgment. This work is supported by the National Natural Science Foundation of China (61773415).

References 1. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, Cambridge (1998) 2. Gao, Y., Chen, S., Lu, X.: Review of reinforcement learning. J. Autom. 30(1), 86–100 (2004) 3. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992) 4. Qiao, J., Hou, Z., Yan, X.: Application of reinforcement learning based on neural network in obstacle avoidance. J. Tsinghua Univ. Nat. Sci. Ed. 48(S2), 1747–1750 (2008) 5. Fakoor, M., Kosari, A., Jafarzadeh, M.: Humanoid robot path planning with fuzzy Markov decision processes. J. Appl. Res. Technol. 14(5), 300–310 (2016) 6. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996) 7. He, D., Sun, S.: An online self-learning fuzzy navigation method for mobile robots. J. Xi’an Univ. Technol. 27(4), 325–329 (2007) 8. Hao, C., Fang, Z., Li, P.: Three-dimensional track planning algorithm of UAV based on Q learning. J. Shanghai Jiaotong Univ. 46(12), 1931–1935 (2012)

An Algorithm for Path Planning Based on Improved Q-Learning

29

9. Liu, Z., Li, H., Liu, Q.: Reinforcement learning algorithm research. Comput. Eng. Des. 29(22), 5805–5809 (2008) 10. Roozegar, M., et al.: XCS-based reinforcement learning algorithm for motion planning of a spherical mobile robot. Appl. Intell. 45, 736–746 (2016) 11. Chen, C.L.: Autonomous learning and navigation control of mobile robot based on reinforcement learning. University of Science and Technology of China (2006) 12. Castronovo, M., Maes, F., Fonteneau, R., et al.: Learning exploration/exploitation strategies for single trajectory reinforcement learning. In: European Workshop on Reinforcement Learning, pp. 1–10 (2013)

Ranking Based Opinion Mining in Myanmar Travel Domain Nilar Aye1(&) and Thinn Thu Naing2 1

2

University of Computer Studies, Yangon, Myanmar [email protected] University of Computer Studies, Taunggyi, Myanmar [email protected]

Abstract. The travel and tour industry is one of the world’s largest service industries. It takes part as a vital role of state’s economy for the developing countries like Myanmar. Therefore, a travel information support system becomes essential that construction of Myanmar travel domain has been considered to be developed which is enriching with ancient heritage and known for many natural tourist destinations. Besides, it extracts travelers’ opinions with a purpose of promoting and improving all related services. We intend to implement the ontology-based opinion mining for Myanmar travel domain. The main contribution of this paper is collecting valuable travel information and tourists’ opinions of Myanmar by crowd sourcing method which has to be used for ranking based opinion mining to support travelers. This proposed method is important for travel industry in terms of services quality, promotion of the business and facilitate the industry that can be ranked based on the factors of determining many various opinions. Thus, opinion mining is affected not only the review comments created by visitors, but also the ranking of travel items given by every stake holder in the industry. Keywords: Opinion mining

Crowd sourcing Myanmar travel domain

1 Introduction Travel domain is an increasingly important one in the modern age. It is centered on the movement of people from one location to another, as well as the services they require along the way. Sectors of Travel Industry include transportation, accommodation, food and beverage, attractions, entertainment, and so on. The World Travel & Tourism Council’s (WTTC) research reveals that the sector accounted for 10.4% of global GDP and 319 million jobs, or 10% of total employment in 2018. Domestic tourism, which represented 71.2% of all tourism spending in 2018 and had the strongest growth in developing nations, continues to support opportunities by spreading development and regional economic benefits and building national pride [13]. As travelling is one of the important domains referring many interesting factors of every state’s economy, a useful travel database is needed to promote tourism industry with easier travelling for tourists. Meanwhile, the openness of social media give un-expectable opportunities for users to publicly voice their opinions, but when it comes to making sense of these opinions, © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 30–37, 2020. https://doi.org/10.1007/978-981-15-3308-2_4

Ranking Based Opinion Mining in Myanmar Travel Domain

31

serious difficulties arises. Here, we can apply opinion mining techniques as an opportunity for making sense of these thousand voices from the public. From a piece of user-entered text, opinion-mining systems can analyze which part is being commented by whom with what opinion. The popular global travel opinion site named TripAdvisor, gives online services to travelers all over the world. They can browse hundreds of millions of traveler reviews and opinions, compare low prices on hotels, flights, and cruises, book popular hotels, etc. However, they cannot give the taste of specific places, like the nature or culture or business of a specific region which is one of the important factors for making a decision to plan for a trip. Thus, we would like to fill this gap by constructing travel opinion mining with case study of Myanmar-centric travel information. There is also another requirement that the viewer can’t easily detect the high quality information of good review opinions from the past travelers. Therefore, we would like to fulfill the requirement by constructing travel opinion mining with ranking based approach. The paper is organized as follows. Section 2 describes the problem issues and related works to clarify more on the approach. In Sect. 3, we describe the background theory of opinion mining in travel domain. Section 4 presents the proposed system of opinion mining for Myanmar travel domain with system implementation and experiments are described in Sect. 5. Finally, Sect. 6 concludes with summary of the system and proposes a future work.

2 Problem Issues and Related Work Travel domain is one of the interesting application domains since tourism industry can contribute country’s GDP growth and create many job opportunities locally [11]. Having well developed tourism information system and telecom infrastructure, the search can be facilitated travelers in finding the right travel information. However, tourism resources are highly heterogeneous, interdependent and scattered from many web sites that taking time to search one after another to get complete information. In addition, information is mostly prepared individually by travel agency, hotel, restaurant, online sales agent which are more focuses on their business with own interest that not all sources are reliable. Moreover, some information is changing in short noticed because of weather, crises, and lunar calendar-based holidays that immediate update is needed. In order to solve these problems, we would like to apply a relatively new management concept, namely “Crowd Sourcing”. In [9], ontology based approach for managing crowd sourcing is proposed as a management framework. Our system proposes the travel ontology domain with a crowd sourced and opinion mining approach to retrieve the useful information. This paper contributes the framework for creation of the travel domain which allows addition of review comments via crowd. Finally, the rating and sentiment results will be displayed using opinion mining. Each travel destination has her own beauty and attractions for travelers, bounded by her unique traditional and cultural values. The global travel advisors cannot give such valuable information to travelers. In [6] the design of ontology for Thailand travel industry on both general information and specific dynamic local information is

32

N. Aye and T. T. Naing

proposed using Domain Ontology Graph (DOG). In order to provide special local dynamic information such as cultural and festival constraints, ontology is developed to serve the travelers as a dynamic sematic database. Different from this approach, to provide local updated information to travelers, we use crowd source to get data together with opinion mining approach to evaluate the reviews/comments and provide relevant rating information. Nowadays, to offer efficient conceptual information to travelers, Travel Ontology needs to be extendable and upgradable. Some resources need to be added, deleted, and updated according to their domain’s requirements. Some research try to carry out this work. An approach in [1] generates ontology from relational database (RDB) while keeping the semantics present in the RDB schema and links between records at the data level. Moreover, some travelers cannot express their opinions by writing review comments because of language familiarity. Their opinion expression can be limited by the word fluency and grammatical construction of review language, mostly for English. In this case, the rating scores can show their opinions more accurately and easily. Our approach aims to construct ontology for Myanmar travel domain based on the relational database schema. The advantages of our approach is being described as follows; 1. Myanmar travel related information is collected from traveler’s reviews comments. So, we can get the Myanmar-value information that is used to build Myanmar travel ontology. 2. Crowd sourcing method allows to gather reviews from many people digitally around the world. So, we can access opinions from different people from different places. 3. Instead of creating ontology directly by jena OntClass, we first build the MySQL database and using this database schema to construct the ontology. So, the ontology can be extendable and upgradable by adding tables and attributes in the database, without physically writing OntClass at the code level. 4. To solve language issue, the opinion mining not only extract opinions from review comments, but also collect valuable feature of rating scores given by the travelers. These rating scores have certain effects on determining their opinions.

3 Opinion Mining in Travel Related Domains According to its formal definition in [7], opinion mining is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals and emotions towards entities such as products, services, organizations, individuals, issues, events and their attributes. It mainly focuses on opinions which express or imply positive or negative sentiments. These days, businesses always want to find consumer or public opinions about their products and services to identify their business. On the other hand, consumers also want to know the opinions of existing users’ recommendations before buying it. Because of its high impact, recent research trends to mine opinion in Travel domain. Most of their framework are in similar manner. They all focus to find out opinions only in travelers’ review comments.

Ranking Based Opinion Mining in Myanmar Travel Domain

33

The authors of [4] proposes a platform for extraction and summarizing of opinions expressed by users in tourism related online platforms. They use an unsupervised method and a lexical resource to get opinions from user reviews posted on TripAdvisor website. A content acquisition module collects the reviews from website by means of web crawler, and an analysis module pre-processes the extracted data and implements opinion mining process with the use of SentiWordNet lexical database. The work of researchers from [5] focus on the aspect-based opinion mining of restaurant reviews. To get the important aspects, this method uses SentiWordNet for assigning priority scores to opinion words. Finally, aggregates the scores of each aspect in all reviews and produce an aspect-based summary. In [10], the authors try to conduct an assessment of consumer sentiment taken from social media in assessing a culinary food that is very valuable in improving food quality. The sentiment of satisfaction assessment is classified by Naive Bayes classification algorithm. To achieve the optimal level of accuracy in classification of consumer ratings, this paper proposed a hybrid feature selection models by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. In a master degree thesis of [8], Twitter platform will be used as a source of opinions. The main contribution is an investigation of classification algorithms (Naïve Bayes and convolutional neural network (CNN)) for extracting opinions from tweets and movie reviews. Then, accuracy of algorithms and their performance time are analyzed. Finally, it can be observed that Naïve Bayes approach gave quite good results. A methodology for the classification of sentiments was developed in [12] for reviews on purchased items in Indian market. The streamed reviews were filtered for relevant content and stored in a database. TF-IDF score-based approach was utilized and the score was calculated for each review. Feature Selection was applied on it using Chi Square method and information gain. The extracted feature forms a term document matrix which is utilized in Naïve Bayes and CNN classification algorithms. Thus, an automated system is designed for sentiment mining related to reviews on different purchased items.

4 Proposed Architecture In this paper, travel domain for Myanmar is selected as a case study which is not only similar travel characteristics of other countries’ but also enrich unique culture activities with insufficient digital information. Besides, Myanmar tourism industry development status is still in early stage. In such situation, local information support with the help of opinion ranking is essential for the convenience of travelers. This paper proposes a framework for mining travelers’ opinions by processing information updates with crowdsourcing approach to save time, cost and provide efficient supports to travelers. Our proposed approach consists of three parts as shown in (Fig. 1) such as, • Collection of users’ review data by Crowd Sourcing Method • Creating Ontology from Visitors’ Reviews • Mining Opinion from Visitors’ Reviews

34

N. Aye and T. T. Naing

Review Database

Crowd

Tourism Review

Travel Domain Ontology

Opinion Mining

Travel Update

Opinion

Travel Opinion Information

Fig. 1. Proposed system’s framework

4.1

Review Data Collection from Users by Crowd Sourcing Method

Review comments from travelers are crucial for travel domain. We need to evaluate opinions via their comments about hotels they stay, destinations they visited around, and so on. With technological advancement, travel information and its sources are collected within short period of time via crowd of people by using a collaborative approach due to the heterogeneity, volume and dynamic nature of resources. Crowdsourcing creates an avenue through which consumers can actively engaged in the work companies do. As the internet continues to bridge communication divides between consumers and businesses, crowdsourcing becomes more prevalent resource to be utilized. Crowdsourcing provides for instance knowledge about markets, for company. It includes three main categories, which are knowledge, resource, and funding focused [9]. 4.2

Creating Ontology from Visitors’ Reviews

Fig. 2. Ontology creation based on relational database

In order to manage information easily, MySQL database is created in which contents and ratings are extracted from travelers’ reviews by storing in the respective tables. We use Jena ontology tools for this work. First, we extract the table metadata from the database with java. Then, OntoClasses are created according to these meta data as shown in (Fig. 2). Here, primary keys and foreign keys of tables are marked and their

Ranking Based Opinion Mining in Myanmar Travel Domain

35

relations are recorded as every foreign key (FK) that refer to a Primary Key in other table will be mapped into two Object-Properties (mutually inverse) [3]. Next, an OntoModel is constructed using these OntoClasses. This OntoModel can then be stored as an ontology.owl file. 4.3

Mining Opinion from Visitors’ Reviews

After collecting and storing travelers’ review, Natural Language Processing (NLP) steps are carried out. In our proposed system, Apache OpenNLP is used. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution. The Word polarity step is used to filter out non-slang words and words polarity scores in general (positive and negative), that we use SentiwordNet in our proposed system. The review texts are stored into review database according to their word scores as seen in (Fig. 3).

Fig. 3. Opinion mining process

The system provides review opinion by summarizing these word scores. According to their impacts upon opinion, our system emphasizes only on four types-of speeches; noun, verb, adjective, and adverb. Again, each review content can be partitioned as two part: title and body text. The score of them are calculated as the summation of all of their noun, adjective, verb, and adverb scores. For the score of whole comments, title score and text score are added again. Here, we can get three possibilities in these score; greater than 0, less than 0 and equal zero, which mean the senses of comment positive, negative and natural. But, some travelers are not from the English spoken countries. So, they have some difficulties in language fluency and cannot express their opinion by writing comment. To fill up this gap, our system did not rely only on the review content. We intend to add the ratings 1 to 5 (1 is worse; 2 is not good; 3 is natural; 4 is good; 5 is excellent) for some important factors (such as service, cleanliness, location, friendliness, and facilities for hotel) that have impacts on the opinion determination. These ratings are averaging and determined positive, negative, and natural as greater than 3, less than 3 and equal to 3.

36

N. Aye and T. T. Naing

For final opinion determination, positive, negative, and natural senses are represented as 1, −1, and 0. Then, our proposed opinion determination is as the below equation. Opinion Mining for the whole review ð100%Þ ¼ ð0:4 Overall Comment scoreÞ þ ð0:6 All Ratings ScoreÞ

ð1Þ

5 Evaluation Results The performance of the proposed system should be evaluated for understanding the effectiveness of our system by measuring accuracy, precision, recall and F-measure as in [2]. As for experiments, we use 300 review comments and ranks collected from the travelers of various countries with different purposes of travel with diverse opinions. We first divide 10 partitions on these data. Then, we process these data by additive manner, that is, first tenfold is added to second tenfold and process again and so on. To check whether the comments in reviews means really positive, negative, or neutral as they would like to say, we apply the expert judgments as admin. As for comparison, we apply the opinion mining methods of without using ranking and with ranking. Without ranking method only rely on the review comments for extracting opinions. With ranking method apply ranks of 1 to 5 on some important factors of travel industry like service, cleanliness, and location of hotels; seasonal, historic, and safely of travel destination, etc. According to the experiments results shown in (Fig. 4), with ranking method has more precision (1.0), recall (0.9875), F-measure (0.9939), and accuracy (0.97) over without ranking method; precision (0.9294), recall (0.9518), F-measure (0.9404), Accuracy (0.8854). This is because people cannot solely express their opinions by writing reviews comment as what they want to say. They can more expressive when using ranking.

1

1

0.95

0.95

0.9

0.9

0.85

0.85

0.8

0.8

Precision

Recall

Precision

Recall

F_Measure

Accuracy

F_Measure

Accuracy

Fig. 4. Experimental results of (a) without ranking method (b) with ranking method

Ranking Based Opinion Mining in Myanmar Travel Domain

37

6 Conclusion In this paper, we introduced the construction of ranking based opinion mining for Myanmar Travel domain. First, we develop ontology for opinion mining based on the relational database schema. Then, opinions on review comments are extracted with the help of SentiWordNet scores. Some senses of opinions can be omitted because of this NLP tool, and we try to overcome this fact by adding ranking methodology in this approach. According to the experimental results we have made, the effectiveness of with ranking method is proved. As for further extension, we intend to apply our method in other domain – especially entertainments - such as movies recommendations, food and other recommendations, and so on to support the growth of businesses as well as to create reliable system for consumers.

References 1. Bakka, J., Bahaj, M.: Generating of RDF graph from a relational database using Jena API. Int. J. Eng. Technol. (IJET) 5(2), 1970–1975 (2013). ISSN 0975-4024 2. Baranikumar, P., Gobi, N.: Feature extraction of opinion mining using ontology. Int. J. Adv. Comput. Electr. Eng. 1(1), 18–22 (2016) 3. Boumlik, A., Bahaj, M.: Advanced set of rules to generate ontology from relational database. J. Softw. 11(1), 27–43 (2016) 4. Bucur, C.: Using opinion mining techniques in tourism. Proc. Econ. Finance 23, 1666–1673 (2015) 5. Chinsha, T.C., Joseph, S.: Aspect based opinion mining from restaurant reviews. In: Advanced Computing and Communication Techniques for High Performance Applications (ICACCTHPA-2014) (2014). ISSN 0975–8887. Int. J. Comput. Appl. 6. Khruahong, S., Kong, X., Hoang, D.: Ontology design for Thailand travel industry. Int. J. Knowl. Eng. 1(3), 191–196 (2015) 7. Liu, B.: Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, San Rafael (2012) 8. Shepelenko, O.: Opinion mining and sentiment analysis using Bayesian and neural networks approaches. Master thesis, University of Tartu, Institute of Computer Science (2017) 9. Sivula, A., Kantola, J.: Ontology focused crowdsourcing management. In: 6th International Conference on Applied Human Factors and Ergonomic and the Affiliated Conferences, pp. 632–638 (2015) 10. Somantri, O., Apriliani, D.: Opinion mining on culinary food customer satisfaction using Naïve Bayes based-on hybrid feature selection. Indonesian J. Electr. Eng. Comput. Sci. 15 (1), 468–475 (2019) 11. Vidushi, Sodhi G.S.: Sentiment mining of online reviews using machine learning algorithms. Int. J. Eng. Devel. Res. IJEDR 5(2), 1321–1334 (2017) 12. World Travel and Tourism Council: Travel and Tourism Economic Impact 2019 (2019) 13. Travel & Tourism Economic Impact (2017). https://www.wttc.org/-/media/files/reports/ economic-impact-research/regions-2017/world2017.pdf. Accessed 30 Sept 2017

Recent Advances on Evolutionary Optimization Technologies

Energy Efficient Container Consolidation Method in Cloud Environment Based on Heuristic Algorithm Linlin Tang(&) and Yao Meng Harbin Institute of Technology, Shenzhen, China [email protected]

Abstract. While container-based clouds are increasingly gaining popularity, minimizing the power consumption of the data center is still one of the major challenges for cloud providers. Dynamic consolidation of containers presents a significant opportunity to save energy in data centers and there are researches use optimization metaheuristic algorithm to find a near-optimal solution for this problem. However, the computation time and complexity of such algorithms increase exponentially with the number of containers. In this paper, a heuristic dynamic container consolidation method which consists of four algorithms that solve the problems in different stages of container consolidation has been proposed. We migrate redundant containers from the hosts before they overload and place them to other host to guarantee QoS requirements. An adaptive reserved resources to prevent re-overload of hosts has also been applied. Experimental results demonstrate that our proposed approach can lead to further energy saving with QoS guarantees compared with some existing approaches. Keywords: Container-based cloud

Energy consumption Heuristic

1 Introduction With the cloud computing increasingly gaining popularity for its elasticity, availability, and scalability, a significant challenge for the growth of capacity of cloud data centers is the high energy consumption of these large-scale infrastructures. High energy consumption not only translates to a high operating cost, but also leads to higher carbon emissions [1]. In the last few years, a new Cloud computing paradigm based on containers has gradually emerged as a flexible and efficient approach for energy efficient resource sharing. The container-based virtualization is tremendously accelerated by the advance of Docker, whose enterprise adoption has doubled to 27% in 2017 from 13% in 2015 [2]. Unlike VMs, containers are lightweight in nature and enable high-density deployment, essentially reducing the thirst for large quantities of physical machines (PMs). As illustrated in Fig. 1, CaaS services are usually provided on top of IaaS’ virtual machines (VMs). A recent study [3] show that VM-Container configurations obtain close to, or even better performance, than native Docker (container) deployments. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 41–49, 2020. https://doi.org/10.1007/978-981-15-3308-2_5

42

L. Tang and Y. Meng

Fig. 1. Container as a service (CaaS) model [5]

For the purpose of reducing energy consumption, Cloud providers tend to consolidate the execution of multiple VMs on the same physical host, i.e. VM consolidation [4]. Similarly, container consolidation is to consolidate multiple containers to a set of VMs which are then consolidated to a set of PMs so that the overall utilization of both VMs and PMs is maximized. Like any consolidation solution, our framework of the consolidation problem is divided in three stages [5]. Firstly, it should identify the situations in which container migration should be triggered. Secondly, it should select a number of containers to migrate in order to resolve the situation. Finally, it should find migration destinations (host/VM) for the selected containers. The rest of the paper is organized as follows. Section 2 introduces the related work. Section 3 presents system objective and problem formulation. Section 3.1 presents the LRMAD algorithm. Section 4 discusses the experiment results. Finally, Sect. 5 concludes the work and possible future directions.

2 Related Work In recent years, a few heuristic and metaheuristic approaches have been proposed to solve the energy-aware container consolidation problem. Piraghaj et al. [5] proposed a framework of container consolidation with several heuristic algorithms to tackle the consolidation problem. They applied a static threshold algorithm in both overload and underutilization detection and a correlation calculation method “MCor” to select containers for migration or hosts for destination. The results show that their algorithms can improve the energy consumption while maintain a low level of SLA violation. However, the static threshold may not adapt to cloud environment with different kinds of workload. Mann et al. [6] propose a greedy algorithm, named COMBINED, to the problem of energy-aware container consolidation. The basic idea of the COMBINED algorithm is considering all fitted PMs and their VMs as possible hosts for one incoming container, then the consolidation decision is made based on First Fit Decreasing (FFD). At the same time, power consumption, sizing aspects, colocation constraints, license costs and hardware affinity relations are taken into account. Tao et al. [7] proposed a Two-stage Multi-type Particle Swarm Optimization (TMPSO)

Energy Efficient Container Consolidation Method in Cloud Environment

43

model to solve container consolidation problem. The results show that this PSO-based algorithm is able to provide solutions in a more energy-efficient way. But both of above two algorithms only implemented the initial allocation of the containers, which lack the dynamic optimization of container consolidation. A Non-dominated Sorting Genetic Algorithm II (NSGA-II) based approach is proposed in [8]. They consider container consolidation as one multi-objective optimization problem with the objectives of minimizing the total energy consumption and the total number of container migrations within the certain period of time. The widely used benchmark dataset shows that their method can find a solution with less energy consumption and fewer container migrations. But they have not discussed the computational overhead of the genetic algorithm, which is non negligible when the scale of the problem is growing large. As we mentioned above, existing works either study the initial allocation of the containers, or based on some optimization metaheuristic algorithms have significant drawback in computational overhead. So, there is a need of finding effective approach to resolve container consolidation problem in more real-world scenarios. Data Center Power Model The energy consumption model of a physical machine PM we use is fully compatible with VM-based energy model proposed by Blackburn [9]. The power consumption of the data center at time t is calculated as follows: Pdc ðtÞ ¼

X Ns i¼1

Pt ðtÞ

ð1Þ

According to Blackburn [9], the energy consumption is linearly related to the utilization of the CPU of the PMs. Ui;t is the CPU utilization of server i and the power consumption of the server is estimated through Eq. (2). Pdc ðtÞ ¼

Ui;t Pidle þ Pmax Pidle i i i 0 Nvm ¼ 0

Nvm [ 0

ð2Þ

The energy efficiency of the consolidation algorithms is evaluated based on the data center energy consumption obtained by Eq. (1). SLA Metric In order to simplify the definition of the SLA metric while we do not have any knowledge of the applications running inside the containers, SLA is violated only if the virtual machine on which the container is hosted on do not get the required amount of CPU that it requested. In this respect, the SLA metric is defined as the fraction of the difference between the requested and the allocated amount of CPU for each VM (Equation (4) [13]). XNs XNvm XNv CPUr vmj;i ; tp CPUa vmj;i ; tp SLA ¼ i¼1 j¼1 p¼1 CPUr vmj;i ; tp

ð3Þ

44

L. Tang and Y. Meng

Problem Formulation In order to minimize the power consumption of a data center with M containers, N VMs and K servers, we formulate the problem as follows: X Ns min Pdc ðtÞ ¼ P ð t Þ i i¼1

ð4Þ

Considering the following constraints: XNvm

Uvmj;i ðtÞ\Sði;rÞ ; 8i 2 ½1; Ns ; 8r 2 fCPU g

ð5Þ

vmðj;i;rÞ \Sði;rÞ ; 8i 2 ½1; Ns ; 8r 2 fBW; Memory; Disk g

ð6Þ

Ucðk;j;iÞ ðtÞ\vmðj;i;rÞ ; 8j 2 ½1; Nvm ; 8i 2 ½1; Ns ; 8r 2 fCPU g

ð7Þ

j¼1

XNvm j¼1

X Nc k¼1

XNc

c \vmðj;i;rÞ ; 8j k¼1 ðk;j;i;rÞ

2 ½1; Nvm ; 8i 2 ½1; Ns ; 8r 2 fBW; Memory; Disk g

ð8Þ

3 Proposed Algorithm 3.1

Host Overload Detection Algorithm

We use Simple Linear Regression(SLR) to predict future workload of a host in order to prevent the deterioration of the service level. SLR is a regression analysis of the dependence of independent and dependent variables on a set of statistical data which is used to predict trends in data changes [14]. The best fitting line which is called regression function shown in Eq. (9). ybi ¼ b0 þ b1 xi

ð9Þ

Where b0 , b1 represent the intercept and slope of the line, respectively, and ybi is the predicted value of the regression equation for the independent variable xi . When we use Eq. (9) to predict the actual response, the residual error can be calculated as: ei ¼ yi ybi

ð10Þ

One way to find the best fitting line is to use the least squares criterion method [14]. Therefore, the Q value should be minimized: Q¼

Xn i¼1

ðyi ybi Þ2

ð11Þ

When Eqs. (12) and (13) are satisfied, the values of b0 and b1 can be calculated by Eqs. (14) and (15).

Energy Efficient Container Consolidation Method in Cloud Environment

45

Xn @Q ¼ 2 ð y b1 x i b0 Þ ¼ 0 i¼1 i @b0

ð12Þ

Xn @Q ¼ 2 ðy b1 xi b0 Þxi ¼ 0 i¼1 i @b1

ð13Þ

b0 ¼ y b1x Pn ðxi xÞðyi yÞ b1 ¼ i¼1 Pn 2 xÞ i¼1 ðxi

ð14Þ ð15Þ

Where x, y are the means of the observations xi and yi . In terms of container consolidation, xi represents the time period, yi is the CPU usage for the corresponding time period, and ybi is the predicted value of CPU utilization for the next phase. If the predicted value of one host satisfies Eq. (16), the host will be considered as overloaded. s ybi 1

ð16Þ

Where s is the safety factor and is used to reserve a certain amount of CPU capacity to avoid SLA violations. Host Selection Algorithm The idea of the host selection algorithm is to reserve a few CPU capacities to handle the fluctuations in the workload. The capacity of the CPU reserved is calculated by the method of Median Absolute Deviation (MAD). The MAD is a measure of statistical dispersion and is more robust than the standard deviation or sample variance [15]. For a set of data sets h ¼ fh1 ; h2 ; . . .hn g, the median absolute deviation is defined as follows: ð17Þ MADðhÞ ¼ mediani ðhi medianj ðhj ÞÞ where, h represents the host, hi represents the CPU utilization history of the i-th time period of the host, and the reserved CPU utilization hs which prevents the host reoverload is calculated by Eq. (19): hs ¼ l MADðhÞ

ð18Þ

where l is the coefficient that adjusts the amount of reserved resources and was set to 5.5 based on research [16]. The available CPU resource of host hau is calculated as: hau ¼ 1 hs hu

ð19Þ

where hu is the current CPU utilization. After the container c is allocated to the host h, the CPU utilization increase of the host h is Dhu :

46

L. Tang and Y. Meng

Dhu ¼

cu cmc hmc

ð20Þ

where cu represents the current CPU utilization of the container, cmc represents the total amount of CPU requested by the container, and hmc is the total CPU capacity of the host. u represents the remaining CPU utilization of the host after the container c is allocated to the host h, and is calculated by Eq. (21). u ¼ hau Dhu

ð21Þ

We need to fill up the host with containers as much as possible while keeping it not overloaded. So we will traversing the candidate host list and choose the one with the minimum u while u [ 0: Container Selection Algorithm This algorithm is responsible for selecting a number of containers to migrate from the over-loaded hosts so that the host is no longer over-loaded. We adopt the Maximum Usage(MU) policy which is proved to result in less energy consumption, fewer container migrations and SLA violations in ContainerCloudSim [5]. MU policy selects the container that has the maximum CPU utilization and added to the migration list. Host Underutilization Detection Algorithm We apply a simple approach for determining underutilized hosts [13]. First, we select all the overloaded hosts by overload detection algorithm. Then, the system finds the host with the minimum utilization compared with the other hosts and tries to place the containers from this host on other hosts while keeping them not overloaded. If this can be accomplished, the containers are set for migration to the determined target hosts, and the source host is switched off once all the migrations have been completed. If there are containers cannot be placed on other hosts, the host is kept active. This process is iteratively repeated for all hosts that have not been considered as being overloaded.

4 Experimental Results We investigate the impact of the algorithms on the system performance and data center energy consumption by ContainerCloudSim, a container cloud simulation software. Each experiment is repeated 30 times and results are compared considering three metrics, namely SLA violations, energy consumption, container migrations during 3 to 24 h’ simulation period. To evaluate the performance of our proposed approach, we compare the three metrics mentioned before with a static threshold heuristic algorithm called Least Full Host Selection (LFHS) algorithm [5] and a NSGA-II based algorithm [8] which both have state-of-the-art performance. The experimental results are shown in Tables 1, 2 and 3. We named our algorithm LRMAD. The previous value is the mean of 30 rounds of experiment and the latter value is the standard deviation. Compared with the static threshold algorithm LFHS, the LRMAD algorithm has a significant effect on energy saving, which can reach about

Energy Efficient Container Consolidation Method in Cloud Environment

47

12%, close to the performance of the intelligent algorithm NSGA II. It is well known that intelligent algorithms have relatively large defects in computational overhead, which will increase exponentially as the scale of the problem increases while our heuristic-based algorithm can always get an acceptable solution in a reasonable time. The SLA violation data was not provided in the paper of NSGA II algorithm. However, by comparing with the LFHS algorithm, it can be seen that the SLA violation of the LRMAD algorithm is always higher than that of the LFHS, which means that the more aggressive consolidation algorithm leads to a decline in the service level to some extent.

Table 1. Energy consumption (KWh) Hours 3 6 9 12 15 18 21 24

LRMAD 19.43 ± 0.21 32.39 ± 0.87 43.45 ± 1.04 53.73 ± 1.85 61.76 ± 2.08 71.34 ± 1.80 79.09 ± 2.02 85.58 ± 2.53

LFHS 20.9 ± 35.53 ± 48.75 ± 60.53 ± 71.06 ± 80.74 ± 89.63 ± 96.64 ±

0.21 0.8 1.39 1.45 2.34 2.42 2.63 2.93

NSGA II (min-Energy) 15.82 ± 0.19 25.33 ± 0.35 35.21 ± 0.45 45.08 ± 0.6 55.2 ± 0.71 65.28 ± 0.83 74.71 ± 0.99 82.9 ± 1.36

NSGA II (min-Migration) 16.9 ± 0.09 25.67 ± 0.26 35.71 ± 0.42 45.88 ± 0.56 55.97 ± 0.63 66.03 ± 0.74 75.97 ± 0.87 83.62 ± 1.18

Table 2. Container migrations Hours LRMAD LFHS NSGA II (min-Energy) NSGA II (min-Migration) 3 4091 ± 191 4263 ± 173 3343 ± 115 3315 ± 56 6 5886 ± 223 6159 ± 167 4603 ± 168 4561 ± 76 9 7205 ± 325 7411 ± 225 5604 ± 265 5552 ± 125 12 8181 ± 371 8443 ± 341 6734 ± 296 6679 ± 149 15 8689 ± 486 9344 ± 396 7837 ± 314 7758 ± 148 18 9250 ± 543 10256 ± 403 8968 ± 406 8866 ± 198 21 9877 ± 538 11165 ± 468 10418 ± 429 10334 ± 210 24 10158 ± 663 11979 ± 540 11256 ± 492 11178 ± 253

Table 3. SLA violation Hours LRMAD LFHS 24 0.031 0.008

48

L. Tang and Y. Meng

5 Conclusion A heuristic-based container consolidation algorithm under the VM-Container configuration has been proposed in this paper. The approach aims to minimize the energy consumption and the SLA violation. A thorough experimental evaluation using the well-known benchmark dataset and simulator shows that our proposed method can find container consolidation solutions with less energy consumption and a small increase in SLA violation comparing with recent proposed approaches. As future work, we will improve the underutilization detection and container selection algorithms to reduce computational overhead and decrease SLA violation. Acknowledgement. This work was supported by Shenzhen Science and Technology Plan under grant number JCYJ20180306171938767 and the Shenzhen Foundational Research Funding JCYJ20180507183527919.

References 1. Zheng, K., Wang, X., Li, L., Wang, X.: Joint power optimization of data center network and servers with correlation analysis. In: Proceedings of the 2014 IEEE International Conference on Computer Communications (INFOCOM), pp. 2598–2606. IEEE (2014) 2. Brown, R.: Running containers on bare metal vs. VMs: performance and benefits (2017). https://www.stratoscale.com/blog/containers/running-containerson-bare-metal/ 3. Ali, Q.: Scaling web 2.0 applications using docker containers on vsphere 6.0 (2015). http:// blogs.vmware.com/performance/2015/04/scaling-web-2-0-applications-using-dockercontainers-vsphere-6-0.html 4. Corradi, A., Fanelli, M., Foschini, L.: VM consolidation: a real case based on openstack cloud. Future Gener. Comput. Syst. 32, 118–127 (2014) 5. Piraghaj, S.F., Dastjerdi, A.V., Calheiros, R.N., et al.: A framework and algorithm for energy efficient container consolidation in cloud data centers. In: 2015 IEEE International Conference on Data Science and Data Intensive Systems, pp. 368–375. IEEE (2015) 6. Mann, Z.Á.: Resource optimization across the cloud stack. IEEE Trans. Parallel Distrib. Syst. 29(1), 169–182 (2017) 7. Shi, T., Ma, H., Chen, G.: Energy-aware container consolidation based on PSO in cloud data centers. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2018) 8. Shi, T., Ma, H., Chen, G.: Multi-objective container consolidation in cloud data centers. In: Australasian Joint Conference on Artificial Intelligence, pp. 783–795. Springer, Cham (2018) 9. Blackburn, M., Grid, G.: Five ways to reduce data center server power consumption. Green Grid 42, 12 (2008) 10. Sharma, N.K., Reddy, G.R.M.: Multi-objective energy efficient virtual machines allocation at the cloud data center. IEEE Trans. Serv. Comput. 12(1), 158–171 (2016) 11. Svärd, P., Li, W., Wadbro, E., et al.: Continuous datacenter consolidation. In: 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 387–396. IEEE (2015) 12. Fan, X., Weber, W.D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. ACM SIGARCH Comput. Archit. News 35(2), 13–23 (2007)

Energy Efficient Container Consolidation Method in Cloud Environment

49

13. Beloglazov, A., Buyya, R.: Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurr. Comput.: Pract. Exp. 24(13), 1397–1420 (2012) 14. Li, L., Dong, J., Zuo, D., Wu, J.: SLA-aware and energy-efficient VM consolidation in cloud data centers using robust linear regression prediction model. IEEE Access 7, 9490–9500 (2019) 15. Huber, P.J.: Robust Statistics. Wiley, New York (1981) 16. Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener. Comput. Syst. 28 (5), 755–768 (2012) 17. Park, K., Pai, V.S.: CoMon: a mostly-scalable monitoring system for PlanetLab. ACM SIGOPS Oper. Syst. Rev. 40(1), 65–74 (2006) 18. Fotuhi Piraghaj, S.: Energy-efficient management of resources in container-based clouds. Ph. D. dissertation (2016)

An Improved Bat Algorithm Based on Hybrid with Ant Lion Optimizer Thi-Kien Dao1, Shu-Chuan Chu2, Jeng-Shyang Pan1,2, Trong-The Nguyen1,3(&), Truong-Giang Ngo3, Trinh-Dong Nguyen3, and Huu-Trung Tran3 1

Fujian Provincial Key Lab of Big Data Mining and Applications, Fujian University of Technology, Fujian, China [email protected], [email protected], [email protected] 2 College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China [email protected] 3 Department of Information Technology, University of Manage and Technology, Haiphong, Vietnam {vnthe,giangnt,dongnt,trungth}@hpu.edu.vn

Abstract. Bat Algorithm (BA) is one of the fundamental algorithms for solving optimization problems. However, the BA still exists weaknesses in terms of exploitation and exploration. In this paper, an enhancing capability of exploration and exploitation for BA by hybridizing BA with Ant Lion Optimizer (ALO) is proposed for the global optimization problems. In the experimental section, several benchmark functions are used to test the performance of the proposed approach. Compared results with other algorithms literature show that the proposed method provides a new competitive algorithm. Keywords: Improved Bat algorithm Optimization Ant Lion Optimizer Bat algorithm

1 Introduction The metaheuristic algorithm is one of the promising solutions to challenging optimization problems [1]. Metaheuristic algorithms can be an efficient way to produce acceptable solutions by trial and error to a complex problem in a reasonably practical time [2]. The balance of some local searches and some global exploration need to be fixed in the metaheuristic algorithms [3, 4]. The variety of solutions is often figured out via randomization that is an evolving collection of methodologies, which aims to exploit tolerance for imprecision, uncertainty, and partial truth to achieve robustness, tractability, close to the human mind, and low cost [5, 6]. The interest of problem complexity makes it impossible to search for possible solutions that aim is to find an excellent feasible solution in an acceptable timescale from the combination. There is no guarantee that the best solutions figured out as expected quality [7]. However, metaheuristic algorithm is proving robust in delivering optimal global solutions [8]. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 50–60, 2020. https://doi.org/10.1007/978-981-15-3308-2_6

An Improved Bat Algorithm Based on Hybrid with Ant Lion Optimizer

51

The Bat algorithm (BA) [9] is one of the most successful metaheuristic algorithms regarding optimization that has some deficiencies in utilization and exploration. On the other hand, the results of the Ant lion optimizer (ALO) [10] show that it has a stronger operator in exploring and exploiting that can pose as an appropriate method for the BA deficiency. In this paper, the natural processes of ALO with exploring and exploiting would be applied for improving the utilization and exploration of BA. The best individual considered as a light source that some closest individuals are always willing to fly in the form of the Lévy flight in their locations. The rest of this paper is organized as follows. Section 2 describes the standard version of BA and ALO. Section 3 presents the improvement of BA algorithm. Section 4 presents the experimental results. Section 5 draws a conclusion.

2 Related Work 2.1

Bat Algorithm

The BA is a metaheuristic optimization algorithm that the bats operate according to the three criteria of reflection of the sound and the pulse rate of emission and sound loudness [9]. The local search performed with the “random walk” scheme. Selecting the best mode will continue until it reaches one of the best stop conditions. Bats spread short pulses to find prey while they move. Of course, it should be noted that the time of approaching the victim, bats increase their emitted pulse rate and reset the frequency. These feature of sound reflection in the BA modeled as following three laws. • All bats use the reflection of sound to calculate distance. Detecting the difference between food (and)/prey and the obstacles in the path is part of the extraordinary capabilities of bats. • The bats fly randomly with velocity speed vi at position xi , with constant frequency fmin , with various wavelengths k and loudness of voice A0 to search for bait. They can automatically adjust the emitted wavelength (or frequency) and the released pulse rate [0, 1] r 2 according to the proximity of the prey. • Although the volume of the sound can be different, however, it is assumed that the volume of sound is changing from a positive A0 to a minimum value of Amin . For each bat (i), a position xi and the velocity vi are considered in a d-dimensional search space, and these values are updated subsequently. fi ¼ fmin þ ðfmax fmin Þ b

vti ¼ vt1 þ xt1 i x

fi

ð1Þ ð2Þ

where b is a random number in the range [0, 1] as the result of a uniform distribution. Figure 1 summarizes the steps of the BA as the pseudo-code.

52

T.-K. Dao et al.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

Objective function f(x), x = (x1 …xD) Initialize the bat population xi (i = 1, 2... n) and vi Define pulse frequency fi at xi Initialize pulse rates and the loudness A While (t r) Select a solution among the best solutions Generate a local solution around the selected best solution End if Generate a new solution by flying randomly If (rand < A & f (xi) : hðxÞ [ 0

ð6Þ

In the formula, f1 ðxÞ indicates the total grounding cost of the TN-C system; f2 ðxÞ indicates the contact voltage.

Method for Calculating Grounding Quantity of TN-C System

2.2

81

Contact Voltage Calculation

A typical TN-C system fault equivalent diagram is shown in Fig. 1.

Fig. 1. TN-C system fault equivalent diagram

It can be seen from Fig. 1 that the sequence impedances in the system sees from the point of failure are: zRð1Þ ¼ XTð1Þ þ Xlð1Þ

ð7Þ

zRð2Þ ¼ XTð2Þ þ Xlð2Þ ¼ zRð1Þ

ð8Þ

zRð0Þ ¼ XTð0Þ þ Xlð0Þ þ 3Zeq ¼ XTð1Þ þ 3Xlð1Þ þ 3Zeq

ð9Þ

In the formula, XTð1Þ , XTð2Þ , XTð0Þ are the transformer positive sequence, negative sequence, zero sequence equivalent reactance standard value; Xlð1Þ , Xlð2Þ , Xlð0Þ are the positive sequence, negative sequence, and zero sequence reactance standard values of the line; Zeq is the equivalent impedance. For the typical system described above, the symmetrical component method is available: when a short-circuit fault occurs, the expression of the human body contact voltage is as shown in Eq. (10). Z U_ d ¼ U_ f Zeq þ Xeql þ Zg Zð2Þ ZðxÞ : U_ C ¼ U_ d Zð1Þ XPEN þ R XPEN þ R XPEN þ R 8
1) Braking torque

Ideally, when the current axle braking force is completely provided by the motor, the influence of the rotating part is ignored, and the longitudinal dynamic equation of the vehicle is as follows: mgz FW Ff ¼ Fm þ Fzr Z

ð1Þ

Research and Simulation Analysis of Brake Energy Recovery Control Strategy

105

Fzr is the rear axle vertical load Fzr ¼

mgða zhg Þ L

ð2Þ

Fw, Ff are air resistance and rolling resistance, respectively Ff ¼ mgf

ð3Þ

Fw ¼

CD Av2 21:15

ð4Þ

Fm ¼

T m ig i0 Rgt

ð5Þ

Fm is the motor braking force

The theoretical optimal braking strength is obtained from Eqs. (1)–(5):

Zoptt ¼

ðL aÞ þ 2hg

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L ðL aÞ2 þ 4hg ðFf þ Fw þ Fm Þ mg 2hg

ð6Þ

3 Control Strategy and Vehicle Modeling 3.1

Braking Energy Recovery Control Strategy Modeling

When the car is driving, the control strategy first determines which mode the system is in. When the drive mode is input, the input signal is assigned to the motor and exits the strategy. In the braking mode, it is judged whether the system satisfies the braking feedback condition, and when the motor is not satisfied, the brake is controlled to mechanically brake, and then exits; When the feedback condition is satisfied, the wheel side demand torque is calculated according to the formula, the motor brake demand torque is calculated from the gear ratio and the efficiency, the motor brake demand torque is limited by the vehicle speed and the acceleration, and the demand torque is limited according to the maximum recovery torque of the motor. Finally, it is judged whether the braking torque of the motor is greater than the braking torque of the whole vehicle. When it is satisfied, it is all used for braking energy recovery. When it is not satisfied, the remaining mechanical brake is supplemented. The control flow chart is shown in Fig. 1.

106

Q.-C. Li and Y.-J. Chiu

Fig. 1. Brake energy recovery control strategy flow chart

The Simulink control strategy model established according to the flow chart is shown in Fig. 2.

Fig. 2. Brake energy recovery control strategy

There are 11 inputs and 3 outputs in the control strategy. The data types are doubled at the input and output, in order to reduce the error. The input source and output signal objects are detailed in the following vehicle model modeling.

Research and Simulation Analysis of Brake Energy Recovery Control Strategy

3.2

107

Vehicle Dynamics Modeling

This paper selects AVL Cruise software to build the whole vehicle model. The basic parameters of the whole vehicle are shown in Table 2. Then compile the Simulink control strategy into a dll file and import it into the Matlab Dll interface in Cruise. The vehicle model is shown in Fig. 3. Table 2. Vehicle basic parameters Parameter Curb weight/kg Wheel base/mm Frontal area/m2 Rolling radius/mm Drag coefficient

Value 1200 2467 1.97 287 0.248

Fig. 3. Vehicle model

After the mechanical connection and electrical connection of the vehicle model in Cruise, the key is the signal connection. To properly connect the individual signals, you must be aware of the source of the input signal and the target of the output signal. Tables 3 and 4 is a description of each signal.

108

Q.-C. Li and Y.-J. Chiu Table 3. Input signal (Cruise to Simulink) Number 0 1 2 3 4 5 6 7 8 9 10

Name Load signal EM_Max_Req_Torque Velocity Brake_Pressure_Req SOC Single ratio iTR Transmission_efficiency EM_actual_Torque eBrake_Coeff_vel eBrake_Coeff_Accel

Source module Cockpit Electric machine Vehicle Cockpit Battery Constants Constants Constants Electric machine General map General map

Table 4. Output signal (Simulink to Cruise) Number 0 1 2

Name EM_Load_signal State Remain_Brake_Pressure

Source module Electric machine Monitor Brake

4 Simulation Analysis 4.1

NEDC Cycle Condition Verification

The new European driving cycle (NEDC) operating conditions were used to verify the energy recovery strategy. The vehicle speed changes with time as shown in Fig. 4.

Fig. 4. The characteristic of NEDC

Research and Simulation Analysis of Brake Energy Recovery Control Strategy

109

After a pure electric vehicle passes an NEDC cycle condition, the simulation results of the motor torque are shown in Fig. 5. (a) no brake energy recovery, (b) brake energy recovery.

Fig. 5. Comparison of simulation results

(b) In contrast to (a), the motor has a negative torque and the result indicates that there is brake energy recovery.

110

Q.-C. Li and Y.-J. Chiu

In order to more intuitively describe the feasibility of the braking energy recovery control strategy, compare the power consumption of the two kilometers, as shown in Fig. 6.

Fig. 6. 100 km of electricity consumption

In Fig. 6, (a) indicates no brake feedback, and (b) indicates brake feedback. The simulation results show that the power consumption of the braking energy recovery control strategy is less than the power consumption without braking feedback, which verifies the correctness of the control strategy.

5 Conclusion Brake energy recovery is one of the important ways for energy saving and emission reduction of new energy vehicles. The correct construction of control strategy is the key to realize brake recovery. In this paper, Simulink is used to build the braking energy recovery control strategy, AVL Cruise builds the vehicle model. The joint simulation results show the feasibility of the control strategy.

References 1. Chen, Q.Z., He, R.: Motor and hydraulic braking force distribution in car regenerative braking system. J. Jiangsu Univ. 29(5), 394–397 (2008) 2. Kumar, C.N., Subramanian, S.C.: Cooperative control of regenerative braking and friction braking for a hybrid electric vehicle. Proc. Inst. Mech. Eng. Part. J. Autom. Eng. 230(1), 103– 116 (2015) 3. Yao, T.F.: Distributed MIMO chaotic radar based on wavelength division multiplexing technology. Opt. Lett. 40(8), 1631–1634 (2015) 4. Sun, D.X., Lan, F.C., Chen, J.Q.: Research on braking energy recovery strategy of four-wheel drive pure electric vehicle based on I-line braking force distribution. Autom. Eng. 35(12), 1057–1061 (2013) 5. Liu, W., Sun, F.C., He, H.W.: An integrated control strategy for the composite braking system of an electric vehicle with independently driven axles. Veh. Syst. Dyn. 54(8), 1031–1052 (2016) 6. Massimo, C.: Vehicle control via second-order sliding-mode technique. IEEE Trans. Ind. Electron. 55(11), 3908–3916 (2008)

Research on Coupling Vibration of Disk Crack and Shaft Crack of Rotor System Based on Finite Element Method Peng-Fei Peng, Yi-Jui Chiu(&), and Xiao-Yun Li School of Mechanical and Automotive Engineering, Xiamen University of Technology, No. 600, Ligong Road, Xiamen 361024, Fujian, China [email protected]

Abstract. The authors explored the problem about coupling vibration with disk and shaft crack in the rotor system based on the previous studies from Chiu and Huang (2008) to Chiu et al. (2019). Finite element method (FEM) was mainly used to analyze the frequencies and modes changes in this paper. The shaft and disk crack affect the modes and frequencies that the shaft and disk predominated. We observed and provides some interesting phenomenon both qualitative and quantitative overviews in this paper. Keywords: ANSYS Rotor systems Coupled vibration Disk crack Shaft crack

1 Introduction During the process of high engine speed and heavy load, the abnormal vibration that may occur in the rotor system may affect the normal working condition of the machine. The crack affect is more important. Therefore, it is particularly important to study the crack expansion direction and crack depth under certain conditions, and it is also the key research content of academia and industry. Xie et al. [1] establishes the RBC model considering the centrifugal force and bending stress caused by the respiration effect and the rotation of the blade, and compares it with OC and BC to explain the necessity and accuracy of the RBC model. Nabian and Hamid [2] use the knowledge of fracture mechanics to study the influence of longitudinal cracks and axial cracks on the axial resonance frequency, and the reasons behind the phenomenon. Foletti et al. [3] carry FEM analysis and a series of experiments out on turbine discs under three different load conditions, with the aim of reducing design errors and predicting the life of the disk and by deriving the disc singular integral equation. Rudavs’ka [4] derives the stress intensity factor value of the disc subjected to non-radial edge crack under uniform pressure, and compares the result with the case of radial crack and oblique edge crack. The authors’ laboratory previous studies, such as Chiu and Huang [5] explored single blade with open type crack how to affect coupled vibration in the shaft- single flexible blade-disk rotor. Chiu and Huang [6] also investigated a disorder blade to affect coupling vibration in rotor. This year (2019), Chiu et al. [7] expanded to study coupling © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 111–116, 2020. https://doi.org/10.1007/978-981-15-3308-2_13

112

P.-F. Peng et al.

vibration phenomenon of a multi-disc system with blades of multiple cracks. The intention of this paper is discussed that the influence of disk crack and shaft crack in the blade-disk-shaft rotor. Because its crack and this disk’s flexible are hard to use analytic method to analyze, this paper only used finite element method.

2 Theories Analysis Rotor systems with disk and shaft crack conclude shaft, disk, blades and crack are shown in the Fig. 1. The actual geometry and material properties are shown in Table 1.

Fig. 1. (a) The rotor system with shaft and disk crack (b) the stagger angle

Table 1. Rotor system material and geometric parameters Shaft Density ðqs Þ Shear modulus Gs Shaft length Ls Radius rs Disk Density ðqd Þ Young’s modulus Ed Location Zd Outer radius rd Thickness hd Poisson’s ratio v Blade Density ðqb Þ Young’s modulus Eb Blade outer end rb Cross section Ab Stagger angle b Crack Depth ratio e

7850 kg/m3 75 GPa 0.6 m 0.04 m 7850 kg/m3 200 GPa 0.3 m 0.2 m 0.03 m 0.3 7850 kg/m3 200 GPa 0.4 m 1.2 10−4 m2 60° 0.1–0.8

Based on the references [5] and [7], which used assumed mode method. The equations of motion in matrix notation as:

Research on Coupling Vibration of Disk Crack and Shaft Crack

½M€q X½Pq_ þ ð½K i þ ½K e X2 ½K X ½K cr Þq ¼ 0

113

ð1Þ

Mass matrices used ½M. Guides eigenvalues bifurcation, X½P from the Coriolis effect. ½P is zero, when the disk is rigid. So the frequency bifurcates when disk is flexible. And, it is the disk flexibility that causes frequency bifurcation. Stiffness from rotation due to initial stress resulted used ½K i matrices. Elastic deflection at low rotational speed could get ½K e matrices.X2 ½K X get from the rotation of the rotor, softening it becomes very obvious at high speed. Cracked blade matrices used ½K cr . After this mesh is divided with disk and shaft crack, the FEM is as the Fig. 2 shows. The disc crack is 0.2 mm and shaft crack is 0.1 mm. The physical field is Set to the mechanical. Meshing on the basis of the size of different sections of the model. Then the simulation of the overall geometric model is carried out by checking the quality of the model mesh. The six blade rotor system model contains 27,488 elements and 132,958 nodes.

Fig. 2. (a) Mesh divides of rotor system (b) disk crack (c) shaft crack

3 Numerical Results Figures 3 shows the frequencies change with the shaft crack expand. The symbols 1s51 and 1s61 are represented the shaft predominate first nature frequencies in a five/six blades rotor system. From Fig. 3, we could find that shaft has more than 40% crack, the frequencies have more dropped. Figure 4 shows the frequencies change with the disk crack expand. The symbols 1d51, 1d61, 1d52 and 1d62 are represented the disk predominate first and second nature frequencies in a five/six blades rotor system. Let we see the Figs. 4(a) and (b), there has a very interesting phenomenon. It is that the disk predominate second nature frequencies has more dropped to close to the disk predominate first nature frequencies, when the disk crack reach to 70%/80% in the five-/ six- blade system (Table 2). Figures 5 and 6 show the modes of the disk first and second predominate in the five- and six- blade rotor system with disk crack 0% to 80% degree. The two interesting phenomenon are found. Let we see Fig. 5(a) that the modes of the disk first predominate, firstly. We could find the disk mode is (0, 0) mode, when the disk crack degree is 0%. If the disk crack degree is increase, the localization phenomenon of disk

114

P.-F. Peng et al.

Fig. 3. Frequencies change of five-/six-blade rotor system by shaftcrack. (a) scale 0–210 Hz (b) scale 170–220 Hz

Fig. 4. Frequencies change of rotor system by disk crack. (a) five-blade (b) six-blade Table 2. Natural frequencies (Hz) of shaft disk, disk, and clamped blade x2 Component’s n.f. x1 Shaft disk (torsion) (w’o blades) 207.418 2645.690 Disk (shaft rigid) 921.227 974.922 Clamped blade (shaft-disk rigid) 81.538 510.99

is appear. From Fig. 5(b) shows the disk that the modes of the disk second predominate have no deformation, when the disk crack degree is 0%. If the disk crack degree is increase, the disk has deformation. When the disk crack degree is 40%, there are (0,2) mode. The blades have transverse deformation in these modes. Figures 6(a) and (b) have similar phenomenon, the authors would not repeat introduce. The modes of the rotate shaft first predominate in the five- and six-blade rotorsystem with disk crack 0% to 80% degree are as Figs. 7(a) and (b) shows. When the shaft crack degree is 0%, the modes of the shaft first predominate that the disk mode is (0, 0) mode, samely. If the shaft crack degree is increase to 50%, there are (0, 1) mode.

Research on Coupling Vibration of Disk Crack and Shaft Crack

115

Fig. 5. Modes of the disk first and second predominate in the five-blade rotor with disk crack degree

Fig. 6. Modes of the disk first and second predominate of the six-blade with disk crack degree

Fig. 7. Modes of the shaft first predominate of the five-/six-blade with shaft crack degree

116

P.-F. Peng et al.

4 Conclusion This paper explored the modes and Frequencies change because of disk and shaft crack with different depths. Firstly, we found that the disk predominate second nature frequencies has more dropped to close to the disk predominate first nature frequencies. Secondly, the disk crack degree is increase, the localization phenomenon of disk is appear. Lastly. the shaft crack degree is increase to 50%, the mode will become (0, 1) mode from (0, 0) mode. Above results, the authors believe that should be useful for rotor system. Acknowledgements. This study is sustained by Scientific Research Climbing Project of Xiamen University of Technology, No. XPDKT18016 and Graduate Technology Innovation Project of Xiamen University of Technology No. 40318027.

References 1. Xie, J.S., Zi, Y.Y., Zhang, M.Q., Luo, Q.Y.: A novel vibration modeling method for a rotating blade with breathing cracks. Sci. China 62, 333–348 (2019) 2. Nabian, M., Hashemi, H.N.: The effects of longitudinal and circumferential cracks on the torsional dynamic response of shafts. Condens. Matter 0503 (2018) 3. Foletti, S., Beretta, S., Scaccabarozzi, S., Rabbolini, S., Traversone, L.: Fatigue crack growth in blade attachment of turbine disks: experimental tests and life prediction. Mater. Perform. Charact. 2165–3992 (2015) 4. Datsyshyn, O.P., Rudavs’ka, I.A.: Nonradial edge crack in a circular disk. Mater. Sci. 47(4), 509–513 (2012) 5. Chiu, Y.J., Huang, S.C.: The influence of a cracked blade on rotor’s free vibration. J. Vib. Trans. ASME 130(5), 054502 (2008) 6. Chiu, Y.J., Huang, S.C.: The influence of a mistuned blade’s staggle angle on the vibration and stability of a shaft-disk-blade assembly. Shock Vib. 15(1), 3–17 (2008) 7. Yu, G.F., Chiu, Y.J., Yang, C.H., Sheng, J., Li, X.Y.: Exploration of coupled-vibration phenomena in multi-disc rotor with blades with multi-cracks. Adv. Mech. Eng. 11(4), 1–22 (2019)

Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy Qi-Chao Li, Yi-Jui Chiu(&), Guo-Wei Weng, and Hao-Da Huang School of Mechanical and Automotive Engineering, Xiamen University of Technology, No. 600, Ligong Road, Xiamen 361024, Fujian, China [email protected] Abstract. This paper is a simulation analysis of elderly electric vehicle. The designed 3D model was introduced into ANSYS, and the stiffness of the body structure was calculated by static analysis. In the modal analysis, the bending and torsion frequencies of the body are calculated separately. The simulation results show that the elderly electric vehicle will not be damaged under normal driving conditions. Keywords: Solar energy ANSYS

Elderly electric vehicle Structural design

1 Introduction Nowadays, with the increase of the elderly, there are elderly electric vehicles on the market to meet the needs of the elderly for outdoor travel. Some manufacturers or businesses in order to seek profits, on the basis of two-wheeled electric vehicles simply transformed into the elderly scooter, there is no special for the elderly to the safety, stability, comfort of the scooter to consider the design, does not meet the physiological and psychological needs of the elderly. At present, the Chinese market is rarely targeted to take into account the driving habits of the elderly on the scooter. Therefore, it is of great significance and development prospect to specially study and design the elderly scooter that is suitable for the elderly and conforms to the national standards of electric bicycle and electric wheelchair. Elderly electric vehicles based on Solar Energy are considered to be the future of the car. It has the advantage as follow: low emissions, less pollution, relatively simple structure and light weight. Researches are in this field, such as Huang et al. [1] proposed a modification experiment on the existing electric tricycle, and measured the experimental data. Based on this, the rear axle was selected to design the frame structure of the solar four-wheeled vehicle. Firat [2] researched environmentally friendly electric vehicle charging and house solar photovoltaic hybrid systems based on machine learning systems. For numerical studies, Homer Pro and PVSOL software were used. Zhou et al. [3] proposed an battery replacement station location and electric vehicle handling routing problem to expand its application range to determine the routing of electric fleets within a limited battery drive range. Planning and Battery Replacement Station (BSS) location. Kanchwala et al. [4] proposed a method of controlling the speed at which acceleration is achieved, ensuring enhanced ride comfort © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 117–124, 2020. https://doi.org/10.1007/978-981-15-3308-2_14

118

Q.-C. Li et al.

for the elderly. The vehicle was modeled using Car-Sim and integrated with the MPC modeled in Simulink. Jiang et al. [5] used the Cyclone series of FPGA chip as the core of the control unit of the intelligent car. According to movement status and the movement direction of the intelligent car to realize the function of monitoring and remote control. Lauer et al. [6] proposed that electric scooter is becoming more and more popular among the elderly and people with mobility problems. This paper is to analyze the mechanical design of the old scooter body, and perform static analysis and modal analysis in ANSYS to check whether the body meets the strength and stiffness requirements.

2 Theoretical Model Analysis This article uses the maximum safety factor to determine if the stress meets the body strength requirements. The maximum stress is the stress experienced by the body within the allowable stress range of the material. In the static analysis, the allowable stress of the body is ½r ¼

rb n

ð1Þ

where rb is the ultimate strength, n is the safety factor. Due to the complexity of the body structure of the scooter, one simplified it into a mathematical model. This paper derives the system equation as € þ ½KfXg ¼ 0 ½MfXg

ð2Þ

The position vector {X} is defined as [D] {u} in the matrix model of the vehicle body, where [D] is the modal matrix of the system. Then Eq. (2) can be changed as follows: ½I f€ug þ ½ Afug ¼ 0

ð3Þ

In which: 2

1

6 60 ½DT ½M ½D ¼ ½I ¼ 6 . 4 .. 0 2 6 6 ½DT ½K ½D ¼ ½ A ¼ 6 6 4

0

...

1

... .. . 0

0 0

x21

0

0 .. .

x22

0

0 ...

3 0 .. 7 .7 7 05 1

3 0 . 7 . . . .. 7 7 7 .. . 0 5 0 x2n

ð4Þ

...

ð5Þ

Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy

119

The natural frequency of the mistuned system was expressed as follow: xn xn ¼ qffiffiffiffiffiffiffiffi ; n ¼ 1; 2; 3. . .. . . EI qAL4

ð6Þ

3 Finite Analysis This paper imported the 3D geometry model built in UG into ANSYS, as shown in Fig. 1. The elderly electric vehicle designed into account the lightweight factors in this paper takes. Table 1 shows the design parameters are shown in the following Table 2. After the mesh is divided, the finite element model is shown in Fig. 2. The model contains 117298 elements and 203372 nodes. The element types are choice 3D hexahedral solid elements.

Fig. 1. Model import

Table 1. The design parameters of elderly electric vehicle Length 1600 mm Width 900 m High 1500 mm Carrying capacity 150 kg Maximum gradient 15° Average speed = 30 km/h

Table 2. Structure material parameter Body Density Shear modulus Young modulus Poisson ratio Aluminum alloy 2.77 g/cm 2.67 1010 Pa 7.1 1010 Pa 0.33

120

Q.-C. Li et al.

Fig. 2. Mesh divides

3.1

Static Analysis

3.1.1 Static Analysis Under Bending Conditions Loading mode of automobile body in bending condition is shown in Fig. 3. The boundary condition is to restrain the front and rear support seats. The load condition is to apply 300N force to the roof of the vehicle body, apply 3000N force to the center of the seat, apply 500N force to the rear of the vehicle body, and apply 200N force to the front of the vehicle body. By solving the ANSYS processor, the deformation of strain and stress under bending conditions can be obtained, as shown in Figs. 4(a) and (b)

Fig. 3. Bending load constraint

Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy

(a) Bending strain

121

(b) Bending stress

Fig. 4. Static analysis under bending conditions

The electric car is made of aluminum alloy. Its strength limit is rb ¼ 375 MPa. Its allowable stress is ½r ¼ rnb , under static load, ½r ¼ rnb ¼ 150 312:5 MPa. According to the calculation results of the cloud image, the maximum stress value obtained by the vehicle body under the bending condition is 28.279 MPa, and the maximum stress value is far less than the allowable stress value, which meets the strength requirement. And through the deformation cloud diagram of the body, it can be seen that the maximum deformation occurs at the center of the base. For 0.57 mm, it is considered to reduce the wall thickness of the frame while ensuring the strength and achieve weight reduction. 3.1.2 Static Analysis Under Torsional Conditions The loading mode of the vehicle body under torsion conditions is shown in Fig. 5. The boundary condition is to restrain the left side of the vehicle body, apply a force of 300N

Fig. 5. Torsion load constraint

122

Q.-C. Li et al.

on the right side of the ceiling, and apply a force of 3000N and 1500N respectively on the right side of the vehicle body. By solving the ANSYS processor, the deformation of strain and stress under bending conditions can be obtained, as shown in Figs. 6(a) and (b).

(a) Torsion strain

(b) Torsion stress

Fig. 6. Static analysis under torsional conditions

According to the cloud map, the maximum stress value of the body under torsion conditions is 31.08 MPa, which occurs at the right sill and is lower than the yield strength of the aluminum alloy, so the strength requirement is met. The maximum deformation occurs at the front of the right side of the base, and the amount of deformation is 0.34 mm. In summary, the designed body meets the strength and stiffness requirements of the body under bending and torsion conditions, thus ensuring the safety and reliability of the old scooter.

4 Dynamic Analysis Table 3 lists that the frequencies and mode of car body. Figure 7 are the first six modes of car body. The dynamic performance of the object depends mainly on the low order vibration modes in this paper. The main concern in car body design is the minimum Table 3. The frequencies and mode of the vehicle body Mode Frequency (Hz)

Maximum deformation (mm)

Maximum deformation position

1 2 3 4 5 6

11.715 4.9656 9.1909 20.943 10.116 36.913

Swing up and down the top of the body First-order torsion at the bottom of the car Swing up and down the bottom of the car First-order bending at the top of the car body First-order twist on the top of the car The front window of the car body swings up and down

44.601 55.038 82.88 85.968 88.69 106.53

Structure Design and Analysis of Elderly Electric Vehicle Based on Solar Energy

123

number of frequencies that cause the body to resonate. Therefore, the first 6 modes are extracted in modal analysis.

(a)1st

(c) 3rd

(e) 5th

(b)2nd

(d)4th

(f)6th

Fig. 7. The first six modes of the structure.

With modal analysis, it can be found that the roof of the car is the most prone to resonance deformation. This is also the location where the solar panels are installed. Once resonance occurs, the connectors will fail and the solar panel structure will be destroyed. The average speed of the old scooter is 20–30 km/h, so the natural

124

Q.-C. Li et al.

frequency of the structure should not be: 17.36–26.04 Hz. Because the first mode frequency of the frame is 44.601 Hz, the excitation of road roughness is unlikely to cause resonance. Although the installation position of the solar panel is the most serious resonance, it can guarantee its stability under normal driving.

5 Conclusion This paper designs a new body of the elderly electric vehicle. The simulation results show that the top of the elderly electric vehicle and the middle of the chassis are prone to damage. Under normal road conditions and driving conditions, the body will not be damaged and the driving safety will be satisfied. Acknowledgements. This project is sustained by Graduate Technology Innovation Project of Xiamen University of Technology No. 40318032.

References 1. Huang, J.B., Tang, W.: Design and trial production of solar electric vehicle. Internet Things Technol. 7(2), 89–91 (2017) 2. Firat, Y.: Utility-scale solar photovoltaic hybrid system and performance analysis for ecofriendly electric vehicle charging and sustainable home. Energy Sour. Part A Recovery Util. Environ. Eff. 41(6), 734–745 (2019) 3. Zhou, B.H., Tan, F.: Electric vehicle handling routing and battery swap station location optimisation for automotive assembly lines. Int. J. Comput. Integr. Manuf. 31(10), 978–991 (2018) 4. Kanchwala, H., Ogai, H.: Development of an intelligent transport system for EV. SAE Int. J. Passeng. Cars Electron. Electr. Syst. 9(1), 9–21 (2016) 5. Jiang, M., Zhang, J.M.: Design of intelligent obstacle avoidance car based on FPGA. Appl. Mech. Mater. 3365(1207), 1233–1236 (2014) 6. Lauer, P., Meiller, S.G.: Effective safety for lift users with mobility scooters as well as for lift operators. Lift Rep. (3), 102–103 (2014)

Structural Design and Analysis of Ring Steering Wheel for Old-Age Walking Vehicle Guo-Wei Weng, Yi-Jui Chiu(&), Qi-chao Li, and Wen-jun Liu School of Mechanical and Automotive Engineering, Xiamen University of Technology, No. 600, Ligong Road, Xiamen 361024, Fujian, China [email protected]

Abstract. In recent years, the aging of our country has intensified, and the needs of the elderly have been paid more and more attention. In order to meet the travel needs of the elderly, the old-age walking vehicle came into being. However, there are many problems with the old-age walking vehicle on the market. The paper started from the angle of human body size and psychological and behavioral characteristics of the elderly, based on the design concepts of humanization, emotional design and green design, then uses Solidworks software to design the ring steering wheel structure, and uses ANSYS finite element analysis to carry out mechanical analysis and modal analysis of the ring wheel mechanism of the elderly generation walker. The results show that the designed indicators meet the requirements of the work. Keywords: Steering wheel

Old-age walking vehicle Structural design

1 Introduction At present, the problem of population aging appears in the social development of our country, which has shown the phenomenon of “old-fashioned survival”. Liu [1] The latest data of the survey in 2019 show that in the past 14 years, China’s population aging has become the most serious problem in the world, and it is expected that there will be an obvious upward trend in the future. In this regard, People pay more and more attention to the products for the aged, and the walking vehicle for the aged is one of them, but most of the old-age bicycles on the market are not designed specifically for the safety, stability and comfort, which does not meet the physiological and psychological needs of the elderly. Therefore, it is of great significance to design an old-age bicycle that meets the needs of the elderly and meets the national standards. The steering wheel is the key component of the elderly walking vehicle in the process of steering movement. At the same time, the steering wheel is equipped with human-machine interface. The elderly can achieve the purpose of human-machine communication by setting the speed, electricity, odometer and speed regulation on the human-machine interface. For the research of the steering wheel of the elderly walking vehicle, the main direction at this stage is to have good comfort and clear and diverse human-machine interface. Zhou et al. [2] outlined the design factors related to

© Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 125–133, 2020. https://doi.org/10.1007/978-981-15-3308-2_15

126

G.-W. Weng et al.

ergonomics in the design of automobile steering wheel, including the size design, shape design, comfortable operation area on the steering wheel, multi-function key design of the steering wheel, and the design of decorative parts on the steering wheel, providing certain design reference. Ma and Xu [3] pointed out that in order to adapt to the psychology of the elderly, from the point of view of color and users, the operating Pan had better avoid using metal, on the one hand, the metal feels cold and uncomfortable. Wang [4] analyzed people’s emotions about products and put forward the emotional design concept, aiming at grasping users’ emotions and providing users with good products. Yang [5] proposed to sort and arrange keys to simplify the operation procedure. At the same time, on the display interface, digital display should be used instead of pointer display to make the font clear and eye-catching, and to increase the volume of the control system moderately. The display mode of audio-visual coexistence is adopted. Wu [6] proposed that multiple controllers concentrate on manual operation, requiring coding design through corresponding means (shape, size, color, logo, etc.) in order to reduce operational errors. The purpose of this paper is to design a ring steering wheel structure for the aged bicycle. On the basis of meeting the requirements of stiffness, strength and feasibility analysis of several schemes, the final scheme is determined and the vibration analysis is carried out.

2 Theories Analysis 2.1

Theories Research

In this paper, the safety factor method is used to determine whether the stress meets the strength requirements. Whether the steering wheel bears the maximum stress, or whether it is within the allowable stress range of the material. In static analysis, the allowable stress of steering wheel is ½r ¼

rb n

ð1Þ

where rb is the ultimate strength, n is the safety factor. Because the structure of steering wheel was complex, we should simplify the structure first. This paper exports the equations of the system. € þ ½KfXg ¼ 0 ½MfXg

ð2Þ

Defined the position vector {X} as [D]{u}, where [D] was the modal matrix of the system. Equation (2) could be changed as follow: ½I f€ug þ ½ Afug ¼ 0

ð3Þ

Structural Design and Analysis of Ring Steering Wheel

127

In which: 2

1

0

6 60 1 ½DT ½M ½D ¼ ½I ¼ 6 . 4 .. 0 0 0 2 2 x1 0 6 6 0 x2 2 ½DT ½K ½D ¼ ½ A ¼ 6 6 .. 4 . 0 0 ...

... ... .. . 0 ... ... .. . 0

3 0 .. 7 .7 7 05 1 3 0 .. 7 . 7 7 7 0 5 xn 2

ð4Þ

ð5Þ

The natural frequency of the mistuned system was expressed as follow: xn xn ¼ qffiffiffiffiffiffiffiffi ; n ¼ 1; 2; 3. . .. . . EI qAL4

2.2

ð6Þ

The Establishment of a Modal

The handles on the circular steering wheel show a regular closed curve and can be grasped at different angles. Compared with the straight handles on the traditional electric vehicle steering wheel, they are more comfortable and stable, and more suitable for the needs of the elderly. The three-dimensional model of the ring steering wheel is shown in Fig. 1.

Fig. 1. Three-dimensional model of ring steering wheel for old-aged bicycles

128

G.-W. Weng et al.

According to the percentile of human body calculated from geriatric research and the body size of 26–35 years old in GB10000-88 standard, the main parameters of designing circular steering wheel are listed in Table 1.

Table 1. The main parameters of ring steering wheel Length Soft start rotary handle diameter Controllable perimeter Handle bevel angle

520 mm 40 mm 3500 mm 75°

Net weight Maximum turning angle of steering wheel Brake handle length Minimum Torque of Handbrake

9.4 kg 50° 125 mm 50 N cm

3 Finite Analysis The three-dimensional geometric model established in Solidworks is imported into ANSYS. The frame of the ring steering wheel is made of aluminium alloy in the initial processing stage. The material is light and strong, and then coated with a layer of resin polyester to achieve the purpose of soft touch. The specific parameters of the two materials are shown in Table 2 below. Because the proportion of resin polyester material in the steering wheel is much smaller than that of aluminum alloy, and the mechanical properties of resin polyester are weaker than that of aluminum alloy, this paper considers that the steering wheel material is aluminum alloy. After meshing, the finite element model is shown in Fig. 2. Because the handle and brake handle are the main force points, the handle and brake handle are subdivided into 2 cells. The number of elements and nodes in the model is 98580 and 65142.

Fig. 2. Mesh divides

Structural Design and Analysis of Ring Steering Wheel

129

Table 2. Structure material parameter Density Shear modulus Young modulus Poisson ratio 0.316 Resin Polyester 1.2 10−9 g/cm3 1.14 109 Pa 3 109 Pa Aluminum alloy 2.77 g/cm3 2.67 1010 Pa 7.1 1010 Pa 0.33

4 Static Analysis 4.1

Static Analysis Under Bilateral Loading

Loading on the ring steering wheel under bilateral loading is shown in Fig. 3. The boundary condition is that the bottom end of the ring steering wheel is fixed on the car body and fixed constraints are imposed on the bottom of the model. The load condition is that 1500 N is applied to the center of the ring steering wheel, the tail end of the soft start turning handle and the left grip respectively (about the average weight of two elderly people). Through the solution of ANSYS, the stress and strain distributions under bilateral loading can be obtained, as shown in Fig. 4(a) and (b).

Fig. 3. Load and constraint of ring steering wheel under bilateral loading conditions

(a) Stress under bilateral loading

(b) Strain under bilateral loading

Fig. 4. Static analysis of circular steering wheel under bilateral loading

130

4.2

G.-W. Weng et al.

Static Analysis Under Unilateral Loading

Under unilateral loading, the load on the ring steering wheel is shown in Fig. 5. The boundary condition is that the bottom end of the ring steering wheel is fixed on the car body and fixed constraints are imposed on the bottom of the model. The load condition is that 1500 N is applied to the center of the ring steering wheel and 1500 N is applied to the tail end of the soft start handlebar. Through the solution of ANSYS, the stress and strain distribution under unilateral loading can be obtained, as shown in Fig. 6(a) and (b).

Fig. 5. Loads and constraints imposed by a ring steering wheel under unilateral loading

(a) Stress under unilateral loading

(b) Strain under unilateral loading

Fig. 6. Loads and constraints imposed by a ring steering wheel under unilateral loading conditions.

Based on the results of the static analysis under the above two loading conditions, we can find that the maximum stress of the steering wheel is in the grip part. Under the condition of bilateral loading, the maximum stress value is 4.5687 MPa. Under unilateral loading, the maximum stress is 3.1746 MPa. The steering wheel is made of aluminium alloy and its safety factor is [n] = 1.5–2. According to the allowable stress ½r ¼ rnb , the allowable stress ranges from 150 to 312.5 MPa. The maximum stress of

Structural Design and Analysis of Ring Steering Wheel

131

the model is less than the allowable stress, which meets the strength requirements. We can also find that the maximum deformation is located in the middle of the left grip under double loading, and the maximum deformation value is 0.0025827 mm. Under the condition of single loading, the maximum deformation locates in the middle of the right grip, and the maximum deformation value is 0.0023996 mm. The degree of deformation is small. 4.3

Dynamic Analysis

Because of uneven road surface, the annular steering wheel mounted on the elderly bicycle will bear dynamic loads from different directions. Based on the condition of unilateral load, the modal analysis of prestressing force is carried out, and the natural frequencies (Hz) and mode shapes of the prestressing structure are analyzed. Table 3 shows the seventh to twelfth natural frequencies (Hz) of the ring steering wheel. Figure 7 shows the seventh to twelfth order modes of the circular steering wheel.

Table 3. The 7th to 12th modes of ring steering wheel Mode 7 8 9 10 11

Frequency (Hz) 820.33 1001.9 1090.7 1231.3 1269.3

12

1545.8

Maximum deformation (mm) 62.144 223.25 246.45 63.223 43.27 44.114

Maximum deformation position The big end corner of left handle The tip of the brake handle The tip of the brake handle The tip of the brake handle Near the corner of the small end of the left handle The corner of the small end of the right handle

Based on the above modal analysis, we can find that the grip is the region most prone to resonance deformation. This area is also the key point for the elderly to control the contact of the ring steering wheel. Once resonance occurs, the ring steering wheel will lose its efficiency. The average speed of the elderly bicycle is 20–30 km/h, and the natural frequency of its structure should not be 17.36–26.04 Hz. Because the seventh mode frequency of steering wheel is 820.33 Hz, the excitation of road roughness is unlikely to cause resonance, so it can run normally. Ensure its stability.

132

G.-W. Weng et al.

(a)7th

(b)8th

(c)9th

(d)10th

(e)11th

(f)12th

Fig. 7. Modal diagram of the 7th to 12th order of ring steering wheel

5 Conclusion With China’s entry into the aging society, the elderly bicycle plays an important role. In this paper, the steering wheel, one of the most important parts of the elderly bicycle, is selected as the research object, and a ring steering wheel of the elderly bicycle is designed. The structure is analyzed by ANSYS finite element analysis software. Static analysis and dynamic analysis are adopted to ensure that the design of steering wheel meets the requirements. Acknowledgements. This study was funded by the Provincial Entrepreneurship Training Project of University Students No. 41619011 of Xiamen University of Technology.

Structural Design and Analysis of Ring Steering Wheel

133

References 1. Liu, C.J.: Summary of expert seminar on prospects and countermeasures of population ageing in China. Soc. Res. (06) (2016) 2. Zhou, H., Zhang, Y.H., Lu, T.Q., Huang: Automotive steering wheel design and ergonomics. Autom. Sci. Technol. (5), 67–72 (2018) 3. Ma, G.T., Xu, Z.Y.: Analysis of ergonomic factors of elderly electric vehicles school of design and art. Devise (22), 134–135 (2015) 4. Wang, K.: Research on Emotional Design of Multipurpose Substituted Car. Changchun University of Technology, Jinlin (2016) 5. Yang, C.: Research on Senior Citizens’ Travel and Relevant Product Design_Take the elderly electric bicycle as an example. Jiangsu University, Jiangsu (2014) 6. Wu, J.J.: Design of Electric Vehicle for the Aged. Kunnig University of Science and Technology (2017)

Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance via Vehicle-to-Vehicle Communication Bijun Chen1,2, Lyuchao Liao1,2(&), Fumin Zou1,2, and Yuxin Zheng1,2 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, China [email protected], [email protected], [email protected], [email protected] Fujian Provincial Big Data Research Institute of Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, China

Abstract. Ignoring the complicated traffic condition and limited field-of-view at unsignalized intersection leave us a major challenge on preventing potential collision scenarios. This paper proposed a novel rule-based graded brake mechanism for intersection collision avoidance (ICA), using simulator-based techniques for combining the two-vehicle collision model and an automatic braking system. In particular, vehicle-to-vehicle communication was employed to ensure real-time interaction between the driving states of two vehicles, i.e., object vehicle and ego vehicle that identified as the controlled agent. Additionally, the intersection collision warning (ICW) threshold based on time to collision (TTC) was presented to warn in advance while a crash is determined. Collision scenario was constructed in PreScan simulation platform, and experimental results show the feasibility of the proposed method, and the two-stage braking strategy might accomplish stable and reliable performance for ICA. Keywords: Intersection Collision Avoidance (ICA) Graded braking V2V Time to collision PreScan

1 Introduction Unsignalized intersections with complex properties, such as lack of stop signs and traffic signals, have been continuously identified as a key challenge for both researchers and managers of the traffic control system. One of the most crucial tasks in the field of collision avoidance is to reduce the crash rate at unsignalized intersections, owing to the report that practically 25% collisions occur at or near an intersection [1, 2]. Particularly, vehicles colliding at intersections mainly yield from the limited field-of-view [3]. Failure to obtain other vehicle information in advance or real-time always leads to collision issues at intersections, which has become a major challenge in driving safety. Growing research efforts have been devoted to Intersection Collision Avoidance (ICA) and its strategy analysis [4]. Several studies focus mainly on algorithm-based approach. For instance, Heejin et al. transformed future collisions verification into a © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 134–142, 2020. https://doi.org/10.1007/978-981-15-3308-2_16

Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance

135

job-shop scheduling problem and then further designed a supervisor to override drivers for collision avoidance [5]; Farahmand et al. proposed a cooperative collision avoidance system to prevent collisions based on extended Kalman filters. The key of these works is to assure cooperative capabilities of intervehicle communication based on Wireless Access for Vehicular Environment (WAVE) [6]. Especially, a real-time environment sensing is of well-known significance for reducing collisions at intersections, however, there are relatively few studies devoted to taking it into account. On the other hand, most researchers made their efforts on Vehicle-to-Vehicle communication technologies (V2V) with simulation-based experiments [7, 8]. For example, Dedicated Short-range Communication (DSRC) [9] was widely employed to enable communication among vehicles at intersections. Also, research that optimized control [10], Bayesian model [11], and robust prediction model [12] for ICA has emerged. In addition, collision pre-warning algorithms are also proposed to perform an operation from advance warnings to emergency braking on time [13, 14]. These works played a pivotal key to reduce the time of drivers’ reaction, but limited attention has been paid to bridge these decisions with vehicles interaction, to collision warning, to autonomous braking, to graded braking, which makes it fail to further reduce collisions from being a closed loop. To bridge the above gaps, we constructed an integral closed-loop system (see Fig. 1), which can effectively obtain vehicular dynamic information in real-time via V2V communication and enable autonomous braking. To do this, we first proposed the ICA algorithm to prevent a collision. After that, the intersection collision warning (ICW) based on V2V communication was developed, which provide future conflict information to perform the autonomous emergency braking (AEB) system. Furthermore, we proposed a rule-based graded braking controller to operate the autonomous braking for performing two-stage brake if a conflict is determined in the near future. PreScan, as the simulation platform, will be employed to simulate the collision scenario and vehicle dynamics configuration as well. AEB system

ICW module

Brake ac on

Graded braking controller

IntersecƟon scenario

Vehicle-to-Vehicle communica on

Blind spot

Fig. 1. The closed-loop strategy for ICA

136

B. Chen et al.

The rest of this paper is organized as follows. In Sect. 2, we develop the strategy for ICA, Sect. 3 introduces the simulation platform and presents the experiments. The simulation results are shown in Sect. 3.3, and Sect. 4 concludes this work.

2 Methodology In this section, we present an overview of the unsignalized intersection collision avoidance problem and then proposed the intersection collision warning system. 2.1

Scenario Description

A major challenge in the ICA problem lies in the complex traffic environment where uncertain objects (i.e. pedestrian, bicycle, etc.) abruptly appear and dynamically move (i.e. rapid acceleration/deceleration or frequent lane change). Moreover, considerable crashes are also attributed to the intersection which is limited by the field-of-view [15]. Focusing on, this paper, two-vehicle collision at unsignalized intersection, modeling the crash scenario to propose our collision avoidance strategy. As is shown in Fig. 2, since the trajectory of a vehicle contains a piece of directional information, the vector method is developed to construct the intersection model [16]. In addition, circles indicate the vehicle, ðx; yÞ refers to the location coordinate, m is travel speed, and u is the heading angle of the vehicles.

Fig. 2. Overview of the intersection collision scenario

Figure 2 illustrates a collision scenario at an intersection, namely, a collision may occur between the ego vehicle and object vehicle. Specifically, according to V2V

Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance

137

communication technology, the object vehicle is served as a transmitter. The ego vehicle, as a receiver, receives the real-time location and travel speed of the object vehicle, then calculates the relative distance and speed of the two vehicles. 2.2

Model Formulation

Definition 1. Time to Collision (TTC). Time to collision is defined as a time unit to determine how soon a collision between two vehicles would occur [17]. TTC ¼ DD=vr

ð1Þ

Where DD is the relative distance between two vehicles, i.e. DD ¼ D Ro Re . Here, Ro and Re presents the radius of the two circles’ (object vehicle and ego vehicle), respectively. Then, D can be calculated by the following Equation. D¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðxe xo Þ2 þ ðye yo Þ2

ð2Þ

Definition 2. Relative Speed. Considering the ego vehicle as a controllable agent for implementing AEB and graded brake machine, vr is defined as the relative speed of the ego vehicle, and its projection on the relative distance DD of the two vehicles is: vr ¼ vx cos/ þ vy sin/

ð3Þ

8 < vx ¼ ve v ¼ vo : y tan/ ¼ y=x

ð4Þ

Subject to

2.3

Intersection Warning System

In the longitudinal driving analysis, forward collision warning (FCW) is a prominent safety indicator for timely decision making of drivers to avoid collisions [18]. Inspired by this idea, we proposed the intersection collision warning (ICW) system. After the ICW is over, the AEB system is triggered to control the ego vehicle to brake autonomously and avoid the collision of two-vehicle at the intersection. The ICW of the ego vehicle will be activated if the following conditions are satisfied. • When the relative speed of the ego vehicle is greater than 10 km/h (i.e. vr 10 km/h), the ICW is activated. If it is less than 10 km/h, the collision generally can be avoided [19].

138

B. Chen et al.

• Considering the reaction time tr of a driver encountering an emergency, the threshold of ICW is determined as ticw ¼ tr þ tw , where tw is the duration of the warning. Table 1 suggests that tr is set to 1.8 s and tw is generally 1 s according to expert knowledge and experience. Table 1. Statistics for avoiding collisions under 3 types of braking force Brake force The percentage of time it takes to avoid collisions 5% 25% 75% 95% 0.5 g 0.1 s 0.6 s 1.5 s 1.8 s 0.675 g 0.15 s 0.5 s 1.1 s 1.5 s 0.85 g 0.2 s 0.4 s 0.85 s 1.2 s

Mean

1.15 s 0.8 s 0.6 s

In general, ICW is activated if vr 10 km/h firstly and TTC 2:8 s. Consequently, if the driver does not take the braking actions, the AEB function will be mobilized after 1 s.

3 Experiments and Results In this section, we first introduced the simulation platform PreScan briefly. Then the ICA experiments were conducted with the software simulation approach. Finally, we evaluated the model performance of the proposed method via simulation results. 3.1

Simulation Platform

PreScan is a physics-based simulation platform with dynamic vehicle properties, flexible scenario definition, and advanced V2V communication, which is a powerful analysis tool in the Advanced Driver Assistance Systems (ADAS) and active safety systems [20]. Developers can easily simulate their experiments in PreScan benefit from its abundant interfaces, such as Carsim and Vissim. Also, designing a V2V or V2I (Vehicle to Infrastructure) simulation system and testing software-in-the-loop (SIL) and hardware-in-the-loop (HIL) systems are comfortable to be implemented. In this work, both active safety system and V2V communication were introduced [20], and the systematic framework consists of four main steps (see Fig. 3a): (1) Build scenario. (2) Add control system. (3) Run experiment. (4) Model sensor system. According to Fig. 2, we built the intersection collision environment and the construction of V2V communication was conducted. In Fig. 3b, the two-vehicle collision scenario was constructed in PreScan, vehicle dynamics, V2V communication modules, and travel trajectories were designed in detail as well. In the V2V modules also named as V2XTransceiver was applied to vehicles for receiving or transmitting the dynamic traveling information.

Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance

139

Ego vehicle (V2V Receiver)

Office building Object vehicle (V2V transmi er)

(a) Systematic framework

(b) Simulation environment

Fig. 3. Simulation platform construction

3.2

Simulation Setup

PreScan offers plenty of vehicle simulation interfaces for modeling realistic driving behaviors, this experiment employed Matlab/Simulink to interconnect them. The vehicle trajectory, V2V communication, and vehicle dynamic configuration were constructed in PreScan, simulated in Simulink afterward. Here, we used a finite-state machine [21] to implement the switch of the operating state caused by the speed change of the vehicle entity, including the ICW and the automatic two-stage braking activation. The parametric configuration in PreScan and Simulink is shown in Table 2. Table 2. Parametric configuration of the experiment Attributes Model of vehicle V2X_Transceiver Initial speed Braking function Warning module

3.3

Ego vehicle Mazda_RX8 Receiver 45 km/h (12.5 m/s) Two-stage brake Intersection collision warning

Object vehicle Nissan_Cabstar Transmitter 20 km/h (5.56 m/s) – –

Results

Due to the inertia of the vehicles, the automatic emergency braking system could not stop the vehicles immediately. Besides, considering excessive braking force may cause extra uncertain risks. More thoughtfully, this paper employed the two-step brake function which can effectively address these issues. As shown in Fig. 4, the two-step brake contained two indicators of time. In this study, Ta1 2 f1:6s; 2:0sg and Ta2 2 f0:4s; 1:6sg suggested that braking is performed at which range of the current TTC belonged, and the first stage braking deceleration was set 4 m/s2 and the second one was set 8 m/s2 via trial and error.

B. Chen et al.

Time to collision (s) Ta1

2.0

1.6

First stage brake Second stage brake

1.4

1.9 1.2 1.8

1.0

Ta2

140

0.8 1.7 0.6 1.6

0.4 10

20

30

40

50

60

70

Ego speed (km/h)

Fig. 4. The two-stage brake function

Figure 5 shows how the speed and warning signal curves between the ego vehicle and object vehicle are distributed via PreScan and Simulink simulations. As is shown, the x-axis depicts the time of the events that occur during the ICA maneuver while the y-axis represents the speed of ego and object vehicle in m/s in Fig. 5a and c.

(a)

(b)

(c)

Fig. 5. The simulation results for ICA

Figure 5a shows the speed changes for autonomous graded braking and the curves demonstrated that the proposed ICA strategy with desirable performance to avoid a collision. Then, the ICW module was triggered after 1.8 s and continuously alarmed for one second as shown in Fig. 5b. Figure 5c illustrated that the object vehicle passed the intersection with no crash when the speed is below 20 km/h.

Rule-Based Graded Braking for Unsignalized Intersection Collision Avoidance

141

4 Conclusions This study set out to address the intersection collision avoidance issues, and was an exploration of how to connect collision avoidance and automatic braking to be a closed-loop system. Based on PreScan simulation platform, both the two-vehicle collision model and two-stage brake module were constructed for cooperative collision avoidance. In addition, the presented methods of this research allowed us to evaluate the ICA issues more effectively. Finally, the simulation results demonstrated that the proposed rule-based graded braking strategy is feasible and can achieve satisfying performance for preventing an intersection crash. As mentioned in the introduction, with the complex and dynamic traffic condition at unsignalized intersection, considerably more work will need to be done to develop scenario diversification and multi-vehicle conflict rather than two-vehicle conflict. Acknowledgment. This work was supported in part by Projects of the National Science Foundation of China (41971340, 41471333, 61304199), project 2017A13025 of Science and Technology Development Center, Ministry of Education, project 2018Y3001 of Fujian Provincial Department of Science and Technology, projects of Fujian Provincial Department of Education (JA14209, JA15325, FBJG20180049).

References 1. Haleem, K., Abdel-Aty, M.: Examining traffic crash injury severity at unsignalized intersections. J. Saf. Res. 41(4), 347–357 (2010) 2. Killi, D.V., Vedagiri, P.: Proactive evaluation of traffic safety at an unsignalized intersection using micro-simulation. J. Traffic Logist. Eng. 2(2), 140–145 (2014) 3. Lopez, B.T., How, J.P.: Aggressive collision avoidance with limited field-of-view sensing. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2017) 4. Chen, L., Englund, C.: Cooperative intersection management: a survey. IEEE Trans. Intell. Transp. Syst. 17(2), 570–586 (2015) 5. Ahn, H., Del Vecchio, D.: Semi-autonomous intersection collision avoidance through jobshop scheduling. In: Proceedings of the 19th International Conference on Hybrid Systems: Computation and Control. ACM (2016) 6. Farahmand, A.S., Mili, L.: Cooperative decentralized intersection collision avoidance using extended Kalman filtering. In: 2009 IEEE Intelligent Vehicles Symposium. IEEE (2009) 7. Hafner, M., Cunningham, D., Caminiti, L., et al.: Automated vehicle-to-vehicle collision avoidance at intersections. In: Proceedings of World Congress on Intelligent Transport Systems (2011) 8. Hafner, M.R., Cunningham, D., Caminiti, L., et al.: Cooperative collision avoidance at intersections: algorithms and experiments. IEEE Trans. Intell. Transp. Syst. 14(3), 1162– 1175 (2013) 9. Xu, Q., Mak, T., Ko, J., et al.: Vehicle-to-vehicle safety messaging in DSRC. In: Proceedings of the 1st ACM International Workshop on Vehicular Ad Hoc Networks. ACM (2004)

142

B. Chen et al.

10. Lee, J., Park, B.: Development and evaluation of a cooperative vehicle intersection control algorithm under the connected vehicles environment. IEEE Trans. Intell. Transp. Syst. 13(1), 81–90 (2012) 11. Xu, P., Huang, H., Dong, N., et al.: Revisiting crash spatial heterogeneity: a Bayesian spatially varying coefficients approach. Accid. Anal. Prev. 98, 330–337 (2017) 12. Schildbach, G., Soppert, M., Borrelli, F.: A collision avoidance system at intersections using robust model predictive control. In: 2016 IEEE Intelligent Vehicles Symposium (IV). IEEE (2016) 13. Yan, X., Zhang, Y., Ma, L.: The influence of in-vehicle speech warning timing on drivers’ collision avoidance performance at signalized intersections. Transp. Res. Part C: Emerg. Technol. 51, 231–242 (2015) 14. Fu, Y., Li, C., Luan, T.H., et al.: Infrastructure-cooperative algorithm for effective intersection collision avoidance. Transp. Res. Part C: Emerg. Technol. 89, 188–204 (2018) 15. Choi, E-H.: Crash factors in intersection-related crashes: An on-scene perspective. 2010 16. Wang, J., Chi, R., Zhang, L., et al.: Study on forward collision warning-avoidance algorithm based on driver characteristics adaptation. J. Highw. Transp. Res. Dev. 26, 7–12 (2009) 17. Ward, J., Agamennoni, G., Worrall, S., et al.: Vehicle collision probability calculation for general traffic scenarios under uncertainty. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings. IEEE (2014) 18. Wang, J., Yu, C., Li, S.E., et al.: A forward collision warning algorithm with adaptation to driver behaviors. IEEE Trans. Intell. Transp. Syst. 17(4), 1157–1167 (2015) 19. Colombo, A., Del Vecchio, D.: Efficient algorithms for collision avoidance at intersections. In: Proceedings of the 15th ACM International Conference on Hybrid Systems: Computation and Control. ACM (2012) 20. PreScan. https://www.tassinternational.com/prescan 21. Kim, A., Otani, T., Leung, V.: Model-based design for the development and system-level testing of ADAS. In: Energy Consumption and Autonomous Driving, pp. 39–48. Springer, Heidelberg (2016)

A Novel Measure for Trajectory Similarity Sijie Luo1,2, Fumin Zou1,2(&), Qiqin Cai1,2, Feng Guo1,2, Weihui Xu1,2, and Yong Li3 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected], [email protected] Fujian Provincial Big Data Research Institute of Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, Fujian, China 3 Fujian Fortunetone Network Technology Co., Ltd., Fuzhou 350002, Fujian, China

Abstract. With the rapid growth of motor vehicles and the popularity of the Global Positioning System, massive traffic trajectory data is accumulated. Traffic trajectory data contains a large amount of structured knowledge such as time, space and the relationship between various traffic elements. Comparing the trajectory of the owner’s travel, mining the travel mode of the owner has become a hot topic in academic research. At present, commonly used methods for measuring similarity include dynamic time warping (DTW), Euclidean distance (ED), longest common subsequence (LCSS), and edit distance (EDR). DTW is currently recognized as the most anti-jamming and anti-deformation similarity evaluation method, but its disadvantages that high time and space complexity are also obvious. This paper proposes a new trajectory similarity measure, without using the DTW algorithm, we also can solve the problem of inaccurate matching by noise. The matching accuracy of this method is similar to or equal to DTW, which can be applied to traffic trajectory similarity evaluation, the matching speed is about 2–5 times faster than DTW. Keywords: Traffic trajectory data Travel mode DTW Trajectory similarity measuring

1 Introduction In recent years, computer and information technology have developed rapidly, big data has entered various industries, such as user search records of browsers such as Baidu and 360, travel data of walking and transportation, Alipay transaction data of WeChat and Alipay. Traffic trajectory data is a major part of big data, the traffic trajectory data mainly records the driving information of the vehicle, the latitude, and longitude information, the vehicle traveling speed information, the vehicle sampling time, and the overall trajectory of the vehicle. From these traffic trajectory data, we can analyze the hobbies of public travel, traffic congestion states, driving habits of car owners, traffic flow analysis and prediction, and the analysis of new and changed traffic networks [1]. Through the analysis of trajectory similarity, the government can make urban traffic management, car owners can get travel recommendation and planning, provides © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 143–150, 2020. https://doi.org/10.1007/978-981-15-3308-2_17

144

S. Luo et al.

insurance company UBI insurance service for insurance companies and provides road network update service, the study of trajectory similarity has a wide range of applications [2, 3]. Traffic trajectory is a kind of data with spatiotemporality, high dimensionality, massive space-time asynchronous non-stationarity, and there are translation and stretching in the amplitude and inter-axis, and even nonlinear drift and inter-continuous complexity performance, position Offset error, and transmission loss; data is usually sparse. These inherent characteristics make the research on spatiotemporal trajectory big data similarity mining full of challenges. DTW realizes dynamic trajectory distortion, which is a very accurate method for processing traffic trajectory data. It can avoid positional deviation and correctly perform point-to-point matching, and the matching precision is very high. However, due to its time complexity, the matching efficiency is seriously affected. Therefore, an efficient and accurate method for processing traffic trajectory data is urgently needed. In order to solve this problem, we propose a combination of multiple lower bound methods to find most similar trajectory, which first discards the most dissimilar trajectory data, and then derives a plurality of trajectory data with the smallest difference value. we do not use DTW with high time complexity, and we also can find the two most similar trajectory data at high speed and accuracy. The rest of the paper is as follows: The second part mainly introduces the current work on trajectory similarity research, the third part is modeling and introducing us to propose novel methods, and the fourth part summarizes our experimental results. The fifth part is to summarize the work of this paper.

2 Related Work The calculation of the similarity and dissimilarity between two trajectories depends on the distance between the two trajectories, the value of the distance usually is an important basis of the similarity or dissimilarity [4]. At present, there are many methods for trajectory similarity measurement. The most primitive trajectory similarity measure is that Euclidean distance [5, 6], which is a method of strictly adopting point-to-point pairing, so it can not measure trajectory data with different length of the point. The Sum-of-Pairs Distance (SPD) [7] is also a method of summing the distances of two points. This method requires the lengths of all points of the two tracks to be the same. The basic idea of these two methods is to correspond one-to-one at each moment, Therefore, for a trajectory sequence with more noise, the matching accuracy will be reduced. In order to further enhance the robustness of trajectory and improve matching accuracy, dynamic warping has been proposed. LCSS [8] and EDR [9] and DTW [10, 11] are both based on the dynamic regularization to match the trajectory points, which has become the mainstream method of trajectory similarity matching, but their shortcomings are time complexity is high. The idea behind LCSS is to calculate how many points are the same in two trajectories. If the two points are the same, the value is 0. The opposite is 1. the larger the value, the more accurate it is. The LCSS is relatively immune to noise points and works better when dealing with more trajectories with

A Novel Measure for Trajectory Similarity

145

more noise points. The EDR is the opposite of LCSS. if the two points are different, the value is 0, otherwise, it is regarded as 1. when the sum of value is 0, which is the most exact, its trajectory matching accuracy is higher than LCSS. DTW uses recursive calculation from the last point to the first point to find the shortest distance. The smaller the distance is, the more similar. This method is one of the most accurate methods, but its time complexity is very high. It takes a lot of time in the calculation process.

3 Definition Definition 1: The longitude and latitude collected by the vehicle terminal at a constant time, and these trajectory data are time series. A trajectory sequence is composed of all trajectory points in a certain period of time, S ¼ ðP1 ; P2 ; . . .; Pm Þ, Pi is a trajectory point. In a period of time, multiple trajectory points form a track sub-sequence, S1 ¼ ðP1 ; P2 ; . . .; Px Þ, and all trajectory sub-sequences form a track sequence S, S ¼ S1 ; S2 ; . . .; Sn . Definition 2: Si;n denotes a subsequence of the trajectory sequence S of length n. S is a long trajectory sequence, we wish to use Si;n to compare, we denote Si;n as c. Definition 3: The query sequence q is an ordered sequence consisting of n trajectory time points q ¼ q1 ; q2 ; . . .; qn , The function of the query sequence is to find the most similar c in the trajectory sequence S. Definition 4: Euclidean distance is the earliest measure of track sequence similarity, and Eq. 1 represents the formula of Euclidean distance calculation. ED ¼

ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Xn 2 ð q c Þ i i i¼1

ð1Þ

4 Algorithm 4.1

Known Algorithms

LB_kimFL LB_kimfl [12] is the sum of the Euclidean distances between the first and last track points of c and q, which is an algorithm of evaluating the dissimilarity of the trajectory sequence. If the value calculated by LB_kimFL is greater than the best-so-far, discard this subsequence. Its time complexity is Oð1Þ. Its formula is as follows (Fig. 1) Dist kimFL ¼ EDðq1 ; c1 Þ þ EDðqn ; cn Þ

ð2Þ

146

S. Luo et al.

Fig. 1. The first and last points of the query sequence and trajectory sequence are computed

LB_keoghEQ LB_keoghEQ [13] is also an algorithm for evaluating dissimilarity, which is to make an envelope for the query trajectory sequence q and calculate the sum of the Euclidean distances outside the envelope. Due to the Euclidean distances of the outside of the envelope is much smaller than the true Euclidean distance, so when the Euclidean distances of outside of the envelope are greater than the current optimal value, the two sequences are different, then it can definitely be discarded (Fig. 2).

Fig. 2. The query sequence has an envelope, and the Euclidean distance between the point on the envelope and the point of the subsequence is much smaller than the Euclidean distance between the query sequence and the subsequence.

qup;i ¼ maxðqim ; . . .; qi þ m Þ; qup ¼ qup;1 ; qup;i ; . . .; qup;n

ð3Þ

qlow;i ¼ minðqim ; . . .; qi þ m Þ; qlow;i ¼ qlow;1 ; qlow;i ; . . .; qlow;n

ð4Þ

Dist keoghEQðqi ; ci Þ ¼

qup;i [ ci ED qup;i ; ci EDðqlow ; ci Þ qlow;i \ci

ð5Þ

The upper envelope qup takes the maximum value of m neighbors of q, and the lower envelope qlow takes the minimum value of m neighbors of q. When the trajectory sequence point ci is larger than the upper envelope qup;i , using the Euclidean distance of ci and qup;i , When the trajectory sequence point ci is smaller than the low envelope qlow;i , using the Euclidean distance of ci and qlow;i . lb_keoghEC LB_keoghEC [14] is also an algorithm for evaluating dissimilarity. It is the same principle as lb_keoghEQ but changed the object of the envelope, which makes an envelope for the trajectory subsequence c. The upper envelope takes the maximum value of multiple neighbors, and the lower envelope takes the nearest minimum value and calculates the sum of the Euclidean distances outside the envelope (Fig. 3).

A Novel Measure for Trajectory Similarity

147

Fig. 3. The trajectory subsequence does a layer of the envelope, because the q, c fluctuations are not the same, so the envelope for C can sometimes be better than q.

4.2

Novel Optimization

Using these three dissimilarity search methods, we have built a new serial discarding strategy with reference to the UCR suite [14]. We do not need to use the DTW algorithm, and we can both externally discard and internal computer. First, we use lb_kim. If the value calculated by lb_kim is greater than the best-sofar, discard this subsequence. otherwise, we will use lb_keoghEQ to calculate it. If it can not be discarded by lb_keoghEQ, the lb_keoghEC algorithms will be used. If the subsequences cannot be discarded by three algorithms, then the values of lb_keoghEQ and lb_keoghEC are added to measure the dissimilarity. The smaller the value of the dissimilarity, the more similar the subsequence sequence is. Since the value of lb_kim is too small compared to the other two algorithms, we only add the values obtained by lb_keogh(EQ) and lb_keogh(EC) to form a standard for measuring dissimilarity.

Algorithm: Our method Input: Query sequence q; Dataset d; Output: The most similar trajectory data 1 Best-so-far = infinity 2 similarity_data = {} 3 for all sequence in d: 4 K = lb_kim(q,c) 5 if K < best-so-far: 6 Q = lb_keoghEQ(q,c) 7 if Q < best-so-far: 8 C = lb_keoghEC(q,c) 9 if C< best-so-far: 10 Best-so-far = max(Q,C) 11 sum = Q+C 12 add sum to similarity_data 13 endif 14 endif 15 endif 16 endfor 17 sort(similarity_data) 18 return(similarity_data)

148

S. Luo et al.

The envelopes of lb_keoghEC and lb_keoghEQ take the nearest neighbor value generally 0.05 times or 0.025 times the length of the query sequence. The original value of Best-so-far is set to infinity. After the three dissimilarity algorithms are run, we take the sum of lb_keoghEC and lb_keoghEQ to update the best-so-far.

5 Experiment Our experimental platform is a Lenovo PC with 8 GB of memory running at 2.6 GHz and an Intel CPU i5. The programming language is python3.7. Our experimental data from the open-source data of “Speeding up similarity search under dynamic time warping by pruning unpromising alignments”. The open-source data includes data sets in different fields, each data set has five different query sequences. We selected the football dataset, which has a total of 1998606 trajectory sequence points, of which the query sequence length is 1024. First, we selected two different query sequences to find the most similar subsequences from the dataset, demonstrating the effectiveness of our idea. Then select the same trajectory sequence with different lengths of 128, 256, 384, 512 and compare the time costs with DTW algorithm to demonstrate the efficiency of our method. 5.1

Effectiveness Tests

In the experiment, we first selected two different query sequences to search for the most similar subsequence in the data set, whose length is 128. The following figure shows the five most similar sub-sequences obtained by our method, In particular, in order to better demonstrate the similar sequence effects, we shift the partial sequence up, the five sub-sequences almost Coincidence with the query sequence. The most similar subsequence is also the most similar one calculated by the DTW algorithm, so our method is effective (Fig. 4). 5.2

Efficiency Tests

According to previous studies, the length of the query sequence is best using a multiple of 128, so we selected the query sequence with lengths of 128, 256, 384, and 512 to find the most similar subsequences from the data set. From the following table, we can prove that our method is several times faster than the DTW algorithm, which can save a lot of time, so our method is efficient (Table 1).

A Novel Measure for Trajectory Similarity

149

Fig. 4. Query have 128 points, lb_keoghEC and lb_keoghEQ envelopes take 6 neighbors and 12 neighbors, (c) and (d) are enveloped with 6 neighbors, (c) and (d) are enveloped with the 12 neighbors.

Table 1. The time cost of our method and DTW Query Length Time consuming/s DTW Our method 1 128 74 14 2 256 163 28 3 384 326 72 4 512 1826 967

150

S. Luo et al.

6 Conclusion We propose a new trajectory similarity evaluation method. Without the time-high cost algorithm of DTW, the accuracy of DTW can be achieved. Experiments show that our method can be applied to the evaluation of trajectory similarity. Find the most similar trajectory sequence in the traffic trajectory. Although the speed of our method is much faster than the DTW algorithm, we still have to spend a lot of time in the large query sequence. In future work, we can optimize the internal structure of our method to further speed up the search.

References 1. Zou, F., Llao, L., Jiang, X., Lai, H.: An automatic recognition approach for traffic congestion states based on traffic video. J. Highway Transp. Res. Dev. (Engl. Edn.) 8(2), 72–80 (2014) 2. Li, Y., Su, H., Demiryurek, U., et al.: PaRE: a system for personalized route guidance. In: Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 637–646 (2017) 3. Cai, Q., Liao, L., Zou, F., Song, S., Liu, J., Zhang, M.: Trajectory similarity measuring with grid-based DTW. In: The 2nd International Conference on Smart Vehicular Technology, Transportation, Communication and Application, pp. 63–72. Springer (2019) 4. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining, pp. 60–65. Pearson Addison-Wesley, Boston (2006) 5. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. Found. Data Organ. Algorithms 730, 69–84 (1993) 6. Atev, S., Miller, G., Papanikolopoulos, N.P.: Clustering of vehicle trajectories. IEEE Trans. Intell. Transp. Syst. 11(3), 647–657 (2010) 7. Agrawal, R., Faloutsos, C., Swami, A.N.: Efficient similarity search in sequence databases. In: FODO, pp. 69–84 (1993) 8. Kang, H.Y., Kim, J.S., Li, K.J.: Similarity measures for trajectory of moving objects in cellular space. In: Proceedings of the 2009 ACM symposium on Applied Computing, pp. 1325–1330. ACM (2009) 9. Chen, L., Ozsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: SIGMOD (2005) 10. Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 67–72 (1975) 11. Silva, D.F., Giusti, R.: Speeding up similarity search under dynamic time warping by pruning unpromising alignments. Data Min. Knowl. Disc. 32, 988–1016 (2018) 12. Kim, S., Park, S., Chu, W.: An index-based approach for similarity search supporting time warping in large sequence databases. In: ICDE, pp. 607–61 (2001) 13. Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005) 14. Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270 (2012)

TS-DBSCAN: To Detect Trajectory Anomaly for Transportation Vehicles Xinke Wu1,2, Lyuchao Liao1,2(&), Fumin Zou1,2, Jiurui Liu1,2, Bijun Chen1,2, and Yuxin Zheng1,2 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, China [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Fujian Provincial Big Data Research Institute of Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, China

Abstract. Deep exploration of the potential characteristics from vehicle trajectory data benefits transportation safety management and improves transport efficiency. Therefore, trajectories anomaly detection plays a pivotal role in transport enterprises. In this paper, we proposed a novel density clustering model named TS-DBSCAN to detect outliers of trajectory data, which is a DBSCAN-based method for clustering time-series data. We first analyzed the time correlation of the trajectory data of transportation vehicles. Then the distance between two adjacent timestamps is considered as the training data of DBSCAN clustering algorithm, determining the e-neighborhood radius ðEpsÞ and the minimum neighbor number ðMinPtsÞ according to the distance density distribution. Finally, anomaly clusters are detected from the trajectory data. We conducted experiments based on real trajectory data of transportation vehicles to evaluate the effectiveness. The experimental results show that TS-DBSCAN algorithm can detect abnormal trajectory data with both efficiency and accuracy. Keywords: Transportation vehicles Massive trajectory data TS-DBSCAN Time-correlation

1 Introduction Telematics [1] is a method of monitoring an asset (car, truck, heavy equipment, or even ship) using GPS and onboard diagnostics to record movements on a computerized map. At present, transport industries use the data of Telematics systems to explore the potential characteristics from trajectory data for transportation vehicles [2, 3], and further improve the level of transportation safety management and transportation efficiency. The vehicle equipped with position device can automatically collect the current driving status information, and upload it to the Telematics system. Analyzing the collected data through data mining is pivotal for transportation safety management. However, the trajectory data of transportation vehicles are mainly obtained from GPS © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 151–160, 2020. https://doi.org/10.1007/978-981-15-3308-2_18

152

X. Wu et al.

equipment [4] whose positioning failure and other factors make the data quality decline. Therefore, anomaly detection is very important for the accuracy and efficiency of trajectory data [5]. There are many methods to detect anomaly data, such as Z-Score, Isolation Forest, K Nearest Neighbors(KNN), D-OPTICS, DBSACN [6] and so on. Z-Score is an effective method to describe the values in the feature space with a gaussian distribution [5]. Isolation Forest, with fewer parameters to be settled, is widely used when value distributions cannot be assumed, which is fairly robust and easy to optimize [7]. KNN algorithm introduces a distance-based definition of outliers that considers the sum of the distances from k nearest neighbors of a certain point [8]. D-OPTICS algorithm can effectively integrate the direction information of floating vehicle trajectory data, and realize the fast clustering analysis of floating vehicle trajectory data through the directed density estimation of space vector data [9]. DBSCAN algorithm can cluster dense data sets of arbitrary shapes and find anomaly points at the same time [10, 11]. However, under the circumstance of a large sample set, the clustering convergence is very slow. And selecting the optimal parameters Eps and MinPts can also be difficult since the clustering results are very sensitive to various combinations of the two parameters [12]. Inspired by DBSCAN algorithm, this paper proposed a time series based DBSCAN (TS-DBSCAN) algorithm. Our method utilizes the trajectory data of the transportation vehicles, which is strictly ordered in time series. Using TS-DBSCAN algorithm for abnormal data detection requires two steps. Firstly, clustering the data of each timestamp according to the time sequence. When the distance between the current timestamp and the previous timestamp is greater than the e-neighborhood radius Eps, they belong to the same cluster; otherwise, the current point will be classified into the next new cluster. Then, we count the number of trajectory points contained in each cluster. If the number of track points contained in a cluster is less than the minimum neighbor numbers MinPts, the cluster is a noise cluster. TS-DBSCAN algorithm overcomes the disadvantage of DBSCAN algorithm in time complexity. DBSCAN algorithm achieves O(N2) in the fastest case [13], while TS-DBSCAN algorithm reduces the time complexity to O(N), greatly improving the clustering efficiency.

2 Replaying of Trajectory This experiment uses the trajectory data of transportation vehicles, including data 450 transportation vehicles for about two months. And the main attribute index information is illustrated in Table 1. To intuitively observe trajectory data, we project the longitude and latitude data collected by GPS onto the road through map matching technology and preliminarily analyze the quality of trajectory data. Benefiting from Google’s real-time map update, we import trajectory data through Google Fusion Tables [14], and accurately check whether the trajectory data match the roads through satellite images. As shown in Fig. 1, we show the vehicle trajectories with obvious anomaly data. Intuitively, the trajectory data is prone to jump in the missing road section, and most of the anomaly data are in straight lines.

TS-DBSCAN: To Detect Trajectory Anomaly

153

Table 1. Trajectory data attribute index information Num

Index name

1

vehicleplatenumber

2 3

device_num direction_angle

Indicator implication Vehicle plate number Device number Direction angle

4 5 6 7 8 9

lng lat acc_state location_time gps_speed mileage

Longitude Latitude ACC state Location time GPS speed GPS mileage

Notes – – Range: 0–359 (0 is right north, clockwise) East longitude North latitude Ignition 1/flameout 0 – Unit: km/h Unit: km

Fig. 1. The replaying of vehicle trajectory contained anomaly data (Left: AA00002; Middle: AB00006; Right: AD000419)

3 Methodology In this section, we analyze the proposed model and give some essential definitions [6] about TB-DBSCAN algorithm. 3.1

Model Analysis

DBSCAN gets good clustering results for non-uniformly distributed data, avoids the interference of noise, and finds clusters of arbitrary shapes [15]. However, traditional DBSCAN algorithms need to calculate the distance between any two points in the process of clustering, which has sharply raisin computation when clustering the massive data. The trajectory data of transportation vehicles used in this experiment are sorted in strict accordance with time series, so the time of the same cluster of data must

154

X. Wu et al.

be continuous under normal circumstances. Considering this characteristic, we propose TS-DBSCAN algorithm. On the other hand, the location information and timestamps in the trajectory data of transportation vehicles are mainly obtained by GPS calculation. However, due to the failure of the on-board terminal equipment and the dense urban high-rise buildings, the received trajectory data may be subject to GPS data drift, anomaly speed value and chaotic receiving time. So, we need to preprocess the trajectory data before clustering. 3.2

TS-DBSCAN Clustering

Some concepts and terms to explain the TS-DBSCAN algorithm can be defined as follows: Definition 1 (trajectory data): The transport vehicle loaded with satellite positioning and communication devices can automatically collect the current driving status information and upload information to the Telematics system. The collected information, such as location, time and ignition/flameout state, constitutes the transport vehicle trajectory data P: P ¼ fPi ; i ¼ 1; 2; . . .; N g

ð1Þ

Pi ¼ fhlngi ; lati i; ti ; acci g

ð2Þ

where, hlngi ; lati i is the spatial location information; ti is the timestamp of the trajectory data, and acci represents ignition/flameout status information of the vehicle. Definition 2 (e neighborhood): For 8pj 2 P, the e-neighborhood of pj contains the sub-sample points in the data set P whose distance from pj is no greater than Eps, then: N pj ¼ pi 2 Pjdistance pi ; pj

ð3Þ

And, N pj is the number of subsample sets. Definition 3 (Core-object): For 8pj 2 P, if the corresponding e neighborhood of N pj contains at least MinPts samples, i.e. if N pj MinPts; then pj is the coreobject. Definition 4 (Directly density-reachable): If pi is in the e neighborhood of pj , and pj is the core-object, then pi is directly density-reachable from pj wrt. Eps, MinPts. Definition 5 (Density-reachable): For pi and pj , if sample sequences e1 ; e2 ; . . .; eT , satisfy e1 ¼ pj , eT ¼ pi , and et þ 1 is directly density-reachable from et , then pi is density-reachable from pj wrt. Eps, MinPts. Definition 6 (Density-connected): If data point o such that both pi and pj are densityreachable from o wrt. Eps and MinPts, then pi is density-connected to pj wrt. Eps and MinPts.

TS-DBSCAN: To Detect Trajectory Anomaly

155

Definition 7 (Cluster): A cluster C wrt. Eps and MinPts is a non-empty subset of P satisfying the following conditions: (1) 8pi ; pj 2 P, if pi 2 C, and pj is densityreachable from pi wrt. Eps, MinPts, then pj 2 C. (2) 8pi ; pj 2 C, pi is densityconnected to pj wrt. Eps and MinPts. Definition 8 (Noise cluster): Let C ¼ fC1 ; C2 ; . . .; Ck g be the clusters of the database P wrt. parameters Eps, MinPts, then define the noise as the set of points in the database P not belonging to any cluster Ci ; i ¼ 1; . . .; k, i.e. niose ¼ fpi 2 Pj8i : pi 6¼ Ci g. Definition 9 (T-Distance): According to the location information of longitude and latitude in the trajectory data, we calculate the spherical distance between two adjacent timestamps, and then calculate the distance information of longitude and latitude by the haversine formula. The latitude and longitude distance (TD) between adjacent trajectory points are defined as follows:

TD hav R

¼ havðlat2 lat1 Þ þ cosðlat1 Þcosðlat2 Þhavðlng2 lng1 Þ

TD havðhÞ ¼ hav R

h 1 cosðhÞ ¼ sin ¼ 2 2 2

ð4Þ ð5Þ

where, lat1 and lat2 are latitudes of two trajectory points, lng1 andlng2 are corresponding longitudes, R is earth radius, and TD R represents the radians h of two points on a circle. From Eqs. (4) and (5), we can get: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! lat2 lat1 lng2 lng1 2 2 TD ¼ 2Rarcsin sin þ cosðlat1 Þcosðlat2 Þsin ð6Þ 2 2 This algorithm starts with the first point p1 in the trajectory data set P, and add p1 to C1 . If the distance between the current point pi and previous point pi1 is not greater than the minimum neighborhood radius Eps, then set the two points as the same cluster, otherwise, create a new cluster. After trajectories Clustering, we count the number of points in each cluster. If the number of points included in a cluster is less than the minimum neighbor number MinPts, we consider the cluster as an outlier. The process is repeated until all of the points have been disposed.

156

X. Wu et al.

The pseudo-code for the TS-DBSCAN algorithm is shown below:

4 Experiments and Results In this section, we shall use experimental results to show the clustering performance of TS-DBSCAN. We experimented with the trajectory data of 450 transportation vehicles, and randomly selected 2 vehicles (AA00002 and AB00006) to demonstrate the experimental process and results of TS-DBSCAN algorithm. The experimental process mainly includes two steps: First, we initially clean the trajectory data collected by GPS. Second, TS-DBSCAN algorithm is used to cluster the cleaned trajectory data and detect anomaly data. 4.1

Trajectory Data Preprocessing

Trajectory data preprocessing can be divided into three parts. Firstly, we segment the trajectory, then remove the abnormal timestamp and location information by traditional methods. Trajectory Segmentation. According to the ignition/extinction status information of vehicles in the whole driving process, we segment the trajectory, which can facilitate the following experiments. Timestamp Anomaly Data Cleaning. After the trajectory segmentation, considering that GPS collects data every second, we need to eliminate the repeated timestamps caused by the chaotic receiving time. Location Information Anomaly Data Cleaning. In this step, we need to eliminate the data with a very large distance from the previous time, but this distance threshold is difficult to determine. Therefore, this experiment removes the data of positional

TS-DBSCAN: To Detect Trajectory Anomaly

157

abnormality according to the speed between two points. If the distance between two adjacent timestamps is greater than a certain threshold, then remove the data. In this paper, we filtered out the data that speed over120 km/h.

Fig. 2. Trajectory after cleaning speed

As shown in Fig. 2, after trajectory data preprocessing, there still exist a few outliers and sparse trajectories in the vehicle trajectory data. And we find that the attribute characteristics of the anomaly trajectory are very similar to the original data. This means that the criterion based on timestamps and speed is not enough to filter out all the invalid data. Therefore, we need to cluster the trajectory data by TS-DBSCAN algorithm to completely eliminate most of the invalid data. 4.2

Clustering Experiment and Results

Inspired by DBSCAN algorithm, we propose a time series based DBSCAN (TSDBSCAN) algorithm. The trajectory data, after preliminary cleaning and pretreatment, were considered as the input of TS-DBSCAN. And the algorithm only calculates the distance between the adjacent two timestamp values. By observing the distance density distribution (as shown in Fig. 3), the e – neighborhood radius Eps is set at 0.1, and the minimum neighbor number MinPts is set at 100.

Fig. 3. Distance density distribution

158

X. Wu et al.

The reachability-distance curve (as shown in the Fig. 4) reflects the distance information between two adjacent timestamps. There are many areas parallel to the horizontal axis in the curve, indicating that data points are relatively dense. While the raised part indicates that the distance between data points is relatively large, which can be used for data clusters segmentation.

Fig. 4. Illustration of the cluster-ordering

In Fig. 5, we can clearly see the effect of trajectory clustering. The purple trajectory points in the red circle are noise after clustering and can be judged as noise trajectory clusters. We remove these noise trajectory clusters to ensure the authenticity of the data. Finally, this paper uses the python interactive interface folium module [16] to draw the transportation roadmap of vehicle AA00002 and AB00006 (as shown in the two figures on the right of Fig. 5).

Fig. 5. Trajectory clustering and trajectory path of TS-DBSCAN algorithm

TS-DBSCAN: To Detect Trajectory Anomaly

159

5 Conclusion In this paper, we proposed an anomaly detection method of transportation vehicle trajectories based on TS-DBSCAN algorithm. The main advantage of our approach is that we only need to calculate the distance between two adjacent time points as an input of TS-DBSCAN algorithm, which effectively reduces the computational burden of the algorithm. Meanwhile, according to the distance density distribution between two adjacent timestamps, we determined the e-neighborhood radius Eps and the minimum neighbor number MinPts. We use the trajectory data of transportation vehicles for experiments. And experimental analysis shows that TS-DBSCAN algorithm can adapt to the characteristics of non-uniform distribution of trajectory data, and detect abnormal trajectory data effectively and accurately, providing support for further mining and analysis of the trajectory data of transportation vehicles. Acknowledgment. This work was supported in part by projects of the National Science Foundation of China (41971340, 41471333, 61304199), project 2017A13025 of Science and Technology Development Center, Ministry of Education, project 2018Y3001 of Fujian Provincial Department of Science and Technology, projects of Fujian Provincial Department of Education (JA14209, JA15325, FBJG20180049).

References 1. Kaiwartya, O., Abdullah, A.H., Cao, Y., et al.: Internet of vehicles: motivation, layered architecture network model challenges and future aspects. IEEE Access 4, 5356–5373 (2017) 2. Liao, L., Jiang, X., Zou, F.: A spectral clustering method for big trajectory data mining with latent semantic correlation. Chin. J. Electron. 43(5), 956–964 (2015) 3. Wang, F., Chen, C.: On data processing required to derive mobility patterns from passivelygenerated mobile phone data. Transp. Res. Part C: Emerg. Technol. 87, 58–74 (2018) 4. Chang, C.-C., Lin, C.-J.: ACM transactions on intelligent systems and technology. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011) 5. Wang, Q., Lv, W., Du, B. (eds.): Spatio-temporal anomaly detection in traffic data. In: Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control. ACM (2018) 6. Ester, M., Kriegel, H.-P., Sander, J., et al. (eds.): A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD (1996) 7. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 6(1), 3 (2012) 8. Angiulli, F., Pizzuti, C. (eds.): Fast outlier detection in high dimensional spaces. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer (2002) 9. Lvchao, L., Xinhua, J., Fumin, Z., et al.: A fast method of FCD trajectory data clustering based on the directed density. J. Geo-Inform. Sci. 17(10), 1152–1161 (2015) 10. Song, J., Guo, Y., Wang, B.: Research on parameter configuration method of DBSCAN clustering algorithm. Comput. Technol. Dev. 29(05), 44–48 (2019)

160

X. Wu et al.

11. Gui, Z., Yu, H., Tang, Y.: Locating traffic hot routes from massive taxi tracks in clusters. J. Inf. Sci. Eng. 32(1), 113–131 (2016) 12. Sawant, K.: Adaptive methods for determining DBSCAN parameters. Int. J. Innov. Sci. Eng. Technol. 1(4), 329–334 (2014) 13. Shi-bo, Z., Wei-xiang, X.: A novel clustering algorithm based on relative density and decision graph. Control Decis. 33(11), 1921–1930 (2018) 14. Gonzalez, H., Halevy, A.Y., Jensen, C.S., et al. (eds.): Google fusion tables: web-centered data management and collaboration. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM (2010) 15. Ali, S.Z., Verma, V.K., Sharma, P.R. (eds.): An obscure method for clustering density and noise using DBSCAN and Chameleon algorithm. In: 2018 2nd International Conference on Inventive Systems and Control (ICISC). IEEE (2018) 16. Cuttone, A., Lehmann, S., Larsen, J.E.: geoplotlib: a Python toolbox for visualizing geographical data. arXiv preprint arXiv:160801933 (2016)

Abnormal Analysis of Electricity Data Acquisition in Electricity Information Acquisition System GuanWei Xu1 and Rongjin Zheng2(&) 1

Fujian University of Technology, Fuzhou 350118, Fujian, China Fujian Institute of Engineering Smart Grid Simulation Analysis and Integrated Control of Fujian University Engineering Research Center, Fuzhou 350118, Fujian, China [email protected] 2

Abstract. With the progress of the times, science and technology and the development of our national economy, users’ demand for power supply quality and power consumption is also increasing gradually. In order to effectively improve the efficiency of power management and planning, and to obtain effective and reliable electricity data, the role of electricity information acquisition system is very important. This paper mainly aims at the problem of abnormal electricity quantity in data acquisition of the system, gives its reasons, and through the analysis of examples, puts forward the corresponding improvement measures on this basis. Keywords: Electricity information acquisition system Electricity abnormality

Data acquisition

1 Introduction Electric power user information acquisition system mainly includes four parts: main station, transmission channel, acquisition equipment and electronic watt-hour meter (i.e. intelligent watt-hour meter). It collects, processes and analyses the information and data of distribution transformer and power user in order to realize power monitoring and implement step-by-step determination. Price, load management and line loss analysis will ultimately achieve the goals of automatic meter reading, peak staggering, power consumption inspection (anti-stealing), load forecasting and cost saving. The acquisition equipment part of the electric power information acquisition system, also known as the electric power information acquisition terminal, can realize the data acquisition, processing, management, bidirectional transmission and transmission or execution of control commands. The acquisition terminals are mainly divided into three types: specialized transformer acquisition, centralized meter reading and distributed energy monitoring terminals. In practical data application, abnormal data acquisition often occurs, that is, individual or overall data acquisition does not conform to the actual situation of power consumption [1]. In order to ensure the normal and orderly operation of the whole © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 161–170, 2020. https://doi.org/10.1007/978-981-15-3308-2_19

162

G. Xu and R. Zheng

power system, avoid the occurrence of safety accidents, and reduce the loss of economic benefits of power-related enterprises, it is very important to ensure the accuracy and reliability of the electricity data collected by the system during the operation. Therefore, it is of great significance to prevent and find out the abnormal collection of electricity in the power information acquisition system in advance, and to analyze the reasons and give the corresponding improvement measures. This paper is mainly divided into three parts: Firstly, through the practical experience of collecting operation and maintenance work, combined with relevant information, several reasons for abnormal data collection are given. Secondly, this paper puts forward the practical problems in the process of collecting and applying electricity information collection system in Jinjing Substation of Jinjiang City, Fujian Province. Finally, several methods are provided to effectively improve the abnormal data acquisition problems.

2 Causes of Abnormal Electricity Data There are many reasons that cause abnormal data acquisition of power consumption information acquisition system in the actual situation, which are mainly manifested in the following aspects. 2.1

Abnormal Electric Energy Metering Device

The quality problems of electric energy metering devices directly determine the accuracy and reliability of electric energy data acquisition system [2]. In order to ensure the accuracy and reliability of electric energy data, it is necessary to find out and prevent the failure of electric energy metering device in advance. Following will be from the device’s energy meter, transformer, terminal and junction box these four aspects of the possible failure problems are analyzed. Abnormal Electric Energy Meter. Electric energy meter is one of the most vulnerable parts in metering device [3]. There are many reasons for its failure. It is necessary to analyze the type and source of the failure to deal with its failure effectively. In fact, there are several kinds of fault types of watt-hour meters. Battery failure, that is, short circuit of the battery in the watt-hour meter, or too low or even zero power in the battery. The malfunction of the meter will cause the staff to read the meter incorrectly. Software malfunction or some program errors in the watt-hour meter will cause black screen or even paralysis of the device. The malfunction of the memory in the watt-hour meter equipment causes that the data can not be stored normally. Some electronic devices in the equipment are aging or damaged due to long working hours. Transformer Abnormality. Compared with the above-mentioned energy meter faults, although the frequency of transformer faults is not high, its fault situation is more complex, and there are many types of faults. This fault will also cause abnormal

Abnormal Analysis of Electricity Data Acquisition

163

data collection. Therefore, it is necessary to scientifically analyze the specific causes of transformer faults, and to prevent or solve them in advance or in time. According to the actual operation experience, there are several reasons for transformer failure. The problem of corona partial discharge or complete discharge occurs in the transformer, which results in errors in data acquisition. The incorrect connection of the transformer leads to the failure of the normal operation of the equipment. Ferromagnetic resonance or secondary open circuit of transformer will affect current transmission power, and then cause abnormal acquisition [4]. The fuse of transformer is burnt out or fused. The quality problems of transformers, such as dampness of components, are caused by the influence of the surrounding environment. Terminal Exception. Terminal faults occur with the same frequency as transformers, but the possibility can not be ruled out. In addition to the terminal fault itself, it also includes the terminal of its communication and power supply. For terminal anomalies, it is necessary to fully consider and analyze the various cases of data anomalies caused by terminal anomalies, so as to ensure the accuracy of data acquisition. The causes of terminal abnormalities are as follows. Error in terminal timing. Errors in terminal software result in data loss due to white or black screens. Fault of terminal equipment and abnormal carrier module. Interface failure. Abnormal Junction Box. The abnormal wiring box will also greatly affect the process of data acquisition and operation of the electric power information acquisition system. Firstly, because of the gradual increase in the number of electricity users in recent years, the total power consumption increases, resulting in the power system junction box in the substation is in an overloaded state. In addition, the long-term operation of the equipment will cause a variety of adverse conditions, even damage the junction box, resulting in abnormal data acquisition system. For example, overheating of the junction box will lead to loose joints and sparks; or the junction plates of the junction box will be oxidized, resulting in the connection is not tight, and the contact ability will be reduced. Secondly, the installation of the junction box is a problem. Because some of the screw of the junction box is not tightened, or the junction terminals are not connected or falsely connected, the function of the junction box is abnormal, resulting in the acquisition process of the power information acquisition system can not run normally and data abnormality. 2.2

Data Channel Exception

After data acquisition is completed at the acquisition terminal, data transmission is needed, so data transmission channel is indispensable in the system. If the data channel is abnormal, it will lead to corresponding errors in the collected data, so the abnormal situation of data channel is analyzed. And processing is of great significance [5]. There are three main data channels in power information acquisition system: dedicated line data channel, dial-up data channel and network data channel.

164

G. Xu and R. Zheng

Abnormal Data Channel of Private Line. The channel has many times of error codes and high frequency. When the channel is abnormal, the data collected by the system will also be abnormal, which will affect the efficiency of the whole system. Abnormal Dial-Up Data Channel. When this abnormality occurs, the probability of abnormality of data collected by the system will be greatly increased. Network Data Channel Anomaly. The anomaly of this channel is mainly manifested in the faults of the router and the network card of the data acquisition terminal. The quality of routers, network cards and other structures has a great impact on the operation of network acquisition terminals. If the quality of the structure is low, these structures are prone to malfunction, which will lead to the interruption of data transmission in this channel, resulting in data errors in the power information acquisition system. 2.3

Abnormalities Caused by Environmental Factors

The centralized meter reading terminal in the electric power information acquisition system includes a collector and a concentrator. The location of the centralizer installation will affect the centralizer to a certain extent. For example, the centralizer antenna is installed inside the box transformer, which will affect the stability of GPRS signal. This will cause the centralizer can not effectively receive the information from the autonomous station, and the meter registration behavior in the centralizer can not proceed smoothly, thus causing the energy meter unacceptable. Commands from the concentrator result in abnormal situations such as loss of energy data collected by meters. 2.4

Abnormalities Caused by Human Factors

Data anomalies caused by human factors are mainly divided into two parts. Man-Made Destruction. With the development of intellectualization, smart meters have gradually replaced traditional mechanical meters in metering data, but the behavior habits of some users in using mechanical meters have not changed. When using smart meters, the circuit inside the meter will be demolished by itself, which is one of the reasons for the abnormal collection of system electricity data. Data Acquisition Errors. This situation usually occurs in the early stage of the application of electricity information acquisition system. The staff who collect data are inexperienced, unfamiliar with the form or unreasonable design of the form, which leads to irregular filling in data and wrong filling in data in the process of acquisition, thus causing errors in the collected electricity data.

3 Analysis of Abnormal Electricity Data The second part of this paper mainly introduces and briefly analyses the reasons of abnormal electric data acquisition from the aspects of measuring devices, data channels, environmental factors and human factors. By sorting out and summarizing all

Abnormal Analysis of Electricity Data Acquisition

165

kinds of faults in a period of time, it can be found that the abnormal situation of power data acquisition caused by the faults of metering devices reaches 91% of the total. From this, it can be seen that the faults of metering devices are the key factors to determine the accurate data acquisition of power information acquisition system. The causes of specific data anomalies are shown in Table 1. Table 1. Distribution of causes of data anomalies. Cause

Number of Number of times affecting failures electric energy meter Abnormal junction 32 320 box Communication 2 128 failure Abnormal electric 5 46 energy meter Data acquisition 37 37 error Terminal fault 4 34 Man-made 2 16 destruction Transformer fault 10 10

Percentage Accumulated percentage 54.1% 54.1% 21.7%

75.8%

7.8%

83.6%

6.3%

89.8%

5.8% 2.7%

95.6% 98.3%

1.7%

100%

The following part of this section will illustrate the practical problems in the application of 10 kV bus power acquisition in Jinjing Substation of Jinjiang City, Fujian Province, and specifically analyze the abnormal situation of power data acquisition caused by the fault of metering device. 3.1

Example Introduction

Electricity information acquisition system monitors the power consumption at each substation gateway in real time, processes and analyses the collected data, and observes whether the power consumption at the gateway is wrong by the measurement staff. According to the law of conservation of energy: Q0 = Q1 + P, in the formula Q0 represents the input power of the bus, Q1 represents the output power of the bus, P represents the loss in the process of power transmission. The ratio of loss to input power is called bus unbalance rate, which is expressed by u, then u = P/Q0. This unbalance rate can be used to judge whether the collected electricity data is accurate or not. The following is the 10 kV bus balance report of Jinjing Substation on a certain day.

166

G. Xu and R. Zheng Table 2. The 10 kV bus balance report of a date in Jinjing Substation. Line spacing Output active power/Mwh Input active power/Mwh 1# incoming line 0 834600 2# incoming line 0 590400 Outgoing line 1 0 0 Outgoing line 2 60120 0 Outgoing line 3 77520 0 Outgoing line 4 88560 0 Outgoing line 5 41040 0 Outgoing line 6 37680 0 Outgoing line 7 39520 0 Outgoing line 8 92600 0 Outgoing line 9 32200 0 Outgoing line 10 119700 0 Outgoing line 11 70920 0 Outgoing line 12 83920 0 Outgoing line 13 59640 0 Outgoing line 14 81840 0 Outgoing line 15 90240 0 Outgoing line 16 63480 0 Outgoing line 17 0 0 Outgoing line 18 23880 0 Outgoing line 19 0 0 Outgoing line 20 128000 0 Outgoing line 21 116880 0 Outgoing line 22 0 0 Total 1307740 1425000

From Table 2, it can be seen that the bus loss is the sum of input active power and subtract the sum of input active power, which is 117260 MWh. Therefore, the unbalance rate at this time can be calculated by the formula of bus unbalance rate, which is 8.23%. Because the unbalance rate of 110 kV and below buses should be between 2% and −2% under normal conditions, the level can be obtained. A part of the electricity data in the balance report is abnormal. 3.2

Cause Analysis

In order to further analyze which part of the power data is abnormal, according to the balance report, the primary connection form of 10 kV bus in the substation is given (where outgoing line 1–11 belongs to A section and outgoing line 12–22 belongs to B section), as shown in Fig. 1.

Abnormal Analysis of Electricity Data Acquisition

167

Fig. 1. Primary wiring of 10 kV bus in Jinjing Substation.

From Fig. 1, it can be seen that the main wiring form of the electrical part of the substation is single bus segment, and the 10 kV bus bar operates separately. Active power is input from the 1st and 2nd line ports of 10 kV bus A and B respectively. Therefore, in the next analysis, we adopt the form of segmentation, and separately analyze the two AB segments. By calculating the value of Table 2, we can see that the total input power at the end of line 2 is basically equal to the total output power of all outgoing lines in Section B, while the difference between input power and output power in Section A bus is large, so it can be judged that the abnormal part of power data exists on Section A bus. Next, the output power of 1–12 outgoing lines of A bus is analyzed. After investigation, all the lines except the outgoing line 7 in Section A are normal, which meets the requirements of system data. Therefore, it can be judged that the outgoing line 7 has abnormal data. The load curve of line 7 of the same day is given below, as shown in Fig. 2.

Fig. 2. Outgoing line 7 load curve.

Figure 2 shows the comparison between the daily hourly load curve of line 7 in SCADA system and the daily increment curve of the line. From the curve changes, it can be seen that the increment of the line decreases rapidly from 6:00 to 7:00 on that day until 1/3 of the actual load. It can be judged that two-phase voltage loss fault may

168

G. Xu and R. Zheng

occur at that time. After field verification, it is found that the energy meter of this line sends out the alarm information of U and W phase voltage loss. After checking with the staff, it is known that the switching operation of the over-line was carried out between 6 o’clock and 7 o’clock on the same day, which resulted in the burnout of the voltage insurance of the metering device, resulting in two-phase voltage loss and the abnormal acquisition of electricity data.

4 Improvement Measures Through the analysis of the problems in the third section of this paper, we can see that the main reasons for the abnormal electric quantity are that the switching operation of the line is not reset in time and the measuring voltage is abnormal, which results in the burnout of the voltage insurance of the measuring device and the abnormal collection of the electric quantity data. At the same time, the monitoring of the voltage loss alarm of the electric energy meter is insufficient. No pressure loss problem was found in time. 4.1

Combining with the Actual Situation, Reduce the Device Failure

In view of the problems in the examples, the following improvement measures are put forward: When switching the circuit, the operation should be checked in time to observe whether the relay is sucked in order to ensure the normal change of the measuring voltage. Improve the quality of the watt-hour meter, check it regularly, rectify the watt-hour meter without Voltage-loss alarm module, strengthen the monitoring of Voltage-loss alarm signal, and ensure that the Voltage-loss can be detected and repaired in time. In addition to the device voltage insurance damage mentioned in this example, for the possible faults of other metering devices, more specific work and actual situation should be combined to take corresponding measures: Drying the internal components of the device to avoid dampness of the transformer due to the environment. Strengthen the monitoring of wiring box installation, strengthen the protection of wiring, as far as possible to keep the wiring box in normal load operation. According to the specific work and actual situation, managers make scientific and reasonable inspection and maintenance plan, which is conducive to improving the efficiency of data channel transmission. Control and monitor the acquisition terminal, improve the use efficiency of communication terminal, power terminal and terminal internal parameters, so as to reduce the number and frequency of abnormal electricity data. Electricity data acquisition terminal is directly carried out in the field, which is vulnerable to various factors. After successful acquisition, it is necessary to debug each terminal in time to ensure that the system is in a normal state of operation, and then put it into use.

Abnormal Analysis of Electricity Data Acquisition

4.2

169

Optimizing the External Environment and Reasonable Overall Planning

In the second section of this paper, the influence of geographical location on concentrator is discussed. To solve this problem, high gain antenna can be used to reduce data anomalies and enhance the transmission and reception ability of information. In order to create an environment that can use the telecommunication information acquisition system to run well and ensure the stability of the whole system, the number of concentrators can be increased appropriately or several harmonic blockers can be set up to reduce the adverse interference caused by environmental factors. In addition to the improvement measures for metering devices and environmental problems, it is of great significance for the normal and stable operation of the system to coordinate early and late management. Before the system is put into use, it is necessary to do a detailed field survey, including the user’s concentration time period, peak power consumption, etc. According to the analysis and research of the survey results, reasonable planning, find out the most effective time period for meter reading collection, improve the success rate of power data collection, and reduce the abnormal frequency of collection. After the system is put into use, in order to ensure the smooth progress of data acquisition, technicians need to check other electrical equipment and find and eliminate interference sources.

5 Conclusion It is of great significance for the operation and development of power system to improve the collection of electricity information, improve the accuracy and reliability of collection, and reduce the probability of abnormal electric energy measurement data. In this paper, several major reasons for abnormal data acquisition of electric energy are given: the fault of measuring device itself, the abnormal data channel, the influence of environmental factors and the interference of human factors. The electric energy of 10 kV bus in Jinjing Substation of Jinjiang City, Fujian Province is also given, which is caused by the internal fault of measuring device. The abnormal examples are analyzed, and the corresponding improvement measures are given as a breakthrough point. Meanwhile, aiming at the influence of metering device faults, environmental factors and human factors in other situations, several methods of improving data acquisition are briefly introduced, such as reducing device faults, optimizing external environment and rationally coordinating planning. Acknowledgment. This work is supported by Fuzhou Science and Technology Project (2018G-30).

170

G. Xu and R. Zheng

References 1. Zhao, A., Zhou, H.: Data acquisition anomaly analysis of electric power information acquisition system. Hebei Electr. Power Technol. 31(01), 10–11+23 (2012) 2. Zhang, X.: Talking about the analysis and judgment of abnormal data in electric power information acquisition system. China Equip. Eng. 21, 130–131 (2018) 3. He, X., Zhou, H.H.: Reasons for abnormal data of electric energy measurement in electric power information acquisition system and its improvement. Digit. Commun. World (04), 180–183 (2017) 4. Cai, J.: Reasons for abnormal electric energy measurement data in electricity information acquisition system. Inform. Constr. (01), 31–32 (2016) 5. Li, X., Li, Y.: Analysis of the operation effect of full event acquisition of electric power information acquisition system. Electr. Meas. Instr. 55(07), 66–70+82 (2018)

A Method of Power Network Security Analysis Considering Cascading Trip Hui-Qiong Deng1,2, Xing-Ying Lin1,2(&), Peng-Peng Wu1,2, Qin-Bin Li1,2, and Chao-Gang Li1,2 1

School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected], [email protected] 2 Fujian Provincial University Engineering Research Center of Smart Grid Simulation Analysis and Integrated Control, Fuzhou 350118, Fujian, China

Abstract. In this paper, a grid security index considering cascade tripping is proposed, and an algorithm is proposed for this index combination with the operating behavior of relay protection so as to solve the cascading tripping problem of the power system. Firstly, according to the current behavior of branch cascading tripping and the action of line current-mode backup protection, the critical state of grid cascading trip is analyzed, and it is proved that the input power of grid nodes can be used to describe the security of grid. Then, based on the relationship between the input power of nodes and the security level of power grid, a model is presented to represent the security level of power system without cascading failures under the impact of initial faults, and the model is solved by particle swarm optimization. An example proves the validity of the algorithm. Keywords: Power system Security index Cascade tripping Particle swarm optimization

1 Introduction The cascading failures of power grids can cause large blackouts. Therefore, in the current era, people are paying more and more attention to the problem of grid cascading failure. However, cascading failures are a stubborn problem and it is extremely difficult to eliminate them. Every few years, there are countries around the world that experience cascading power failures and blackouts. In the past ten years, researchers have studied cascading failures from various perspectives, and have achieved many research results. For example, literature [1] reveals the internal mechanism of large-scale cascading failures in power systems from the perspective of self-organizing criticality. Literature [2] combined with seepage theory studies the possible development path of cascading failures. Literature [3] proposes an algorithm for screening initial faults, interlocking disturbed branches and partitioning associated nodes of interlocking disturbed branches. Literature [4], an improved model for cascading failure analysis of substation automation system is studied. Literature [5] identifies the fragile lines in the system by the degree of nodes © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 171–180, 2020. https://doi.org/10.1007/978-981-15-3308-2_20

172

H.-Q. Deng et al.

and the median of edges. Literature [6] studies the impact of load redistribution strategy on grid security during cascading failures based on local real-time information. The early stage of cascading failures in power grid is cascading trip. The specific form of cascading trip is: when the branches of initial failures are removed, some of the remaining branches may be removed by backup protection because of the readjustment of the operation status of the power grid. This is the second stage of failures. Further, the operation status of the power grid will be readjusted again. New trips may occur again, and so on. Therefore, in order to avoid cascading failures, cascading trips should be prevented as soon as possible, which requires in-depth study of cascading trips. For this paper, it combines the actual form of cascading trips, and analyses the situation that the power grid is in the critical state of cascading trips. Combined with the node injection power of power grid, the security index of power grid considering cascading trips is proposed, and the optimization model for calculating the security index is given. For the given optimization model, the solution algorithm is further given. Finally, the proposed algorithm is further analyzed and validated by using IEEE14 system.

2 Mathematical Representation of Branch Chain Trip For the convenience of analysis, this paper only considers the case that the backup protection of the line is current protection. Assuming that an initial fault occurs in a branch Li of the power grid at a certain time, when the branch Li is removed and the power flow of the power grid redistributes, whether any branch Ij in the remaining part of the power grid will undergo cascading trips can be measured by formula (1): Ijdist ¼ Ijset Ij ð1Þ In formula (1), Ij is the current of branch Lj after power flow redistribution Ij.set is the current setting value of backup protection on branch Lj, and Ij.dist is the quantity to measure the electrical distance between Ij.set and Ij. From the concept and formula of chain tripping (1), it can be seen that when Ij.dist > 0, branch Lj will not occur chain tripping, and when Ij.dist 0, branch Lj will be removed by backup protection, that is, chain tripping will occur in Lj. Ij.dist = 0 is a special case. In this case, branch Lj just benefits from chain tripping. Boundary state. From the above analysis, it can be seen that Formula (1) gives a mathematical expression for the analysis of branch chain trips.

3 Considering the Critical State of Cascading Trips in All Branches Assuming that after the initial fault branch is removed, the number of remaining branches in the power grid is l According to formula (1), all remaining branches are taken into account, and the matrix shown in formula (2) can be given.

A Method of Power Network Security Analysis Considering Cascading Trip

173

K ¼ diagðI1dist ; ; Ijdist ; ; Ildist Þ

ð2Þ

Ij.dist in formula (2) represents the same meaning as in formula (1), and K is a diagonal matrix. According to the concrete manifestation of the interlocking tripping of the power grid, for the power grid operating under a certain state, after the original fault branch is removed, the power grid is transferred through the power flow, and if the current of all the remaining branches of the power grid meets formula (3), the power grid is rigid.

jK j ¼ 0 Ijdist 0; j ¼ 1; 2; ; l

ð3Þ

Further, from the knowledge of power flow calculation in power system, it can be seen that the power flow equation of power system can be expressed by Eq. (4) after the initial failure of branch Li. Y U_ ¼ S U_

ð4Þ

In formula 4, Y is the admittance matrix of the grid node after Li outage of branch, and U_ represents the node voltage vector after Li outage of branch. Y is a matrix with fixed elements for a particular power grid and a particular initial fault branch Li. Therefore, the U_ in Formula (4) is mainly determined by the node injection power S. If S will the change of node injection power before and after Li outage is neglected, ~ remain unchanged before and after Li outage, so U_ mainly depends on the node injection power S~ of power grid before Li outage. From the knowledge of power flow calculation in power system, it is also known that Ij in formula (1) is mainly determined by the node voltage after the initial failure of branch Li. Therefore, combined with (4), Ij is mainly determined by the injection power of the node. It is further known that for a given initial fault, the critical operation state of the power grid after the initial fault removal is mainly determined by the operation state before the initial fault removal, and the operation state before the initial fault is mainly determined by the injected power of the nodes in the power grid. Thus, the safety of power grid can be described by injecting power into nodes of power grid.

4 Power Network Security Index Considering Cascading Trips According to the above analysis, when considering cascading trips, when the power grid is running in a certain state, the cascading trips will not occur after the impact of initial faults. The farther the distance between the operation state and the operation boundary of the power grid is, the safer the power grid will be. Therefore, this paper adopts the current operation state of the power grid. The shortest distance between node injection power and critical state is used to measure the security level of power grid.

174

H.-Q. Deng et al.

~ can be expressed by formula (5) if S~0 is the power The distance between S~0 and S ~ is the power vector vector injected into the nodes under the current operation state s1, S injected into the nodes under a certain boundary operation state s2. ~ DðSÞ ¼ S~0 S

ð5Þ

~ In formula (5), DðSÞ represents the norm of the difference between S~0 and S. Furthermore, the shortest distance between the injected power of the node in the current operation state and the injected power of the node in the critical state can be expressed as follows: F ¼ minDðSÞ

ð6Þ

From the above analysis, it can be seen that when the power grid operates under state s2, in this state, if the power grid is not subject to the impact of initial fault, the greater the value of F, the safer the power grid. Therefore, considering the case of cascading trips, F can be used as an indicator to characterize the security of the power grid. Before the initial fault occurs, the power flow constraint of steady-state operation should be satisfied when the grid operates under state s2, that is, the power flow formula in steady-state operation can be expressed in the form shown in formula (7). h0 ðxÞ ¼ 0

ð7Þ

In formula (7), h0 is the mapping relation of power flow before the initial fault occurs, and x is the state variable of power flow. Before the initial fault occurs, when the power grid operates under state s2, it should also satisfy some inequality constraints, which can be expressed in the form of formula (8). 8 PGimin PGi PGimax ; i ¼ 1; ; N1 > > < QGimin QGi QGimax ; i ¼ 1; ; N1 P Pmmax ; m ¼ 1; 2; ; l > > : m Ukmin Uk Ukmax ; k ¼ 1; ; 2; ; N2

ð8Þ

Among them,PGi and QGi are respectively the active and reactive power of the i generator in the system; PGi:min and PGi:max are the lower and upper limits of the active power of the second generator; QGi:min and QGi:max are the lower and upper limits of the reactive power of the i generator, respectively; N1 is the total number of generators in the grid; Pm is the branch Lm transmission; Active power, Pm:max is the active power limit of branch Lm transmission; Uk is the voltage of node k, Uk:min is the lower voltage limit allowed by node k, Uk:max is the upper voltage limit allowed by node k, and N2 is the total number of grid nodes. Formula (8) is abbreviated and can be expressed as formula (9).

A Method of Power Network Security Analysis Considering Cascading Trip

175

g0 ð xÞ 0

ð9Þ

After the initial fault occurs, the power flow constraint should also be satisfied under the action of the injected power of the node corresponding to the state s2. Which can be expressed as the equation shown in Eq. (10). hk ðxÞ ¼ 0

ð10Þ

Among them, hk is the corresponding power flow mapping relationship when the power flow redistribution occurs after the initial fault occurs in s2 state. The model for calculating DðsÞ shown in formula (11) can be given by formula (3) and formula (6)*formula (10). ~ S ~ minDðSÞ ¼ S0 B s:t: h0 ðxÞ ¼ 0 B B h k ð xÞ ¼ 0 B B jK j ¼ 0 B @ g0 ð xÞ 0 Ijdist 0; j ¼ 1; 2; ; l 0

ð11Þ

Formula (11) is an optimization model. The variable to be optimized is the node injection power corresponding to the operation state of the interlocking tripping boundary. Its goal is to find the nearest state on the boundary to the current operation state s1 and the node injection power in this state.

5 Calculation Algorithm of Security Index For the optimization model given in Formula (11), this paper will use particle swarm optimization to solve the problem. The main reason for using this algorithm is that it is easy to program, simple and fast, and is suitable for solving complex optimization problems which are difficult for classical optimization algorithms, and has been used in many fields. For the basic particle swarm optimization, this paper carries out iterative calculation in the form of formula (12).

vki þ 1 ¼ wvki þ c1 r1 pbesti xki þ c2 r2 gbest xki xki þ 1 ¼ xki þ vki

ð12Þ

In formula (12), xki is the iteration position of particle i at the k th; vki is the iteration speed of particle at the k th, which generally requires that vmin vki vmax ; pbesti is the optimal solution experienced by particle i itself; gbest is the optimal solution experienced by the whole particle swarm; w is the inertia coefficient, which decreases linearly from 0.9 to 0.1; c1 and c2 are the acceleration. Constants, whose values are generally 2; r1 and r2 are random numbers with uniform distribution in [0,1] interval.

176

H.-Q. Deng et al.

In order to cooperate with particle swarm optimization, the model of formula (11) is simplified appropriately: (1) For power flow constraints in formula (11), this paper will turn them into power flow calculation. In the process of iteration, if the power flow constraints are not satisfied, new particles will be generated. (2) For jK j ¼ 0 in formula (11), it is recorded as f ðxÞ ¼ 0 in this paper, and for Ijdist 0 in formula (11), it is recorded as g1 ðxÞ 0 in this paper. In this way, the constraints to be handled in formula (11) can be expressed in formula (13). 8 < f ð xÞ ¼ 0 g 0 ð xÞ 0 : 1 g ð xÞ 0

ð13Þ

Based on the above analysis, the penalty function shown in formula (14) can be given by combining formula (11) and formula (13). D0 ¼ D þ

2 X1 2 X 1 2 1 min 0; g0 ðxÞ þ min 0; g1 ðxÞ þ ½f ðxÞ ak bk c k k

ð14Þ

In formula (14),ak ,bk and c are penalty factors. Using the penalty function given in formula (14), this paper will give the following specific algorithm flow combined with particle swarm optimization. 1. Calculate whether power flow convergence occurs after initial fault 2. Generating new ground state data from original ground state data 3. Test the convergence of the newly generated ground state data. If so, re-verify the convergence after calculation failure; otherwise, specify the original ground state data as the new ground state data. 4. Initialization of particle swarm. 5. Calculate the fitness of each particle. 6. Update the current and local optimum values of particles, and update the velocity and position of particles. 7. Stop the iteration and output the result after reaching the optimal result. Otherwise, return to step 4 and continue to search for the optimal solution. The above process of considering the safety behavior of chain tripping can be analogous to the following driving behavior. In an open space, a car is running at a predetermined speed and direction. At this time, there are N traps around the car, which could have been completely avoided by the car. But then suddenly an unpredictable external force acts on the car, causing the car to step into the nearest one of the N traps, and the car falls into the trap. Among them, the running car is equivalent to the operation state of the power grid before the fault occurs, the uncontrollable external force is equivalent to the initial fault of the power grid, and the car stepping into a trap is equivalent to the phenomenon of Cascading Trips in the power grid. The main task of this paper is to find the nearest trap to the car, adjust the speed and direction of the

A Method of Power Network Security Analysis Considering Cascading Trip

177

car, so that the car will not fall into the trap, assuming that the car is subjected to the unpredictable external force.

6 Example In order to further explain the security margin analysis model and solution algorithm for cascading trip proposed in this paper, a further example is given below. An example is given for IEEE14-bus system. The connection of EEE14-bus system is shown in Fig. 1. The component parameters and injected power data of the system can be referred to reference [7].

Fig. 1. Diagram of IEEE14-bus system

In this example, it is assumed that the running state of the system corresponding to the basic data given in reference [7] is recorded as the current state of the system, according to the definition and algorithm mentioned above, what we need to calculate is the minimum distance between the current state of the system and the corresponding operating boundary of the system, that is, the security margin. According to the idea of solving the problem based on Improved Particle Swarm optimization, an analysis program is compiled in Matlab environment. In the specific iteration calculation process, the parameters of particle swarm optimization are set as follows: c1 = c2 = 2; w is taken as decreasing from 0.9 to 0.1 according to the linear law. The initial fault branch is assumed to be the branch between node 4 and node 5, namely branch L4,5. To simplify the example analysis, this paper assumes the backup protection for each branch configuration of the system shown in Fig. 1. Because of the lack of relevant data, this paper uses virtual data and assumes that the fixed value of protection is 3 kA.

178

H.-Q. Deng et al. 0

In the analysis of the following examples, the values of D and fitness f given in this paper are expressed as per unit. In the calculation, the reference capacity of the corresponding power grid is considered as 100 MVA, and the reference voltage is consistent with the reference voltage of IEEE14 bus data given in reference [7]. In formula (11), the values of ak ; bk ; c are 0.5, 0.001 and 0.001. Here, the values of bk ; c are smaller, mainly considering that the constraints corresponding to these two penalty factors are more closely related to cascading trips. 0 Figure 2 shows the calculation results of D value shown in formula (14) using the basic particle swarm optimization algorithm. A total of 100 iterations were performed. The abscissa in Fig. 2 shows the number of iterations. As can be seen from the figure, when the program runs 100 times, the corresponding security index value reaches the maximum of 0.5521, and there is no better result.

Fig. 2. Trends in maximum fitness

Figure 3 shows the shortest distance between the injected power of nodes in the current state of power grid and the critical state based on Eq. (5). A total of 100 iterations were carried out, and the abscissa represents the number of iterations. As can be seen from the figure above, when the system runs 100 times, the shortest distance is found to be 0.0052. Table 1 lists the relevant data calculated based on particle swarm optimization, including the active and reactive power of PQ nodes in the system shown in Fig. 1 (expressed as load, positive load output, negative sign for injection), and active injection of PV nodes. (The data of the plankton node 7 in Fig. 1 are not listed), where the unit of active power is MW and the unit of reactive power is Mvar.

A Method of Power Network Security Analysis Considering Cascading Trip

179

Fig. 3. Shortest distance change data Table 1. The relevant data PQ node Number Power Number Power PV node Number Power

4 47.756 11 4.047 2 21.7

4 −2.338 11 3.242 3 94.2

5 7.306 12 6.067 6 11.2

5 2.963 12 2.963 8 0

9 29.824 13 13.324

9 -1.913 13 7.696

10 9.518 14 15.175

10 7.696 14 6.826

From the above results, it can be seen that the nearest critical point of cascading tripping accident can be easily found when the initial fault occurs in the power grid considering the action of relay protection after the analysis of the algorithm in this paper. So as long as we avoid that critical point, it is very advantageous to prevent chain tripping accidents.

7 Conclusion The cascading trip phenomenon of power grid is closely related to the action of relay protection of line. Combining with the action of relay protection and considering that the backup protection of line is current protection, this paper presents an idea of security index for cascading trip. The method observes the shortest distance between the critical state of cascading trip and the initial fault of the power system. The distance is expressed by the injected power between the injected power of the node and the critical state in the current operation state of the power system. The effectiveness of this method can be demonstrated by an example analysis, which can provide a reference for further research.

180

H.-Q. Deng et al.

Acknowledgment. This research was financially supported by Doctoral Research Foundation of Fujian University of Technology under the grant GY-Z13104, and Scientific Research and Development Foundation of Fujian University of Technology under the grant GY-Z17149.

References 1. Cao, J., Zhang, Y., Lin, H., et al.: Self-organized criticality identification of power system based on homogeneity. Electric Power Autom. Equip. 33(7), 6–11, 18 (2013) 2. Zhang, J., Tong, X., Jiang, J.: Power system cascading failure analysis based on seepage and risk theory. Power Syst. Autom. 41(5), 46–52 (2017) 3. Deng, H., Li, P., Zheng, R.: Analysis of disturbed branches and associated nodes in cascading faults of power grid. J. Fujian Inst. Eng. 13(3), 223–228 (2015) 4. Yang, J., Luo, X., Qu, C., et al.: Cascading failure analysis model of substation automation system. Power Syst. Autom. 40(23), 36–41 (2016) 5. Fu, L., Huang, W., Xiao, S., et al.: Vulnerability assessment for power grid based on smallworld topological model. In: Power and Energy Engineering Conference, pp. 1–4. IEEE (2010) 6. Song, B., Zhang, Z.: Preferential redistribution in cascading failure by considering local realtime information. Elsevier Ltd. 128, 5–15 (2019) 7. Zhang, B.: Analysis of Higher Power Network. Tsinghua University Press, Beijing (2007)

Construction Area Identification Method Based on Spatial-Temporal Trajectory of Slag Truck Jinjuan Wen1,2, Fumin Zou1,2(&), Lyuchao Liao1,2, Rong Hu1,2, Zhiyuan Hu1,2, Zhihui Chen1,2, Qiqin Cai1,2, and Jierui Liu1,2 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, 350118 Fuzhou, China [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Fujian Provincial Big Data Institute for Intelligent Transportation, Fujian University of Technology, 350118 Fuzhou, China

Abstract. The enclosure construction is easy to cause traffic congestion in the process, so timely determination of it is crucial to alleviate traffic congestion. At present, most existing methods are either susceptible to obstacles such as clouds and fog, or only suitable for small-scale areas. To address these problems, this paper proposes an effective automatic identification method, which mines the spatial-temporal trajectory data of common engineering vehicles such as slag trucks to automatically determine the construction area. To reduce the memory consumption and the influence of the two parameters e (field radius) and MinPts (domain density threshold) on the clustering results, we first divide the trajectory points, match them to the grid and generate cluster candidate sets by extracting High-density grid base on the preset density threshold. Then, the DBSCAN algorithm is used to identify the construction areas, which greatly shorten the running time. The experimental results show that the method is effective through the verification of ArcGIS & Google Earth. Keywords: Construction area Slag truck Spatial-temporal trajectory data DBSCAN algorithm ArcGIS & Google Earth

1 Introduction Over the past 40 years of reform and opening up, with the rapid development of China’s economy, the scale of infrastructure construction such as roads, bridges, subway has been continuously expanded. Because the enclosure construction area is easy to cause traffic congestion, timely judgment of it is significance to alleviate traffic congestion. At present, the acquisition of construction area information mainly comes from the relevant announcements of the government website, but it cannot truly reflect whether the construction area is under construction or has been completed. Therefore, domestic and foreign scholars have conducted relevant research on the problem of construction area identification. Literature [1] measured the construction progress of © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 181–188, 2020. https://doi.org/10.1007/978-981-15-3308-2_21

182

J. Wen et al.

elevated railway by using high-resolution satellite remote sensing image. In reference [2, 3], remote sensing images were used to detect the dynamic changes of airport and farmland areas. However, because they are highly susceptible to obstacles from clouds, fog, they will further affect the measurement results. Reference [4, 5] visualized the progress of infrastructure construction by UAV (unmanned aerial vehicle) and accurately extracted markers such as tower crane in the image to realize the identification of the construction area, but it only for the identification requirements of the small-scale area and high cost. With the rise of “Internet + ” and artificial intelligence, the field of intelligent transportation, as a way to improve the management level of relevant government functions and to alleviate social problems, has received attention from all walks of life, especially traffic trajectory data mining has become a new research upsurge [6]. Reference [7] realized the identification of urban hot spot travel area by mining and analyzing the taxi trajectory data. Literature [8] proposed a general framework TODMIS for community detection based on the trajectory multi-information source diffusion model. Literature [9] further detected the construction of urban roads by mining the trajectory data of taxis. Reference [10] mined the operation and consumption of urban commercial circle through trajectory data. Literature [11] developed a model resample Recurrent Neural Network (RRNN) to forecast passenger traffic on Mass Rapid Transit Systems (MRT). Because of the time, space and objects of spatial-temporal trajectory data, it can explore the evolution law of urban construction [12]. Urban construction is inseparable from the participation of construction vehicles such as slag trucks and other trucks. Because the trajectory points of the slag vehicle in the construction area are dense, the track points on the road are sparse, resulting in the cluster and linear feature of the spatial-temporal trajectory data. Therefore, according to the trajectory characteristics of the slag truck, this paper proposes an effective method for automatic judgment of the surrounding construction area. Firstly, divides the trajectory space and maps them to the corresponding grid cells, next generates clustering candidate sets with the high density grid cells based on the density threshold. Then uses the DBSCAN algorithm to cluster the candidate set to realize the automatic identification of the construction area. Finally, the verification of the construction area through the ArcGIS & Google Earth.

2 Related Definition The traffic trajectory data is the spatial information collected by the vehicle satellite positioning terminal [13]. The slag truck is a kind of common engineering vehicle. Due to its track points are dense in the construction area, result in obvious accumulation in time and space. 2.1

Spatial-Temporal Trajectory Sequence

Given a car, a sequence of spatial-temporal tracks consisting of space, time, and other information collected during driving is represented as follows:

Construction Area Identification Method Based on Spatial-Temporal

Di ¼ fP0 ðid0 ; t0 ; x0 ; y0 Þ; P1 ðid1 ; t1 ; x1 ; y1 Þ; . . .. . .; Pn ðidn ; tn ; xn ; yn Þg

183

ð1Þ

where Pn ðidn ; tn ; xn ; yn Þ represents the space-time data composed of vehicle id, timestamp, longitude and latitude information, and n ¼ LenðDi Þ is the length of the spatial-temporal track sequence, that is the number of points. 2.2

Grid Division

Assumed that geographical features are characterized by longitude and latitude range: Lonmin Lonmax , Latmin Latmax , We define D in M * N spatial grid:

M ¼ ðLonmax Lonmin Þ=l N ¼ ðLatmax Latmin Þ=w

ð2Þ

For any vehicle in D, the trajectory point (Lon, Lat) can be mapped to the corresponding grid:

m ¼ ðLon Lonmin Þ=l n ¼ ðLat Latmin Þ=w

ð3Þ

Where l and w represent the grid size. The spatial grid set ðG ¼ gi Þ is obtained by formula (2) for any i; jði 6¼ jÞ; gi \ gj ¼ ;. 2.3

Grid Density

The total number of spatial trajectory points of the grid element is regarded as the gi network density, recorded as Denðgi Þ. When Denðgi Þ ¼ 0, called an empty grid unit, When Denðgi Þ [ 0, called a non-empty grid unit. High Density Grid Units For cells gi , if its grid density satisfies: Denðgi Þ r

ð4Þ

In the formula: r is the preset grid density threshold. If the grid unit gi satisfies formula (4), it is called high density network unit.

3 Methodology The density of the track point reflects the intensity of the track data in the effective region. In the density-based clustering algorithm, the density value calculation method will have an important influence on the clustering result [14]. Because the track data of the engineering vehicle shows the obvious distribution of point cluster and linear characteristics, the distance difference and variance between two points are large, which seriously affects the clustering effect.

184

J. Wen et al.

This paper uses the grid method to calculate the density of the track points, and the cluster candidate set D´ with high density is established, which is beneficial to the next two parameters of e and MinPts in the DBSCAN algorithm setting. The method comprises the following steps of: (1) divide track space into a grid cell, and preset density threshold r; (2) map track points to a grid unit; (3) generate a clustering D´ by extracting grid cells with high density based on r; (4) use the DBSCAN algorithm to cluster the D´ to realize the identification of the construction area. The pseudo-code of the DNSCAN algorithm based on the grid density is shown in Table 1 below: Table 1. Pseudo-code of DBSCAN algorithm based on grid method

Algorithm: A Grid-based DBSCAN Algorithm Input: D: data set Output: set C of clusters based on density Initialization: grid density threshold ; domain radius ε; domain density threshold MinPts; grid size * ; empty dictionary dict_G, dict_C; empty dataset D '; new cluster C 1) the S is split into a grid by the formula (2) / * grid-based partition * / 2) for each point in D: match each point to the corresponding g∈dict_G by the formula (3) if g_ID in dict_G: dict_G[g_ID]. Append (current point) else create a new dictionary item [ g_ID：current point] end if end for 3) for each g_ID in dict_G: dict_C [g_ID] = Len [ dict_G [ g_ID]] end for for each g_ID in dict_C: if dict_C [g_ID]> / * select high-density grid */ add the dict_G. Key, dict_G. values to D´ end if end for 4) for each point in D´: / * a grid-based DBSCAN algorithm */ if ε - neighborhood of p has Minpts objects Add p to C A set of objects in an e-neighborhood where N is p for each p´ point in N: if ε - neighborhood of p´ has Minpts objects, add this point to N; if p´ not yet a member of any cluster., add p´ to C end for end for output: C until traverses all D points

Construction Area Identification Method Based on Spatial-Temporal

185

The points contained in the new cluster C are put into ArcGIS, and the.csv file is converted into. kmz file by Arc Toolbox, and then visualization is carried out on Google Earth to prove whether it is a construction area.

4 Experiment The experimental data set of this paper is the real GPS spatial-temporal trajectory data of 48 slag vehicles in Xiapu County. The selection time is from September 1, 2018, to November 30, 2018. 4.1

Experimental Data Description

The trajectory data contains the vehicle ID, timestamp, longitude, latitude, speed and other information. The specific data structure format is shown in Table 2. Table 2. Track data structure of slag truck Serial number Field name 1 F_ID 2 F_TIME 3 F_LON 4 F_LAT 5 F_SPEED

Describe vehicle ID timestamp longitude latitude speed

Example minJ28788 2018-10-30 08:20:00 120.31608 26.877706 24

As shown in Fig. 1 below, it is the 3D trajectory Diagram of a Slag vehicle one day, where the X and Y axis represent the longitude and latitude of the trajectory data points, the Z axis is the data point timestamp.

Fig. 1. The 3D trajectory data of MinJ35266 truck

186

J. Wen et al.

From the diagram, the overlapping part of the trajectory points indicates that this is the area of dense activity of the slag truck, which is suspected to be the construction area. The sparse part of the track point, which is the track where the slag truck carries materials on the road. 4.2

Experimental Results and Analysis

The slag truck with F_ID of MinJ35266 has a total of 5192 trajectory data in the range of 2018-11-30 07:16:42 to 2048-11-30 21:11:58. After matching, the high-density network is selected. The clustering candidate set data formed by Gurley’s trajectory points is 1818, and then the coordinates of centroids 1–3 by DBSCAN clustering are (120.023066°, 26.871235°), (120.025322°, 26.87601°), (120.017225°, 26.874154°).

J35266traj cluster result

(120.017225∞, 26.874154∞) (120.025322∞, 26.87601∞) (120.025322∞, 26.87601∞)

Fig. 2. MinJ35266 truck 2018-11-30 trajectory data clustering result

The clustering of candidate sets by DBSCAN is shown in Fig. 2 above. On the way, the Y axis is the longitude and latitude of the trajectory data points, the green points correspond to the one-day trajectory, and the red is the clustering center and shows the longitude and latitude coordinates, which are the area we’re going to verify next. The verification of experimental results uses ArcGIS & Google Earth, first make the clustering result.csv file type convert (. kmz) by the Arc Toolbox of ArcGIS, then visualize in Google Earth, finally confirm the centroid 1, 2 are construction area, the centroid 3 is the material area (mainly at the foot of mountain, mostly soil or sand, etc.). We will focus on the construction area identified in centroid 1, 2, as shown in Fig. 3 below.

Construction Area Identification Method Based on Spatial-Temporal

187

(120.025322∞, 26.87601∞)

(120.023066∞, 26.871235∞)

Fig. 3. The centroids 1, 2 are located construction area

Using the above experimental method, next randomly selects 10 slag truck to identify the construction area. The final results show 24 suspected construction areas, and then through the verification find 21 construction areas. Analyzing the three non-construction areas with misidentification, find that one of the three areas is a multi-forked intersection of a more complex mountain road, the trajectory is dense and prone to misjudgment; The remaining two areas are material areas, because the soil transport by slag truck have designated locations, it is easy to cause high density of track points, which further leads to misjudgment.

5 Conclusion Since the timely identification of the construction area is very important to alleviate the traffic congestion of the enclosure, this paper presents a construction area identification method based on the spatial-temporal trajectory data of the slag truck. For the original DBSCAN algorithm, e (field radius) and MinPts (domain density threshold) are flexible in setting two parameters. The spatial grid method is introduced to remove noise points and extracts the high density grid, and establishes the clustering candidate set. Then uses the DBSCAN algorithm to realize automatic recognition of the construction area. Finally, through the matching between the visualization of ArcGIS & Google Earth and the clustered trajectory shows that the method can effectively find the construction area. The construction area identification method proposed in this paper can provide decision-making basis for relevant government management departments, and can also provide electronic map manufacturers with the preparatory information for updating the base map to provide related services for urban management and public travel. Finally, it is necessary to distinguish the material area from the construction area through the Visualization of Google Earth. This process requires manual analysis and statistics. The future work will be devoted to solving this problem and establishing a suitable model to further realize the distinction between the two. Acknowledgment. This work was supported in part by projects of Provincial Economic and Trade Commission (Rong Finance Enterprise (refers) [2018] 41) Research and Application of Urban Traffic accident Express and violation report system based on vehicle networking Technology, project of Municipal Bureau of Science and Technology (2019-G-40) Research and

188

J. Wen et al.

Application of key Technology of’Beautiful Travel’ Fuzhou Traffic smooth Service system, project of Institute Research and Development Fund (GY-Z17151, GY-Z13125, GY-Z160064), project of General Project of Provincial Natural Science Found (2019I0019), project of General Project of Science and Technology at Education Department level (JAT170368).

References 1. Behnam, A., Wickramasinghe, D.C., Ghaffar, M.A., et al.: Automated progress monitoring system for linear infrastructure projects using satellite re-mote sensing. Autom. Constr. 68, 114–127 (2016) 2. Zhu, D., Wang, B., Zhang, L.: Target detection in remote sensing images: a new method based on two-way saliency. IEEE Geosci. Remote Sens. Lett. 12(5), 1096–1100 (2015) 3. Valero, S., Morin, D., Inglada, J., et al.: Production of a dynamic cropland mask by processing remote sensing image series at high temporal and spatial resolutions. Remote Sens. 8(1), 55 (2016) 4. Ham, Y., Han, K., Lin, J., et al.: Visual monitoring of civil infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): a review of related works. Vis. Eng. 4(1), 1 (2016) 5. Yu, B., Niu, W., Wang, L., et al.: A tower crane extraction method based on multi-scale adaptive morphology. Remote Sens. Technol. Appl. 28(2), 240–244 (2013) 6. Chen, Y., Zhang, L., Hai, J.: Research progress and trend of big data-driven intelligent transportation system. Chin. J. Internet Things (1), 56–63 (2018) 7. Zhang, L., Xin, Z., Zhaohui, J., et al.: Mining urban attractive areas using taxi trajectory data. Comput. Appl. Softw. (1), 1–8 (2018) 8. Liu, S., Wang, S.: Trajectory community discovery and recommendation by multi-source dif-fusion modeling. IEEE Trans. Knowl. Data Eng. 29(4), 898–911 (2017) 9. Lvchao, L., Xinhua, J., Fumin, Z.: A spectral clustering method for big trajectory data mining with latent semantic correlation. Acta Electron. Sinica 43(5), 956–964 (2015) 10. Zhi, L., Huiping, L., Dapeng, Z., et al.: Business circle population mobility statistics based on mobile trajectory data. J. East China Normal Univ. (Nat. Sci.) (04), 97–113 + 138 (2017) 11. Hu, R., Chiu, Y.C., Hsieh, C.W., et al.: Mass rapid transit system passenger traffic forecast using a re-sample recurrent neural network. J. Adv. Transp. 2019 (2019) 12. Deren, L., Jun, M., Zhenfeng, S.: Discuss of space-time big data and its application. Satellite Application (09), 7–11 (2015) 13. Zou, F., Liao, L., Jiang, X., Lai, H.: An automatic recognition approach for traffic congestion states based on traffic video. Highw. Traffic Sci. Technol. Engl. Ed. 8(2), 72–80 (2014) 14. Meiwei, H., Hualin, D., Kun, H.: Optimization of density-based K-means algorithm in trajectory data clustering. J. Comput. Appl. 10, 2946–2951 (2017)

Monitoring Applications and Mobile Apps

Study on Hazardous Scenario Analysis of High-Tech Facilities and Emergency Response Mechanism of Science and Technology Parks Based on IoT Kuo-Chi Chang1, Kai-Chun Chu2(&), Yuh-Chung Lin1, Trong-The Nguyen1, Tien-Wen Sung1, Yu-Wen Zhou1, and Jeng-Shyang Pan1 1

2

Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, China Department of Business Administration Group of Strategic Management, National Central University, Taoyuan, Taiwan [email protected]

Abstract. A High-tech plant also uses a large amount of electrical energy and mechanical energy, heat, radiant energy, chemical energy, and human factors cause by the process requirements. The ability of each unit to improve the maintenance and rescue capacity is most important works especially based on big data and IoT. This study high-tech plant hazard types and possible situational analysis includes fire, explosion, and toxic gas leakage for analysis. However, this study uses the risk assessment method and simulates for disaster prevention and prevention operations. The sensing layer is mainly carried out in three parts include fire, explosion and toxic gas leakage. Finally, the rationality of this study is verified. Keywords: Hazards Scenarios analyses Emergency Response Management Smart Mechanism Big data IoT

1 Introduction High-tech processes, tools, and facilities system may cause injuries, which are extremely prone to fire, explosion, poisoning, corrosion, suffocation, etc. In addition, the process tool in high-tech plants also uses a large amount of electrical energy and mechanical energy, heat, radiant energy, chemical energy, and human factors cause by the process requirements. All activities may also cause hazards to the operators and cause stagnation or interruption of the production line [1, 2]. The diversity of high-tech industry hazards also causes difficulties in the rescue of disasters, especially because the high-tech plants processes are mostly in clean rooms, the floor area is wide, and there are many machines. It is not easy when accidents occur unfortunately. The discovery of the origin of the disaster, so rescue materials, rescue manpower and disaster relief site command and dispatch is extremely difficult, a large number of hazardous materials scattered in the plant, the lives of rescue personnel will © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 191–199, 2020. https://doi.org/10.1007/978-981-15-3308-2_22

192

K.-C. Chang et al.

also be seriously threatened. Judging from the past few cases of major high-tech plant fires in Taiwan, it also confirmed the previous statement [3]. However, in terms of semiconductor and LCD plant s in general high-tech parks, the main types of hazards are fire, explosion, toxic gas leakage, chemical leakage, high temperature, electric induction, mechanical crimping, mechanical entrapment, mechanical collision, material collapse, If we want to do emergency response and rescue treatment, we must first understand the occurrence process of the hazard, and estimate the severity and frequency of occurrence, and further confirm the hazard risk value of the current operation. So use this risk value to confirm whether the plants have done their own risk prevention and prevention work. If not, the manufacturer must be required to improve the operational safety. The ability of each unit to improve the maintenance and rescue capacity is most important works especially based on big data and IoT. So, in the event of a disaster, we can ensure that the disaster situation can be handled quickly and effectively. This is the most important contribution of the Emergency Response Management Smart Mechanism in this study [4–6].

2 Methodology This study high-tech plant hazard types and possible situational analysis, no matter what kind of hazard occurs, we should be able to confirm the occurrence of the situation and assess the extent of the hazard impact, which is the severity analysis. This study chooses fire, explosion, and toxic gas leakage for analysis, as follows [7, 8]: (1) Fire and Explosion Dow’s Fire & Explosion Index is a very effective way to evaluate fires, Dow, s Fire & Explosion Index is a very effective method for evaluating fires. Equation (1) is an evaluation formula. F ¼ MF ð1 þ GPHtotÞ ð1 þ SPHtotÞ

ð1Þ

Where MF is the material coefficient (0–40), GPHtot is the general process hazard point (0.25–1.5), SPHtot is the special process hazard point (0.25–1.5). The principle of judgment is shown in Table 1. Table 1. Judgment principle of Dow’s Fire & Explosion Index Hazard level Fire explosion index Level 1 F < 65 Level 2 65 ≦ F < 95 Level 3 F ≧ 95 ※The level 3 is the most harmful

Toxicity index T > = ð0Þ ð0Þ ð0Þ ð0Þ ð0Þ ð0Þ f 2 ðx1 þ Dx1 ; x2 þ Dx2 ; xn þ Dxn Þ ¼ 0 > > ; ð0Þ ð0Þ ð0Þ ð0Þ ð0Þ ð0Þ f n ðx1 þ Dx1 ; x2 þ Dx2 ; xn ¼ Dxn Þ ¼ 0 Spread the n equations around the initial value with the Taylor series, and ignore the higher order terms of the quadratic term and above: 2

ð0Þ ð0Þ ð0Þ f1 ðx1 ; x2 ; xn Þ ð0Þ ð0Þ ð0Þ f2 ðx1 ; x2 ; xn Þ

3

2

@f1 @x1 0

@f1 @x2 0

0

0

6 7 6 7 @f2 6 @f2 7 6 @x2 0 7 ¼ 6 @x1 0 7 6 5 4 @fn @fn ð0Þ ð0Þ ð0Þ fn ðx1 ; x2 ; xn Þ @x1 @x2

6 6 6 6 6 4

32

@f1 @xn 0

ð0Þ

Dx1

3

7 6 7 76 7 76 Dxð0Þ 7 2 7 76 76 7 67 7 5 5 4 @fn @xn 0 Dxð0Þ n @f2 @xn 0

FðX ðkÞ Þ ¼ J ðkÞ DX ðkÞ X ðk þ 1Þ ¼ X ðkÞ þ DX ðkÞ The Newton-Raphson method can generally be used directly in solving the power flow power equation of a power system. The power system is composed of a generator, a transformer, a transmission line and a load, wherein the generator and the load are nonlinear components. However, when performing power flow calculation, it is generally indicated by a current injection amount connected to the corresponding node. Therefore, the power network used for power flow calculation consists of static linear components such as transformers, transmission lines, capacitors, and reactors, and is simulated by series and parallel equivalent branches represented by concentrated parameters. Combined with the characteristics of the power system, the analysis of such a linear network is generally carried out by the node method.

Power Flow Calculation Based Newton-Raphson Calculation

Pi jQi ¼ U i

Pi ¼ ei Qi ¼ fi

n X

Yij U_ j

j¼1

9 n X > Gij ej Bij fj þ fi Gij fj þ Bij ej > > > = j¼1

j¼1 n X j¼1

n X

203

n X > > Gij ej Bij fj ei Gij fj þ Bij ej > > ; j¼1

For PQ nodes, given Pis and Qis , the power equations are written as follows: DPi ¼ Pis Pi ¼ Pis ei DQi ¼ Qis Qi ¼ Qis fi

n X

Gij ej Bij fj fi

j¼1 n X

n X j¼1

Gij ej Bij fj þ ei

j¼1

n X j¼1

9 > Gij fj þ Bij ej ¼ 0 > > > =

> > > Gij fj þ Bij ej ¼ 0 > ;

For PV nodes, given Pis and Uis , the power equation is written as follows: 9 n P Gij ej Bij fj fi Gij fj þ Bij ej ¼ 0 = j¼1 j¼1 ; DUi2 ¼ Uis2 Ui2 ¼ Uis2 e2i þ fi2 ¼ 0

DPi ¼ Pis Pi ¼ Pis ei

n P

Let n have all node in the system, where m are PQ nodes, n − (m + 1) are PV nodes, and 1 is a balanced node, then the total active equation is n − 1, and the reactive power equation is m The voltage equation is n − (m + 1) and the total number of equations is 2(n − 1): 2

@DP1 @e1 @DQ1 @e1

6 6 6 6 .. 6 . 6 @DPm 6 6 @e1 6 @DQm 6 @e1 6 J ¼ 6 @DPm þ 1 6 @e1 6 6 0 6 6 . 6 . 6 . 6 @DPn1 6 @e1 4 0

@DP1 @f1 @DQ1 @f1

.. .

@DPm @f1 @DQm @f1 @DPm þ 1 @e1

0 .. .

@DPn1 @f1

0

@DP1 @em @DQ1 @em

.. .

@DP1 @fm @DQ1 @fm

.. .

@DPm @em @DQm @em @DPm þ 1 @em

@DPm @fm @DQm @fm @DPm þ 1 @fm

0 .. .

0 .. .

@DP1 @em þ 1 @DQ1 @em þ 1

.. .

@DP1 @fm þ 1 @DQ1 @fm þ 1

.. .

@DPm @em þ 1 @DQm @em þ 1 @DPm þ 1 @em þ 1 @DUm2 þ 1 @em þ 1

@DPm @fm þ 1 @DQm @fm þ 1 @DPm þ 1 @fm þ 1 @DUm2 þ 1 @fm þ 1

.. .

.. .

@DPn1 @em

@DPn1 @fm

@DPn1 @em þ 1

@DPn1 @fm þ 1

1

0

0

0

.. .

@DP1 @en1 @DQ1 @en1

@DP1 @fn¼1 @DQ1 @fn1

@DPm @en1 @DQm @en1 @DPm @en1

@DPm @fn¼1 @DQm @fn1 @DPm @fn1

0 .. .

0 .. .

.. .

.. .

@DPn1 @en1

@DPn1 @fn1

@DUm2 þ 1 @em þ 1

@DUm2 þ 1 @em þ 1

3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5

204

P. Sun et al.

2

DP1 DQ1 .. .

3

7 6 7 6 7 6 7 6 7 6 6 DPm 7 7 6 6 DQm 7 7 DF ¼ 6 6 DPm þ 1 7 7 6 6 DUm2 þ 1 7 7 6 7 6 .. 7 6 . 7 6 5 4 DP n1

2 DUn1

2

De1 Df1 .. .

3

7 6 7 6 7 6 7 6 7 6 6 Dem 7 7 6 6 Dfm 7 7 DX ¼ 6 6 Dem þ 1 7 7 6 6 Dfm þ 1 7 7 6 6 . 7 6 .. 7 7 6 5 4 De n1

Dfn1

Block diagram of Newton-Raphson method power flow calculation: 1. Forming a node admittance matrix [3] based on network parameters. 2. Given the initial values [2] of each node voltage. 3. The initial value of each node voltage is used to calculate the offset of each node ð0Þ ð0Þ ð0Þ power and node voltage in the modified equation DPi ; DQi ; ðDUi Þ2 . 4. Find the elements in the Jacobian matrix [3]. 5. Solve the correction equation and find the correction amount of each node voltage ð0Þ ð0Þ Dei ; Dfi 6. Find a new initial value of voltage ð1Þ

ei ð1Þ

ð1Þ

ð0Þ

ð0Þ

¼ ei þ Dei

ð1Þ

fi

ð0Þ

¼ fi

ð0Þ

þ Dfi

ð1Þ

7. Find DPi ; DQi ; ðDUi Þ2 8. Check for convergence. When the voltage tends to be true, its power offset will tend to zero. ðkÞ f ðxi Þ

max

ðkÞ ¼ DPðkÞ max ; DQmax \e

9. If it does not converge, return to step 4 and re-iterate; if it converges, do the next calculation. 10. Calculate the power distribution and balance node power in each line and output the result [4].

3 Power Flow Calculation Simulation Results and Analysis Taking the three-machine nine-node power system of IEEE international [3] standard as an example, we use the Newton-Raphson algorithm to program the power flow calculation. The figure shows the wiring diagram of the three-machine nine-node system. The parameters of each node are given in the table below. ? is the variable to be determined.

Power Flow Calculation Based Newton-Raphson Calculation

205

The parameters of each node Node parameter reference value (100 MVA 220 kV) Bus Attributes Control Control variable P variable Q 1 Generator Balance ? ? 2 Generator PV 1.63 ? 3 Generator PV 0.85 ? 4 Contact line PQ 0 0 5 Load PQ −1.25 −0.5 6 Load PQ −0.9 −0.3 7 Contact line PQ 0 0 8 Load PQ −1 −0.35 9 Contact line PQ 0 0

State variables V 1.04 1.025 1.025 ? ? ? ? ? ?

State variables h 0 ? ? ? ? ? ? ? ?

The simulation results are given in the table below. Node power flow distribution Power flow results Bus V [p.u.] Bus 1 1.04000 Bus 2 1.02500 Bus 3 1.02500 Bus 4 1.02579

Phase [rad] 0.00000 0.16197 0.08142 −0.03869

P gen [p.u.] 0.71641 1.63000 0.85000 (0.00000)

Q gen [p.u.] 0.27046 0.06654 -0.10860 0.00000

P load [p.u.] 0 0 0 0

Q load [p.u.] 0 0 0 0 (continued)

206

P. Sun et al. (continued)

Power flow results Bus V [p.u.] Bus 5 0.99563 Bus 6 1.01265 Bus 7 1.02577 Bus 8 1.01588 Bus 9 1.03235

Phase [rad] −0.06962 −0.06436 0.06492 0.01270 0.03433

P gen [p.u.] (0.00000) (0.00000) 0.00000 (0.00000) 0.00000

Q gen [p.u.] 0.00000 0.00000 0.00000 0.00000 0.00000

P load [p.u.] 1.25 0.9 0 1 0

Q load [p.u.] 0.5 0.3 0 0.35 0

We find the voltage amplitude and phase angle of each node and plot the node voltage distribution.

Analysis of the tabular data shows that the voltage amplitude of the generator node is higher than the load node voltage amplitude. The load absorbs active and reactive power, which causes the amplitude to drop, in line with objective physical laws. The branch power distribution is shown in the following table. Line flows From bus

To bus

Line

Bus Bus Bus Bus Bus Bus

Bus Bus Bus Bus Bus Bus

1 2 3 4 5 6

9 7 9 7 5 6

8 8 6 5 4 4

P flow [p.u.] 0.24183 0.76380 0.60817 0.86620 −0.40680 −0.30537

Q flow [p.u.] 0.03120 −0.00797 −0.18075 −0.08381 −0.38687 −0.16543

P loss [p.u.] 0.00088 0.00475 0.01354 0.02300 0.00258 0.00166

Q loss [p.u.] −0.21176 −0.11502 −0.31531 −0.19694 −0.15794 −0.15513 (continued)

Power Flow Calculation Based Newton-Raphson Calculation

207

(continued) Line flows From bus

To bus

Line

Bus 2 Bus 3 Bus 1 Line flows From bus

Bus 7 Bus 9 Bus 4

7 8 9

To bus

Line

Bus Bus Bus Bus Bus Bus Bus Bus Bus

Bus Bus Bus Bus Bus Bus Bus Bus Bus

1 2 3 4 5 6 7 8 9

8 8 6 5 4 4 7 9 4

9 7 9 7 5 6 2 3 1

P flow [p.u.] 1.63000 0.85000 0.71641

Q flow [p.u.] 0.06654 −0.10860 0.27046

P loss [p.u.] 0.00000 0.00000 0.00000

Q loss [p.u.] 0.15832 0.04096 0.03123

P Flow [p.u.] −0.24095 −0.75905 −0.59463 −0.84320 0.40937 0.30704 −1.63000 −0.85000 −0.71641

Q flow [p.u.] −0.24296 −0.10704 −0.13457 −0.11313 0.22893 0.01030 0.09178 0.14955 −0.23923

P loss [p.u.] 0.00088 0.00475 0.01354 0.02300 0.00258 0.00166 0.00000 0.00000 0.00000

Q loss [p.u.] −0.21176 −0.11502 −0.31531 −0.19694 −0.15794 −0.15513 0.15832 0.04096 0.03123

Based on the tabular data, plot the power flow graph. We draw the active and reactive power separately.

208

P. Sun et al.

In power system, active power always flows from a node with a high voltage phase angle to a node with a low phase angle. From the flow diagram we find that the active power always flows from the generator side to the load side, and the line consumes active power.

The flow of reactive power is relatively complex, and flows to the load side and the transformer side, which can be well understood through its physical mechanism. In the power system, in addition to the load absorbing reactive power, the transformer also needs to absorb reactive power to establish a magnetic field to transmit electrical energy. The transmission line has a capacitive effect [4] on the ground and therefore delivers a non-functional amount to the system. Regardless of active or reactive, the power of each branch is balanced for each branch. Global summary report Global summary report Total generation Real power [p.u.] Reactive power [p.u.] Total load Real power [p.u.] Reactive power [p.u.] Total losses

3.196410215 0.228398751 3.15 1.15 (continued)

Power Flow Calculation Based Newton-Raphson Calculation

209

(continued) Global summary report Real power [p.u.] Reactive power [p.u.]

0.046410215 −0.921601249

4 Conclusion The power system power flow equation is essentially a set of nonlinear algebraic equations. In mathematics, we can use the Newton-Raphson method to calculate. In this paper, the three-machine nine-node system is taken as an example to simulate and analyze the example through programming. The method has fast convergence and good convergence. The calculation results are consistent with the actual physical system, and a reasonable physical interpretation can be obtained. Therefore, it is feasible to use the Newton-Raphson method for power system power flow calculation.

References 1. Qin Yu, F., Qin Xiaojia, S., Wang Mengwei, T.: Research on power system power flow calculation algorithm. Electron. World (21) (2018) 2. Zhong Tao, F., Zhang Zhen, S., Li Xinming, T.: Talking about the application of power flow calculation in power system. Sci. Technol. Wind (30) (2018) 3. Sun Qiuye, F., Chen Huimin, S., Yang Jianong, T.: Convergence analysis of power flow calculation methods. Chin. J. Electr. Eng. (13) (2014) 4. Lu Jiewei, F., Liu Jinhua, S., Li You, T.: A new method for storage of node admittance matrix elements. J. Nanchang Univ. (Eng. Ed.) (02) (2017) 5. Hu Jian, F., Yang Xuan, S., Chen Fan, T.: Improved algorithm based on power flow calculation of Newton-Raphson power system. Comput. Technol. Autom. (04) (2013) 6. Wan Chenyu, F.: The development trend of electrical automation and its application in power systems. Sci. Technol. Innov. (01) (2018)

Research on the Application of Instance Segmentation Algorithm in the Counting of Metro Waiting Population Yan Cang, Chan Chen(&), and Yulong Qiao Harbin Engineering University, Harbin, China [email protected]

Abstract. With the development of deep learning, intelligent security and measurement and control products emerge in endlessly. In order to maintain the operation order of subway station, it is one of the important problems in the field of deep learning to carry out real-time and accurate counting of the number of people queued in front of the subway. In order to solve this problem, we propose a simple and flexible real-time counting method for the number of people queued in front of the subway. The image is collected by the camera in front of the subway door, instance segmentation algorithm accurately divides the target in the image, and completes the counting of the number of people queued in front of the subway by calculating the number of targets segmented. Selected the mainstream Mask R-CNN as the basic algorithm, combine the characteristics of the live picture, feature pyramid network and non-maximum suppression process of Mask R-CNN are improved. The experimental results show that the algorithm can realize accurate and real-time counting of the number of people queued in front of the subway, and the accuracy of counting can reach 96%. Compared with the traditional target detection algorithm, it has stronger adaptability to occlusion problem, and can accurately segment the intersection of targets from less than 30% to less than 60%. And the real-time performance has been greatly improved, the target inventory in a single picture only needs about 0.2 s. Keywords: Instance segmentation Target counting Deep learning Feature pyramid network

1 Introduction In recent years, with the improvement of urbanization level, the number of urban residents has continued to increase, and urban traffic pressure is increasing. As the fastest public transportation at present, the operating pressure of the subway is particularly prominent. In 2019, the daily passenger volume of the Beijing subway reached 10.1010 million, ensuring the passengers’ travel efficiency and travel safety has become an urgent problem to be solved. Relevant research shows that when the number of waiting people in front of a single subway is more than 21, the subway station will take current limiting measures, at the same time increase the number of trains. At present, the method of manually counting the number of waiting people in front of the © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 210–218, 2020. https://doi.org/10.1007/978-981-15-3308-2_24

Research on the Application of Instance Segmentation Algorithm

211

subway is more difficult to use, the pixel-level instance segmentation technology can effectively solve this problem, and the image is taken by the camera in the subway station, instance segmentation algorithm divides the target in the image to count the number, directly sends the counting results to the workstation, and the staff dispatches the vehicle in time and takes relevant measures. On this basis, this paper studies a smart count method. Through the improvement of the Mask R-CNN algorithm, an accurate real-time counting of the queue number in front of the subway is realized. Section 2 of the paper introduces the relevant research results, Sect. 3 details the instance segmentation algorithm used, Sect. 4 analyzes the experimental results, and Sect. 5 summarizes the full text.

2 Related Work The instance segmentation technique is a combination of target detection and semantic segmentation. In 2014, Hariharan et al. proposed [1] laid the foundation for instance segmentation, MCG (Multiscale Combinatorial Grouping) was used to generate candidate regions of image, AlexNet performed feature extraction and extracted features training SVM for classification. In 2015, Hariharan et al. improved [1] proposed [2], introduced the concept of supercolumn in the classifier, and combined the low-level features and high-level features extracted by CNN (Convolutional Neural Network) to obtain more accurate details. [3] instead of rectangular frame by CFM (Convolutional Feature Masking), mask of irregular regions is generated. In 2016, FAIR developed a new set of techniques for detecting and segmenting individual targets in an image, the algorithm used in this technique is [4–6] to generate the initial mask of the target, [5] Used to optimize these masks, [6] classify masks. In [7], the target bounding box regression, the target pixel-level segmentation, and the target classification are combined into a cascade structure, and the low-level convolution features are contributed. Insequence-sensitive score maps were used in [8], and the inside/outside score maps inside or outside the target instance were added, and context information was introduced. In 2017, He et al. proposed Mask R-CNN [10] used FPN (Feature pyramid network) for feature fusion and hierarchical detection, and added a mask branch for semantic segmentation based on [9], combined with classification and regression branches, achieved target instance segmentation. In 2018, Liu et al. proposed [20] on the basis of [10], making full use of feature fusion, and the detection accuracy was improved but the real-time performance was affected. In 2019, He et al. proposed a single-stage detection method [19], which directly predicts the mask in a dense sliding window, the detection accuracy is slightly lower than [10], but it is very close, this method paves the way for the study of the single-stage instance segmentation algorithm. Deep learning is the closest intelligent learning method to the human brain, its research is not only of great academic significance, but also very practical, it has extremely high support value for image recognition, target segmentation and natural language understanding. In 2012, the dream lineup led by Andrew Ng and Jeff Dean began to build the Google Brain project, making breakthroughs in speech recognition and image recognition. In the same year, Microsoft Chief Research Officer Rick Rashid

212

Y. Cang et al.

demonstrated an automatic simultaneous interpretation system at the 21st Century Computing Conference, deep learning is the key technology in this system. In 2015, Baidu launched the blind assistant “DuLight”, relying on deep learning technology to help the blind to see the real world. In 2017, Google’s AlphaGo, a deep learning model based on Google’s deep learning model, defeated the world’s top player, Ke Jie, in the game, nearly one hour and 20 min less than the latter. Endless deep learning products illustrate that intelligent products are an inevitable trend in the development of various industries.

3 Research on Pixel-Level Instance Segmentation Algorithm This paper adopts the Mask R-CNN algorithm based on TensorFlow framework, the algorithm uses ResNet-101 as the feature extraction network, and uses FPN for feature fusion and hierarchical detection. Then, using RPN (Region Proposal Network) for RoI (Region of Interest) selection, RoIAlign maps the RoI to the specified size. Finally, each region of interest is classified, regression, and mask prediction, so that each instance is predicted and the target instance is segmented. The overall block diagram of the algorithm is shown in Fig. 1.

RPN Image

Resnet101

Classification

FPN RoIAlign

regression

loss

mask

Fig. 1. Overall block diagram of the algorithm

3.1

Algorithm Application

In the application scenario, collect a picture production data set. After collecting the data, select different types of images to mark up and generate a .json file. After the labeling is completed, perform image zooming, sharpness adjustment and other data enhancement operations. Taking image zoom as an example, in the first step, the ratio is selected to scale the image, the scaled image is renamed to prevent the name from colliding with the original image, and save the image in the specified folder; the second step is to convert the content of the json file into a dictionary format, and scale the x and y coordinates in the dictionary content, since the scaled coordinates may have decimals, the rounding process is performed, and the zoomed x and y coordinates are re-saved in the specified position in the dictionary, and finally the new json file is generated and packaged; the third step is to read the information of the new picture, obtain the picture name and size, and read the json generated in the second step in a dictionary format, modify the value of the dictionary, including filename and size,

Research on the Application of Instance Segmentation Algorithm

213

where the dictionary key value consists of “filename + size”, after the modification is completed, re-save the json file to complete the data enhancement operation. The algorithm implementation in the application scenario, trained the model with the produced data set and continuously adjusted parameters according to the test result, improved the algorithm to ensure the generalization ability and accuracy of the algorithm. Finally, the trained algorithm model is embedded into the workstation, and the application device transmits the captured image to the workstation to perform segmentation of the target instance to realize the counting of the target number. The algorithm application process is shown in Fig. 2.

Collect images, Production dataset Model training

Algorithm Implementation Algorithm embedding, Practical application

Algorithm optimization

Fig. 2. Algorithm application process

3.2

Improved FPN Process

FPN is a top-down process, which transfers high-level features down the layer, only enhances the high-level semantic information, and does not enhance the underlying information. The underlying information of the neural network is mostly the shape of the edge, these features are important for pixel level instance segmentation. For this problem, a bottom-up enhancement path is added after the original FPN to transfer the underlying positioning features to improve the overall architecture of the FPN, thereby improving the accuracy of the bounding box positioning. The enhanced FPN is shown

N6

C5

N5 P5

C4

N4 P4

N3

C3 P3

N2

C2 P2 C1

Input image

Fig. 3. Path-enhanced FPN structure

Fig. 4. Feature map operation process

214

Y. Cang et al.

Fig. 5. Fast path enhanced FPN structure

Fig. 6. Fast path feature layer operation process

in Fig. 3, inside the dashed box is the enhanced path. Among them, the feature maps N2 and P2 are the same size, 3 * 3 convolution strides of 2 for Ni to obtain a feature map of the same size as Pi + 1, and is added to Pi + 1 obtained by the 1 * 1 convolution operation, obtained Ni + 1, the operation process is shown in Fig. 4. After adding the enhanced path of FPN, the accuracy of the network bounding box positioning is improved, but when the edge is not obvious the target is easy to miss detection. For this problem, consider shortening the enhanced path of the feature pyramid network, and directly integrate the low-level edge location features with highlevel features, improve the ability to recognize edge contours. Specifically, the fast path of the N2 layer to the N4 layer and the N3 layer to the N5 layer is designed, the fast enhancement path of the FPN is as shown in Fig. 5, and the enhanced path parameter is set according to the output size of different network layers, and the N2 layer is subjected to a convolution kernel of 5 * 5, stride is 4 convolutional layer connected to N4 layer, N3 layer is connected to N5 layer through the same convolutional layer, the operation process is shown in Fig. 6, shortening the transmission path between different feature layers, making full use of low-level features, enhancing the model to the target edge recognition ability. 3.3

Improved NMS Process

The traditional NMS process is simple and rude. When the IoU of the prediction box and the highest bounding box of the score is greater than the set threshold, the prediction frame is directly removed. When the two targets occlude each other, the IoU of the boundary box between the targets is larger, which causes the NMS process suppressed the bounding box of the adjacent target, causing the missed detection. The NMS process is optimized for this problem and changed to the soft NMS process, When the prediction box and the IoU of the highest bounding box of the score are larger than the threshold, the prediction frame is not directly removed, instead, the score of the prediction box is modified. In order to minimize the number of redundant bounding boxes, using the principle that the bigger the IoU, the lower the score. The new score of the prediction bounding box is set to the original score multiplied by the

Research on the Application of Instance Segmentation Algorithm

215

exponential function, as in Eq. (1). Among si is the original score, and B is the highest bounding box of the score, bi is the current processing prediction box. s ¼ s i e

iouðB;bi Þ a

ð1Þ

4 Experimental Results At present, crowd count always adopts target detection algorithm, instance segmentation algorithm adds a mask prediction branch based on the target detection, which can accurately segment the edge of the target, the detection result is more accurate, and can better handle occlusion problems between targets. The detection results of the common target detection algorithm and the instance segmentation algorithm on the MS COCO data set are shown in Table 1. The accuracy of the instance segmentation algorithm is about 15% higher than that of the target detection algorithm, therefore, this paper chooses to study the instance segmentation algorithm.

Table 1. Comparison of detection results (%) between target detection algorithm and instance segmentation algorithm on the MS COCO dataset Method Faster R-CNN SSD Mask R-CNN

mAP50 42.7 46.5 60

mAP[0.5, 0.95] 21.9 26.8 37.1

This article uses the data set produced by the application scene, including 800 training sets, 80 verification sets, 120 test sets. There are different degrees of occlusion problems between the targets in the picture, the IoU between the target bounding boxes is large, the NMS thresholds in the RPN are tested. The thresholds are set to 0.6, 0.7, and 0.8 respectively. Under different thresholds, the training results of the model and the test are tested, test results on the set are shown in Table 2, this results are obtained from the original Mask R-CNN test. For the comprehensive training and test results, 0.8 is selected as RPN_NMS_THRESHOLD.

Table 2. Training and test results under different RPN_NMS_THRESHOLD THRESHOLD 0.6 0.7 0.8

Loss 0.1328 0.1315 0.1306

Val loss 0.1807 0.1627 0.1581

AP50 0.9264 0.9326 0.9322

216

Y. Cang et al.

In the feature extraction network, the feature extraction network was modified, and Resnet101, Resnet152, ResNext101 and the proposed path-enhanced Resnet152 network were trained and tested. The training results of each network are shown in Table 3. According to the training results and test results, path-enhanced Resnet152 network was selected as the feature extraction network. Table 3. Training results of feature extraction network Network ResNet101 ResNet152 ResNeXt101 Proposed

Loss 0.1252 0.1125 0.1265 0.0851

Val loss 0.1294 0.1168 0.1276 0.0825

After the FPN network is modified, the path-enhanced FPN is combined with the ResNet152 network for training, training set and the verification set loss are reduced to 0.1 or less. The loss value changes during the training process as shown in Fig. 7, the edge detection accuracy of the training model is improved. After modifying to fast path-enhanced FPN, the training model has significantly improved the detection effect of the edge fuzzy target. The detection results of FPN, path enhancement FPN, and fast path enhancement FPN are shown in Table 4.

Fig. 7. Loss value curve

Table 4. Comparison of APs in FPN Model Initial model Improved model Optimization model

Backbone network FPN Improved-FPN Fast-improved-FPN

AP 0.67 0.79 0.82

Research on the Application of Instance Segmentation Algorithm

217

After the NMS process is optimized, the recognition accuracy of the training model for the mutual occlusion targets (large IoU of the bounding box) is improved. The comparison of the detection results is shown in Fig. 8. The ratio of accurately identified (recall = 100%) pictures increased from 93% to 96%.

Fig. 8. Comparison of test results

After the network structure optimization and parameter adjustment of the algorithm, the accurate recognition of the image in the test set (recall = 100%) is increased from the initial 52.7% to 96.3%, that is, the image can be accurately counted more than 96%, and the counting task can be initially completed (Table 5). Table 5. Accurate inventory picture comparison Model Backbone network Accurately identify image ratios (%) Initial model ResNet101-FPN 52.7 Improved model ResNet152-Fast-improved-FPN 95.3

5 Conclusion In this paper, we studied the pixel-level instance segmentation algorithm based on deep learning–Mask R-CNN. Combined with the application scenario, there is a mutual occlusion problem in the queuing crowd in front of the subway, and since the crowd is dynamic, the targets in the picture has edge blurring phenomenon, improved the FPN and NMS process of Mask R-CNN. The experimental results show that the improved Mask R-CNN algorithm can increase the counting accuracy of queues in front of the subway to 96%, which can meet the requirements of practical applications.

218

Y. Cang et al.

However, the current algorithm also has certain deficiencies, this algorithm mainly performs the counting of the target in the single-frame picture. The shooting time of the queued passengers in front of the subway is not fixed, there was a big difference in the number of people when the train left and when it was coming, to a certain extent, it causes statistical errors. In the future research work, real-time monitoring of the number of queues in front of the subway will be gradually realized.

References 1. Hariharan, B., Arbeláez, P., Girshick, R., et al.: Simultaneous Detection and Segmentation (2014) 2. Hariharan, B., Arbeláez, P., Girshick, R., et al.: Hypercolumns for object segmentation and fine-grained localization. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015) 3. Dai, J., He, K., Sun, J.: Convolutional Feature Masking for Joint Object and Stuff Segmentation (2014) 4. Pinheiro, P.O., Collobert, R., Dollar, P.: Learning to Segment Object Candidates (2015) 5. Pinheiro, P.O., Lin, T.Y., Collobert, R., et al.: Learning to Refine Object Segments (2016) 6. Zagoruyko, S., Lerer, A., Lin, T.Y., et al.: A MultiPath Network for Object Detection (2016) 7. Dai, J., He, K., Sun, J.: Instance-aware Semantic Segmentation via Multi-task Network Cascades (2015) 8. Li, Y., Qi, H., Dai, J., et al.: Fully Convolutional Instance-aware Semantic Segmentation (2016) 9. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015) 10. He, K., Gkioxari, G., Dollar, P., et al.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. (2017) 11. Gidaris, S., Komodakis, N.: Object detection via a multi-region and semantic segmentationaware CNN model. In: IEEE International Conference on Computer Vision, pp. 1134–1142. IEEE (2016) 12. Montserrat, D.M., Lin, Q., Allebach, J., et al.: Training object detection and recognition CNN models using data augmentation. Electron. Imaging 2017(10), 27–36 (2017) 13. Bodla, N., Singh, B., Chellappa, R., et al.: Soft-NMS—Improving Object Detection with One Line of Code (2017) 14. He, Y., Zhu, C., Wang, J., et al.: Bounding Box Regression with Uncertainty for Accurate Object Detection (2018) 15. Singh, B., Davis, L.S.: An Analysis of Scale Invariance in Object Detection - SNIP (2017) 16. Cai, Z., Vasconcelos, N.: Cascade R-CNN: Delving into High Quality Object Detection (2017) 17. Cholakkal, H., Sun, G., Khan, F.S., et al.: Object Counting and Instance Segmentation with Image-level Supervision (2019) 18. Han, C., Murao, K., Satoh, S., et al.: Learning More with Less: GAN-based Medical Image Augmentation (2019) 19. Perez, L., Wang, J.: The Effectiveness of Data Augmentation in Image Classification using Deep Learning (2017) 20. Chen, X., Girshick, R., He, K., et al.: TensorMask: A Foundation for Dense Object Segmentation (2019) 21. Liu, S., Qi, L., Qin, H., et al.: Path Aggregation Network for Instance Segmentation (2018)

Image and Video Processing

Implementation of Android Audio Equalizer Based on FFmpeg Weida Zhuang1, Xingli He1,2, and Jinyang Lin1(&) 1

School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected], [email protected], [email protected] 2 Research Center for Microelectronics Technology, Fujian University of Technology, Fuzhou 350118, Fujian, China

Abstract. With the continuous development of multimedia technology, music has become an indispensable part of people’s daily life, so the audio equalizer used to improve the performance of music is also extremely important. This paper first introduces the development and use of the current excellent FFmpeg audio and video framework, and deeply analyzes the use of the high order audio parameters multiband equalizer algorithm and the feasibility of using it on the Android platform. The FFmpeg high order audio parameter multiband equalizer algorithm is implemented on the Android platform for corresponding sound processing. The experimental results show that the Android system can perform sound processing on the input audio files of the system according to the preset gain. Keywords: Multimedia technology platform Audio processing

FFmpeg JNI development Android

1 Introduction Nowadays, the development of Android system is relatively mature, and it has occupied a high market share. Correspondingly, there are many music player software with equalizer adjustment function for Android platform, such as QQ music and other software. However, in the published literature, the literature on Android audio equalizers is relatively rare. Most of the audio related researches are focus on audio or video codecs, and audio equalizers are lacking [1, 2]. The research on audio equalizers is mostly focused on individual research algorithms [3, 4] or combined with hardware development [5–7], it is rare to combine with operating systems. This paper combines the current popular Android system to develop the audio equalizer. The algorithm of the audio equalizer adopts the current excellent open source audio and video framework FFmpeg, which makes the development of the audio equalizer easier. Through looking up the relative literature, it is found that the research on the FFmpeg audio and video framework mostly focuses on the development of video functions and the use of codec functions [8, 9], and less literature on FFmpeg audio function development. As can be seen from the published literature, the FFmpeg audio and video framework contains extremely powerful functions, supports multiple audio and video formats, can © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 221–227, 2020. https://doi.org/10.1007/978-981-15-3308-2_25

222

W. Zhuang et al.

filter and process audio and video, and can also perform audio resampling, and so on. Because it is feasible to use FFmpeg audio and video framework for application development on Android platform, so this article uses the FFmpeg audio and video framework to implement the audio equalizer on the Android platform.

2 FFmpeg Development and Use 2.1

FFmpeg Usage

There are two main ways to use FFmpeg: one is to use the API interface reserved by FFmpeg, and the other is to use the command line to implement the function. The former needs to call the FFmpeg internal function when using it. It requires the user to have a certain degree of understanding of the internal functions of FFmpeg, and a function, a project implementation often use more functions, so it demands developer master all the required functions. It must be occupied a lot of software development time, and many functions inside FFmpeg are written from the very beginning of the implementation, so the amount of code is more, the process of understanding the code requires a lot of effort. The latter uses the command line to implement the function, compared with the API interface reserved by the former using FFmpeg, the development process is relatively simple. In this way, you only need to know the command line format of the function you use to achieve the corresponding function. Compared with the complicated internal functions, this method greatly facilitates the user and shortens the development cycle. The FFmpeg command line uses the syntax: ffmpeg ½½options½ i0 input file ½½optionsoutput file

2.2

The Overall Design

First, the FFmpeg source code is compiled, and the corresponding dynamic library file can be obtained after compilation. Before compiling, you need to enable the function of the corresponding function module. Otherwise, the module will be skipped during compilation, which will affect the later use. The FFmpeg high order audio parameter multiband equalizer algorithm used in this paper needs to enable the avfilter option of FFmpeg. Secondly, perform JNI develop, write the JAVA class of the required function native method and generate the header file, then write the C/C++ file to implement the method, and compile it into a new dynamic library containing the native method; finally, load the new dynamic library in the project, and the corresponding function can be used later. This article uses the command line to achieve, so the main program main code is to pass the FFmpeg command to the main function execution in FFmpeg, the command is passed and executed successfully, the corresponding function is realized.

Implementation of Android Audio Equalizer Based on FFmpeg

223

3 Analysis of FFmpeg High Order Audio Parameters Multiband Equalizer Algorithm 3.1

Operational Process Analysis

By analyzing the high order audio parameter multiband equalizer algorithm source code, the FFmpeg high-order audio parameter multiband equalizer algorithm contains three types of filters. When there is audio data input, the system first matches the required filter type, and then Calculate the corresponding bandwidth gain GB according to the peak gain G set by the user, that is, the compute_bw_gain_db part; then derive the filter design parameters according to different types of filter transfer function models, that is, the bp_filter part, after the design parameters are calculated, the corresponding parameters can be obtained. Digital low-pass slope type filter coefficients a0 ; ; a4 and b0 ; ; b4 , which in turn obtain the coefficients a0 ; ; a4 and b0 ; ; b4 of the band-pass equalizer second-order sub-block and the fourth-order sub-block, that is, the fo_section part; finally, the input audio data and the calculated filter coefficients are based on The difference equation of the system is calculated to obtain a new output, that is, the section_process part, and the gain set by the user is valid. Figure 1 shows the main part of the FFmpeg operation process. equalizer ftype butterworth

chebyshevI

chebyshevII

compute_bw_gain_db

compute_bw_gain_db

compute_bw_gain_db

bp_filter

bp_filter

bp_filter

fo_section

fo_section

fo_section

section_process

Fig. 1. FFmpeg main part of the running process

3.2

Algorithm Simulation Analysis

After obtaining the source code of the algorithm, the algorithm needs to be simulated to ensure the correctness of the algorithm. In the simulation process, the FFmpeg high order audio parameter multiband equalizer algorithm operation flow is followed. First, the input is set to a sine wave signal, and using 32 Hz, 64 Hz, 125 Hz, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz, 8000 Hz, 16000 Hz as the center frequency of the equalizer. Since this paper wants to combine the algorithm with the Android system, it first calculates the three types of frequency response curves of the algorithm according to the characteristics of the Android audio system. It is found that the

224

W. Zhuang et al.

combination of Butterworth and the Android system is better, so this article only uses Butterworth type. After selecting the type, the input signal is calculated by the bandwidth gain, the design parameters are calculated, and the obtained filter coefficients are brought into the difference equation to obtain the input and output waveforms. By comparison, the equalizer function has been realized. Due to the limited space, the text only places one of the frequency band simulation results as shown in Fig. 2.

Fig. 2. 500 Hz input signal processing result

4 Development Process and Results Analysis 4.1

FFmpeg Compilation

The FFmpeg source code compilation work of this article is performed under Ubuntu14.04. After the source code is downloaded, use tar -zxvf filename to decompress it. Since the dynamic library format loaded by the Android system is filename.so, and FFmpeg does not modify the configuration, the compiled dynamic library file is filename.so.xx.xx.x, and the FFmpeg will compile the version number after .so, Android The system can’t be loaded, so you need to modify the FFmpeg configuration file to adapt to the Android system [10]. Since it will eventually be applied to the Android platform, after modifying the configuration file, you need to configure the NDK tool. Finally, you can write a compile script that configures environment variables and FFmpeg related functions to be turned on or off. The high order audio parameter multiband equalizer algorithm used in this article is mainly the libavfilter module function, so the module function is required at compile time. The open way is – enable-filter. After the compilation is successful, there will be an android folder under the FFmpeg source path, which mainly contains the dynamic library files generated after compilation and the corresponding header files.

Implementation of Android Audio Equalizer Based on FFmpeg

4.2

225

Android FFmpeg Command Line Use

Since the Android platform cannot directly use the FFmpeg command, it is necessary to introduce the function entry of the FFmpeg execution command into the Android application layer by using the JNI method. The main steps for implementing the FFmpeg command line using the JNI development method are as follows [11]: 1. Write a JAVA class that contains native methods; 2. Generate the header file of the class, specifically to switch to the java path of the project in the Android Studio terminal, execute javah -classpath.com.jni. filename; 3. Write the corresponding native implementation method, mainly to convert the FFmpeg command passed from JAVA to the parameters argc and argv required by FFmpeg internal main function; 4. In addition, you need to use ffmpeg.h, ffmpeg.c, ffmpeg_opt.c, ffmpeg_filter.c, cmdutils.c, cmdutils.h, cmdutils_common_opts.h, config.h and other 8 files. These files need to be included in the project. In order to ensure that FFmpeg does not automatically destroy the process after executing a command, you need to modify the method implementation of the process in the source code; 5. Write Android.mk file and Application.mk file. Android.mk mainly compiles all the code files of the project and the dynamic library generated by FFmpeg compilation into a dynamic library file with JNI method for Android application layer call; The. mk file is mainly for specifying the ndk version, etc.; 6. Execute the ndk-build command to compile the project. After compiling successfully, you can get the new dynamic library and the file. The dynamic library can be used for Android application layer calls; 7. Copy the new dynamic library into the Android audio scale instrument project path and load the dynamic library in the project. The loading method is: “System. loadLibrary(“filename”)”. Where filename is the dynamic library to remove the file name of “lib”, such as avfilter-6. After loading, you need to add a dynamic library path to the android node in build.gradle; 8. Use the FFmpeg command line in your project. The FFmpeg command line format used in this article is “ffmpeg -i input -af anequalizer = channel f = freq w = width g = gain output”, where input is the input file, channel is the set channel, freq is the center frequency, and width is bandwidth, gain is the gain size, and output is the output. Fill in the corresponding command in the project to complete the use. 4.3

Analysis of Results

According to the previous implementation method, the related development operation is performed under Android Studio, and the application interface is simply set as shown in Fig. 3. When using, first click to get the music list for music selection, then select the desired processing channel, you can select the left channel or the right channel separately, you can also select the stereo of the two channels simultaneously, and then set the gain of each band. Finally, click Generate Music to generate the processed output file and play it in the specified path. The input and output files are loaded into the Adobe Audition software, and the frequency analysis curve before and after processing is compared. The system has indeed been executed according to the input command.

226

W. Zhuang et al.

Figure 3 show the comparison of the left and right channel frequency analysis curves before and after processing. In order to verify the positive and negative gains. Performance, so the left channel is set to −15 dB gain, and the right channel is set to +15 dB gain. As can be seen from the figure, both positive and negative gains are correct. It can be seen that it is feasible to apply the audio equalizer algorithm included in the FFmpeg audio and video framework to the equalizer development of the Android platform.

Fig. 3. Application main interface and the frequency analysis curve

5 Conclusion This paper deeply analyzes the high-order audio parameter multi-band equalizer algorithm and development and use method in FFmpeg audio and video framework. Combined with the characteristics of Android platform, JNI development is implemented in JAVA program, which realizes the use of FFmpeg command line in Android and successfully applies it. Go to the specific audio equalizer development example. The experimental results show that the FFmpeg command is used normally on the Android platform. For the use of FFmpeg on the Android platform, adopting the command line implementation method greatly reduces the development difficulty and cycle of the audio equalizer, and has certain application value. Acknowledgement. In this paper, the research was supported by Fujian Provincial Natural Science Foundation Projects (2017J05097).

Implementation of Android Audio Equalizer Based on FFmpeg

227

References 1. Hao, J.: Research on equalizer algorithm based on Android platform. Dalian University of Technology, Dalian (2011) 2. Feng, Q., Yang, F., Yang, J.: Design and implementation of sound system based on android platform. Netw. New Media Technol. 5(4), 16–22 (2016) 3. Rämö, J., Välimäki, V.: Optimizing a high-order graphic equalizer for audio processing. IEEE Signal Process. Lett. 21(3), 301–305 (2014) 4. Wu, L.: Research and implementation of audio equalizer algorithm. Xi’an University of Electronic Science and Technology, Xi’an (2010) 5. Jiang, J.: Audio processing with channel filtering using DSP techniques. In: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference, pp. 545–550 (2018) 6. Chen, S.: Design of digital sound system based on DSP. Xiangtan University, Hunan (2016) 7. Zhu, Z.: Research and design of integrated sound processor based on DSP. Xi’an Engineering University, Xi’an (2012) 8. Zhong, X., Luo, Z.: Design of video bitrate analyzer based on swift. In: EITCE 2018, pp. 1– 4 (2018) 9. Zeng, H., Shi, L., Zhang, Z.: Research and implementation of video codec based on FFmpeg. In: 2016 International Conference on Network and Information Systems for Computers, vol. 54, pp. 184–188 (2016) 10. Liu, H., Lu, C.: Research on FFmpeg open source project ported to Android. Value Eng. 4, 166–169 (2016) 11. CSDN, Android integration FFmpeg (Two) Call FFmpeg [EB/OL] as a command. https:// blog.csdn.net/yhaolpz/article/details/77146156

Can Deep Neural Networks Learn Broad Semantic Concepts of Images? Longzheng Cai1(&), Shuyun Lim2, Xuan Wang1, and Longmei Tang1 1

2

College of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, China [email protected] Faculty of Business and Technology, Unitar International University, 47301 Petaling Jaya, Selangor, Malaysia

Abstract. A lot of researches use DNNs to learn image high-level semantic concepts, like categories, from low-level visual properties. Images have more semantic concepts than categories, like whether two images are complement with each other, serve the same purpose, or occur in the same place or situation, etc. In this work, we do an experimental research to evaluate whether DNNs can learn these broad semantic concepts of images. We perform experiments with POPORO image dataset. Our results show that in overall, DNNs have limited capability in learning above-mentioned broad semantic concepts from image visual features. Within DNN models we tested, Inception models and its variants can learn broad semantic concepts of images better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset used in this work is too small for DNN models. Big image datasets with rich and broad semantic labels and measures is the key for successful research in this area. Keywords: DNN Image matching similarity Image visual property

Image semantic property Image

1 Introduction Images have low-level visual features and high-level semantic properties. Image appearance and geometric information contains low-level properties that mainly refer to their shape, color, texture, pattern, position, dimension, etc. These are also called image physical properties. On the other hand, Image semantics is in the area of image understanding, and it is related to high-level cognition of images by humans. Image semantic relatedness is the core content of image understanding. It describes the meaning association between images, like whether objects in any two images are complement with each other, serve the same purpose, occur in the same place or situation, or belong to the same basic category [1]. There are successful applications of Deep Neural Networks (DNNs) in image classification, searching, recommendation, annotation, segmentation, object detection and many other areas related to computer vision. Except image visual information, © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 228–236, 2020. https://doi.org/10.1007/978-981-15-3308-2_26

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

229

category and hierarchical relationship among images are the main semantic information used in these applications. A natural question is can DNNs learn broad image semantic concepts. In this work, we conduct an experimental research on whether DNNs can learn broad image semantic concepts. Our results show that in overall, DNNs have limited capability in learning broad image semantic concepts just from image visual features. Within DNN models we tested, Inception models and its variants can learn broad image semantic concepts better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset [1] used in this work is too small for DNN models. The rest of this paper is organized as following. The POPORO image dataset [1] used in this work is introduced in Sect. 2. Section 3 describes experiment setup and result analysis. Related work is given in Sect. 4, and this paper is concluded in Sect. 5.

2 POPORO Image Dataset and Its Semantic Relatedness Rating POPORO image dataset was originally built for cognitive psychology researches. It contains 1,200 images. All images were “resized to 400 400 pixels and each image contains a single object on a white background” [1]. These 1,200 images are partitioned to 400 triplets. Each triplet contains an object image, an image semantically related to the object image, and an image semantically unrelated to the object image. Samples of triplets and their semantic relatedness ratings are shown in Table 1. Table 1. Sample image triplets from POPORO dataset. Each triplet contains an object image, a semantically related image, and a semantically unrelated image. The semantic relatedness rating of object and related image pair, and object and un-related image pair evaluated by participants are shown in the right side of the table. Image triplet Object

Related

Unrelated

Semantic relatedness rating Object and Object and Related Unrelated 3.59

1.25

3.25

1.05

3.23

2.15

3.43

1.1

230

L. Cai et al.

3 Experiments and Result Analysis In this research, right match ratio is used to evaluate how good a model can learn image semantic relatedness. A right match is defined as: For the two pairs of images in a triplet of the POPORO dataset, if the similarity of the match pair is larger than or equal to that of the un-match pair, it is a right match; otherwise, it is a wrong match. Similarly when distance measure is used, if the distance of the match pair is less than or equal to that of the un-match pair, it is a right match; otherwise, it is a wrong match. We conducted three types of experiments. In the first experiment, we assess whether image structure similarity index (SSIM) [2] and mean squared error (MSE) [3] measures calculated from raw images are in accordance to image semantic relatedness. Similar to research [4], the second experiment uses ImageNet pre-trained DNN models [5] to extract features from images. Then distances of the two pairs of images in each triplet of POPORO dataset are calculated from these features, and are used to evaluate the capability of ImageNet pre-trained models in learning broad image semantic relatedness. Based on the second experiment, the third experiment uses the features extracted by ImageNet pre-trained models as data, and the semantic ratings made by participants of POPORO dataset as targets to train DNN regression models. Then these models are used to predict semantic relatedness of images pairs and evaluate whether these regression models can learn image semantic relatedness. 3.1

Are Similarity Measures of Images in Accordance to Semantic Relatedness?

Our first experiment is to verify whether some image similarity metrics can be used as measures of image semantic relatedness, or in other words, whether these metrics are in accordance to human-evaluated ratings of image semantic relatedness. If this was true, then we can use these metrics to represent image semantic relatedness. The two metrics we used are SSIM and MSE. SSIM index is a method for measuring the similarity between two images. “It is a full reference metric; in other words, it measures image quality based on an initial uncompressed or distortion-free image as reference” [2]. “MSE represents the average of the squares of the errors between two images; the error is the amount by which the values of one image differ from another image” [3]. The experiment process is show in Fig. 1. For the two pairs of images in each triplet, we calculate SSIM index of the object and related images, denoted as SSIM(O, R), and also SSIM index of object and unrelated images, denoted as SSIM(O, U). Then these two SSIM indices are compared. If SSIM(O, R) is greater than or equal to SSIM (O, U), we call it a right match since this matching is in line with participant’s semantic relatedness ratings; otherwise, it is a wrong match. Similarly, when MSE is used, for the two image pairs in each triplet, we calculate MSE between the object and the related image, which is denoted as MSE(O, R), and MSE between the object and the unrelated image, which is denoted as MSE(O, U). If MSE(O, R) is less than or equal to MSE(O, U), we call it a right match; otherwise, it is a wrong match. We calculate the right match ratio among all 400 image triplets, and the result is shown in Table 2.

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

231

Fig. 1. The SSIM indices of the two image pairs in each triplet are used to calculate the right match ratio, and this ratio is used to assess whether SSIM index is in accordance to image semantic relatedness. The same method is applied to MSE. Table 2. Image match accuracy using SSIM index and MSE measures calculated from raw images. Metrics Accuracy (%) SSIM 51.75 MSE 47.5

Table 2 shows the matching accuracy based on SSIM index and MSE are 51.75% and 47.5% respectively. Both accuracies are close to random guessing accuracy 50% of binary classification problems. From this experiment we can see that SSIM index and MSE measures calculated from raw images cannot be used to represent image semantic relatedness. 3.2

Can DNN Learn Broad Image Semantic Relatedness?

POPORO is a very small dataset with 400 triplets and totally 1,200 images in it. It is too small to train and evaluate DNN models. In this experiment, ImageNet pre-trained DNN models [5] are used. “ImageNet is a project which aims to provide a large image database for research purposes. It contains more than 14 million images which belong to more than 20,000 classes” [6]. The models used in this paper are trained with a subset of ImageNet which has 1.2 million images belonging to 1,000 classes. Similar to research [4], the experiment process is shown in Fig. 2. The three images in each triplet are used as input to ImageNet pre-trained DNN models. The outputs are three feature sets: object image feature set fO, related image feature set fR, and unrelated feature set fU. Each feature set is a vector of 1,000 values; these values are the probabilities that a given image belongs to the 1,000 categories in the ImageNet dataset, so they are small values and the sum of all these 1,000 values is 1. We than calculate the distance between object image feature fO and related image feature fR, denoted as d(fO, fR), and also distance between object image feature fO and unrelated image feature fU, denoted as d(fO, fU). If d(fO, fR) is less than or equal to d(fO, fU), we call it a right match since it is in line with semantic relatedness rating evaluated

232

L. Cai et al.

Fig. 2. Using ImageNet pre-trained DNN models to extract features from images, then distances of image pairs are computed and used to calculate the right match ratio of images.

by POPORO dataset participants; otherwise, it is a wrong match. Multiple DNN models and distance metrics are used in this experiment and the test result is shown in Table 3. From Table 3 we can see for VGG models (VGG16 and VGG19) with all distance metrics used, the highest match accuracy is 55.75%. But for majority distance metrics, the right match ratio is close to random guessing accuracy 50%; it indicates VGG models cannot learn broad image semantic relatedness. For the case of ResNet50, DenseNet121, and DenseNet201 with different distance metrics, all accuracies are below 50%. This means for these networks, unrelated images are closer to object images compared with related ones. This is in opposite to the image semantic relatedness ratings evaluated by POPORO dataset participants. Inception models and its variants, including InceptionV3, Xception, and InceptionResNetV2, have match accuracy greater than 50% across all distance metrics. The highest accuracy is 65% from InceptionResNetV2 using cosine distance, and lowest accuracy is 51.5% from Xception with canberra distance. Table 3. Image match accuracy (%) of ImageNet pre-trained models with different distance measures Distance

VGG16 VGG19 ResNet 50

DenseNet 121

DenseNet 201

Inception v3

Xception Inception ResNetV2

Cosine Euclidean Braycurtis Canberra Chebyshev

49.25 50.25 48 49.5 48.5

48.25 46.25 47.75 47.75 48.5

44.25 44 44 45.5 43.75

62.5 52 59 52.25 56

59.5 54.5 58 51.5 60.5

46.75 48.25 48.5 49.75 55.75

43.25 46.25 45.25 44 48.5

65 55.25 61.5 54.5 60.5

Table 4. Right matching accuracy of regression DNNs using features extracted by ImageNet pre-trained models. Feature extraction models

VGG16 VGG19 ResNet DenseNet 50 121

DenseNet 201

Inceptionv3 Xception Inception ResNetV2

Match accuracy (%)

54

57.5

55

55

56.25

58.75

61.25

60

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

233

We can conclude that in some extend, Inception networks are capable of learning broad image semantic relatedness. Consine and chebyshev distances are the best performers in all distance metrics. This experiment also shows that learning image broad semantic relatedness is much harder compared with only learning image categories since DNNs can easily achieve above 90% accuracy in a binary image classification task. 3.3

Learn Broad Image Semantic Relatedness with Regression DNN

Another experiment we have done is using the features extracted by the ImageNet pretrained models to train a regression DNN model, as shown in Fig. 3. First, the three images in each triplet are used as input of ImageNet pre-trained models, and the feature sets extracted by these models and their corresponding semantic relatedness ratings from participants, as samples shown in Table 1, are used to train and evaluate the regression DNN model.

Fig. 3. ImageNet pre-trained DNN models are used to extract features from images, then these features are used to train a Regress DNN model. The prediction of this model is used to calculate image right match ratio.

We partition the features extracted by ImageNet pre-trained models to two parts: 80% for training and cross-validation, and 20% for testing. In the training phase, grid search [7] is used to determine the structure and select hyperparameters of the regression DNN model. The resulted model, shown in Fig. 4, contains three hidden dense layers. Each layer has 64 neurons, and a L2 weight decay coefficient value 1e-06 is used for regularization. The activation function of all three hidden layers is RELU. The model’s learning rate is 0.01, and RMSProp is the optimization algorithm. After the structure and hyperparameters of the regression model were determined, the 80% training data and their semantic relatedness ratings evaluated by POPORO dataset participants as targets are used to training the model. The trained model then is used to predict the semantic relatedness rating values for all test data. The image right match ratio is used to evaluate the model’s performance.

234

L. Cai et al.

As shown in Fig. 3, for each triplet in the test images, if the predicted semantic relatedness rating of object and related images, denoted as r(fO, fR), is greater than or equal to the that of object and unrelated images, denoted as r(fO, fU), it is a right match; otherwise, it is a wrong match. The experiment result is shown in Table 4.

Predicted Relatedness Rating

Dense: N=1

Dense: N=64, LR=0.01, L2=1e-6, RELU

Regression Model

Dense: N=64, LR=0.01, L2=1e-6, RELU

Dense: N=64, LR=0.01, L2=1e-6, RELU

Feature 1

Feature 2

Relatedness Rating

Fig. 4. The regression DNN model has four dense layers. N stands for number of neurons, LR is the Learning Rate, L2 is the weight decay coefficient, and RELU is the activation function.

We can see the lowest and highest image match accuracies are 54% and 60% respectively. Although the highest match accuracy 60% is lower than that of Table 3 of Sect. 3.2 which is 65%, with regression models, the image match accuracy of all ImageNet pre-trained DNN models for feature extraction are above 50%. This indicates compared with distance measures used in Sect. 3.2, regression models have better and more stable capability in learning image broad semantic relatedness.

4 Related Work Learning image semantic concepts remains challenging because of the “large discrepancy of visual and semantic properties; pixel-level images usually lack of highlevel semantic information” [8]. The hierarchical structures used to organize word datasets as well as image datasets contain semantic information and can be used in computer vision and other related applications. “WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations” [9].

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

235

Similar to WordNet, ImageNet dataset is also organized in a hierarchical structure [10]. It “contains 12 top-level categories, e.g. animal, instrumentality. Every top category has subcategories. For example, category animal contains subcategories chordate, vertebrate, mammal, etc.” [10]. This structure contains not only semantic categorical labels, but also semantic relationships among images. Images in the same top category have higher semantic similarity (shorter distance) compared with images from different top categories. In the same way, images within the same subcategory have higher semantic similarity (shorter distance) compared with images from difference subcategory. Images from social networks, like Flickr, Facebook, Instagram, etc., are not organized in a strict hierarchical structure as ImageNet, but semantic information can be extracted from metadata associated with these shared images, “such as the groups to which each image belongs, the comment thread associated with the image, who uploaded it, their location, and their network of friends” [11]. There are also researches in image semantic matching [12, 13] which exploit accordance of image visual and semantic similarity.

5 Conclusion We conduct an experimental research to evaluate whether DNN models can learn broad semantic concepts of images like complement with each other, serving the same purpose, occurring in the same situation or place, or belonging to the same basic category [1]. We first assess whether SSIM index and MSE measures calculated from raw images can be used to represent semantic relatedness of images, and the experiment results show they cannot since the image match accuracy based on these two metrics is close to random guess accuracy 50%. We also evaluate whether ImageNet pre-trained DNN models and distance measures can be used to learn image semantic relatedness. Experiments show DNN models has limited capability in this area. Within DNN models we tested, VGG and Densenet cannot learn image relatedness since its matching accuracy is close to random guess accuracy which is 50%. Inception network and its variants can learning image relatedness in some extend, although their matching accuracy (highest 65%) is much lower than the accuracy of binary image classification which can easily reach 90%. When regression DNN models are used instead of distance metrics, features extracted by all ImageNet pre-trained models can learn stable and better image semantic relatedness ratings; this indicates regression DNNs models can learn broad image semantic relatedness. We think the lock of large training dataset is one of the main reasons the image matching accuracy based on broad semantic concepts is low in our work. Our next job is to augment the POPORO dataset; we predict it can improve performance in some extend. But the augmented dataset is still very small compared to the data requirements of deep learning models and available datasets used in image classification, object detection, etc. Large image dataset with broad semantic labels and ratings are of upmost important for research in this area.

236

L. Cai et al.

Compliance with Ethical Standards. This research does not involve human participants and/or animals.

References 1. Kovalenko, L.Y., Chaumon, M., Busch, N.A.: A pool of pairs of related objects (POPORO) for investigating visual semantic integration: behavioral and electrophysiological validation. Brain Topogr. 25(3), 272–284 (2012) 2. Wang, Z., Alan, C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment (2019). https://ece.uwaterloo.ca/*z70wang/research/ssim/. Accessed 25 July 2019 3. Peak signal-to-noise ratio as an image quality metric. http://www.ni.com/zh-cn/innovations/ white-papers/11/peak-signal-to-noise-ratio-as-an-image-quality-metric.html. Accessed 25 July 2019 4. Lee, H.S., Jung, H., Agarwal, A.A., Kim, J.: Can peep neural networks match the related objects?: a survey on ImageNet-trained classification models (2017). https://arxiv.org/abs/ 1709.03806v1. Accessed 25 July 2019 5. Gupta, V.: Keras tutorial: using pre-trained Imagenet models (2019). https://www. learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/. Accessed 25 July 2019 6. Deng, J., Dong, W., Socher, R., Li, K.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009) 7. Brownlee, J.: How to grid search hyperparameters for deep learning models in Python with Keras (2016). https://machinelearningmastery.com/grid-search-hyperparameters-deep-learningmodels-python-keras/. Accessed 25 July 2019 8. Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR 2011, pp. 1777–1784 (2011) 9. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998) 10. Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011 (2011) 11. McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: European Conference on Computer Vision, pp. 828–841 (2012) 12. Wang, Q., Zhou, X.W., Daniilidis, K.: Multi-image semantic matching by mining consistent features. In: CVPR 2017, pp. 685–694 (2017) 13. Huang, Y., Wu, Q., Song, C.F., Wang, L.: Learning semantic concepts and order for image and sentence matching. In: CVPR 2018, pp. 6163–6171 (2018)

A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain Kai An1(&), Zheming Lu1, Faxin Yu1,2, and Xuexue Luo2 1

Zhejiang University, No. 38 ZheDa Road, Hangzhou 310027, China [email protected] 2 Hangzhou Kilbychain Technology, Hangzhou 310030, China

Abstract. There are many algorithms for data hiding in H.264/AVC video sequences, most of which focus on embedding in the decoded domain. However, these schemes always perform badly in embedding speed. Aimed at simplifying computing complexity and developing a fast algorithm, we propose in this paper a new data hiding scheme for H.264/AVC video with a high embedding speed. Instead of hiding data into the elements in the encoding process, we analyze the structure of H.264/AVC video bitstream and look for appropriate bits to modify in order to minimize the change caused to the original stream. Our algorithm chooses the last byte of those less important NALU (Network Abstraction Layer Unit) to hide secret data. The experimental results showed that the proposed algorithm achieves an impressive embedding speed while maintaining good visual quality and embedding capacity. Keywords: Data hiding

H.264/AVC video Bitstream domain

1 Introduction Steganography is the technical art of embedding secret information into digital cover media, whose primary target is to enhance the imperceptibility, fidelity while achieving a large embedding capacity. Since the H.264/AVC video compression standard [1] was proposed by Richardson in 2003, lots of steganographic algorithms have been explored and used in this widely adopted video standard. Most algorithms focus on hiding secret data into quantized discrete cosine transform (QDCT) coefficients and motion vector (MV) [2–8] and try to decrease or eliminate distortion drift while im-proving embedding capacity. Ma et al. [2] analyzed the relationship between the DCT coefficients and the distortion of the pixel values used in intra-frame prediction and then proposed a new algorithm for data hiding in H.264/AVC based on several paired QDCT coefficients. Their scheme firstly handled the intra-frame distortion problem in algorithm based on QDCT coefficient which previous method did not handle. As an extension of Ma et al.’s work, Lin et al. [3] fully used the remaining luminance blocks to hide the secret data and improved the capacity. But the capacity is still limited at a low level. To develop a high embedding capacity based on QDCT coefficients, Nguyen et at. [4] improved the algorithm by classifying all embeddable coefficient pairs in each luminance block into two different clusters, one group for embedding secret data while the other for avoiding distortion drift. Besides QDCT coefficients, motion vectors are © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 237–244, 2020. https://doi.org/10.1007/978-981-15-3308-2_27

238

K. An et al.

explored to hide secret bits. In 2006, Nguyen et al. [5] proposed a scheme to hide secret bits into motion vectors to limit error propagation. They chose the least two significant bits of the larger component in a marked motion vector to hide secret information. But there still exists distortion drift and unfriendly frame latency. In recently proposed algorithms, Niu et at. [6] efficiently use Histogram Shifting (HF) of motion vector values to hide data and manage to overcome the distortion accumulation effects. Later, Li et at. [7] present an algorithm based on two-dimensional histogram modification of MVs. They combine vertical and horizontal components of an MV as embedding pairs and then classify their values into 17 non-intersect sets. Besides, Li et at. also explore ideal reference frame interval to gain satisfying results both in decreasing drift and enhancing embedding capacity. Hiding data in QDCT coefficients and MV is a method focusing on the encoding process in the H.264/AVC video standard, there are also algorithms based on the feature of the undecoded bitstream in H.264/AVC video. In 2017, Liu et al. [8] analyzed the statistic characteristics of the bits for residual differences and motion vector differences, and utilize its redundancy to hide data and received a pleasant embedding speed. All the algorithms for data hiding in H.264/AVC video have different limitations in distortion drift, embedding capacity, and computing complexity. Focus on simplifying computing complexity and developing a fast algorithm, we propose in this paper a new data hiding scheme for H.264/AVC video with a high embedding speed. Instead of hiding data into the elements in the encoding process, we analyze the structure of H.264/AVC video bitstream and hide secret data directly into the last byte of less important NALU (Network Abstraction Layer Unit). The experimental results showed that the proposed algorithm further improved the embedding speed while maintaining good visual quality and embedding capacity.

2 Proposed Scheme In this section, all of the main processes in our fast data hiding scheme for H.264/AVC video will be introduced in detail. Before embedding phase, we will analyze the structure and characteristics of H.264/AVC bitstream and then introduce the target byte in bitstream we select to hide secret data. In the embedding phase, the original secret data is encrypted by scrambling algorithm based on chaotic sequence and then embedded by a specific rule to substitute some bits in target byte. Extracting phase includes the target byte detecting in embedded bitstream and secret data extraction. 2.1

Embedding Position Selection

In our algorithm, secret data is hidden directly in the undecoded H.264/AVC bit-stream instead of elements in coding process, which helps the algorithm achieve excellent efficiency with impressive simplicity. Before depicting how the secret data is hidden, the structure and characteristics of H.264/AVC bitstream are necessary to be analyzed here, which inspired us to find an appropriate position to hide data. The binary stream in H.264/AVC standard is structured and divided into packet called NALU (Network Abstraction Layer Unit). The first byte of NALU is a header

A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain

239

that indicates the type of current packet and some other information. NALU type defines what data structure is presented by current packet. It can be slice, parameter set or filter and so on. Figure 1 shows the NALU structure, which is composed of a onebyte Header and variable-length RBSP (Raw Byte Sequence Payload). RBSP describes a row of bits specified order of SODB (String Of Data Bits) with a trail used for byte alignment. The trail byte begins with a “1” and ends with some “0” which worked as a mark for the end of SODB. Let’s go deeper into the NALU Header. This byte can be sequentially parsed into three parts: forbidden_zero_bit, nal_ref_idc and nal_unit_type. The first bit forbidden_zero_bit is used to check whether there is any error occurred during transmission. The next two bits are nal_ref_idc indicating whether this unit is a reference field, frame or picture. If current NALU is a nonreference, nal_ref_idc equal to 0. For any non-zero value, the larger the value, the more the importance of the NAL unit. The last 5 bits specify the type of current NALU.

Fig. 1. The structure of H.264/AVC bitstream

Based on the analysis of H.264/AVC bitstream structure and meaning of some important bits, we can find that the last byte in a NALU whose nal_ref_idc bits in header equals to 00 or 01 is obviously of lower importance compared to other bytes. So that’s the ideal byte our algorithm aims at to hide secret data. It is necessary to explore which bit or bits in the target byte (the last byte of NALU whose nal_ref_idc bits equals to 00 or 01) should be selected to hide our secret bits, which can affect the quality of video after embedding in different degrees. The reason we choose the last byte of NALU is that there contains a tail for byte alignment, which means not all 8 bits in this byte works for the decoding of bitstream. So, modifying this byte can cause less unpleasant change and affection relatively. In this byte, the length of the tail varies from 2 to 8 almost equiprobably. So we rebuild the tail as “10000” and embed secret bits in front of it to decrease the degree of modification in total. Taking both embedding capacity and visual quality, we choose to embed two bits in one target byte. Now we’ve got the final form of the modified target byte as “xb1b210000”, where x stands for original bit and b1b2 for secret bit.

240

2.2

K. An et al.

Embedding Process

Embedding processes are divided into 5 steps. Figure 2 shows the embedding procedure.

Fig. 2. Embedding procedure

Step 1: encrypt secret bits in advance for security reasons. Scrambling algorithm based on chaotic sequence for encryption is adopted here. Now target byte and to-be-embedded bits are both determined, here are the embedding steps to substitute specific bits with secret bits. Step 2: lengthen original secret bits B0 by 2 or 3 bits to be even-length bits B1 and group the lengthened bits in the unit of two. The added bits Badd is determined by rules in Eqs. 1, 2.

Badd

B1 ¼ Badd þ B0 11; if lðB0 Þ mod 2 ¼ 0 ¼ 100; if lðB0 Þ mod 2 ¼ 1

ð1Þ ð2Þ

Step 3: use FFMPEG to decapsulate video file into H.264/AVC bitstream and locate target bytes. Step 4: use the 2-bit unit b1b2 to substitute the second and third bit in target byte, and set the following 5bits into “10000”. The target byte changes into the form “x b1b210000”. All the 2-bit units are sequentially embedded into target bytes. To mark the end of embedding, we set the last 4 bits of the following target byte into “1000”.

A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain

241

Step 5: use FFMPEG to capsulate the modified bitstream, then we get the embedded video file and finish the embedding process. 2.3

Extracting Process

In this subsection, the extracting algorithm is used to extract the secret data from the embedded H.264/AVC video sequences. Steps are introduced below. Figure 3 shows the extracting procedure.

Fig. 3. Extracting procedure

Step 1: use FFMPEG to decapsulate video file into H.264/AVC bitstream and locate embedded target bytes. We can get all NALU and then check the value of nal_ref_idc (the second and third bit). If the nal_ref_idc is 00 or 01, the last byte of current NALU is our embedded target byte. Step 2: extract the second and third bit from the embedded target bytes until the tail “10000” turns to “1000”. That’s the mark for the end of extraction. Step 3: trim the extracted secret bits B2. We need to remove the last 2 or 3 bits that are added in the embedding process to get the rebuilt bits B3. The removed bits Brm is determined by the rule in Eqs. 3, 4.

Brm

B3 ¼ B2 Brm 11; if last bit ¼ 1 ¼ 100; if last bit ¼ 0

ð3Þ ð4Þ

Step 4: decrypt the trimmed secret bits into the original bits. Then secret information will be rebuilt without any difference from the one before embedding.

242

K. An et al.

3 Experimental Results In this section, we will show the experimental evaluation of the performance of our scheme. Three mp4 video sequences in different sizes are used as test samples. The embedding speed is the most cared aspect of our evaluation. Besides, the performance in visual quality and embedding capacity are also given out in following subsections. 3.1

Embedding and Extraction Speed

As mentioned before, our algorithm is proposed for providing a solution to embedding speed improvement. To verify this, we tested the time needed to embed a specific length of bits (the length is kept below the capacity, which will be discussed in the next subsection) into different sizes of video sequences. Results are shown in Table 1. Table 1. Embedding and extracting speed Video size Secret size Embedding time Extracting time 2.71 MB 100 bits 0.030 s 0.023 s 2.71 MB 200 bits 0.031 s 0.024 s 18.5 MB 1000 bits 0.056 s 0.034 s 18.5 MB 4000 bits 0.092 s 0.074 s 161 MB 10000 bits 1.687 s 0.328 s 161 MB 20000 bits 1.986 s 1.141 s

3.2

Embedding Capacity and Visual Quality

Though we did not spend much effort in embedding capacity and visual quality when we design our data hiding algorithm, the performance in these two aspects are acceptable, which ensures that our scheme is practical to present its advantage in embedding speed without too much concern or limitations. In our experiments, to evaluate capacity, we test the maximum bits that can be embedded to the three video sequences and the ratio of NALU that can be selected to embed secret bit in our scheme. Results are shown in Table 2. Table 2. Embedding capacity Video size Maximum secret size Ratio of target NALU 2.71 MB 294 bits 48.2% 18.5 MB 4348 bits 55.1% 161 MB 25002 bits 52.0%

A Fast Data Hiding Algorithm for H.264/AVC Video in Bitstream Domain

243

Using FFMPEG command to compare the original video and embedded video, we get visual quality evaluated in the PSNR (Peak Signal to Noise Ratio) when embedding maximum bits. Results are shown in Table 3. Table 3. Visual quality Video size 2.71 MB 18.5 MB 161 MB

3.3

PSNR-Y 83.04 45.90 45.88

PSNR-U 63.22 62.82 64.76

PSNR-V 72.63 63.05 63.76

PSNR-AVG 70.37 47.61 47.61

Analysis

The experiment results in both embedding and extracting speed has shown the excellent performance of our scheme. Though we did not spend much effort in embedding capacity and visual quality when we design our data hiding algorithm, the performance in these two aspects are acceptable, which ensures that our scheme is practical to present its advantage in embedding speed without too much concern or limitations. One thing that needs to point out here is that our fast algorithm for data hiding in H.264 video bitstream is not robust against decoding and re-encoding. But in some application scenarios, for example, content authentication, our fast scheme is definitely in huge demand.

4 Conclusion In this paper, a fast data hiding algorithm for H.246/AVC standard video is proposed. In the proposed scheme, we aim at the bitstream domain and make use of the last byte of some NALU packets which have lower importance in bitstream. The algorithm achieves a high speed of embedding while maintaining good visual quality and embedding capacity.

References 1. Richardson, I.E.: H. 264 and MPEG-4 Video Compression: Video Coding for NextGeneration Multimedia. Wiley, Hoboken (2004) 2. Ma, X., Li, Z., Tu, H., et al.: A data hiding algorithm for H. 264/AVC video streams without intra-frame distortion drift. IEEE Trans. Circ. Syst. Video Technol. 20(10), 1320–1330 (2010) 3. Lin, T.J., Chung, K.L., Chang, P.C., et al.: An improved DCT-based perturbation scheme for high capacity data hiding in H. 264/AVC intra frames. J. Syst. Softw. 86(3), 604–614 (2013) 4. Nguyen, D.C., Nguyen, T.S., Chang, C.C., et al.: High embedding capacity data hiding algorithm for H. 264/AVC video sequences without intraframe distortion drift. Secur. Commun. Netw. 2018 (2018)

244

K. An et al.

5. Nguyen, C.V., Tay, D.B.H., Deng, G.: A fast watermarking system for H. 264/AVC video. In: 2006 IEEE Asia Pacific Conference on Circuits and Systems, pp. 81–84. IEEE (2006) 6. Niu, K., Yang, X., Zhang, Y.: A novel video reversible data hiding algorithm using motion vector for H. 264/AVC. Tsinghua Science and Technology 22(5), 489–498 (2017) 7. Li, D., Zhang, Y.N., Li, X.C., et al.: Two-dimensional histogram modification based reversible data hiding using motion vector for H. 264. Multim. Tools Appl. 78(7), 8167–8181 (2019) 8. Liu, N., Pei, J., Ma, L.: A watermark scheme for videos in bitstream domain. Commun. Technol. 50(10), 2333–2339 (2019)

An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice Yi-Nung Chung1, Ming-Sung Chiu1, Chien-Chih Lin1, Jhen-Yang Wang1, and Chao-Hsing Hsu2(&) 1

Department of Electrical Engineering, National Changhua University of Education, Changhua 500, Taiwan [email protected], [email protected], [email protected] 2 Department of Information and Network Communications, Chienkuo Technology University, Changhua 500, Taiwan [email protected]

Abstract. An efficient pattern recognition technology is applied to recognize the number character of lottery and invoice is proposed in this paper. There are some Arabic numerals of lottery or invoice tickets easy to be confused because of unclear printing. In this algorithm, an image processing and pattern recognition technology is applied. The advantage of this algorithm includes that the region of interest for images can be captured automatically and the accuracy of recognition is remarkable. In order to compare the Arabic numerals of invoice with template in database, the optical character recognition technology is proposed. According to compare the normalized character with template in database, the algorithm can recognize the correct Arabic numerals of lottery or invoice tickets. Based on experimental results, the proposed algorithm in this paper is efficient and has high accuracy. Keywords: Pattern recognition Arabic numerals Image processing Optical character recognition technology

1 Introduction In order to avoid the tax evasion, the government in Taiwan encourages people to request unified invoices when they buy any daily articles. There are many awards if their invoice numbers are same as the prize numbers. The government announces the prize numbers every two months. People is interest to check the number in invoices by every two months. The people need to spend much time to check the invoice numbers. In this paper, an efficient algorithm which applies image processing and pattern recognition technology to recognize the Arabic numerals in unified invoice and lottery is proposed. At first, the image pre-processing of the invoice or lottery image is adopted. In this pre-processing step, it uses the gray image converter to obtain the binary image, sets the region of interest (ROI) [1, 2], and captures the interesting part, which can simplify the image and reduce the processing time. In order to compare the character image with the template, the system applies the Optical Character Recognition (OCR) technology to recognize the character of the object image. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 245–252, 2020. https://doi.org/10.1007/978-981-15-3308-2_28

246

Y.-N. Chung et al.

In image process, a series of images taken by the camera and then apply the image processing technology to acquire meaningful information for the further process. The images may be corrupted by random variations which include the intensity of the light, poor contrast, shadows, and noise. The system applies some methods to remove noise and enhance the image to improve the recognition results. It is better to intensify the natural patterns before the recognition work. In order to obtain better effect, the image enhancement algorithm is applied also. By the way, in order to smoothly extract the characters, the object must be put in corrected position. In order to reduce the computation time, the unnecessary area of pattern in the image needs to be removed. In this process, the system needs to set the region of interest (ROI). In general, the image is in RGB color space, but the RGB color image may be affected by shadows. In order to avoid the influence effect, this study proposes that let the image in RGB color space transfer to HSV color space [3]. Moreover, in order to reduce the computation burden, the system will convert the color image to grayscale image. The grayscale image converted by original image may contain some background noise. It needs to filter out the noise before applying the edge detection processing. The edgy positions are usually on the object and the background of border, so it can use this principle to detect the edge of the object. Common edge detection methods include Sobel edge detection, Prewitt edge detection, Canny edge detection method [4–7], and etc. In this paper, the Canny edge detection method is proposed to detect the edge of the object. Because, the Canny edge detection method has done the filter out noise in pre-processing. It applies dual hysteresis thresholds, which let Canny edge detection have advantages of filtering and enhancing edges. It has more clear edge detected by using Canny edge detection method. The rest of this paper is arranged as following. The second section is system algorithm which includes the technology of image processing and pattern recognition process. The detail algorithm proposed in this paper is described. The third section is the experimental test. In this section, a test algorithm is applied. Several sets of data are tried in the experiment which can find the recognized results. The conclusion is presented in the last section.

2 System algorithm In this study there are two parts which includes pre-image processing and pattern recognition. There are several tasks in pre-image process, which are binarization process, color space transform, enhancement process, ROI setting, edge detection, and morphology process [8, 9, 12]. Because the color image contains three components, which will increase the computation burden. Therefore, the system will converse the color image to grayscale image. The transform equation is shown as Eq. (1). Moreover, in order to reduce the influence, the system transfers the RGB space to HSV color space in pre-image process also. And then it also applies the binarization process to binarized the image. The first step of binary image conversion process must set a threshold to be the judgment condition. If the input image grayscale value is less than or equal to the threshold, the pixel is set to 0. And if the grayscale value is greater than the threshold, the pixel is set to 1. Therefore, the image has only two grayscale values left, which are

An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice

247

0 and 1. The 0 represents black and 1 represents white. The conversion formula is shown in Eq. (2). Y ¼ 0:299R þ 0:587G þ 0:114B ð1Þ where R is value of red color, G is value of green color, and B is value of blue color. After applying this equation, we can obtain the gray level value to be Y. The images after applying this process are shown in Fig. 1.

Fig. 1. (a) The original image (b) grayscale image (c) binary image

248

Y.-N. Chung et al.

Fði; jÞ ¼ f1 if Gði; jÞ [ threshold 0 if Gði; jÞ threshold

ð2Þ

where F(i, j) is determined by the threshold value, G(i, j) is the pixel value in the grayscale image. The images after applying the pre-image process are shown in Fig. 1. In the whole image, the part that the user wants may be only one small part. The system does not need to process the entire image. In order to enhance the overall efficiency, it needs to set the region of interest (ROI). In this system, the ROI can be set automatically. In order to capture the object image, it needs to process the edge detection. The purpose of edge detection is to detect the point where the brightness change in the image. It is usually in the boundary between the object and background. In this paper, the Canny edge detection method is used to detect the edge of the unified invoice. Canny edge detection method is an algorithm to extract useful structural information from different vision objects. It has been widely applied in many computer vision systems. Applying the Canny edge detection will have lower error rate and more obvious effect. The edge detection results are shown in Fig. 2. Finally, the object is captured, which is shown in Fig. 2(d).

Fig. 2. (a) Invoice image (b) edge detection result by Canny method (c) rotation to right position (d) Capture the target

An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice

249

Morphology [9, 12] is an important field in image processing. It is an application for binary image enhancement. Its purpose is to reduce fragmentary objects in images. The morphological processing method is based on the set theory of mathematics. The operation in the image is done by masking and performing displacement operations between pixels in the image. This mask is also called structural element (SE), the result of morphological operation is related to the shape of the structural element, in addition to the image itself. The user can set different size and shape structural elements, and then based on the preset element structure to process the morphological algorithm. The binarized image is filled or hollowed out to achieve image segmentation and recognition. However, the character area is black, so the process should be different. The first step is dilation process and then applying the erosion process to enlarge the black characters. The invoice images after morphology process are shown in Fig. 3.

Fig. 3. (a) The original image (b) image after dilation process (c) image after erosion process

For the pattern recognition process, the first step of character recognition is to obtain a template, and then establish the comparison database. Usually, the number of template is more, then the recognition accuracy is better. However, each additional template must occupy more memory space in the database and need more computation time to do the comparison identification. Therefore, the system needs to increase the number of templates for hard identification characters such as 0, 3, 6, 8, and 9. After this process, the identification accuracy is enhanced. The character recognition algorithm is using optical character recognition (OCR) [10, 11] to determine the shape of the character in a dark and bright manner, and then compare it with the template in the database. Equation (2) is the correlation coefficient formula, which is used to calculate the overall error of two variables. P P m n ðAmn AÞðBmn BÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r ¼ corr2ðA; BÞ ¼ r P P P P Þ2 Þ2 ð A A ð B B mn mn m n m n

ð2Þ

represents the average of all where A and B represent two images of the same size. A represents the average of all pixels in the B image. r pixels in the A image, and B represents the correlation coefficient, which ranges from −1 to 1. If it closer to −1 and 1 is more relevant, and 0 means completely irrelevant.

250

Y.-N. Chung et al.

3 Experimental test In this section, the image processing techniques are applied to conduct the experiments. In the experiment, we choose 50 invoices randomly. And then we put these 50 invoices in different angles, such as parallel vertical axis, clockwise inclination and counterclockwise inclination. The typical invoice images are shown in Fig. 4

Fig. 4. Invoice image in different angles

Some identification examples are shown in Table 1. Table 1. Identification examples

Original Image

Captured image

Real numerals

Identification results

Correct number

76399508

76399508

8

00182965

00182965

8

07620385

07620385

8

An Efficient Pattern Recognition Technology for Numerals of Lottery and Invoice

251

There are eight Arabic numerals in each invoice, so the total number of numerals are 400. After applying the pattern recognition algorithm proposed in this paper, only three numerals are wrong. The accuracy of identification is over 99%.

4 Conclusion In this paper, an efficient image recognition algorithm is applied to Arabic numerals of invoice identification. The identification process includes background removal, rotation correction, automatic capture of the region of interest, and optical character recognition (OCR). Especially, the Canny method is applied for edge detection. Because the Canny method applies dual hysteresis thresholds, which let Canny edge detection have advantages of filtering and enhancing edges. It has more clear edge detected by using Canny method. Based on the experimental results, the accuracy of character recognition is over 99%. The advantage of this algorithm includes that the region of interest for images can be captured automatically and the accuracy of recognition is remarkable. Apply this method, the users can save time and enhance the efficiency also. Acknowledgments. This work was supported by the Ministry of Science and Technology under Grant MOST 106-2221-E-018-028-.

References 1. Vu, H., Le, T.L., Tran, T.H.: A vision-based method for automatizing tea shoots detection. In: 2013 IEEE International Conference on Image Processing, pp. 3775–3779 (2013) 2. Huseyin, O., Chen, T., Wu, H.R.: Performance evaluation of multiple regions-of-interest query for accessing image databases. In: Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech, pp. 300–303 (2001) 3. Liu, F., Liu, X., Chen, Y.: An efficient detection method for rare colored capsule based on RGB and HSV color space. In: 2014 IEEE International Conference on Granular Computing, pp. 175–178 (2014) 4. Yitzhaky, Y., Peli, E.: A method for objective edge detection evaluation and detector parameter selection. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1027–1033 (2003) 5. Dezert, J., Liu, Z.G., Mercier, G.: Edge detection in color images based on DSmT. In: Proceedings of the 14th International Conference on Information Fusion (FUSION), pp. 343–350 (2011) 6. Qiu, T., Yan, Y., Gang, L.: An auto-adaptive edge-detection algorithm for flame and fire image processing. IEEE Trans. Instr. Meas. 61(5), 1486–1493 (2012) 7. Jiang, J.A., Chuang, C.L., Lu, Y.L., Fahn, C.S.: Mathematical-morphology-based edge detectors for detection of thin edges in low-contrast regions. IET Image Process. 1(3), 269– 277 (2007) 8. Yeh, M.T., Chung, Y.N., Huang, Y.X., Lai, C.W., Juang, D.J.: Applying adaptive LS-PIV with dynamically adjusting detection region approach on the surface velocity measurement of river flow. In: Computers and Electrical Engineering, pp. 1–17, December 2017 9. Shih, H.-C., Liu, E.-R.: Automatic reference color selection for adaptive mathematical morphology and application in image segmentation. IEEE Trans. Image Process. 25(10), 4665–4676 (2016)

252

Y.-N. Chung et al.

10. Zhai, X., Bensaali, F., Sotudeh, R.: Real-time optical character recognition on field programmable gate array for automatic number plate recognition system. IET Circ. Devices Syst. 7(6), 337–344 (2013) 11. Ramiah, S., Liong, T.Y., Jayabalan, M.: Detecting text based image with optical character recognition for English translation and speech using Android. In: IEEE Student Conference on Research and Development (SCOReD), pp. 272–277 (2015) 12. Chung, Y.-N., Yun-Jhong, H., Tsai, X.-Z., Hsu, C.-H., Lai, C.-W.: Applying image processing technology to region area estimation. Adv. Intell. Syst. Comput. 579, 77–83 (2017)

An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery Yulin Yang, Baolong Guo(B) , Cheng Li, and Yunpeng Zhi School of Aerospace Science and Technology, Xidian University, No. 2 South Taibai South Road, Xi’an, Shaanxi, China [email protected], [email protected]

Abstract. Pedestrian detection algorithms based on deep learning often rely on high-performance image processing platforms, and it does not adapt to high portability requirements of practical engineering. In order to solve the problem, an improved approach is proposed, which can be applied to embedded platforms. Firstly, reinforcing dense connectivity is adopted to connect the multiple layer features. Then the k-means++ algorithm is used to cluster the pedestrian targets, and the method of eliminating duplicate detection is optimized. Finally, the detection model is obtained through multi-scale training. Experiment results show that Dense-YOLOv3-tiny can detect pedestrian on UAV imagery in real time and exhibit higher precision and recall rate than the YOLOv3-tiny model in the embedded platforms. Keywords: Deep learning Dense connectivity

1

· Embedded device · Pedestrian detection ·

Introduction

Pedestrian detection [1,2] refers to detecting objects and marking their coordinates from a image or video. In recent years, with the development of deep learning [3], convolutional neural networks (CNNs) have become one of the dominant approaches of pedestrian detection. However, in order to improve the performance of object detection, network structure tends to be complex. The majority of networks have high requirements on hardware and software resources and power consumption, which are not conducive to deployment in practical applications. Therefore, for the embedded platforms with fewer computing units and slower processing speed, designing tiny network is important prerequisite for implementing advanced applications such as smart driving, smart security and unmanned aerial vehicle (UAV). The pedestrian detection algorithm based on deep learning [4] can be generally divided into two categories: two stages method and one stage method. In a two stages algorithm, a large number of region proposals that may contain targets are generated, and then regression and classification are performed on these regions. Girshick and Donahue [5] propose the Regions-Convolution c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 253–261, 2020. https://doi.org/10.1007/978-981-15-3308-2_29

254

Y. Yang et al.

Neural Network (R-CNN) algorithm, which segments and clusters images to obtain a large number of region proposals. Then these regions are cropped to a uniform size and processed with convolutional neural networks. The Fast Regions-Convolution Neural Network (Fast R-CNN) [6] use a pyramid pooling to process images of any size. Faster Regions-Convolution Neural Network (Faster R-CNN) [7] merges Region Proposal Network (RPN) and Fast R-CNN into a network by sharing their convolutional features. However, these algorithms are too computationally intensive for embedded device to meet real-time requirement. The one stage algorithm realizes the end-to-end pedestrian detection and improves the speed of detection. You Only Look Once version 3 (YOLOv3) [8] is one of the most representative algorithms. The tiny network model of the YOLOv3 algorithm can be transplanted into the embedded device for real-time detection, while the precision is not desirable. Therefore, how to optimize the neural network so that it can achieve the real-time and accurate pedestrian detection in the embedded device is a research hot topic in the field of computer vision. In this paper, a Dense-YOLOv3-tiny network structure is proposed, which combines Dense Convolutional Network (Densenet) [9] and YOLOv3-tiny. The proposed method adopts k-means++ algorithm [10] to cluster the pedestrian target, and optimizes the algorithm of eliminating duplicate bounding boxes to improve the detection accuracy and recall rate. Finally, the method is applied to the embedded device to achieve accurate and real-time pedestrian detection.

2

YOLOv3-tiny

YOLOv3-tiny is a regression-based target detection network with high real-time performance. The network model divides the input image into S×S cells and uses two scale feature maps for prediction, which effectively improves the accuracy of small target detection. Firstly, the six anchor boxes obtained by the K-means algorithm are equally divided into two scales. Then, the anchor boxes is predicted to obtain object information (tx , ty , th , tw ), to , and conditional probability of object category. According to the first five parameters, the position coordinates (x, y, w, h) and confidence Co of the bounding boxes relative to the feature map can be calculated. The calculation formula is as follows [11]: ⎧ x = σ(tx ) + cx ⎪ ⎪ ⎨ y = σ(ty ) + cy (1) w = pw etw ⎪ ⎪ ⎩ h = ph eth Co = Po ∗ Ipt = σ(to )

(2)

In Eq. (1), σ(·) represents the function of sigmoid(·). (cx , cy ) is the cell offset from the top left corner of the grid. (pw , ph ) is the width and height of the bounding box relative to the feature map. In Eq. (2), Po refers to the probability

An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery

255

of including the target in the grid, and Ipt is the intersection over union (IOU) [12] between the bounding box p and the ground truth t. The category confidence of the bounding box indicates both the probability that the target of the frame prediction belongs to and the IOU which the bounding box matches the ground truth. Pci means that there is an object in the bounding box and the probability of class i, so the confidence Cc which the object contained in the bounding box belongs to a certain category is: Cc = Pci ∗ Po ∗ Ipt

(3)

After obtaining the Cc , the bounding boxes with less probability are filtered according to the set threshold, and then the remaining bounding boxes are screened by Non-Maximum Suppression (NMS). Finally, the location and category probability of the pedestrian can be obtained.

3 3.1

The Proposed Method Clustering of Pedestrian

YOLOv3-tiny introduces an anchor box of the Faster R-CNN algorithm, and obtains the predicted position by regressing the position and size of the selected anchor box during the training. The anchor box with the appropriate size is able to decrease the training time. Therefore, YOLOv3-tiny uses k-means algorithm clustering to determine the optimal size of anchor boxes, and applies the IOU between the cluster center and the ground truth as the distance metric. The formula is as follows: d(c, t) = 1 − Itc

(4)

where c is the cluster center and t is the ground truth. According to the characteristics of VOC20 and COCO80 dataset samples, YOLOv3-tiny uses the k-means algorithm to obtain anchor boxes which are not suitable for pedestrian detection. This paper adopts k-means++ algorithm to select the initial cluster centers, which could select a point farther from the existing cluster center as the next center. The k-means algorithm is completely affected by the initial value, while the k-means++ algorithm is not. Therefore, this paper clusters the pedestrian samples in dataset of UAV images, and obtains six anchor boxes suitable for pedestrian detection. The sizes of which are: (17, 43), (35, 64), (40, 95), (53, 132), (75, 148), (100, 232). 3.2

Dense-YOLOv3-tiny Network

The YOLOv3-tiny network structure consists of 13 convolutional layers and 6 maxpooling layers, which predicts bounding boxes at two different scales. However, the convolutional networks is shallow and each feature map cannot be fully utilized, which reduces the accuracy of target detection to some extent.

256

Y. Yang et al.

In order to address the problem, this paper applies the dense connectivity, as well as strengthens the propagation of features by connecting all convolutional layers with each other. Moreover, fewer convolutional layers are used to generate more features, which avoid the deepening of the network and increasing the number of model parameters. Each convolutional layer in a dense block has a dense connectivity with all previous convolutional layers. The dense block formula is as follows: Xi = H([X0 , X1 , ..., Xi−1 ])

(5)

H(·) = BN − Relu + Conv(1, 1) + BN + Relu + Conv(3, 3)

(6)

where X0 represents the input feature map in the dense block, and Xi is the output feature map of the dense block. [X0 , X1 , ..., Xi−1 ] refers to the concatenation of the feature maps produced in layers 0 to i − 1. In Eq. (6), H(·) is a non-linear transformation function, which is a combination of three operations, including Batch Normalization (BN ), Rectified linear unit (Relu) and Convolution (Conv). As the number of layers increases, the input of dense block will increase sharply. Therefore, the 1×1 convolutional layer is added before the 3×3 convolutional layer, which can reduce the number of feature maps in the block and improve the computational efficiency.

Fig. 1. Dense-YOLOv3-tiny network structure

An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery

257

In this paper, the proposed network termed as Dense-YOLOv3-tiny which replaces the fifth convolutional layer and the seventh convolutional layer in the structure with Dense Block1 and Dense Block2. The improved network structure is shown in Fig. 1. The internal structure of the dense block is shown in Fig. 2. In the Dense Block1, the feature map X0 is channel-combined with the X1 of the first layer output to obtain a feature map of 160 channels which is used as the input of the second layer. After H(·) operation, the number of channels of the output feature map becomes 32, and then X2 , X1 and X0 are concatenated as the input of the third layer, etc. The structure of Dense Block2 is similar to the Dense Block1.

Fig. 2. Dense Block structure

3.3

Soft-NMS

YOLOv3 uses NMS to eliminating duplicate bounding boxes. The principle of the NMS algorithm is to select the bounding box with the highest confidence, as well as set the overlap threshold based on experience. When the IOU between the selected bounding box and the one of the remaining bounding boxes is larger than the set threshold, the confidence of bounding box is set to zero. Then repeat the above process for the remains. However, the NMS algorithm cannot completely detect the pedestrians that are occluded from each other and also increases the miss-rate in UAV images. Therefore, this paper adopts the Soft-NMS algorithm [13], which decays the confidence that the IOU exceed the set threshold. If the results has highly value of IOU, its confidence should be decreased more. The Soft-NMS functions as follows: IiM < Nt Si , (7) S(n) = Si (1 − IiM ), IiM ≥ Nt where Si represents the confidence of the bounding box, IiM is the IOU between the bounding box M with the highest confidence and the remaining bounding box i, and Nt is the set overlap threshold. Thus, the higher overlap with M , the greater the attenuation of confidence of bounding box, and the bounding box far from M is not affected, which can reduce the miss-rate of detection.

258

4 4.1

Y. Yang et al.

Experiment and Analysis Platform and Dataset

The experiment is performed on the aerial pedestrian dataset from the Internet and experimental project data, where has about 4800 images. The images of seventy percent are randomly extracted from the dataset as a training set, and the remains are used as test set. The platform is Jetson Xavier which has ARM 64-bit CPU, 512-core GPU, and 16g Memory. 4.2

Training

The improved network only contains convolutional layers and maxpooling layers, which can change the input image size. In the experiment,the down-sampled parameter is 32, and the scale of the input picture is [320, 320+32×k], k ∈ [0, 9]. During the training process, the size of the input picture is randomly selected after 10 batches. The multi-scale training method can better predict pictures of different sizes, so that the network structure can perform detection tasks of different resolutions. The learning rate of training is 0.001 and drops to 0.1 times the current learning rate when the steps are 40000 and 45000 respectively. The Loss curve of the algorithm in the training process is shown in Fig. 3. After training, the loss function is stabilize and approaches to 0.09. This indicates that the network parameters of the algorithm are reasonable. 3 Loss

2.5

Loss

2 1.5 1 0.5 0 0

1

2

3

4

batches

Fig. 3. Loss curve in Dense-YOLOv3-tiny

5 4

10

An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery

4.3

259

Results

In order to verify the performance of the algorithm, we use the same experimental embedded platform Xavier and the same aerial pedestrian dataset, and calculate the mean Average Precision (mAP), Recall and Frame Per Second (FPS), as shown in Table 1. Table 1 shows that the SSD and YOLOv3 algorithms cost a lot of time on the embedded platform and cannot meet the real-time requirement. Compared with YOLOv3-tiny, the proposed method has increased the mAP to 80.32% and the Recall to 81.96%. Because dense connectivity blocks have less influence on the detection speed, the algorithm has better performance under the condition that the speed meets the real-time. Figure 4 shows that the proposed method can accurately locate the pedestrian position in UAV images. Figure 5 is a comparison of YOLOv3-tiny and the Table 1. Experimental results Method

mAP (%) Recall (%) FPS

SSD

89.53

YOLOv3

86.72

3

91.04

89.82

8

YOLOv3-tiny 75.30

76.40

26

Proposed

81.96

25

80.32

Fig. 4. The results of the proposed method for pedestrian detection on UAV imagery

Fig. 5. Visual comparison of pedestrian detection by YOLOv3-tiny and the proposed Dense-YOLOv3-tiny. The first line is the test result graph of the yolov3-tiny, and the second line is the test result graph of the proposed method.

260

Y. Yang et al.

proposed method. The results reveal that it can be seen that the false positive rate and miss-rate of the proposed method is significantly less than YOLOv3tiny. The algorithm is more suitable for pedestrian detection on UAV imagery.

5

Conclusion

This paper proposes a network structure with dense connectivity based on the YOLOv3-tiny detection model. K-Means++ algorithm is applied to cluster the pedestrian dataset so that the size of the anchor box is more closely matched with the pedestrian size, and Optimized method of eliminating duplicate bounding boxes is introduced to increase the recall rate. Experimental results demonstrate that the proposed method makes fully use of feature map and improves the performance in pedestrian detection of UAV imagery with meeting real-time requirement on the embedded platforms. Acknowledgments. This work is supported by the National Natural Science Foundation of China (61571346). The research is also supported by the Fundamental Research Funds for the Central Universities and the Innovation Fund of Xidian University.

References 1. Zhang, S.S., Benenson, R., Omarn, M., Hosang, J.: How far are we from solving pedestrian detection. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 1259–1267. IEEE Computer Society (2016) 2. Benenson, R., Omarn, M., Hosang, J., Schiele, B.: Ten years of pedestrian detection, what have we learned? In: Proceedings European Conference on Computer Vision, Zurich, Switzerland, pp. 613–627 (2014) 3. Shrestha, A., Mahmood, A.: Review of deep learning algorithms and architectures. IEEE Access 7, 53040–53065 (2019) 4. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015) 5. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, pp. 580–587. IEEE Computer Society (2014) 6. Girshick, R.: Fast R-CNN. In: 15th IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1440–1448. Institute of Electrical and Electronics Engineers Inc. (2015) 7. Ren, S.Q., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, Hawaii, HL, USA, vol. 39, pp. 1137– 1149. IEEE Computer Society (2017) 8. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018) 9. Gao, H., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 2261–2269. Institute of Electrical and Electronics Engineers Inc., (2017)

An Improved YOLOv3 Algorithm for Pedestrian Detection on UAV Imagery

261

10. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, January 2007, pp. 1027–1035. Association for Computing Machinery (2007) 11. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, January 2017, pp. 6517–6525. Institute of Electrical and Electronics Engineers Inc., (2017) 12. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, December 2016, pp. 779– 788. IEEE Computer Society (2016) 13. Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS-improving object detection with one line of code. In: Proceedings IEEE International Conference on Computer Vision, Venice, Italy, October 2017, pp. 5562–5570. Institute of Electrical and Electronics Engineers Inc., (2017)

Fast Ship Detection in Optical Remote Sensing Images Based on Sparse MobileNetV2 Network Jinxiang Yu, Tong Yin, Shaoli Li, Shuo Hong, and Yu Peng(&) School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150080, China [email protected]

Abstract. Ship detection in optical remote sensing images (ORSIs) has drawn lots of attention because of its extensive potential in maritime applications. Although many methods have been proposed in recent years, there are still great challenges for improving the detection accuracy and detection speed. In this paper, a fast ship detection method in optical remote sensing images based on sparse MobileNetV2 network is proposed, which has high accuracy and fast detection speed. Ship detection problem is turned into a sub-image classification one, which successfully avoids the massive computation caused by the region proposal stage in previous methods. The sparse MobileNetV2 network has high detection accuracy and less computation benefited from the convolutional neural networks and the depth separable convolution. Furthermore, the pruning method is used to compress the network to decrease model complexity and prevent overfitting. Several experiments are conducted based on some optical remote sensing images from Google Earth. The results demonstrate that the proposed method achieves over 5x speed enhancement compared with several mainstream ship detection methods, while the accuracy is competitive. Keywords: Optical remote sensing image Ship detection MobileNetV2 network Model compression

Sparse

1 Introduction Ship detection in optical remote sensing images (ORSIs) is to determine if the images contain one or more target ships and locate the position of each one [1]. In recent years, huge amount of ORSIs are available with the good properties of high resolution, and the details of objects in images are much clear. Ships are important transportation carrier and military targets. The detection of it has vital importance in shipping schedule, salvage, anti-piracy operation, war warning and so on. As a result, ship detection based on ORSIs has become a hot issue. However, the optical remote sensing images usually have a complex background. As a result, ship detection becomes much more difficult by the influence of illumination, brightness and likeness [2], etc. On the other hand, ship detection platforms, most of which are satellites, have limited computing capacity and storage [3]. As a result, onboard ship detection is hard to achieve. Currently, the ship detection is mainly accomplished by ground processors which have strong computing power, but the © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 262–269, 2020. https://doi.org/10.1007/978-981-15-3308-2_30

Fast Ship Detection in Optical Remote Sensing Images Based on Sparse

263

image data occupies too much bandwidth of space-ground link and the duration between shooting an image to detecting target ships is too long to meet the actual demand, especially in some time-sensitive applications. In the past few years, numerous methods have been proposed for accurate and fast ship detection based on OSIs. They can be generally divided into three main categories characteristic-based methods [4], template matching-based methods and machine learning-based methods. Machine learning-based methods are the most widely used one among the three methods, and the detection performance is relatively well. Its main idea is to use machine learning methods to classify and regress the target according to features extracted manually or automatically. Especially, the deep-learning methods have made great achievement because of its automatically feature extraction [5], which is robust and can adopt different situations. At present, there are mainly two categories of deep-learning-based methods: region proposal and classification methods, and regression-based methods. The first one generates candidate areas by various search algorithms or edge detection methods, and then the areas potentially containing targets is sent to the neural network for feature extraction. The features are classified and scored for each candidate area to obtain the final detection results. The mainstream methods are Fast R-CNN, Faster R-CNN [6] and so on. On the other hand, the regression-based methods such as YOLO [7], SSD [8] set the default box to make window interception on the original image. The target category and position offset are directly regressed when the neural network extracts the features, which avoid region proposal stage to get faster speed. In order to further enhance the detection speed, fusion methods, such as MobileNet-SSD [9] and so on, are proposed, which decrease the computation by optimizing computation structure. However, most of these methods aim at object detection of natural image whose data volume is relatively low and the target is obvious, while the ORSIs have enormous data volume and the ship target in it is too small compared with the background. The methods existed are not fast enough to meet the requirement of ship detection based on ORSIs. In this paper, a fast and accurate ship detection method is proposed, which aims at the application of onboard ship detection. The ship detection problem is turned into a sub-image classification one. A sparse MobileNetV2 network with competitive accuracy and less computation is applied to conduct a fast classification and the fast ship detection is finally achieved thorough regression and localization stage. The rest of this paper is organized as follows. In Sect. 2, the detail of MobileNetV2 network is introduced. In Sect. 3, the framework detail of the proposed method is described. Section 4 demonstrates comparative experiments and its results. And Sect. 5 finally concludes this paper.

2 Methodology Deep-learning-based methods can extract high-level semantic information from the original image data layer by layer, which avoid the in-efficient and none-universal handcraft feature extraction stage, and have a better accuracy. Meanwhile, they have good robustness against the inference of illumination, brightness and likeness in

264

J. Yu et al.

the ORSIs. However, because of the high computational complexity and the large model size, the computation is usually large as well. MobileNetV2 is the second generation of MobileNet series CNN-based deep learning method, which is firstly brought out by Google [10]. Different from other CNN-based networks, the MobileNetV2 network uses depth separable convolution to decrease the convolution computation, on the other hand, it imports bottleneck residual block to further decrease the requirement for storage. The complete structure of the MobileNetV2 network is shown in Fig. 1.

C O N V

d

B R M

7 layers in total

B R M

Expansion layer

C O N V

B R M

Depthw1ise Convolution

nd

BatcnNorm

BatcnNorm

ReLU6

ReLU6

Average Pooling

C O N V

Pointwise Convolution BatcnNorm Bottleneck Residual Module (BRM)

Fig. 1. The MobileNetV2 model structure

2.1

Depth Separable Convolution

Depth separable convolution factorizes standard convolution into depth convolution and pointwise convolution. In the calculation of depth convolution, each input channel is convolved with a separate filter. After that, the output is combined by pointwise convolution with a 1 1 kernel. In this way, the standard convolution operation can be accomplished while significantly reducing the computation complexity.

Input

F

Standard Convolution

Output

G

M

DK

N

DK

Standard Convolution

M

DK DK

( DK , DK , M , N )

( DF , DF , M ) Input

F

Depth Convolution

( DG , DG , N ) Pointwise Convolution

Output

G

1

Depth Seperable Convolution

DK

DK

M

N

1 DK

DK

( DF , DF , M )

( DK , DK , M )

(1,1, N )

( DG , DG , N )

Fig. 2. Depth separable convolution structure decomposition

Fast Ship Detection in Optical Remote Sensing Images Based on Sparse

265

Figure 2 illustrates the process of decomposing a standard convolution into a depth convolution and a pointwise one. Assuming that the input feature map size F is ðDF ; DF ; MÞ, where DF stands for the width and height and the M stands for the number of input channel. The output feature map size G is ðDG ; DG ; NÞ, where DG stands for the width and height and the N is the number of output channel. The standard convolution operation K is ðDK ; DK ; M; NÞ, where DK is the kernel size. When the stride is 1, the calculation of a standard convolution is shown in Eq. (1), where i, j mean the line and row number of kernel and k, l are the line and row number of output feature map in the M-th channel. The corresponding computation is DK DK M N DF DF . X Gk:l: ¼ Ki;j;M;N Fk þ i1;l þ j1;M ð1Þ i;j;M

^ is ðDK ; DK ; MÞ, which For a depth separable convolution, the depth convolution K uses single filter in each channel as shown in Eq. (2). After that, the pointwise convolution is used to create linear combination of every output of the depth convolution. The total computation is DK DK M DF DF þ M N DF DF . X ^ k;l;M ¼ ^ i;j;M Fk þ i1;l þ j1;M K G ð2Þ i;j

Comparing Eqs. (1) and (2), the computation that the depth separable convolution can reduce is shown in Eq. (3). In one convolution operation, the computation sharply decreases to D12 . K

DK DK M DF DF þ M N DF DF 1 1 ¼ þ 2 N DK DK M N DF DF DK

2.2

ð3Þ

Bottleneck Residual Module

The bottleneck residual module is improved to reduce storage resource. The module takes low dimension representation as input, extends it into higher dimensions and uses light weight depth control for filtering, and then projects the features back to the lowdimensional representation with linear convolution. Because the process needs few large tensors, the access to storage is significantly reduced. As shown in Fig. 1, the BRM consists of 3 convolutions. The expansion layer turns the input channel dimension d into nd, where n is the ratio of output channel to input channel. After that, the depth convolution and pointwise convolution is applied to get the output feature.

3 Proposed Method The classification framework avoids the region proposal stage, which has advantages in terms of model volume and computation. The proposed framework is shown in Fig. 3. Firstly, instead of the region proposal stage, an ORSI is divided into several sub-images which have the same size. Then the sparse MobileNetV2 network is applied to

266

J. Yu et al.

determine if each sub-image contain ship target. Finally, the sub-images which have ship targets are regressed back to the original image and the ship targets are marked according to the location of each sub-image.

Stage1: Segmentation and Decimation Original Images

Stage2: Sub-image Classification

Stage3: Ship Targets Location

Overlapped Segmentation

MobileNetV2 model Traning and Verification

Sub-image Regression

Bilinear Decimation

Model Compress

Image Mark

Lighted MobileNetV2 Network Regression

Ships

Background

Fig. 3. Proposed ship detection framework based on sparse MobileNetV2 network

3.1

Image Segmentation and Decimation

In this stage, the original image is overlapped cut into small pieces, in which the potential ship target occupies at least one third area while the overlap operation guarantees the integrity of ship targets. This stage replace the region proposal stage in previous detection framework to decrease computation. On the other hand, it may contribute to the classification accuracy, because the performance of CNN-based classification network is greatly influent by the size of target in image. Each picture is subsampled with bilinear interpolation. By sacrificing part of the resolution, the size of each convolution layer in the forward reasoning process can be reduced, which effectively controls the memory requirements, reduces the time complexity and improves the efficiency and speed of the entire detection process. 3.2

Model Training and Verification

The MobileNetV2 network is trained and verified as shown in Fig. 4. Some sub-images are marked and divided into positive and negative samples, which will be used as the training dataset while the other are used as test dataset. The learning rate of network is constantly adjusted in the training and appropriate gradient descent and regularization manners are selected. Finally, a well-performing model is available.

Fast Ship Detection in Optical Remote Sensing Images Based on Sparse

267

Training Negative Sapmles

Positive Samples

Trained MobileNetV2 Models

Verification Test Images

Classification

MobileNetV2 Network

Test Results

Fig. 4. The process of model training and verification

3.3

Model Compression

Although the trained MobileNetV2 network has a preferable accuracy, but too many network layers and too little data are easy to cause over-fitting. Moreover, the high storage and computing resource consumption make it difficult to deploy in actual platforms. The pruning method is improved to compress the network, by which the complexity of the model is reduced, the overfitting is effectively prevented and the generalization of the model is also enhanced. Pruning means to remove some unimportant parameters, nodes or layers from the original model and reduce the channel numbers.

4 Experiments 4.1

Dataset, Evaluation Metrics and Experiment Environment

The proposed method is evaluated by using the ORSIs from Google Earth platform. The image provided by this platform is the integration of satellite image and aerial photography data with various resolutions and rich categories. The dataset comes from images with the spatial resolution between 0.4 m and 0.58 m. Generally, four situations could occur in a ship detection task, as shown in Table 1. As a result, the main evaluation metrics of object detection, which are precision, recall and false alarm, are explained in Eqs. (4) to (6).

Table 1. Four possible situations in ship detection Situations Ship target Background Detected as ship target True Positive (TP) False Positive (FP) Detected as background False Negative (FN) True Negative (TN)

268

J. Yu et al.

precision ¼ recall ¼ PF ¼

TP TP þ FP

ð4Þ

TP TP þ FN

ð5Þ

FP TP þ FP

ð6Þ

In most ship detection tasks, the most important principle is detecting all existing ships, which means the recall should be 100%. In this way, we can use precision to equivalently demonstrate the accuracy of detection methods. Besides, we will use average detection time to evaluate the speed of the method. The implementation of all the methods mentioned are based on Tensorflow and Python, while all the experiments run on a server with 3.2 GHz main frequency and 8 GB storage. 4.2

Experiments and Analysis

Several experiments of two strategies are conducted in order to make a comparison, including mainstream ship detection methods and ours method. The statistical result of experiments are listed in Table 2.

Table 2. Experiments statistical result Method SSD MobileNet-SSD AlexNet InceptionV4 MobileNetV2 Sparse MobileNetV2 (Ours Method)

Precision 75.8% 74.6% 93.7% 98.2% 97.3% 93.7%

Average detection time 16.361 s 314 ms 121 ms 218 ms 95 ms 69 ms

Strategy Regression-based Classification-based

The SSD and MobileNet-SSD, which are regression-based methods, have much lower precision and longer average detection time compared with classification-based methods. It is because that prediction boxes of different sizes are framed on different convolution layers to match the default box of the target in order to predict the location of the target more accurately. The MobileNetV2 performes better in speed among the AlexNet, InceptionV4, while only litte accuracy loss are suffered. What’s more, our method achieves the fast speed because of the pruning operation compressing the model size and removing the unimportant computation process, while the accuracy is still competitive.

Fast Ship Detection in Optical Remote Sensing Images Based on Sparse

269

5 Conclusion In this paper, a fast ship detection method in optical remote sensing images based on sparse MobileNetV2 network is presented. The computation is significantly decreased by solving the ship detection problem by a classification strategy. Furthermore, the application of sparse MobileNetV2 network with depth separable convolution, bottleneck residual module and pruning compression makes the proposed method has less computation complexity while keeps comparative accuracy. Experiment results show that our method is much faster compared with mainstream methods, while the accuracy is tolerable decreased. And it can fully meet the requirement of fast ship detection tasks. This method also has greatly potential in onboard ship detection. The future work will focus on the fast ship detection method withstanding the cloud pollution, because the cloud in OSIs seriously affects the detection accuracy.

References 1. Cheng, G., Han, J.: A survey on object detection in optical remote sensing images. ISPRS J. Photogram Remote Sens. 117, 11–28 (2016) 2. Demirel, H., Anbarjafari, G.: Satellite image resolution enhancement using complex wavelet transform. IEEE Geosci. Remote Sens. Lett. 7(1), 123–126 (2010) 3. Ji-yang, Y., Dan, H., Lu-yuan, W., et al.: A real-time on-board ship targets detection method for optical remote sensing satellite. In: 2016 IEEE 13th International Conference on Signal Processing (ICSP), pp. 204–208. IEEE (2016) 4. Leninisha, S., Hinz, S., Stilla, U.: Vehicle detection in very high resolution satellite images of city areas. IEEE Trans. Geosci. Remote Sens. 48, 2795–2806 (2010) 5. Zou, Z., Shi, Z.: Ship detection in spaceborne optical image with SVD netowrks. IEEE Trans. Geosci. Remote Sens. 54(10), 5832–5845 (2016) 6. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017) 7. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) 8. Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016) 9. Biswas, D., Su, H., Wang, C., et al.: An automatic traffic density estimation using Single Shot Detection (SSD) and MobileNet-SSD. Phys. Chem. Earth 110, 176–184 (2018) 10. Sandler, M., Howard, A., Zhu, M., et al.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

Image Matching Using Phase Congruency and Log-Gabor Filters in the SAR Images and Visible Images Xiaomin Liu1,2(B) , Huaqi Zhao2 , Huibin Ma2 , and Jing Li2 1

School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China [email protected] 2 Information and Electronic Technology Institute, Jiamusi University, Jiamusi 154002, China

Abstract. SAR and visible image matching provides many applications in remote sensing, image fusion and image guidance with laborious problems with regard to the potential nonlinear intensity differences between two images. This paper proposes an image matching approach which use the phase congruency (PC) to detect corners and log-gabor filters for obtaining feature descriptor in the SAR and visible images. PC can provide inherent and rich image textures for the images with intricate grayscale changes or noise, which is utilitized to detect the corners. The moments of PCs for the images are calculated to obtain the keypoints and the log-gabor filters are employed to acquire the feature descriptors. Five evaluation methods are used for testing the results of the algorithm for three pairs of images and its result is compared to the SIFT algorithm. The experiment performance show that the advocated algorithm is better than the SIFT algorithm. Keywords: SAR image Log-gabor filters

1

· Image matching · Phase congruency ·

Introduction

Image matching is described as searching for the corresponding relationship of image contents, features, structures, textures and gray intensities for the uniform scene, which are acquired under different condition, including different filed of views, different scales, different times and different sensors et al. It is a precondition for the applications [1–3], for examples, precision guidance, visual navigation, medical image analysis, image fusion. Synthetic aperture radar is an active earth imaging system producing high-resolution images working under all times and all weather conditions [4,5]. Thus, it is useful for many applications, including remote sensing, image fusion, reconnaissance and target location et al. A traditional intelligent image matching approaches consists of four stages: (1) keypoints detection; (2) the obtaining and matching of keypoint descriptor; c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 270–278, 2020. https://doi.org/10.1007/978-981-15-3308-2_31

Image Matching Based on Phase Congruency and Log-Gabor Filters

271

(3) transformation function determination; and (4) image regeneration. In accordance with the above descriptions, most multisensor image matching approaches usually include two classes: feature-based and area-based. Area-based approaches usually uses a given-size template window to obtain the keypoints between a pair of images. In the template image, the template window is determined, and in reference image the corresponding window is obtained by calculating the matching results on the basis of a certain similarity measurement just as phase correlation [7], cross correlation [6] and mutual information [8]. The centering pixel of the given windows is considered as the location of the feature points, by which we can match the two images. Feature-based approaches first the keypoints are detected in both template image and reference image, and then obtain the corresponding keypoints with their similarities to achieve matching. These keypoint features can be points [11], region [9], or lines [10], which are considered as keypoints with distinctiveness and stability for a fixed location and without considering the change of scanned scene changes, image geometry, nonlinear intensity changes et al. Now, local invariant feature descriptors have been diffusely employed to image matching, for examples, SIFT [12], BRISKLeu [14], SURF [13], ORB [15] et al, have been extensively employed for applications of natural image matching for their robustness to illumination and geometric changes [16]. Nevertheless, a poor matching result has been notified for these approaches with multispectral images. SAR and visible image matching can be known as multispectral image analysis [17]. SAR and visible light generated by spectrals with large difference, which usually come from different two sensors. These difference include: (1) The difference of imaging modes make the SAR and visible images illustrate the attributes of targets in different spectrals. (2) The difference of imaging environments make the image gray change and the geometric distortion easily influenced by shooting time, light intensity, season, environment, et al., which force the SAR image and visible image different in some characteristics. The wavelet transform is used to acquire regional frequency information for a spot in a signal [18]. In [19], lots of models of visual processing have been special effective in accounting for inclusive psychophysical observations. Originally, Daugman advocate the 2D gabor filter family representing 2D receptive-field outlines of simple cells in mammalian visual cortex, for the image texture, which is the proper selection [20]. Then, Daugman extract the phase information for the iris image to encode the iris feature, which advance the accuracy of the iris recognition [21]. Nextly, the phase congruency using gabor filter family is advocated to find the corners and edges for the images [22]. The approaches of phase congruency and gabor filters inspire us to propose an image matching with phase congruency detecting the keypoints and log-gabor filters obtaining the descriptors in the SAR and visible images. In this paper, the contributions comprise the following four points. (1) In this paper, the moments with the different scales in the phase concruency images are employed to detect the keypoints for SAR and visible images, which are suitable for the feature descriptors based on log-gabor histogram.

272

X. Liu et al.

Experiment results illustrate that corner detections with the moments of phase congruency advance the performance of the image matching greatly. (2) The robust feature descriptor with Log-gabor phase histograms is proposed, which not considers the intensities of the images but the phase information of the images. The phase histograms as feature descriptors possesse more advantages for the SAR and visible images than the other feature descriptors. (3) Five evaluation measures are used to test the performance of the algorithm with three pairs of images. Experiment results indicate the proposed approach has a better performance than the other approaches.

Fig. 1. Overview of image matching based on phase congruency and log-gabor filters for the SAR and visible images.

Image Matching Based on Phase Congruency and Log-Gabor Filters

2

273

Image Matching Based on Phase Congruency and Log-Gabor Filters for SAR and Visible Images

Figure 1 shows a flowchart of the advocated method. Firstly, Fig. 1(a) and (b) give the template and references images, in which phase congruency (PC) is computed to acquire the PC images Fig. 1(c) and (d) for six orientations. Then Fig. 1(e) and (f) show the minimum (m) and maximum (M) moments of the PC images, which are calculated to obtain the corners images Fig. 1(g) and (h). Nextly, the complex images in real domain, Fig. 1(k) are calculated by the log-gabor with six orientations and four scales. The phase properties with the energy maximum value are employed to generate the phase angle images in every scales, shown as Fig. 1(l) and the image texture features are described by the phase histograms of the phase angle images, shown as Fig. 1(m). Finally, Fig. 1(j) shows that the corresponding keypoints can be detected by the nearest neighbor distance matric and RANSAC algorithms. 2.1

Corners Detection Based on Phase Congruency

Keypoints are considered as a variety of key image feature points possessing quite changes for the intensity, for examples, corners, junctions, line endings et al., where the corresponding phase possesses the maximum value of the Fourier components commonly. The qualities of keypoints have a direct influence for the image matching. So the keypoints can be found by phase congruency in the images possessing multispectral changes. It is a dimensionless quantity, which has more advantages than based on gradient approaches. The regional energy function of feature detection denotes that keypoint can be detected at points with phase congruency maximum values in an image. Morrone et al. propose the phase congruency function by the Fourier series expansion of a signal at some location and it is determined as [23]: P C(x) = maxφ(x)∈[0,2π] ¯

n

¯ An cos(φn (x) − φ(x)) . n An

(1)

where An denotes the amplitude values of the nth Fourier component, and φn (x) denotes the regional phase of the Fourier component at location x. The value ¯ of φ(x) is regional phase angle with the amplitude value weighted mean of all the Fourier coefficients at the point considered, which make this equation maximized. So the phase congruency images, Fig. 1(c) and (d), for six orientations are obtained by Eq. (1). Then, in order to detect the edges or corners in a relative stability, we propose to calculate the covariance matrix, whose minimum and maximum moments are computed for generating a better localized operator. The covariance matric is obtained as following: Covx = P C(θ)cos(θ). (2)

274

X. Liu et al.

Covy = P C(θ)sin(θ). a= b=2

c=

(3)

covx2 .

(4)

covx × covy.

(5)

covy 2 .

(6)

The phase congruency function is computed at orientation θ, defined as P C(θ). The maximum moment M and the minimum moments m are calculated by: M=

1 (c + a + b2 + (a − c)2 ). 2

(7)

1 (c + a − b2 + (a − c)2 ). (8) 2 Figure 1(e),(g) and (f),(h) show the images for the maximum and minimum moment, respectively. Once M surpasses a given threshold value M T , the edge image will be signed and m surpasses the other given threshold value mT , the corners at the edge in the images will be detected shown as Fig. 1(i) and (j). In our works, the frequency neighbor information of a point in an image can be obtained by the log-gabor filters. Polar coordinates are used in the frequency domainfor the log-gabor transform function, which is detected as follows: m=

[log( λkr n )]2 (θ − ψ)2 ). (9) ).exp(− 2 2σ 2 In polar coordinates, r represents the radius and θ denotes angle, which determine the shapes of the filters. n is orientation angle, σ is standard deviation for the filter, λ denotes the smallest wavelength corresponding to the gabor filter, and k presents scaling parameter between several filters, by which the final wavelength of the log-gabor is determined. The images filtered by the loggabor filters with distinctive wavelengths are regarded as the images possessing different spectral information. G(r, θ, ψ, n) = exp(−

2.2

Feature Descriptor Using Log-Gabor Filters and the Corresponding Keypoints Detection with RANSAC Algorithm

A variety of Gaussian envelope of plane waves constitute the Gabor filter family. As is shown in Fig. 1(l), we can acquire twenty-four energy images in six orientations and four scales and nextly the orientation inforamtion with maximum energy for the pixels is determined for obtaining the orientation image Fig. 1(m). Finally, the orientation histogram, Fig. 1(n), is defined as feature descriptors, which is computed from the orientation image. As is shown in Fig. 2, we decompose the 100 × 100 region into 16 subregions with size 25 × 25, which make us acquire 4 × 6 × 4 × 4 dimensional vectors as the feature descriptors.

Image Matching Based on Phase Congruency and Log-Gabor Filters

275

Fig. 2. Partition of the region.

In our experiments, we use distance similarity metric with the nearest neighbor to acquire corresponding keypoints. One pair of keypoints from reference image and model image are consistent once they satisfy the following equations. D(di , dj ) < th × D(di , dk ).

(10)

where D(., .) denotes the Euclidean distance. In the reference image, di is a keypoint. In the model image, dj and dk are the first closed keypoint and the second closed keypoint to di . The th is a given threshold value of the distance, which is aquired according to experience. Lastly, the RANSAC algorithm is utilized to remove the outliers and the result image of the corresponding keypoints pairs is shown as Fig. 1.

3

Experimental Results and Discussion

The advocated approach is executed and we compare it with SIFT in three pairs of images shown in Fig. 3. First group of images is a pair of ZY-3 image and TerraSAR-X image. Second group of images is a pair of ZY-3 image and COSMO-Skymed image. Third group of images is a pair of Google earth image and TerraSAR-X image. In these images, the experimental results shows a perfect performance with M R (matching rate), where the matching rate (MR) for the whole image is defined as: MR =

true matches . total matches

(11)

276

X. Liu et al.

Fig. 3. Examples of the images.

Fig. 4. Matching results of the SAR and visible images.

Table 1 and Fig. 4 illustrate the matching results. Table 1 illustrates that the proposed approach have a better performance than SIFT and the result images are shown in Fig. 4. By experimental analysis, the advocated approach can satisfy the actual demand. Table 1. The matching rate of different methods

4

Image pairs

Proposed approach SIFT

First pair of images

0.4231

0.003

Second pair of images 0.2683

0.002

Third pair of images

0.009

0.6755

Conclusion

This paper presents image matching with phase congruency based on log-gabor filters in the SAR images and visible images. First, phase congruency can provide inherent and rich image features for the images with noise or complex gray level changes, which is utilitized to detect the corners. The moments of PCs for

Image Matching Based on Phase Congruency and Log-Gabor Filters

277

the images are calculated to detect keypoints and we use log-gabor descriptors. Five evaluation methods are used for showing the performance of the algorithm for three pairs of images and its experimental result is compared to the SIFT algorithm. The experiment results illustrate that the proposed algorithm has a perfect performance. In the future work, we will concern the parameter optimization of log-gabor filters for the SAR and visible image matching. The extensive experiment should be done with a very large number of images set and the comparisons with the state-of-the-art image matching approaches are also concerned. Acknowledgments. Heilongjiang Provincial Natural Science Foundation of China under Grant No. QC2015072, and Jiamusi University Young Innovative Talents Training Program No. 22Zq201506, Heilongjiang Prvincial Innovative Training Program for College Students No. 201610222066, Doctoral Program of Jiamusi University No. 22Zb201519, Excellent discipline team project of Jiamusi University (No. JDXKTD2019008).

References 1. Zheng, H., Li, S.Y., Shao, Y.Y., Yang, S.: Typical building of multi-sensor image feature extraction and recognition, pp. 259–272 (2017) 2. Son, J., Kim, S., Sohn, K.: A multi-vision sensor-based fast localization system with image matching for challenging outdoor environments. Expert Syst. Appl. 42(22), 8830–8839 (2015) 3. Xu, Y., Zhou, J., Zhuang, L.: Binary auto encoding feature for multi-sensor image matching. In: 2016 Fourth International Conference on Ubiquitous Positioning, Indoor Navigation and Location Based Services (UPINLBS), pp. 278–282 (2016) 4. Fan, J., Wu, Y., Wang, F., Zhang, P., Li, M.: New point matching algorithm using sparse representation of image patch feature for sar image registration. IEEE Trans. Geosci. Remote Sens. 55(3), 1498–1510 (2017) 5. Fan, J., Wu, Y., Li, M., Liang, W., Cao, Y.: Sar and optical image registration using nonlinear diffusion and phase congruency structural descriptor. IEEE Trans. Geosci. Remote Sens. 56(9), 5368–5379 (2018) 6. Avants, B.B., Epstein, C., Grossman, M., Gee, J.: Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12, 26–41 (2008) 7. Yi, X., Wang, B., Fang, Y., Liu, S.: Registration of infrared and visible images based on the correlation of the edges, pp. 990–994 (2013) 8. Zhuang, Y., Gao, K., Miu, X., Han, L., Gong, X.: Equation infrared and visual image registration based on mutual information with a combined particle swarm optimization-powell search algorithm. Optik - Int. J. Light Electron Opt. 127, 188–191 (2015) 9. Sinisa, T., Narendra, A.: Region-based hierarchical image matching. Int. J. Comput. Vis. 78(1), 47–66 (2008) 10. Bhat, K.K.S., Heikkil¨ a, J.: Line matching and pose estimation for unconstrained model-to-image alignment. In: 2014 2nd International Conference on 3D Vision, vol. 1, pp. 155–162 (2014)

278

X. Liu et al.

11. Senthilnath, J., Kalro, N.P.: Accurate point matching based on multi-objective genetic algorithm for multi-sensor satellite imagery. Appl. Math. Comput. 236(2), 546–564 (2014) 12. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004) 13. Bay, H., Ess, A., Tuytelaars, T., Goolab, L.V.: Surf: speed-up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2007) 14. Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: binary robust invariant scalable keypoints. In: 2011 International Conference on Computer Vision, pp. 2548–2555 (2011) 15. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to sift or surf. In: European Conference on Computer Vision, ICCV 2012, pp. 1–8 (2012) 16. Li, Q., Wang, G., Liu, J., Chen, S.: Robust scale-invariant feature matching for remote sensing image registration. IEEE Geosci. Remote Sens. Lett. 6(2), 287–291 (2009) 17. Aguilera, C., Barrera, F., Lumbreras, F., et al.: Multispectral image feature points. Sensors 12(9), 12661–12672 (2012) 18. Kovesi, P.: Image features from phase congruency. Videre J. Comput. Vis. Res. 1, 1–26 (1999) 19. Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A 4(12), 2379–2394 (1987) 20. Daugman, J.: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. A Opt. Image Sci. 2(7), 1160–1169 (1985) 21. Daugman, J.: Statistical richness of visual phase information: update on recognizing persons by iris patterns. Int. J. Comput. Vis. 45(1), 25–38 (2001) 22. Kovesi, P.: Phase congruency detects corners and edges (2003) 23. Morrone, M., Owens, R.: Feature detection from local energy. Pattern Recogn. Lett. 6(5), 303–313 (1987)

Pattern Recognition

WetlandNet: Semantic Segmentation for Remote Sensing Images of Coastal Wetlands via Improved UNet with Deconvolution Binge Cui(&), Yonghui Zhang, Xinhui Li, Jing Wu, and Yan Lu College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, Shandong Province, China [email protected]

Abstract. In order to keep the clear boundary contour for segmented object in semantic segmentation of remote sensing image of coastal wetlands, a deep learning semantic segmentation model called WetlandNet is proposed by improving UNet. The model takes encoder-decoder as its basic structure, uses depthwise separable convolution instead of regular convolution to reduce the model parameters, uses deconvolution to extract boundary contour features of the object, and connects these features to upsampled feature maps by jump connection. This paper takes the remote sensing image of the Yellow River Estuary wetland in Kenli District, Dongying City, Shandong Province, China, taken by GF-2 satellite as an example. The experimental results show that compared with advanced semantic segmentation models UNet, PSPNet and DeepLabV3 in deep learning, the proposed model achieves more accurate segmentation results, which improves the OA by more than 5%, Kappa by more than 0.07, and the F1 scores of five of the six classes are higher than those of contrast models, while the number of parameters is only 1/36 of UNet, 1/42 of PSPNet and 1/51 of DeepLabV3. Keywords: Coastal wetland Remote sensing image Semantic segmentation UNet

Deep learning

1 Introduction Coastal wetlands which are located in the transition zone between land and marine systems are one of the key protected wetland environments in China [1]. Coastal wetlands can reserve resources, regulate climate, control pollution, purify the air, maintain biodiversity and regional ecological balance [2, 3]. In recent years, due to the influence of human activities and the natural environment, the area of coastal wetlands has been decreasing. Therefore, it is imperative to strengthen the protection of the wetland environment. However, the coastal wetland environment is complex, and it is difficult for human beings to carry out field survey, which brings great difficulties to the protection work. With the development of remote sensing imaging technology, © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 281–292, 2020. https://doi.org/10.1007/978-981-15-3308-2_32

282

B. Cui et al.

increasingly advanced imaging instruments have been developed, which makes it possible to carry out large-scale wetland ground type monitoring. At present, many people have carried out research on coastal wetlands. Hu et al. proposed the DCNN model, which is a classification method of deep learning. This method combines the spectral features and texture features of the image. The experimental results show that the accuracy of the DCNN model is higher than that of the SVM method [4]. Afterwards, Hu et al. proposed a multi-objective convolutional neural network decision fusion classification method for coastal wetlands. This method applied decision fusion based on fuzzy membership rules to single-objective CNN classification, and obtained higher classification accuracy [1]. However, in the field of remote sensing images, although SVM and CNN can also achieve semantic segmentation directly or indirectly, they are implemented by pixel-by-pixel semantic markers, and semantic segmentation methods in depth learning can mark the whole image at one time. In 2014, Long et al. proposed a Full Convolutional Neural Network (FCN), which is mainly used for semantic segmentation tasks [5]. On the one hand, FCN uses convolution layers instead of full connection layers in CNN, so that the model can accept any size of image input. On the other hand, FCN uses upsampling to restore feature maps of the last convolution layer to the same size of the input image, thus generating a prediction for each pixel while retaining the spatial information of the original input image. Subsequently, increasingly semantic segmentation models were proposed, such as UNet [6], PSPNet [7], DeepLabV3 [8] and so on. In 2015, Ronneberger et al. proposed a U-shaped semantic segmentation model based on FCN and named it UNet. UNet uses upsampling to restore the size of feature maps, and uses jump connection to connect the results of the first four convolution layers with corresponding upsampled results respectively, which can compensate for the loss of spatial information after pooling operation [6]. And UNet has achieved great success in the field of medical image segmentation. In 2017, Zhao et al. proposed PSPNet with pyramid pooling module [7]. The pyramid pooling module uses four different sizes of pooling windows to obtain four groups of different sizes of feature maps, then uses convolution to extract features from these four groups of different sizes of feature maps, and finally uses upsampling to restore the four groups of feature maps obtained by convolution to the size before pooling operation and connect them [7]. The pyramid pooling module can aggregate the context information of different regions in the image, thus improving the ability to obtain global information [7]. In the same year, Szegedy et al. proposed DeepLabV3, which uses atrous convolution to obtain a larger field of experience while introducing the idea of pyramid pooling [8]. In this paper, we propose a deep learning method for semantic segmentation in order to keep the clear boundary contour for segmented object in semantic segmentation of remote sensing image of coastal wetlands. Although PSPNet and DeepLabV3 can aggregate the context information of the image and improve the accuracy, PSPNet and DeepLabV3 cannot achieve accurate segmentation for many objects with obvious boundary contours in remote sensing images of coastal wetlands, such as reed meadow, pond waters and Yellow River in the Yellow River Estuary wetland. Although UNet is superior to PSPNet and DeepLabV3 in the segmentation result of remote sensing images of coastal wetland, there is still a large range of segmentation errors. In addition, UNet, PSPNet and DeepLabV3 have many parameters and high computational

WetlandNet: Semantic Segmentation for Remote Sensing Images

283

overhead. To solve the above problems, WetlandNet is proposed by improving UNet, and experiments are carried out on the image of the Yellow River Estuary wetland taken by GF-2 satellite. The experimental results show that compared with UNet, PSPNet and DeepLabV3, OA is improved by more than 5% and Kappa by more than 0.07 in WetlandNet. At the same time, the parameters of WetlandNet are only 1/36 of UNet, 1/42 of PSPNet and 1/51 of DeepLabV3. There are three main differences between WetlandNet and UNet: 1. In WetlandNet, the depthwise separable convolution is used to replace regular convolution, which reduces the model parameters. 2. WetlandNet introduces deconvolution in jump connection, which connects the extracted boundary contour features to upsampled feature maps, enriching the feature information. 3. Semantic segmentation of remote sensing images of coastal wetlands is a multiclassification task. WetlandNet replaces sigmoid function in UNet with softmax function, and calculates the final result using argmax function.

2 Relevant Techniques This section mainly introduces three main techniques introduced in WetlandNet, including the idea of encoder-decoder [9], deconvolution [10] and depthwise separable convolution [11]. 2.1

Encoder-Decoder

The encoder-decoder structure is a commonly used semantic segmentation structure, and UNet and DeepLabV3 all adopt this structure. The encoder is usually a CNN model with full connection layers removed, such as VGG16 [12], ResNet [13], etc. These classification models extract image features such as details and texture features by alternately stacking convolution and pooling layers. In the field of semantic segmentation, the encoder should be able to work in a similar way in the original classification model, i.e. to operate on smaller resolution data and provide for information processing and filtering [14]. Instead, the role of the decoder, is to upsample the output of the encoder, only fine-tuning the details, restore the image resolution [14]. 2.2

Deconvolution

Deconvolution can be regarded as the inverse process of convolution, that is, convolution maps multiple values into one value through a filter. Conversely, deconvolution maps a value to multiple values [15], as shown in Fig. 1. The deconvolution enlarges the size of feature maps, so this paper crops the enlarged feature map to make it consistent with the size of the feature map to be connected. Moreover, the kernel parameters of deconvolution are obtained by training the model, which enables the model to autonomously learn some information that people cannot provide through prior knowledge [10]. Many experiments show that the deconvolution can extract the

284

B. Cui et al.

boundary contour features of the object in the image very well [16], which has been verified in the semantic segmentation experiment of coastal wetland remote sensing image in this paper.

Fig. 1. Convolution and deconvolution operation diagram

2.3

Depthwise Separable Convolution

Many advanced network models have many parameters and computational complexity, and the model runs slowly. Therefore, increasingly lightweight network models have been proposed, such as MobileNet [17] and ShuffleNet [18]. In order to reduce the size and calculation of the model without loss of precision, the proposed method uses a depthwise separable convolution instead of the regular convolution operation [11], i.e. a spatial convolution performed independently over each channel of an input, followed by a pointwise convolution, i.e. a 1 1 convolution, projecting the channels output by the depthwise convolution onto a new channel space [19]. The experimental results show that this is less than 36 parameters of UNet with less parameters and higher precision.

3 The Proposed Method In order to keep clear boundary contour for segmented object in semantic segmentation of remote sensing image of coastal wetlands, this paper proposes a deep learning semantic segmentation model called WetlandNet with encoder-decoder as the basic structure. The model uses depthwise separable convolution instead of regular convolution to reduce the model parameters, uses deconvolution to extract boundary contour features of the object. 3.1

WetlandNet Model Structure

Figure 2 shows the WetlandNet model structure proposed in this paper. WetlandNet uses an encoder-decoder structure. The encoder uses five sets of depthwise separable convolutions to extract features such as image detail and texture features, and uses two deconvolutions to extract object boundary contour features after the first three pooling layers. The main operation of the decoder is first to double the size of feature maps by using upsampling, then extract features by using the depthwise separable convolution

WetlandNet: Semantic Segmentation for Remote Sensing Images

Fig. 2. WetlandNet model structure

285

286

B. Cui et al.

with kernel size of 2 2, and then results are connected with the corresponding convolution and deconvolution results in the encoder in the channel direction, so as to further enrich the feature after upsampling. Then two depthwise separable convolutions are used to further extract features. Finally, the channel information is reconstructed by 1 1 convolution, where n is the number of classes, and the final prediction result are obtained by using the softmax activation function and argmax function [20]. 3.2

Specific Parameters of WetlandNet

In all the depth separable convolution and deconvolution used in this paper, the parameter padding is set to same, stride is set to 1, and ReLU activation function is used. In Fig. 2, take 3 3-64 DS-conv as an example, 3 3 is the size of convolution kernels, 64 is the number of convolution kernels, and DS-conv is a depthwise separable convolution. Taking 64 deconv as an example, deconv is deconvolution and 64 is the number of deconvolution kernels. The specific parameters of the model are shown in Table 1. In order to prevent overfitting, this paper uses Dropout operation after DS-conv4_2 and DS-conv5_2, the ratio is 0.5 [21]. Table 1. Specific parameters of WetlandNet Module

Layer name

Encoder

DS-conv1_1 (Input) DS-conv1_2 MaxPooling1

Decoder

Kernel size 33 33 22

Kernel number 64 64 –

DS-conv2_1 DS-conv2_1 MaxPooling2

33 33 22

64 64 –

DS-conv3_1 DS-conv3_2 MaxPooling3

33 33 22

128 128 –

DS-conv4_1 DS-conv4_2 MaxPooling4 DS-conv5_1 DS-conv5_2 UpSampling6 DS-conv6 Concatenate7 DS-conv7_1 DS-conv7_2 UpSampling7 DS-conv7

3 3 2 3 3 2 2 – 3 3 2 2

3 3 2 3 3 2 2

3 3 2 2

128 128 – 256 256 – 128 – 128 128 – 128

Connect to DS-conv1_2 MaxPooling1 & Concatenate7 DS-conv2_1 & Deconvolution11_1 DS-conv2_1 MaxPooling2 DS-conv3_1 & Deconvolution12_1 DS-conv3_2 MaxPooling3 DS-conv4_1 & Deconvolution13_1 DS-conv4_2 MaxPooling4 DS-conv5_1 DS-conv5_2 UpSampling6 DS-conv6 Concatenate7 DS-conv7_1 DS-conv7_2 UpSampling7 DS-conv7 Concatenate8 (continued)

WetlandNet: Semantic Segmentation for Remote Sensing Images

287

Table 1. (continued) Module

Deconvolutions

Layer name

Kernel size

Kernel number

Connect to

Concatenate8 DS-conv8_1 DS-conv8_2 UpSampling8 DS-conv8 Concatenate9 DS-conv9_1 DS-conv9_2 UpSampling9 DS-conv9 Concatenate10 DS-conv10_1 DS-conv10_2 Deconvolution11_1 Deconvolution11_2 Deconvolution12_1 Deconvolution12_2 Deconvolution13_1 Deconvolution13_2 Conv10 Softmax Argmax

– 3 3 2 2 – 3 3 2 2 – 3 3 3 3 3 3 3 3 1 – –

– 128 128 – 64 – 64 64 – 64 – 64 64 64 64 64 64 128 128 n – –

DS-conv8_1 DS-conv8_2 UpSampling8 DS-conv8 Concatenate9 DS-conv9_1 DS-conv9_2 UpSampling9 DS-conv9 Concatenate9 DS-conv9_1 DS-conv9_2 Conv10 Deconvolution11_2 Concatenate9 Deconvolution12_2 Concatenate8 Deconvolution13_2 Concatenate7 Softmax Argmax (Output)

3 3 2 2

3 3 2 2

3 3 3 3 3 3 3 3 1

4 Experiments 4.1

Experimental Image

The experimental image used in this paper is the image of the Yellow River Estuary wetland in the Kenli District of Dongying City, Shandong Province, taken on August 11, 2016, and the image resolution is 0.8 m. The experimental image is shown in the right of Fig. 3, the image size is 2148 2148, and the number of channels is 4.

288

B. Cui et al.

Fig. 3. The experimental image of Yellow River Estuary wetland

4.2

Training Datasets

In this paper, six typical ground types of the Yellow River Estuary wetland were selected for experiments, including reed meadow, spartina, tamarix forest, tidal flat, waters, Yellow River. The marked images are shown in the right of Fig. 4. In this paper, nine small blocks of experimental images (yellow box in Fig. 4) are randomly selected as training datasets, accounting for about 7% of the experimental images. Train samples and labels are taken out according to the row and column coordinates of the yellow box in the experimental image and the marked image, and each small block image is cut into 64 64, and the cutting stride is 16, and 648 training samples and 648 training labels are respectively obtained. The information of training samples and test samples are shown in Table 2. The numbers in the Table 2 are the number of pixels of each class.

Fig. 4. Experiment image and market image

WetlandNet: Semantic Segmentation for Remote Sensing Images

289

Table 2. The information of training samples and testing samples Type Training samples Testing samples Reed meadow 83,040 1,521,491 Spartina 11,071 82,025 Tamarix forest 15,571 66,680 Tidal flat 122,956 1,770,351 Waters 82,277 924,396 Yellow River 19,579 248,961

4.3

Model Training

All models used in this experiment were implemented using the Keras framework. They were trained on the NVIDIA K80 for 100 epochs and used the same hyperparameters, with a learning rate of 0.0001, using Adam as the optimizer, the batch size was set to 2, and the 10 fold cross-validation was used to train, that is, 90% of the sample training, 10% of the sample verification. 4.4

Experimental Results and Analysis

This experiment was compared with UNet, PSPNet, DeepLabV3, and the final evaluation was performed using three evaluation indicators, F1 score, OA and Kappa. The F1 score, also known as the balanced F Score, is defined as the harmonic mean of the accuracy and recall rates. As shown in Eq. 1, the precision and recall are shown in Eqs. 2 and 3. F1 ¼ 2

precision recall precision þ recall

precision ¼ recall ¼

TP TP þ FP

TP TP þ FN

ð1Þ ð2Þ ð3Þ

Among them, TP means the sample is positive, the prediction result is positive. FP means the sample is negative, the prediction result is positive. FN means the sample is positive and the prediction result is negative. In this paper, the Kappa, OA and the F1 scores of the six ground types were calculated. The experimental results are shown in Table 3 and Fig. 5. It can be seen from Table 3 that except for the F1 score of the Yellow River, which is slightly lower than UNet, the F1 scores of the other five categories, Kappa and OA are higher than the three comparison methods, and WetlandNet is much smaller in parameters than other models.

290

B. Cui et al. Table 3. Comparison of F1, OA and Kappa Type Reed meadow Spartina Tamarix forest Tidal flat Waters Yellow River OA Kappa Parameters

DeepLabV3 0.86680 0.20393 0.38734 0.78122 0.80767 0.02334 0.74681 0.64615 43M

PSPNet 0.79145 0.01767 0.10865 0.78165 0.41308 0.30202 0.65213 0.51472 36M

UNet 0.89270 0.41274 0.57221 0.93699 0.85197 0.86459 0.88324 0.83613 31M

WetlandNet 0.98962 0.54019 0.59673 0.94168 0.93045 0.85144 0.93992 0.91414 0.8M

Fig. 5. Comparison of experimental results

It can be seen from Fig. 5 that on the boundary contours, such as the boundary of the reed meadow and the boundary of the pond water, WetlandNet has significant advantages over the other three models, mainly due to the deconvolution sensitive to the boundary contour feature. Unlike the upsampling method using interpolation [22], deconvolution is the inverse process of convolution, and its kernel parameters are obtained by model training. When deconvolution maps a value to multiple values through the deconvolution kernel, the boundary contour of the object can be recovered according to the detail features obtained by the convolution. To verify this, the deconvolution in WetlandNet is removed and the experiment is carried out again. The experimental results are shown in Fig. 6. From the area marked in Fig. 6, it can be seen that deconvolution plays a significant role in keeping boundary contour of the object.

WetlandNet: Semantic Segmentation for Remote Sensing Images

291

Fig. 6. Comparison of deconvolution effects in WetlandNet

5 Conclusion In this paper, a deep learning semantic segmentation model called WetlandNet for remote sensing images of coastal wetlands is proposed by improving UNet, which can effectively extract the object boundary contour. The model uses the depthwise separable convolution instead of regular convolution to reduce model parameters, uses deconvolution to extract the boundary contour features of the object, and connects these features to upsampled feature maps by jump connection, which enriches the feature information. The experimental results show that compared with UNet, PSPNet and DeepLabV3, WetlandNet improves OA by more than 5%, Kappa improves by more than 0.07, while the parameters are only 1/36 of UNet, 1/42 of PSPNet and 1/51 of DeepLabV3.

References 1. Hu, Y., et al.: Hyperspectral coastal wetland classification based on a multiobject convolutional neural network model and decision fusion. IEEE Geosci. Remote Sens. Lett. PP(99), 1–5 (2019) 2. Woodward, R.T., Wui, Y.S.: The economic value of wetland services: a meta-analysis. Ecol. Econ. 37(2), 257–270 (2001) 3. Yang, Y.: Main characteristics, progress and prospect of international wetland science research. Progr. Geogr. 21(2), 111–120 (2002) 4. Hu, Y., et al.: Deep learning classification of coastal wetland hyperspectral image combined spectra and texture features: a case study of Huanghe (Yellow) River Estuary wetland (2019) 5. Long, J., et al.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014) 6. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation (2015) 7. Zhao, H., et al.: Pyramid scene parsing network (2016) 8. Chen, L.C., et al.: Rethinking atrous convolution for semantic image segmentation (2017) 9. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation (2014)

292

B. Cui et al.

10. Krishnan, D., Tay, T., Fergus, R.: Blind deconvolution using a normalized sparsity measure. In: Proceedings of the Computer Vision and Pattern Recognition, CVPR (2011) 11. Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation (2017) 12. Simonyan, K., Zisserman, A.J.C.S.: Very deep convolutional networks for large-scale image recognition (2014) 13. He, K., et al.: Deep residual learning for image recognition (2015) 14. Paszke, A., et al.: ENet: a deep neural network architecture for real-time semantic segmentation (2016) 15. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: IEEE International Conference on Computer Vision (2015) 16. Wu, C., et al.: A compact DNN: approaching googlenet-level accuracy of classification and domain adaptation (2017) 17. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017) 18. Zhang, X., et al., ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. 2017 19. Chollet, F., Xception: deep learning with depthwise separable convolutions (2016) 20. Gould, S., et al.: On differentiating parameterized argmin and argmax problems with application to bi-level optimization (2016) 21. Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014) 22. Kopf, J., et al.: Joint bilateral upsampling (2007)

Intelligent Multimedia Tools and Applications

Digital Multimeter Automatic Verification Device Design Qingdan Huang, Liqiang Pei(&), Yuqing Chen, Rui Rao, and Huiyuan Lv Electrical Power Test & Research Institute of Guangzhou Power Supply Bureau, Guangzhou 510000, China [email protected] Abstract. This unit is used for the automatic verification operation of the digital multimeter. The computer is used as the core to complete the verification of the digital multimeter, and the digital image processing technology is used for the digital universal representation number reading. The verification data and verification results in the multimeter verification process are stored in the computer. The device realizes the automatic operation of the digital multimeter from instrument plug-in, range adjustment, reading recognition, data storage, data processing, and result determination. The utility model relates to a digital multimeter verification device, belonging to the field of automatic verification of power meters. Keywords: Automatic verification

Machine vision Data processing

1 Introduction In the power industry, it is necessary to regularly check the digital multimeter used by the staff to ensure that the accuracy of the digital multimeter used by the staff in the power operation meets the requirements [1–3]. In the process of meter verification, it is necessary to manually complete the cumbersome work [4–6]. At the same time, the instrument model and verification scheme are many, and the operation error is easy to occur during the long-term verification process, and the verification efficiency is low [7, 8]. In summary, designing a digital multimeter verification device that reduces the work intensity of workers and has high verification accuracy has very important practical use value [9].

2 Technical Solutions 2.1

The Hardware Structure of the Device

As shown in Fig. 1, the digital multimeter automatic verification system in this design consists of computer, display, controller, monocular camera, patching device, range adjustment device, transport instrument device, standard source, mechanical device, optical platform. It consists of several parts. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 295–301, 2020. https://doi.org/10.1007/978-981-15-3308-2_33

296

Q. Huang et al.

Fig. 1. Digital multimeter automatic verification system composition diagram

A multimeter verification program is installed on the computer. It is used to control the operation of the whole system, including the control of the mechanical device, instrument wiring, range adjustment, instrument transportation, standard source adjustment, instrument representation number identification, data storage, data processing, result determination and so on. 2.2

Device Program and Its Composition

This unit is controlled by a verification program installed on the computer. The verification software has a related operation interface, and the staff can perform related parameter adjustment [10]. The data in the working process of the device is stored in the database, and the staff can directly print the relevant verification data and verification results into the verification report, and can perform the database reading operation. At the same time, in the operation interface, according to the actual verification requirements, different types of digital multimeter verification schemes are modified, added, deleted, and the like. As shown in Fig. 2, the verification program consists of database operations, monocular camera control, image processing, standard source control, and mechanical structure control.

Fig. 2. Verification program composition diagram

Digital Multimeter Automatic Verification Device Design

297

The database operation program completes the storage, reading and modification of the relevant verification data, the verification plan, and the verification result during the working process of the device. The monocular camera control program is used to collect images from the calibration instrument display screen and transmit the images to the computer through the network cable. The image processing program is used for image processing on the image acquired by the monocular camera, and uses the image processing technology to acquire the representation number of the instrument, writes the obtained data into the database, and performs related data processing to obtain the instrument verification result. The standard source control program is used to input the corresponding voltage value, current value, and resistance value to the calibration meter according to the program. The mechanical mechanism control program is used to control the actuators such as the electric cylinder and the motor of the device, and complete the checkout of the instrument, the range adjustment, and the transport operation.

3 Specific Implementation 3.1

Automatic Verification Device Workflow

The mechanical structure of the digital multimeter automatic verification device is shown in Fig. 3

Fig. 3. The mechanical structure of the digital multimeter automatic verification device 1. Monocular camera detection device 2. Verification platform 3. Instrument to be inspected 4. Instrument clamping device 5. Plug-in device 6. Terminal block 7. Range adjustment device 8. Instrument display screen to be inspected 9. Conveyor belt

298

Q. Huang et al.

The specific workflow is as follows: (1) Place the instrument on the conveyor belt according to the requirements. (2) Running program the transport meter device is activated, the conveyor belt is started, the conveyor belt is stopped after the instrument to be inspected, and the instrument to be inspected on the conveyor belt is transported to the verification platform. (3) The camera detection device collects the image of the instrument, performs model identification, and calls the corresponding verification scheme. (4) Fix the instrument to be inspected. (5) Starts the wire insertion device. (6) Starts the range adjustment device to adjust the meter range. (7) The terminal of the plug-in device is energized, and the program outputs a specific verification output value. Each output value of the monocular camera detection device collects the image of the instrument display screen, and after image processing, the display screen of the instrument to be inspected is obtained. The number. (8) Repeat the previous step, The computer stores the verification data obtained during the verification process, and performs arithmetic processing on the obtained data to determine whether the meter is qualified. (9) Start the range adjustment device, turn off the instrument to be inspected, and return the range adjustment device to the initial position. (10) The wire insertion device is activated, and the terminal of the instrument to be inspected is pulled out and returned to the initial position. (11) The verification platform is fixedly released, and the checked instrument to be inspected is sent back to the conveyor belt, the conveyor belt is activated, and the instrument to be inspected is transported and removed. (12) Arrive at the designated area and complete a check of the instrument to be inspected.

4 Experiments and Results Analysis 4.1

Automatic Handling Multimeter Function Test and Analysis

Table 1 shows the test results of the mechanical actions and related hardware involved in transporting the multimeter from the assembly line to the verification platform and fixing it, and transporting it from the verification platform to the conveyor. Experiment to adjust parameters to meet design requirements. The multimeter is fixed and loosened. Some of the five multimeters are not completely fixed during the fixing process. After adjusting the parameters of the electric actuator, the design requirements are met.

Digital Multimeter Automatic Verification Device Design

299

Table 1. Automatic handling multimeter test results No. 1 2 3 4 5 6 7

Test item

Test times 30 30 30 30 30 30 30

Qualified Qualified Qualified Qualified Qualified Qualified Qualified

30

Qualified

30

Qualified after adjustment Qualified after adjustment Qualified Qualified Qualified after adjustment

9

Power-on reset Jaw electric cylinder open, clamp Conveyor start and stop Photogate 1 detection signal Photogate 1 detection signal Out of the pole, close the rod Gripper cylinder lifts and falls (conveyor surface highest position) Gripper cylinder lifts and falls (sliding table actuator surface - highest position) Slide electric actuator slides out and retracts (no load)

10

Slide electric actuator slides out and retracts (full load)

30

11 12 13

Tray leveling Multimeter attitude adjustment Multimeter fixed, loosened

50 50 50

8

4.2

Test result

Automatic Plug-In Function Test and Analysis

In the automatic plug-in function test, firstly, the power-on reset test is performed for each action actuator, and then the actuators involved in the plug-in function module are separately tested. After each set of mechanical action tests is passed, the plug-in line is performed. Functional overall test. The test results of the automatic plug-in function are shown in Table 2.

Table 2. Automatic cable test results No. Test items 1 2 3 4 5 6 7 8

Testing frequency Power on reset 30 VC890C+ wiring hole pin right and left alignment 30 VC890C+ wiring hole pin before and after alignment 30 VC890C+ Insert and remove the pin template 30 FLUKE15B wiring hole pin right and left alignment 15 FLUKE15B wiring hole pin before and after alignment 15 FLUKE15B Insert and pull out the pin template 15 FLUKE17B wiring hole pin right and left alignment 15

Testing result Qualified Qualified Qualified Qualified Qualified Qualified Qualified Qualified (continued)

300

Q. Huang et al. Table 2. (continued)

No. Test items 9 10 11 12 13 14 15 16

Testing frequency FLUKE17B wiring hole pin before and after alignment 15 FLUKE17B Insert and pull out the pin template 15 CD771 wiring hole pin right and left alignment 15 CD771 wiring hole pin before and after alignment 15 Insert and remove the CD771 pin template 15 KEW1012 wiring hole pin right and left alignment 15 KEW1012 wiring hole pin before and after alignment 15 KEW1012 Insert and remove the pin template 15

Testing result Qualified Qualified Qualified Qualified Qualified Qualified Qualified Qualified

As can be seen from Table 2, the problem occurred in the test of the CD771 model multimeter wiring pin and wiring hole alignment process, by adjusting the pin left and right alignment electric actuator parameters to complete the CD771 model multimeter wiring pin and wiring hole accurate positive. 4.3

Automatic Range Adjustment Function Test and Analysis

Before the automatic range adjustment function test, the certified multimeter needs to be fixed at the actual verification position of the verification platform. When testing the automatic range adjustment function, firstly, the mechanical actuator of the automatic range adjustment device should be tested for power-on reset, and then the mechanical action involved in the range adjustment should be tested in a single step. After the test is passed, the whole range adjustment device is tested. The test results are shown in Table 3. The range auto-tuning function test meets the design requirements.

Table 3. Range adjustment function test No. 1 2 3 4 5 6 7 8 9 10 11

Test item Power on reset VC890C+ Range knob alignment VC890C+ Range knob rotation FLUKE15B Range knob alignment FLUKE15B Range knob rotation FLUKE15B AC and DC button press FLUKE17B Range knob alignment FLUKE17B Range knob rotation FLUKE17B AC and DC button press CD771 Range knob alignment CD771 Range knob rotation

Test times Test result 10 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified 15 Qualified (continued)

Digital Multimeter Automatic Verification Device Design

301

Table 3. (continued) No. 12 13 14 15

Test item CD771 AC and DC button press KEW1012 Range knob alignment KEW1012 Range knob rotation KEW1012 AC and DC button press

Test times Test result 15 Qualified 15 Qualified 15 Qualified 15 Qualified

5 Conclusion The digital multimeter verification platform designed by this subject meets the design requirements, the function is normal, and the equipment runs smoothly. During online operation, the intelligent storage system can be docked with the digital multimeter verification platform under the scheduling of the dispatching computer, so that two verification platforms share a single conveyor belt for verification.

References 1. Instrument Network: Power monitoring instruments and meters will welcome the market blue ocean enterprises how to seize the highlands [EB/OL]. (2017). http://shupeidian.bjx. com.cn/html/20170630/834343.shtml. [2018] 2. Zhu, Z.: Technical analysis of automatic calibration of electrical measurement instruments. Enterp. Technol. Dev. 27–28 (2015) 3. Zhang, Y.: Research on the status quo of instrument verification system. Private Technol. 33–34 (2010) 4. Wang, Z., Yang, J., Yang, M.: Method and implementation of automatic verification technology for electric measuring instruments. Electr. Power Meas. 82–85 (2011) 5. Wang, Y.: Automatic calibration device for pointer instrument based on DSP core. Dev. Innov. Electromech. Prod. 124–126 (2014) 6. Li, Q., Fang, Y., He, Y.: Automatic reading system based on automatic alignment control for pointer meter. IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society, pp. 3414–3418 (2014) 7. Li, Y., Teng, Y., Cao, G.: Research on intelligent automatic detection pipeline system of single-phase electric meter. Power Electr. Eng. 88–89 (2014) 8. Qin, H.: Design and implementation of electric energy meter verification pipeline control system, pp. 8–9. Hangzhou University of Electronic Science and Technology master’s thesis, Hangzhou (2017) 9. Yang, L.: Design of automatic calibration system for electric energy meter, pp. 20–22. Tianjin University Master’s Thesis, Tianjin (2015) 10. Yue, X.F., Min, Z., Zhou, X.D., et al.: The research on auto-recognition method for analogy measuring instruments. In: International Conference on Computer, Mechatronics, Control and Electronic Engineering, pp. 207–210 (2010)

A Method to Prevent Cascading Trip in Power Network Based on Nodal Power Hui-Qiong Deng1,2, Peng-Peng Wu1,2(&), Xing-Ying Lin1,2, Qin-Bin Lin1,2, and Chao-Gang Li1,2 1

2

School of Information Science and Engineering, Fujian University of Technology, Fujian 350118, China [email protected], [email protected] Fujian Provincial University Engineering Research Center of Smart Grid Simulation Analysis and Integrated Control, Fujian 350118, China

Abstract. This paper proposes a method to prevent cascading trip based on the action of relay protection and the output of generator as the optimization variable for aiming at the cascading trip in power grid. Firstly, the critical state of cascading trip is discussed, and the safety index of cascading trip in power grid is analyzed based on the analysis of the performance of cascading trip and the action of relay protection. On the basis of this index, a kind of optimized model of cascading trip prevention is given while considering the various constraints of power grid synthetically. This paper presents a complete algorithm flow for solving the optimized model combined with particle swarm optimization algorithm aiming at the proposed optimized model. Finally, on a IEEE39 system, an example is analyzed by the MATLAB program, which verifies the rationality of the proposed method. Keywords: Power grid optimization

Cascading trip Relay protection Particle swarm

1 Introduction Grid cascading trip is usually the early manifestations of complex cascading failures. Although cascading trip rarely occur in today’s power grid, once they occur and cause complex cascading failures, their consequences are often very serious. Many major blackouts occurred worldwide have approved this in the past 50 or 60 years. For a long time, cascading trip and even cascading failures of power grids have attracted the attention of many researchers because the public pay close attention to preventing the blackouts accident. Ref. [1] specifically analyzed the dominant mode and characteristics of cascading failures, summarized the research status of cascading failures model and control, and proposed the improvement direction of cascading failures research. In recent years, besides the deepening and integration of the formation mechanism of cascading failures, the accurate simulation and prediction of cascading failures, and the propagation patterns of cascading failures in complex networks, researchers have been constantly pushing forward some new research perspectives, which provide many © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 302–309, 2020. https://doi.org/10.1007/978-981-15-3308-2_34

A Method to Prevent Cascading Trip in Power Network

303

useful inspirations for further research. For example, in Ref. [2], according to the characteristics of information physical integration with network system of power system, a dynamic optimization routing strategy based on SDN is proposed to reduce the risk of cascading failures; in Ref. [3], according to the characteristics of cascading failure process, sequential pattern mining technology is introduced into the study of cascading failure in power system to identify cascading failure modes and key lines; In Ref. [4], considering the system brittleness, the composite brittleness degree of association of branch is defined from the power flow state and network structure, and an improved Floyd algorithm is proposed to identify the propagation path of cascading faults; In Ref. [5], based on stochastic power flow and risk value theory, a high-risk cascading failure chain model is proposed to analyze the risk degree of cascading failure events involving large-scale wind power systems. From the point of view of preventing cascading failures, it is obvious that the earlier the development of cascading failures is blocked, the more beneficial to the safety of the grid, that is to say, the grid should avoid the occurrence of secondary failures as far as possible. Regardless of the specific performance of the grid Cascading trip events, relay protection is always indispensable. Therefore, the action of protection should not be ignored in the study of Cascading trip. Based on the consideration of current-mode backup protection and the action of relay protection, this paper proposes a preventive model to avoid Cascading trip as much as possible and to improve the safety of power grid. Furthermore, taking IEEE39 system as an example, a solution algorithm is given, and analysis and verification on the method proposed in this paper are carried out.

2 An Index to Measure the Safety of Power Grid When the Cascading Trip Occurs When the initial fault occurs and the initial fault is removed, whether the Cascading trip occur depends mainly on whether the electrical quantity of each branch of the power grid enters the action area of the backup protection after the redistribution of the power flow. If the backup protection of each branch of the power grid is current-type protection, the Cascading trip of the power grid after power flow transfer mainly depend on the specific configuration of the current and backup protection of each branch. Assuming that at a certain time, the branch Lij between node i and node j has an initial fault. When the branch Lij is cut off and the power flow redistributes, whether any branch Lst in the remaining part of the power grid will undergo Cascading trip can be measured by formula (1): Istdist ¼ jIstset j jIst j

ð1Þ

In formula (1), Ist is the current of branch Lst after power flow redistribution, Istset is the current setting value of backup protection on branch Lst, and Istdist is the quantity of measuring the electrical distance between Istset and Ist.

304

H.-Q. Deng et al.

According to formula (1) and the actual performance of grid Cascading trip, when Istdist > 0, branch Lst will not occur Cascading trip, and when Istdist 0, branch Lst will be removed by backup protection, that is, Lst will occur Cascading trip. Assuming that except the initial fault branch, there are l remaining branches of the power grid in total, and the indicators shown in Formula (2) can be further obtained from Formula (1) [6]. m ¼ minðIstdist Þ ð2Þ According to formula (2) and Cascading trip phenomena, when m = 0, there is at least one branch in the remaining branches of the grid besides the initial fault branch is in the boundary state of cascading trip, that is, the grid is in the boundary state of cascading trip at this time. The cascading trip boundary state mentioned above refers to the operation state S1 of the power grid after redistribution of power flow, which is interpreted by the operation parameters I of each branch of the power grid after the initial fault occurs. Since the operation parameters I of each branch of the power grid are determined by the injected power of the nodes of the power grid, if the injected power of the nodes of the power grid remains unchanged before and after the redistribution of the power flow, the boundary state of cascading trip of the power grid is actually determined by the injected power of the nodes before the initial fault occurs, that is, the power grid before the initial fault occurs. From this point of view, S0 can also be called the critical state of cascading trip. m can be used as an index to measure the safety of power grid According to the above analysis. In the operation of power grid, if the generator output of power grid can be optimized and adjusted, and the m value in formula (2) is greater than zero as far as possible and the m value is as large as possible under the premise of satisfying the load demand, then the cascading trip of power grid can be effectively prevented. The objective function shown as formula (3) can be formed by writing it in mathematical form. F ¼ maxðmÞ

ð3Þ

3 Optimization Model for Preventing Grid Cascading Trip For the objective function shown as Formula (3), the variable to be optimized is the output of the generator unit. A complete mathematical model can be formed by adding the objective function to all kinds of electrical constraints necessary for the power grid. The electrical constraints considered include equality constraints and inequality constraints. Among them, equality constraints mainly include power flow constraints before the initial fault occurs, and power flow constraints after the initial fault occurs. Among them, power flow constraints before the initial fault occurs can be abbreviated as formula (4). h0 ðxÞ ¼ 0

ð4Þ

In formula (4), h0 is the mapping relation of power flow before the initial fault occurs, and x is the power grid state variable when solving the flow. The power flow constraints of the power grid can be abbreviated as formula (5) after the initial fault occurs.

A Method to Prevent Cascading Trip in Power Network

h1 ðxÞ ¼ 0

305

ð5Þ

In formula (5), h1 is the mapping relation of power flow after the initial fault occurs. In this optimization issue, the inequality constraints that need to be considered are mainly shown in formula (6). 8 PGimin PGi PGimax ; i ¼ 1; ; N1 > > > > < QGimin QGi QGimax ; i ¼ 1; ; N1 > Pm Pmmax ; m ¼ 1; 2; ; l > > > : Ukmin Uk Ukmax ; k ¼ 1; ; 2; ; N2

ð6Þ

In formula (6), PGi and QGi are respectively the active and reactive power of the generator i in the system; PGi•min and PGi•max are respectively the lower limit and upper limit of active output of the generator i in the system; QGi•min and QGi•max are respectively the lower and upper limits of reactive power of the generator i in the system; N1 is the total number of generators in the grid; Pm is the active power transmitted by the branch Lm; Pm•max is the active power limit transmitted by the branch Lm; Uk is the voltage of node k; Uk•min is the lower voltage limit allowed by node k; Uk•max is the upper voltage limit allowed by node k; N2 is the total number of grid nodes. Write formula (6) in abbreviated form, which can be expressed in formula (7). g0 ð xÞ 0

ð7Þ

In addition, according to the above analysis, m in formula (2) also needs to satisfy certain constraints, that is, m > 0, because m is also a function of the state variable x, so in this paper, m here is expressed as f (x), and the corresponding inequality constraints are expressed as the form shown in formula (8). f ð xÞ 0

ð8Þ

Combining the above formulas (3)–(8) can form the objective of optimizing the unit output of the grid and improving the safety of the grid against cascading trip. The complete model is shown in formula (9). 8 F ¼ maxðmÞ > > > > < s:t:h0 ðxÞ ¼ 0 h1 ð x Þ ¼ 0 > > g0 ð x Þ 0 > > : f ð xÞ 0

ð9Þ

306

H.-Q. Deng et al.

4 Solution of Optimization Model For the optimization model given by formula (9), this paper will use particle swarm optimization to solve it, the reason for adopting this algorithm is mainly to consider that this algorithm is easy to program and implement. For the basic particle swarm algorithm, this paper performs iterative calculation according to the form of formula (10). (

vki þ 1 ¼ wvki þ c1 r1 Pbesti xki þ c2 r2 gbest xki xki þ 1 ¼ xki þ vki

ð10Þ

In formula (10), xki is the k-th iteration position of particle i; vki is the iteration speed of particle i at the k-th time;The condition that need to be met is vmin vki vmax ; Pbesti is the optimal solution experienced by particle i itself; gbest is the optimal solution experienced by the entire particle swarm; w is the inertia coefficient,The value is generally decreasing from 0.9 to 0.1 in a linear manner; c1 and c2 are the acceleration constant, Its value is generally taken as 2; r1 and r2 are uniformly distributed random number in the interval [0, 1]. For the optimization problem shown in formula (9), this paper firstly simplifies the model: (a) For the power flow constraint in formula (9),In this paper, it will be processed by the power flow calculation. In the iterative calculation process, if the power flow constraint is not satisfied, a new particle will be generated. (b) For the optimization model shown in formula (9), in order to facilitate processing with particle swarm optimization. This paper changes the objective function of formula (9) to the penalty function form shown in (11). F0 ¼ F þ

X 1 2 1 min 0; g0 ð xÞ þ ½f ð xÞ2 ka c k

ð11Þ

In formula (11),ak and c are punishment factors, k is the number of inequalities contained in formula (6). Using the penalty function given by formula (11), this paper combines the particle swarm algorithm to give the following specific algorithm flow. Setp1: Initialized particle swarm optimization; Step2: According to the fitness function of the model, the initial fitness of each particle is solved and the optimal fitness value is selected; Step3: Update Pbest , gbest , particle position and speed according to the optimal fitness value; Step4: Calculate the next position and velocity of the particle based on the iterative formula of the particle swarm; Step5: Calculate whether the grid power flow converges; Step6: The iteration is stopped when the maximum number of iterations is reached.

A Method to Prevent Cascading Trip in Power Network

307

5 Example Using IEEE39 system to analyze an example based on the previous algorithm (Fig. 1): G

G

30 2

37 26

25

28 27

1

18

3

29 38

17 G

16 G

39

21

15 4

14

G

24

13

5

23

12

6

9

36

19 22

11

7

31

8 G

10

32 G

20 34

33 G

G

35 G

Fig. 1. Diagram of the example system

It can be seen from the figure that branches 37–46 are connected to the generator unit. In this paper, the line connecting to the generator unit is not selected as the initial fault line. The range of initial fault is selected to be 1–36 branches, and the generator output of the balance node does not change. Assuming that the initial fault is in the 25th branch, the generator unit has not changed in time when the fault occurs. At this time, the safety index m of the system is (Table 1): Table 1. Generator unit output distribution plan before failure m

Node 30 Node 32 Node 33 Node 34 Node 35 Node 36 Node 37 Node 38 Node 39 output output output output output output output output output

0.9570 250 MW 650 MW 632 MV 508 MV 650 MV 560 MV 540 MV 830 MV 1000 MV

From the table above, when the initial fault is in the 25th branch, the safety index m of the system is only 0.9570, which is less than 1. At this time, the whole system is in an unsafe safety state. If the system is subjected to a external shocks again, probability of cascading trip occurrence is quite high.

308

H.-Q. Deng et al.

In order to further prevent the occurrence of cascading trip and increase the value of m, the system should timely adjust the output of generator units to ensure that their safety index is as large as possible. After adjusting the output of the generator unit, the safety index m of the system is shown in the Fig. 2: From Fig. 2, it can be seen that m value tends to be stable and the value is more than twice the initial value when the PSO iterates to 250 times or so.

Fig. 2. Safety index m after adjusting the output of generator units

At this point, the optimal output of the generator units are (Table 2): Table 2. Generator unit output distribution plan after failure m

Node30 output

Node32 output

Node33 output

Node34 output

Node35 output

Node36 output

Node37 output

Node38 output

Node39 output

2.1337 472 MW 442 MW 423 MV 300 MV 442 MV 352 MV 332 MV 622 MV 2247 MV

According to the analysis of the above example, Changing the generator unit’s output distribution method of the original system can increase the safety evaluation index of the system from 0.9570 to 2.1337, thus achieving the purpose of preventing cascading trip.

6 Conclusion The cascading trip of power grid is closely related to the operation state of power grid before the initial fault occurs. In this paper, the model and calculation method for calculating system safety index m are given from the point of view of optimizing unit

A Method to Prevent Cascading Trip in Power Network

309

variables. The example analysis shows that adjusting the unit output can effectively improve the safety level of the system for preventing cascading trip after the fault occurs. The safety index of preventing cascading trip and the corresponding calculation method proposed in this paper can provide some reference for the study of power grid operation. Acknowledgment. This research was financially supported by Doctoral Research Foundation of Fujian University of Technology under the grant GY-Z13104, and Scientific Research and Development Foundation of Fujian University of Technology under the grant GY-Z17149.

References 1. Qian, Y., Chen, Z., Yang, W., Cao, Y., Wang, X.: Review of cascading failure in power system. Shandong Electric Power 45(253), 1–9 (2018) 2. Han, Y., He, Y., Lou, F., Wang, Y., Guo, C.: Analysis and application of SDN based dynamic optimal route strategy for cyber layer in cascading failures of cyber-physical power system. Power Syst. Technol. 42(8), 2620–2629 (2018) 3. Liu, Y., Huang, S., Mei, S., Zhang, X.: Analysis on pattern of power system cascading failure based on sequential pattern mining. Autom. Electr. Power Syst. 43(6), 34–46 (2019) 4. Zhu, T., Ding, J., Tian, S., Zhu, B.: Cascading failure forecast based on compound vulnerability relevance and improved Floyd algorithm. J. Electr. Power Sci. Technol. 33(4), 58–65 (2018) 5. Xu, D., Wang, H.: High risk cascading outage assessment in power systems with large-scale wind power based on stochastic power flow and value at risk. Power Syst. Technol. 43(2), 400–409 (2019) 6. Deng, H.Q.: Research on the safety margin of power network considering cascading tripping. J. Fujian Univ. Technol. 14(03), 255–261 (2016)

Insulation Faults Diagnosis of Power Transformer by Decision Tree with Fuzzy Logic Cheng-Kuo Chang(&) , Jie Shan, Kuo-Chi Chang, and Jeng-Shyang Pan School of Information Science and Engineering, Fujian University of Technology, Fuzhou, Fujian, China [email protected]

Abstract. The three ratios method of power transformer faults diagnosis based on decision tree with fuzzy logic propose in this study. The two major problems in the application of the IEC ratio method in transformer fault diagnosis are the lack of coding and the clear ratio range. Used the decision tree algorithm to solve the lack of coding problem. The fuzzy logic deals with the clear ratio range while ratio range displaced by the membership function. The simulation analysis of experimental data shown that the new method had more diagnostic accuracy, convenience and operability compared with the traditional IEC ratio method. Keywords: Dissolved gas analysis Power transformer Decision tree Fuzzy logic Fault diagnosis

1 Introduction As one of the main monitoring means, analysis of dissolved gas analysis (DGA) composition and content is one of the most effective measures to monitor the safe operation of oil-immersed transformer [1]. Using gas chromatography to analyze dissolved gas in oil to monitor the safe operation of oil-immersed transformers has been more than thirty years of experience. Over the years, DGA has been widely used in the industry as one of the most effective means to monitor oil-immersed transformer faults. In the traditional DGA monitoring method, the three-ratio method is widely accepted because of its simple operation and high reliability [2]. In the long term in the statistical data of actual work and a lots of practice shows that the traditional three ratio method exist two major problems. The first is that oil-immersed transformer internal fault is very complex. The encoding obtained by statistical analysis of common accident in combination. The actual application often appear in the table does not include the code provided by the combination of the fault. The second is that the boundary of the coding range is too clear in the three ratios method, which leads to some errors in fault type judgment. For these two kinds of problems, decision tree algorithm [3] and fuzzy logic [4–7] proposed respectively to solve them. Then, combine these two methods to form a new comprehensive ratio method. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 310–317, 2020. https://doi.org/10.1007/978-981-15-3308-2_35

Insulation Faults Diagnosis of Power Transformer by Decision Tree

311

The new ratio method proposed in this paper is to introduce the decision tree algorithm to remove the coding combination in the traditional ratio method, and directly classify the values corresponding to the ratio, not only de-coding but also changing the original ratio method from two-step diagnosis to one-step diagnosis. This method takes the entropy value in the decision tree as the judgment condition, calculates the information entropy of each ratio to determine the leaf node, finds the simplest data classification method, thus determining the data classification, and finally divides a heap of messy data fault types into 9 kinds of common faults. Then the problem that the ratio range classification (i.e. the coding range in the three ratios method) is too clear in the process of decision tree algorithm solved by using fuzzy logic. Based on the fuzzy membership function of the decision tree method, the classification of the ratio is determined by calculating the value of the membership function, and then the fault types of oil-immersed transformers identified according to the classification rules of the decision tree. This comprehensive new ratio method can overcome the limitations of the traditional ratio method to a certain extent and has higher diagnostic accuracy.

2 The Basic Principle of Three Ratios Method Ratio method is according to the oil-immersed transformer oil. Insulating paper pyrolysis under failure to produce relative concentration and temperature of the gas composition content of interdependent relationships, from the five characteristics of two kinds of gas in the gas component three pairs ratio such as C2 H2 =C2 H4 , CH4 =H2 and C2 H4 =C2 H6 . Different codes was able to determine the fault type of transformer according to the codes of ratio. This method eliminates the volume effect of oil was been one of the main methods to diagnose transformer faults for many years. The fault coding rules and fault type judgment rules of the three ratios method described [8]. 2.1

Definition of Decision Tree

Decision tree [9] is a kind of tree structure. Each leaf node represents a category, while each non-leaf node represents the test classification of data on the feature attribute, and each branch represents the output of the feature attribute in a range. The decisionmaking process using decision tree is to start from the root node. Test the corresponding characteristic attributes in the category to be classify. Select the output branch according to its value until reaching the leaf node. Then take the category stored by the leaf node as the decision result. The key of decision tree construction lies in the selection of shard points, which directly affects the classification performance of this decision tree. The method to select the best segmentation point is to quantify the purity. The specific method in this paper is information entropy because this method only revises the classification rules, so it does not use information gain. Suppose records divided into m categories, and the ratio of each category shown in Eq. (1).

312

C.-K. Chang et al.

PðiÞ ¼

The number of category The total number

ð1Þ

Entropy is an important measure of purity shown in Eq. (2). Entropy ¼

Xn 1

ðPðiÞ log2 PðiÞÞ

ð2Þ

The formula corresponds to the purity of the classification, the larger means more impure and the smaller means purer. The processes of decision tree construction are: (1) List all feature attributes. (2) Calculate the information entropy of all feature attributes and find the best segmentation attribute point (i.e. the first attribute point). (3) After recursively processing all feature attributes, recalculate the unselected feature attributes and select the second attribute to divide. Recursive termination conditions: all feature attributes used up or all faults are classified. 2.2

Establishment of Decision Tree

Step 1: Determine the characteristic attributes according to the coding rules of IEC ratio method in Table 1 [10]. Obviously, three feature attributes can be extracted: The characteristic gas ratios are C2 H2 =C2 H4 , CH4 =H2 and C2 H4 =C2 H6 . Table 1. Coding rules of IEC ratio method Fault types Code combination C2H2/C2H4 CH4/H2 3 2 2

C2H4/C2H6 0 0 1 2

Table 2. IEC ratio method for determining fault properties Fault types

Code combination C2H2/C2H4 CH4/H2 Ultra-low temperature superheating 0 0 Thermal fault of low temperature 0 2 Thermal fault of medium temperature 0 2 Thermal fault of high temperature 0 0, 1, 2 Partial discharges 0 1 Discharges of low energy density 2 0, 1 Low energy discharge and overheating 2 2 Discharges of high energy density 1 0, 1 Arc discharge and overheating 1 2

C2H4/C2H6 1 0 1 2 0 0, 1, 2 0, 1, 2 0, 1, 2 0, 1, 2

Insulation Faults Diagnosis of Power Transformer by Decision Tree

313

Step 2: IEC ratio method, which is used to judge fault properties according to Table 2, calculates the information entropy of the three characteristic attributes. For example, C2 H2 =C2 H4 divided into three categories by coding 0, 1, and 2. The number of each category is 5, 2, and 2, respectively. Entropy calculated as E ðx1 Þ ¼ 1:435, Eðx2 Þ ¼ 2:0588 and Eðx3 Þ ¼ 1:8366 where x1 ¼ C2 H2 =C2 H4 , x2 ¼ CH4 =H2 and x3 ¼ C2 H4 =C2 H6 . According to the calculation results, C2 H2 =C2 H4 selected as the first characteristic attribute point (i.e. root node). Step 3: Continue to classify the coding rules of the three ratios method by feature attributes. At this time, the data encoding of two of the branches is only 0, 1, 2, indicating that the fault type at this time is only related to the characteristic attribute CH4 =H2 . There are five fault types left and two characteristic attributes. The high temperature overheating is only relate to the condition attribute of C2 H4 =C2 H6 , and then the classification directly identified. Entropy calculated as E ðx5 Þ ¼ 1:5 and E ðx6 Þ ¼ 1 where x5 ¼ CH4 =H2 and x6 ¼ C2 H4 =C2 H6 . Therefore, C2 H4 =C2 H6 conditional attribute taken as the next node. The final transformer fault types decision tree shown in Fig. 1.

Fig. 1. Transformer fault types by decision tree.

3 The Ratio Range Classification Improved 3.1

Definition of Fuzzy Logic

Membership grade break the absolute membership relations of classical sets, express the membership relationship with relative degree and propose the concept of fuzzy set, and then to define the degree of elements belonging to the set [11]. Fuzzy logic solves the problem of fuzzy boundary between things. In the daily life and work process, many things, including people’s way of thinking and thinking process have a certain Fuzzy. For example, the adaptability of a new control method could evaluate as strong, relatively strong, effective, not so effective and ineffective. This kind of problem has the objective phenomenon of fuzziness and the concept is difficult to describe by classical mathematics. It limits the development and application of classical mathematics in this field and fuzzy mathematics comes into being.

314

C.-K. Chang et al.

The application process of fuzzy logic under decision tree in this task are: (1) List ratio boundaries of all feature attributes and select appropriate fuzzy interval. (2) Establish appropriate membership functions for the fuzzy intervals under each characteristic attribute. (3) Give the classification flow chart of fuzzy logic and verify the data. 3.2

Solution of Fuzzy Logic Under Decision Tree

Selection of Fuzzy Interval Under Decision Tree. At present, the traditional three ratios method used to divide the interval according to the changing gas ratio range. In order to get the corresponding ratio range coding. For a certain ratio, the code is also certain and unique. The three ratios method is verified by seventy-two groups of data, among which fifty-five groups are accurate and the accuracy rate reaches 76.39%, which indicates that the code basically reasonable. In the gas interval division of the three ratios method, the dividing points are 0.1, 1.0 and 3.0, which obtained by means of mathematical statistics according to a large number of oil-immersed transformer fault examples and after inspection and analysis. The Fig. 2 takes C2 H2 =C2 H4 as an example. Nevertheless, it is too rigid to classify by just one point at the cut-off point of the coding range. In fact, the growth rate of this ratio is very low. For example, at the cut-off point of 0.1, according to the actual diagnosis, the interval from 0.02 to 0.18 does not strictly belong to code 0 or 1 as shown in Fig. 3.

Fig. 2. Encoding range of C2 H2 =C2 H4

Fig. 3. Coding range of uncertainty interval

Because of the uncertainty of coding in this area, the traditional three-ratio method may make wrong judgment. At this time, fuzzy logic used to establish membership functions between the areas. The data between the areas are flexibly in order to conduct more numerical classification. According to the verification results of insulation oil chromatographic anomaly transformers for many years and references [12]. The reasonable fuzzy range of characteristic attributes under the decision tree is to blur the boundary of “0.1” to “0.08–0.12”, the boundary of “1.0” to “0.9–1.1”, and the boundary of “3” to “2.9–3.1”. Establishment of Membership Function (Take the First Characteristic Attribute Point C2H2/C2H4 as an Example). The first fuzzy interval of feature attributes is shown in Fig. 3, including two intervals of “0.02–0.18” and “2.9–3.1”. For feature attribute points are CH4 =H2 , 1: C2 H4 =C2 H6 [ 3 and 2: CH4 =H2 . Their membership degrees are defined as l0 ð xÞ, l1 ð xÞ and l2 ð xÞ. The l1 ð xÞ includes l11 ð xÞ and l12 ð xÞ.

Insulation Faults Diagnosis of Power Transformer by Decision Tree

315

Descending semi-normal shape, ridge shape and ascending semi-normal shape adopted for fitting. According to the statistical knowledge and fuzzy logic method, the size of parameters adjusted for many times. The final membership function results are 2 l0 ð xÞ ¼ e50ðx0:02Þ , l11 ð xÞ ¼ 0:5 þ 0:5sin½25pðx 0:1Þ, where x 2 ð0:02; 0:18Þ. When comparing the size of membership function values, there are li ¼ maxfl0 ; l1 g, the i is 0 or 1. That is, data classification to the corresponding characteristic attribute point, which means 0: C2 H4 =C2 H6 [ 3 and 1: CH4 =H2 . Then l12 ð xÞ ¼ 0:5 0:5sin 2 ½5pðx 3Þ, l2 ð xÞ ¼ 1 e15ðx2:9Þ where x 2 ð2:9; 3:1Þ. Compare the value of membership function lk ¼ maxfl1 ; l2 g, where k is 1 or 2. The data classified into corresponding feature attribute points, which means 1: CH4 =H2 and 2: CH4 =H2 . Using the above method, we can continue to get the classification of data of each feature attribute point. The data processing flow chart at a single feature attribute point shown as Fig. 4. The above is the whole process of using fuzzy logic to deal with the problem of too absolute classification in decision tree.

4 Comprehensive Fault Diagnosis Results 4.1

Case Verification

Existing dissolved gas samples in transformer oil of a certain plant [13]: H2 ¼ 18:3, CH4 ¼ 23, C2 H6 ¼ 10:7, C2 H4 ¼ 164, C2 H2 ¼ 18:2. Actual fault type was high

Fig. 4. The ratio C2 H2 =C2 H4 processing flow chart

316

C.-K. Chang et al.

temperature overheating. According to the new ratio method in this study, the steps of fault diagnosis for the actual case are as follows. Classification of the first feature attribute points are C2 H2 =C2 H4 ¼ 0:11 2 ð0:08; 0:12Þ, l0 ð0:111Þ ¼ 0:96 and l11 ¼ 0:85, so l0 [ l11 means C2 H4 =C2 H6 [ 3. Classification of the second feature attribute points are C2 H4 =C2 H6 ¼ 15:33 62 ð0:02; 0:18Þ [ ð0:9; 1:1Þ [ ð2:9; 3:1Þ, and 15:33 [ 3. So it directly classified as high temperature overheating. The calculated results agree with the actual faults of transformer. 4.2

Result Discussions

In this paper, seventy-two groups of dissolved gas sample data in transformer oil used for simulation experiment, and nine fault types selected. The maximum test sample number of each fault type was thirteen, and the minimum sample number was five, which ensured the diversity of sample types in the experimental process and the accuracy of experimental results. The final test results shown in Table 3. Among the seventy-two groups of data, the accuracy rate of IEC ratio method is 76.4%, and the accuracy rate of the new ratio method in this paper is 88.9%, with a total increase of 12.5%. However, in the cases of “ultra-low temperature overheating” and “mediumtemperature overheating”, the accuracy of the new ratio method unimproved, which indicates that the method needs further improvement. Table 3. Comparison of diagnostic results between IEC ratio method and the new method Fault types

Test sample size IEC ratio The new method Accuracy % Accuracy % Ultra-low temperature superheating 7 57.1 57.1 Thermal fault of low temperature 5 60 80 Thermal fault of medium temperature 8 75 75 Thermal fault of high temperature 6 100 100 Discharges of low energy density 12 91.7 100 Discharges of low energy density 6 100 100 Low energy discharge and overheating 8 125 75 Discharges of high energy density 7 100 100 Arc discharge and overheating 13 76.9 100 Average 7 76.4 88.9

5 Conclusions In this study, a new ratio method based on decision tree with fuzzy logic proposed to diagnose insulation faults of power transformer. The defect of three-ratio method solved by decision tree, and the defect of too clear coding boundary solved by fuzzy logic. Compared with other methods, such as David triangle, David pentagon, kernel probability clustering algorithm, improved particle swarm optimization limit learning

Insulation Faults Diagnosis of Power Transformer by Decision Tree

317

machine, the method proposed in this study is more operable and easier to understand. The example diagnosis results also show the reliability and effectiveness of the method, which provides a new way to solve the problem. Finally, ones demonstrate the superiority of new ratio method and the accuracy of diagnosis improved greatly by actual calculation and comparison.

References 1. GB/T7252 - 2001 guide for analysis and determination of dissolved gas in transformer oil 2. Yang, Z.: Analysis of dissolved gas in transformer oil and discussion of judging transformer fault by guidance. Transformer 45(10), 24–26 (2008) 3. Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991) 4. Fuzzy Mathematics and Its Application. Wuhan University Press (2007) 5. Chen, S.Y., Chang, Y.C., Pan, J.S.: Fuzzy rules interpolation for sparse fuzzy rule-based systems based on interval type-2 gaussian fuzzy sets and genetic algorithms. IEEE Trans. Fuzzy Syst. 21(3), 412–425 (2013) 6. Chen, S.M., Ko, Y.K., Chang, Y.C., Pan, J.S.: Weighted fuzzy interpolative reasoning based on weighted increment transformations and weighted ratio transformation techniques. IEEE Trans. Fuzzy Syst. 17(6), 1412–1427 (2009) 7. Chen, S.M., Wang, N.Y., Pan, J.S.: Forecasting enrollments using automatic clustering techniques and fuzzy logical relationships. Expert Syst. Appl. 36(8), 11070–11076 (2009) 8. State economic and trade commission of The People’s Republic of China: DL/T 722 - 2000 guidelines for dissolved gases and determination in transformer oil. China Electric Power Press, Beijing (2001) 9. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986) 10. Zhou, J., Huang, X., Zou, J., et al.: Improved fuzzy algorithm for fault diagnosis of dissolved gas in transformer oil. Guangdong Electr. Power 3, 86–91 (2015) 11. Liang, J., Qu, Y.: Intelligent control technology. Harbin Institute of Technology Press (2016) 12. Li, L., Wan, Z.: Study on insulation fault diagnosis of power transformer based on fuzzy three ratio method. Zhejiang Electr. Power 30(2), 12–14 (2011) 13. Zhang, W., Yuan, J.: Transformer three ratio fault diagnosis method improved by b-spline theory. Chin. J. Electr. Eng. 34(24), 4129–4135 (2014)

Smart City and Grids

Statistics Algorithm and Application of People Flow Detection Based on Wi-Fi Probe Wenbin Zheng1,2, Xiao Liu1, Peng Li1,2(&), Li Lin1,2, and Hancong Wang1,2 1

School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, China [email protected] 2 National Demonstration Center for Experimental Electronic Information and Electrical Technology Education, Fujian University of Technology, Fuzhou 350118, China

Abstract. For the open laboratory management of an university, the attendance data is not accurate, and meanwhile, the statistics data is too easy to be lacked by using the traditional devices. Wi-Fi probe for people flow detection is often used for safety warnings in public places and commercial promotion of products, laboratory attendance and so on. However, the existing technologies used for laboratory attendance only identify the presence of a target while neglect the fact that when the coverage range of Wi-Fi signal exceeds the attendance range, the attendance data obtained by Wi-Fi probe will be inaccurate. So far, Wi-Fi probe has not been applied for managing people flow of open laboratory. In this paper, we propose an algorithm based on Wi-Fi probe so as to successfully increase the quality of people flow detection in open laboratory. We use RSSI positioning theory to distinguish people inside and outside the laboratory and identify the statistics of the people flow. The experimental results also demonstrate our system not only has higher accuracy than the existing RFID attendance systems, but also provide the rank of time which people spent in the laboratory, the numbers of people and the total time which all the people spend in the lab in a certain time. Managers and users can log in the system to check the statistical data. The system improves the utilization of the laboratory and contributes to the development of open laboratory management practices. Keywords: Wi-Fi probe

People flow detection Laboratory management

1 Introduction University laboratory is an important place for experimental teaching, scientific research, technological development and academic exchange. In recent years, in order to give full play to the advantages of laboratory resources and cultivate high-quality innovative talents, the opening of laboratory has become a major trend. Compared with the traditional laboratory management mode, the management requirements of the open laboratory are very higher. So the management mode is much more complex, and a series of supporting systems are required, which increases the human resource input. It © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 321–332, 2020. https://doi.org/10.1007/978-981-15-3308-2_36

322

W. Zheng et al.

is important for the management and implementation of open laboratory to establish a system platform that can provide real-time feedback of laboratory usage. Traditional laboratories usually use manual registration method to count the information of people who use the laboratory. Manual registration is time-consuming and laborious, and these data are not convenient for further statistical analysis. In recent years, there are more and more ways of monitoring attendance. Wang et al. put forward to use RFID to realize the laboratory access attendance [1]; Lin proposed a fast algorithm that can be applied to face recognition laboratory attendance system [2]; Wang proposed a kind of check on work attendance system based on fingerprint identification [3]. All of these methods require the checkers to actively and deliberately complete, and the checkers often forget, and students often only check in and do not sign out, resulting in inaccurate data loss and inability to obtain real-time information of personnel in the laboratory. These methods simply complete the recording of basic data. If you want to make further statistical analysis of data and obtain useful data, you need to spend a lot of power to export the summary data and then classify them, which is inefficient. In 2011, Wi-Fi probe technology was first proposed by POEDING CYRIAC et al. to detect the presence of devices by capturing packets of data when the connection is established [4]. In 2016, Gao et al. proposed to collect MAC address information and SSID information of terminals in Wi-Fi network through distributed collection, and summarize and analyze the data to achieve early warning of large crowd [5]. In 2017, Ren et al. designed a public safety management system based on WIFI probe with the big data framework [6]. In 2018, Li et al. proposed a Wi-Fi probe-based automatic early-warning system for abnormal traffic flow, which can determine whether the current traffic flow is abnormal based on the analysis of historical data and realize timely early warning [7]. Zhu et al. proposed to use Wi-Fi probe technology to conduct human flow statistics for commercial promotion [8]. Tang et al. realized multi-mode intelligent check-in by combining Wi-Fi probe technology with network scanning technology and interactive mode, and solved the network congestion caused by insufficient IP allocation of hot spots and large-scale access [9]. Shi proposed to use Wi-Fi to realize the laboratory routine course attendance [10]. The paper [9, 10] can identify whether the target exists or not, but does not consider that when the actual attendance range is smaller than the coverage range of Wi-Fi signal, the result of attendance will be wrong. However, they did not use Statistics to manage laboratory. Therefore, in view of the existing problems, combined with the characteristics of open laboratories in universities, this paper innovatively puts forward the application of WiFi probe in open laboratories for crowd statistics. At the same time, RSSI positioning method is used for data judgment, which can realize accurate the stream of people statistics. The data of people flow in the laboratory is classified and displayed basis on attendance, providing effective reference data for the management and investment decision of the laboratory. The following four parts will be elaborated from the four parts of Wi-Fi probe working principle, system architecture, people flow statistics and judgment algorithm, test results and data analysis.

Statistics Algorithm and Application of People Flow Detection

323

2 Wi-Fi Probe Working Principle Wi-Fi is a wireless LAN technology based on IEEE 802.11 [11]. In a Wi-Fi wireless network, there are access points and stations. The access points provide network services. The stations connect to access points and access to the network through it. Interactions between access points and stations are encapsulated as communication frames according to IEEE 802.11 standard [12], and it has three main types: management frame, control frame and data frame. In which the data frame transmits upper level data between stations through the frame body [10], it contains MAC address of the station’s own wireless network card. Station send out data frames when it transmit network data over Wi-Fi. The Wi-Fi probe works in sniffing mode, can capture all the data frames in the receiving range, and then extract the MAC address in the data frame by analyzing the data frame, so that it can recognize the stations. Mobile phones and other stations have unique MAC addresses and stand for a person. Therefore, Wi-Fi probes can be used to obtain MAC addresses and realize the statistics of stream of people.

3 System Structure This system mainly consists of Wi-Fi probe and server, and its architecture is shown in Fig. 1. When the Wi-Fi probe works in the sniffing mode, it will sniff the wireless communication packets, analyze and extract the MAC address and RSSI of the wireless terminal, and save these results. Switch to station mode regularly, connect to wireless router, and access the Internet through wireless router. The Wi-Fi probe submits the saved probe results to the server through the POST method of HTTP protocol. Then switch to sniff mode and continue to probe the device.

Fig. 1. The system structure

324

W. Zheng et al.

The system uses B/S system architecture, users can access the server website, log in the system, and view or modify the information through the browser of the computer or telephone. After receiving the data uploaded by Wi-Fi probe, the server uses the user binding information in the database for data analysis, and saves the human flow data after analysis to the database. When the user accesses the traffic statistics, the server runs the preset program to filter and analyze the traffic data in the database and return the analysis results.

4 People Flow Statistics and Judgment Algorithm The detection data collected by Wi-Fi probes need to be processed before they can be used as the stream of people statistics. To facilitate the processing of probe data, a producer-consumer model is used to separate probe data receiving and processing programs. The Wi-Fi probe uploads the probe data to the server API interface, and the receiving program saves the probe data to the database. There is only one data processing program running in the background, which regularly obtains and deletes the probe data in the database. 4.1

Determination of the Range

The laboratory structure is usually adjacent to each other, and the outside is a corridor. A Wi-Fi probe can cover a radius of more than 15 m without barrier, beyond one of our laboratory areas, and may cover the corridor outside the laboratory or adjacent laboratories. Therefore, what the Wi-Fi probe detects is may not necessarily our effective target. In order to count the personnel using a certain laboratory accurately, it is necessary to distinguish the personnel in the target laboratory from those outside. The communication packet obtained by the Wi-Fi probe between the workstation and AP contains MAC information and RSSI, the received signal strength value. The size of RSSI reflects the distance between the workstation and the probe. Since the propagation of Wi-Fi signal will gradually decrease with the increase of distance, and the signal strength will decline more when encountering obstacles, the transmission loss of signal will be calculated by using the RSSI received by the receiving node, and the transmission loss will be converted into the distance model according to the theoretical and empirical models. The most commonly used propagation path loss model is Shadowing: PðdÞ ¼ Pðd0 Þ þ 10n lg

d þ x0 d0

ð1Þ

Where d0 represents the reference distance, which is generally set to 1 m, d is the actual distance, PðdÞ and Pðd0 Þ represents the path loss value under the propagation distance of d and d0 , n is the path loss index, which is related to the environment, x0 represents interference with the transmission process. The laboratory is usually a closed room, and the Wi-Fi signal will decay rapidly after penetrating the wall. This paper modifies the measured data based on the

Statistics Algorithm and Application of People Flow Detection

325

Shadowing model and sets the optimal threshold value to determine whether the equipment is in the current laboratory. In order to determine whether the equipment is in the laboratory, Wi-Fi probe is placed 2 m away from the wall, as shown in Fig. 2. Table 1 records the ten times data of the wireless device at test points 1 and 2 close to the wall.

Inside the laboratory

Outside the laboratory

Wi-Fi probe

Test point 2

Test point 1 Wall

2 meters

Fig. 2. Device position diagram

Table 1. RSSI test data Test point number RSSI (dBm) 1 −47 2 −68

RSSI (dBm) −47 −69

RSSI (dBm) −49 −69

RSSI (dBm) −48 −69

RSSI (dBm) −49 −69

RSSI (dBm) −48 −69

RSSI (dBm) −48 −69

RSSI (dBm) −48 −70

RSSI (dBm) −48 −67

RSSI (dBm) −47 −68

According to the test data, when the Wi-Fi probe 2 m away from the wireless device, the RSSI detects the wireless device is about −48 dBm, and it is about −69 dBm after penetrating the wall. When the RSSI value of the device is detected to be greater than −62 dBm, it is judged as the laboratory where the device is located in the WIFI probe; otherwise, the detection information is ignored. In the practical test, with −62 dBm as the threshold, a Wi-Fi probe can cover a range of more than 15 m in barrier-free condition, which is enough to cover the whole range of the laboratory. 4.2

Determine the Presence of Equipment

There are 14 channels in the 2.4 GHz band of Wi-Fi communication, and only 1–13 channels are approved in China [13]. So we actually only need to grab the wireless communication data on channels 1–13. Since the device is not always sending data frames after connecting to the wireless router, and the probe can only detect data frames of one channel at a time, each channel detects 250 ms, so not all devices within the

326

W. Zheng et al.

detection range can always detect them. Therefore, in this scheme, the empirical value of multiple scanning is 80% of the equipment that can be detected in the first round of scanning, and 95% or more can be detected in the second round of scanning. In order to ensure that the equipment is not judged to be not in the laboratory because it is not detected for a short time, it is considered that the equipment has been in the laboratory if it is detected for less than 5 min before and after the same equipment.

Fig. 3. Data processing flow chart

4.3

Population Statistics Processing (Data Processing)

Among those who come to the lab, someone do not use the lab, such as looking for classmates and borrowing tools. They stay in the laboratory for a while, and contribute little to the statistics. Therefore, in the program, there are statistical data with a single time shorter than 20 min in the laboratory, those will not be saved to the database. At the same time, considering that the personnel who use the laboratory will leave the laboratory for some reasons, after staying in the laboratory for 20 min, those who

Statistics Algorithm and Application of People Flow Detection

327

detect for less than 20 min before and after the same equipment are considered to have been using the laboratory. This approach can also reduce the fragmentation of statistics. The specific process show in Fig. 3. Finally, a stream of people statistics data is saved to the database with four elements: user ID, lab ID, start time and end time.

5 Test Results and Data Analysis 5.1

Establishing Test Environment

After the detection data processing, the system is tested in the innovation practice and design laboratory of our university. The length of the laboratory is 18 m and the width is 8 m. The Wi-Fi probes are arranged as shown in Fig. 4. In the collected statistical data, we randomly selected a day’s data and compared it with the RFID laboratory attendance system currently used in our school.

Fig. 4. Wi-Fi probe position diagram

5.2

Comparison Test Data

In order to test the superiority of Wi-Fi probe, we randomly selected one day’s data from the statistics data of RFID laboratory attendance system currently used in our school, and also randomly selected one day’s data in our Wi-Fi probe system of the stream of people, then compared the two groups of data. This is shown in Table 2. We can see from the comparison of the data of the two tables: (1) The number of RFID attendance records is far less than the number of Wi-Fi probe system statistics. It indicates that the students coming to the laboratory rarely take the initiative to swipe their cards, which leads to the data recorded by RFID statistics are not true.

328

W. Zheng et al. Table 2. Innovative practice and design office’s stream of people statistics Students Xinyu Chen Jinzhao Yu Peng Li Xin Pan Yifan Wang Qiang Gao Xinyu Chen Zhenjian He Peng Li Yifan Wang Qiang Gao Peng Gao Xin Pan Xinyu Chen Fenglin Li Ruiqiang Li Jinzhao Yu Zhenjian He

Wi-Fi probe attendance records Sign-in time Sign-out time 2019-5-12 20:36:27 2019-5-13 01:26:20 2019-5-12 20:36:36 2019-5-13 01:21:52 2019-5-12 20:47:41 2019-5-13 01:15:44 2019-5-12 21:34:15 2019-5-13 01:31:14 2019-5-12 22:32:22 2019-5-13 01:47:16 2019-5-13 09:18:36 2019-5-13 10:43:12 2019-5-13 10:03:47 2019-5-13 10:41:25 2019-5-13 13:06:40 2019-5-13 14:24:48 2019-5-13 13:31:01 2019-5-13 18:23:36 2019-5-13 14:03:30 2019-5-13 15:06:49 2019-5-13 14:46:20 2019-5-13 17:23:14 2019-5-13 15:21:20 2019-5-13 17:13:22 2019-5-13 15:24:55 2019-5-13 18:21:40 2019-5-13 15:45:40 2019-5-13 18:18:02 2019-5-13 15:45:55 2019-5-13 17:02:24 2019-5-13 16:19:21 2019-5-13 17:11:58 2019-5-13 16:27:26 2019-5-13 18:23:24 2019-5-13 17:48:55 2019-5-13 18:24:24

Peng Li Jingwei Liu Yifan Wang Xinyu Chen Peng Li Jinzhao Yu

2019-5-13 2019-5-13 2019-5-13 2019-5-13 2019-5-13 2019-5-13

20:12:16 21:21:46 21:21:55 21:22:04 21:22:08 22:28:18

2019-5-13 2019-5-13 2019-5-13 2019-5-13 2019-5-14 2019-5-13

RFID attendance records Student swipe time

2019-5-13 09:11:33 2019-5-13 13:04:48

2019-5-13 17:47:20 2019-5-13 18:23:24

20:33:13 23:08:51 23:26:27 22:43:50 00:10:44 23:21:35

(2) In the RFID system, the judgment mechanism of sign-in and sign-out is not clear, every time students swipe the card, it is not clear whether they sign in or sign out, therefore, it is impossible to provide accurate data of students’ time spent in the lab. However, the Wi-Fi probe statistics system can clearly record the time of sign-in and sign-out. (3) The RFID attendance record of “Qiang Gao” is consistent with that of “Qiang Gao” in the Wi-Fi probe system, which proves the feasibility of the stream of people statistics system. In order to test the performance of the positioning algorithm, we designed an experiment and invited five students to the corridor outside the laboratory and the laboratory next door. Statistical results of adding positioning algorithm and those of not adding positioning algorithm are shown in Table 3. As can be seen from Table 3, when we do not add positioning algorithm, the targets covered by Wi-Fi probes are

Statistics Algorithm and Application of People Flow Detection

329

considered as statistical objects. We screened out some data, and the green part was the students in the corridor and the laboratory next door, who were excluded after adding the positioning algorithm. This is consistent with our presupposition. This shows that the algorithm based on RSSI positioning is effective and the statistical data are more accurate.

Table 3. Comparison of statistical data before adding location algorithm and after adding location algorithm

No add positioning algorithm

Add positioning algorithm

Students Sign-in time

sign-out time

Sign-in time

sign-out time

Chao Li

2019-5-27 9:35

2019-5-27 12:09

2019-5-27 9:35

2019-5-27 12:09

Peng Li

2019-5-27 13:21

2019-5-27 14:02

2019-5-27 13:21

2019-5-27 14:02

Peng Li

2019-5-27 14:34

2019-5-27 17:03

2019-5-27 14:34

2019-5-27 17:03

QianGao

2019-5-27 15:27

2019-5-27 16:03

2019-5-27 15:27

2019-5-27 16:03

Ruiqiang

2019-5-27 15:28

2019-5-27 16:00

Ming Li

2019-5-27 15:30

2019-5-27 15:55

2019-5-27 15:30

2019-5-27 15:55

Jian Li

2019-5-27 15:31

2019-5-27 15:57

Chao Li

2019-5-27 15:37

2019-5-27 16:07

Xiao He

2019-5-27 15:38

2019-5-27 16:03

2019-5-27 15:38

2019-5-27 16:03

ZhenJin

2019-5-27 15:39

2019-5-27 16:04

2019-5-2715:39

2019-5-27 16:04

FengLin

2019-5-27 15:39

2019-5-27 16:02

JianFeng

2019-5-27 15:40

2019-5-27 16:08

2019-5-2715:40

2019-5-27 16:08

Xin Yu

2019-5-27 15:40

2019-5-27 16:19

MingLu

2019-5-27 15:41

2019-5-27 16:53

2019-5-2715:41

2019-5-27 16:53

5.3

Analysis of Data

The stream of people data collected by WIFI probe can not only obtain accurate attendance information, but also obtain data information that managers and users want through data analysis and processing. For example, rank the number of hours people spent in the lab, the number of people in the lab in the last 48 h, the total time of people

330

W. Zheng et al.

in the lab in the last 15 days. In the future, new analysis procedures can be added according to actual requirements. Table 4. The ranking of total time Ranking 1 2 3 4 5 6 7 8 9 10

Students Peng Li Xinyu Chen Yifan Huang Qiang Gao Jinzhao Zhang Fenglin Liao Zhenjian Cheng Xin Pan Hongmin Tang Peng Gao

Total time 98 h and 3 min 76 h and 9 min 73 h and 13 min 64 h and 29 min 56 h and 16 min 53 h and 54 min 30 h and 33 min 28 h and 11 min 17 h and 16 min 14 h and 30 min

(1) Time ranking The ranking program calculates the start and end times based on the time range options to obtain. Then we get all matching person IDs from the database that end time is greater than start time, traversing the data, if the start time is less than the start time pre-set, we will change it to the start time, then we take the end time minus the start time to get a single duration. We use the user ID as the index, and accumulate the user ID and single duration in a two-dimensional array. At the end of the traversal accumulation, the two-dimensional array is sorted by the cumulative time to get the ranking result (Table 4). (2) The number of people in the lab in the last 48 h The program first calculates the starting time to be analyzed, and then gets all the statistics from the database that end time is greater than the starting time and matches user ID. The 24 data we collected are the most recent 48 h, and every 2 h is a time period. We use a two-dimensional array of 24 elements to store the temporary data of the statistical process, the 0th element represents 2 h after the start time, the 1st element represents 2 h to 4 h after the start time, and so on. Iterates through the data, assigning a value of 0 to the element named user ID in the array of 2-d array elements at the corresponding time. Finally, we get the number of elements in each element in the two-dimensional array, which is the number of people in the corresponding (Fig. 5).

Statistics Algorithm and Application of People Flow Detection

331

Fig. 5. The number of people in the lab

(3) The total time of all people in the lab in the last 15 days We counted the total time of people in the last 15 days, first calculated the starting time of the first day, and then obtained all statistics from the database that the end time was greater than the starting time and the user ID. We use an array of length 15 to store the cumulative sum of 15 days. Traverse the data, if the start time is less than the start time pre-set on the first day, we will change it to the start time on the first day. Then calculate the time in two cases: first, if the start time and the end time are on the same day, we just take the end time minus the start time, and add it up to the elements of the day array. Second, if the start time and end time are not the same day, we subtract the starting time from the 0 o’clock of the day after the starting time, and the result accumulates into the array element corresponding to the starting time. Subtract the 0 o’clock on the day of the end time from the end time, and the result accumulates into the array element corresponding to the end time. All array elements corresponding to the date between the start date and the end date are directly added to the day (Fig. 6).

Fig. 6. The total time in the lab

332

W. Zheng et al.

6 Conclusion In this paper, we present the algorithm based on the device of Wi-Fi probe to monitoring the stream of people in open laboratory, and to solve the problem of inaccurate attendance data in laboratory management because the user forgot not to swipe the card to sign in. In addition, we present the location algorithm based on RSSI to solve the problem of the statistical data is not accurate when the Wi-Fi signal coverage is larger than the statistical range. We have a clear sign-in and sign-out mechanism, which makes it possible to make effective use of time statistically. What’s more, we analyze and display statistical data according to requirements, and provide them to users. It is helpful for managers to make decisions and teachers and students to use the laboratories. Acknowledgement. In this paper, the research was supported by Scientific Research Fund of Fujian Provincial Education Department (JAT160339, JT180340) and by Research Project of Experimental Teaching Reform in Fujian University of Technology (SJ2017003).

References 1. Wang, Y.H., Liu, J., Yu, Z.: Application of RFID in laboratory access control attendance. J. Chongqing Univ. Arts Sci. 32(5), 132–135 (2013). (in Chinese) 2. Lin, S.: Fast face recognition algorithm for laboratory attendance system. Inform. Technol. 4, 16–22 (2019). (in Chinese) 3. Wang, Y.H.: Design and implementation of fingerprint attendance system based on student management in laboratry. Wirel. Internet Technol. 5, 63–64 (2017). (in Chinese) 4. Roeding, C., Emigh, A.T.: Method and system for detecting presence using a wifi network probe detector: US. US20110029359[P] (2011) 5. Gao, J., Yuan, D.Y.: Design and research of early warning system based on WIFI probe. J. People’s Public Secur. Univ. China (Sci. Technol.) 3, 89–93 (2016). (in Chinese) 6. Ren, Z.H., Wang, Y.Q., Wang, L.: Design and research on the system of public safety management based on WIFI probe. J. Data Min. 7(3), 77–81 (2017). (in Chinese) 7. Li, K.L., Zhao, H.W., Wang, G.Z.H., Fan, T.: Design of automatic warning system for abnormal visitors flow rate based on WiFi probe. Electron. Meas. Technol. 41(17), 138–141 (2018). (in Chinese) 8. Zhu, Y.J, Hao, S.H.J., Pei, Zh.Y., Xu, Sh.J., Zhou, Zh.: Commercial big data analysis technology based on wifi probe. Times Agric. Mach. 45(2), 99–104 (2018). (in Chinese) 9. Tang, J., Ren, C., Pan, W.: Multi-mode intelligent check-in system based on wifi probe. Softw. Eng. Appl. 7(4), 224–233 (2018) 10. Shi, X.R.: Design and implementation of laboratory automatic attendance and management system based on WIFI. Hebei University, Hebei (2018). (in Chinese) 11. Li, X.Y.: WIFI technology and its application and development. Inform. Technol. 36(02), 196–198 (2012). (in Chinese) 12. Nithya, B., Mala, C., Vijay Kumar, B.: Simulation and performance analysis of various IEEE 802.11 backoff algorithms. Proc. Technol. 6, 840–847 (2012) 13. Yang, X., Wang, C.: Research on multi-channel communication algorithm based on aggregation tree protocol. Netw. Secur. Technol. Appl. 12, 40–43 (2018). (in Chinese)

Research on the Status Quo and Problems of Power Grid Production in China Bo-Lin Xie, Yuh-Chung Lin, Jeng-Shyang Pan(&), and Kuo-Chi Chang Fujian Provincial Key Laboratory of Big Data Mining and Applications, School of Information Science and Engineering, Fujian University of Technology, Fuzhou, China [email protected], [email protected], [email protected]

Abstract. The intelligent production of the power grid is the core of the development of smart grids. The current production status of the power grid is in a transitional phase. During the transition period, there are many problems in the power grid foundation due to the application of new technologies. After our investigation, the big data storage and protection problems, the different generation devices coexistence problem, and scene image classification and processing problems taken by drone inspection are the most prominent problems in the current power grid. This paper elaborates on the above issues and proposes feasible solutions according to the development of modern technology. The possible solutions for our observed problems are to establish a big data cloud platform inside the power grid, to propose a unified standard for the power grid device, and to effectively utilize the two-dimensional code for image classification. This will not only effectively solve the various problems faced by the power grid at the current stage, but also lay the foundation for the development of smart grid. Keywords: Smart grid power grid

Big data cloud platform Intelligent production of

1 Introduction Through on-site investigation of the China southern power grid, we found some common problems existed in the power grid. Firstly, we notice that the power grid always produce large amounts of data, such as the data generated on the following processes which are the power generation process, power transmission process, power transformation process, power consumption status on the houses of end users, and the status data of monitor and maintenance of all related equipment. In the era of big data, there are a lot of technologies to deal with the data generated on the power grid. However, on our observation, the data did not be well organized and preserved. Such a valued data is lost without careful treatment. It’s a big lose for a company. Next, due to the advancement of technology, new equipment which is more efficiently is continuously installed or replaced in the power grid. Therefore, the coexistence of different © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 333–340, 2020. https://doi.org/10.1007/978-981-15-3308-2_37

334

B.-L. Xie et al.

generation equipment will cause the coordination problems among these devices. Recently, there is no substantial improvement on the problems caused by the coexistence of different generations of equipment in the power grid company. Finally, in the inspection of power transmission lines, drones are utilized to take pictures on the towers, conjunctions, and power lines. The inspection process produces a huge amount of pictures. Recently, these pictures are classified by human employees. The workload is too heavy for a human. Hence, the classification errors are inevitable which leads to difficulties in retrieval on these pictures. These problems will become a stumbling block on the road of the advancement of smart grids. The purpose of this paper is mainly to study the problems mentioned above and explore more reasonable and smarter solutions.

2 Overview of Cutting-Edge Technology for Smart Grid Development 2.1

Internet of Things

Recently, in the development of smart grid, how to utilize the IoT technology to improve energy efficiency is an important issue. The purpose of using IoT technology in smart grids is to reduce energy consumption and to better adapt to users’ power needs. Electricity consumption in India’s electricity production process exceeds 30% [1]. In order to meet the growing demand for electricity and reduce energy waste, it is necessary to develop a real-time tracker for the entire power system. This can be done with IoT technology. In the future, the centralized system currently used by the grid will be replaced by a distributed microgrid. Distributed microgrid based on IoT technology will realize real-time monitoring of the power grid [2]. We found that power companies have developed some IoT devices to monitor the state of the power grid, and smart meters are one of them. Smart meters measure power consumption and return data to the utility for further analysis. Different technologies are used for data transmission in different countries. In the United States, a low-power radio frequency (LPRF) communication technology for mesh networks is used. They transmit data in the frequency band below 1 GHz. France and Spain use wired narrowband orthogonal frequency division multiplexing (OFDM) power line communication technology. Wireless communication technologies are also often used for end users. In the United States, they use the IEEE 802.15.4 2.4 GHz ZigBee® standard. The UK and Japan are considering RF solutions below 1 GHz. Hybrid RF and power line communication can effectively communicate energy information between power producers and consumers. That is to say, the Internet of Things technology can be applied to the main subsystems of smart grids such as power generation, substation, transmission and distribution [3–5]. In short, smart meters are one of the Internet of Things terminals in the power grid. IoT technology can collect data from all subsystems in the grid. Such a large amount of data can be effectively analyzed by big data analysis methods, thereby reducing energy loss.

Research on the Status Quo and Problems of Power Grid Production in China

2.2

335

Cloud Computing

Traditional storage technologies are no longer able to meet the development requirement of smart grids. A large amount of data will be produced if IoT technologies are applied to the power grids. How to preserve such a large amount of data is an important issue in the development of smart grid. Therefore, cloud computing technology is urgently needed for the development of smart grids. The general definition of cloud computing is the on-demand availability of computer system resources without the need for direct user active management, especially data storage and computing power. The key technologies of cloud computing include the availability of high-speed access networks, hardware visualization, and serviceoriented architecture. For a power grid company, there are a lot of data should be preserved. Data storage technology in cloud computing will be an assistance on the data preservation. Considering the confidentiality of data collected from the power grid, how to avoid information disclosure or tampering to protect information security is also an important topic. In addition to the requirement on huge amount of data storage, a fast access strategy is also necessary for fast image recognition which can speed up the judgment of problem. The structure of the smart grid under the cloud storage and cloud computing architecture is shown in Fig. 1.

Fig. 1. Smart grid structure under cloud architecture.

2.3

QR Code and Others

QR is a two-dimensional code that is designed for quick response capabilities. It can store more data than a one-dimensional barcode. For high-speed decoding the information in QR code, it is necessary to use the corresponding devices such as the mobile phone. The QR code has the advantages of high data storage capacity, fast scanning and full readability. It also has error correction capabilities such that a slight damage to the QR code will not affect the information reading [6]. In power transmission lines,

336

B.-L. Xie et al.

equipment is usually numbered with a nameplate. If the power line tower can be numbered with a QR code, it can make the work more convenient. The power system can also use artificial neural networks to make some predictions about the operating state of the device such as the temperature rise prediction of distribution transformers. By using an artificial neural network to predict the temperature rise of the distribution transformer, it is possible to effectively reduce the failure caused by the excessive temperature rise of the distribution transformer. In [7], it shows that the result referring by ANN’s derivation has a good prediction rate. In [8], it studies the power cost reduction through the supervisor control and data acquisition (SCADA) system application. Power companies can also use SCADA system to collect data and monitor operations. This will bring good benefits to the management of the power company. A small distribution underground system which has a voltage level of 22.8 kV can also be used in the distribution. In [9], the full power factor and the maximum power factor of each segment of the feeder can be derived according to the complex power of the feeder outlet and the feeder segment load such that the compensation point for the best power factor can be found. This can effectively improve the power grid efficiency. In [10], it proposed the voltage tracking method to detect the early faults of online battery such that power supply interruption can be prevented. It can solve the battery failure problem to ensure the safety and smooth operation of the substation.

3 Storage and Protection of Big Data in the Power Grid With the development of smart grids, smart meters have almost been fully installed at the home of customers of the China Southern Power Grid. At the same time, a large amount of data is constantly being produced in hundreds of millions of smart meters. The storage for such a large amount of data is a challenge for grid companies. But obviously according to our observation, the grid company has not handled this problem yet. 3.1

Data Storage Issues

We have found that there is a problem in storing data generated by smart meters in the power companies. For example, we want to access the electricity consumption data of all users in a certain area for one year from the database. But it can’t be done directly in the power company’s database. We need to firstly retrieve the monthly electricity consumption data of this zone user, and then, integrate the data of 12 months. Finally, we can get the data set that we demand. This should be caused by the unreasonable design of the database structure. We believe that the grid may need to be upgraded in database management. Because the coverage of power grid is very wide, the establishment of a distributed database management system should be seriously considered due to the mature of cloud computing and big data related technology.

Research on the Status Quo and Problems of Power Grid Production in China

3.2

337

Data Preservation Issues

We found that the data collection frequency of smart meters by grid companies is extremely low. It is usually once every 24 h for ordinary home users. The acquisition frequency for the dedicated transformers is once every 15 min. Even so, the collected data will be deleted because of the limitation of data storage. For home users, only one of the 30 data collected is reserved. The reserved data is only used for the billing of electricity consumption, and the other 29 data will be deleted. This seriously damages the integrity of the data. Furthermore, the grid company only preserve data of two years. In other words, user data for more than two years cannot be accessed, which seriously hinders the application of data analysis. To solve the data preservation problem, the cloud storage technology is a possible candidate which can efficiently perform data storage.

4 Generation Gap Problem of Power Grid Equipment With the development of power equipment, new types of automation and intelligent power equipment have been used in large quantities in the power grid. The installation and commissioning of new equipment not only requires a lot of time, but also it requires power outages which may bring a lot of potential risks and economic losses. Therefore, once the power equipment is put into production, it usually needs to be used continuously for several decades. When the time goes by, the coexistence of a large number of new and old equipment in the power system may block the development of power grid. Recently, the power companies are vigorously promoting intelligent construction, so the problems caused by the coexistence of devices of different generation should be resolved. 4.1

Differences Between Knife Switches at Different Times

There are currently two types of knife switch commonly used in high voltage substations in power systems. They are single-column vertical telescopic scissors and horizontal open handshakes, called handshakes. Take the handshake knife as an example. When time goes by, it will have handshake knife brakes produced by different manufacturers at different times. Due to the difference of the manufacturer and model, the manual opening and closing method used by the different knife gates are very different. Usually, the operation of opening and closing is performed by controlling the motor in the terminal box. However, in some special cases, it is necessary to directly manually operate on the knife gate. In a substation, manual trips are required when power is lost and weather conditions are poor. For example, when rainwater seeps into the drive motor, the motor will not operate properly. The cutters that need to be closed is not only a large number but also a variety of models. Therefore, workers need to carry a large number of different types of wrenches which can operate on different models, such as lever, handcart, and cross wrench. These wrenches are often bulky and inconvenient to carry, so they are mostly stored in warehouses. During the operation, you will need to go back and forth to the warehouse multiple times. This is not only

338

B.-L. Xie et al.

time consuming, but also extremely inefficient. In order to reduce the power outage time, this situation will seriously affect the smooth progress of the power outage plan. 4.2

Differences Between Relay Protection Devices in Different Periods

A large number of secondary relay protection devices are utilized in high voltage substations. This can lay the foundation for remote intelligent control while effectively protecting the safety of substation equipment. However, with the continuous development of relay protection devices, miniaturization is the trend and intelligent devices are constantly emerging. The relay protection device in the substation is also facing the coexistence problem of equipment of different generation. We found that there are 4–5 types of contact switches that play the same role in the same relay protection room. For example, there are different types of contact switches such as a blade type, a button type, and a knob type. At this stage, since the opening and closing operations of the contact switches are manually operated, there is no major problem for the time being. But in the future, intelligent robots will replace humans to complete these simple operations. Then to unify the type of switch of the secondary relay protection devices will also become the only way for the smart grid.

5 Image Classification and Storage Problem After Drone Inspection Operation The inspection of high-voltage transmission lines is always the main work of the power transmission department. In the past, in order to inspect the line, the electric workers have to on the site to climb the tower which may locate on a wild field, even on a mountain. With the rapid advancement of domestic drone technology, civilian quadrotor UAVs have been widely used in the inspection of power lines. Power workers effectively use drones for power line inspection. It not only effectively reduces the risk of work, but also greatly improves the efficiency of inspection. At the same time, many of the real-time pictures taken by drones need to be further processing. The procedure of power line inspection now is that electric workers first took pictures of power lines and towers through drones. Then the pictures taken on the site will be brought back to the office for classification. Finally, fault and defect identification are performed. The inspection of the power line by drone has brought more and more cumbersome indoor workload when greatly reducing the outdoor workload. A large number of pictures not only require to be classified one by one, but also need to be numbered and stored uniformly. The fault and defect identification are also done by human workers manually. At present, the power company has carried out research on automatic image recognition. 5.1

The Grid Needs to Establish a System Image Data Platform

The development goal of the power grid at this stage is to achieve intelligence. But the premise of intelligence is to create a database platform with reasonable structure and reliable performance. Power networks are different from other industries, and grid

Research on the Status Quo and Problems of Power Grid Production in China

339

security is an important guarantee for the national security. Therefore, its particularity is self-evident. The system of power grid cannot take the risk to use the database platform in public domain. The best choice is to establish a private cloud-based data platform. All the pictures generated by the drone inspection will be stored in the cloud platform. Next, a series of intelligent operations will be carried out, such as intelligent classification, intelligent fault diagnosis, and so on. This will greatly reduce the complexity of post-process and improve the efficiency of the power grid. For intelligent classification, it can be implemented with QR code. For example, we try to place a fixed QR code on the point where the picture of transmission line needs to be taken. The two-dimensional code can be correctly recognized after the drone flies to the correct shooting angle. After the drone recognizes the QR code, it will link with the cloud storage space recorded in the QR code. This particular storage space will be used to store all images taken at this angle. At this point, the live image will be transferred directly to the cloud platform for intelligent classification and storage in the correct location. Next, the fault identification system can access all images of the same shooting point in the cloud through the platform. Image recognition is then performed. If an abnormality is found after the comparison, it can be judged that the device may be defective or defective. Through further intelligent calculations, accurate fault diagnosis can be obtained. 5.2

Remote Patrol and Fault Identification Technology Based on 5G Technology

The 5G technology with fast transmission speed, large bandwidth and low latency can help the power grid company to return live HD video and real-time video inspections. China Southern Power Grid and Huawei Company established the ICT laboratory. At present, the world’s first 5G application in the real-time grid control and acquisition field is established in Huawei’s purely domestic 5G architecture. ICT Labs also used the IoTs and artificial intelligence technology to create a new generation of intelligent power line inspection systems. The system can simultaneously access different sensor data such as audio and video, image, temperature, humidity and wind bias of the line. Even in the on-site inspection, potential problems can be identified. Although identifying faults on site can improve efficiency, the accuracy of fault identification can only reach 90%. In terms of power grid security, this accuracy is not enough to ensure the safe operation of the power grid. Therefore, image recognition based on AI and big data cloud platform is still an effective way to improve the accuracy of fault identification.

6 Conclusions The 21st century is an era of big data and AI. The development of smart grids in the 21st century is clearly not smart enough without the assistance of the big data and AI technologies. How to build a practical smart grid is one of the important issues for the future work. The storage, preprocessing, analysis and prediction of big data in the power grid are still in the initial stage. Due to our observation, the large amount of data

340

B.-L. Xie et al.

generated by the power grid has not been properly protected and utilized. The establishment of a big data cloud platform is an effective way to protect data and improve data utilization efficiency. At the same time, the power grid should also pay attention to the coexistence problems among devices with different generations. It is important to establish equipment standards for the power grid industry. The purpose of using new technologies is to reduce the workload and increase efficiency. But because of the new way of working and possible additional workload, operators may resist to use new technologies. Therefore, the negative effects of such new technologies should also be taken seriously. New techniques such as QR code, image recognition and 5G can be used to reduce the workload and eliminate negative effects as mentioned in this paper. The technologies of image intelligent classification, fault intelligent identification, fault report and intelligent warning will make power grid production truly intelligent.

References 1. Thakurdesai, S.: Smarter Metering—Now & in Future, July 2018. http://electronicsmaker. com/em/admin/pdfs/free/Texas_Instruments.pdf 2. Hossain, E., Khan, I., Un-Noor, F., Sikander, S.S., Sunny, M.S.H.: Application of big data and machine learning in smart grid, and associated security concerns: a review. IEEE Access 7, 13960–13988 (2019) 3. Basit, A., Sidhu, G.A.S., Mahmood, A., Gao, F.: Efficient and autonomous energy management techniques for the future smart homes. IEEE Trans. Smart Grid 8(2), 917–926 (2017) 4. Shu-Wen, W.: Research on the key technologies of IOT applied on smart grid. In: Proceedings of International Conference on Electronics, Communications and Control (ICECC), pp. 2809–2812 (2011) 5. Wang, Y.F., Lin, W.M., Zhang, T., Ma, Y.Y.: Research on application and security protection of Internet of Things in smart grid. In: Proceedings of IET International Conference on Information Science Control Engineering (ICISCE), pp. 1–5 (2012) 6. Tiwari, S.: An introduction to QR Code Technology. In: 2016 International Conference on Information Technology (ICIT), Bhubaneswar, pp. 39–44 (2016). https://doi.org/10.1109/ icit.2016.021 7. Zhang, W., Pan, J.-S., Tseng, Y.-M.: Research on temperature rising prediction of distribution transformer by artificial neural networks. In: ICGEC 2017, pp. 144–152 (2017) 8. Hwang, J.-C., Chen, J.-C., Pan, J.-S., Huang, Y.-C.: Automation power energy management strategy for mobile telecom industry. IEICE Trans. 93-B(9), 2232–2238 (2010) 9. Tang, J., Zhang, W., Pan, J.-S., Tseng, Y.-M., Deng, H.-Q., Yan, R.-W.: The optimal reactive power compensation of feeders by using fuzzy method. In: ECC 2017, pp. 272–281 (2017) 10. Hwang, J.-C., Chen, J.-C., Pan, J.-S., Huang, Y.-C.: Measurement method for online battery early faults precaution in uninterrupted power supply system. IET Electr. Power Appl. 5(3), 267–274 (2011)

Technologies for Next-Generation Network Environments

Small File Read Performance Optimization Based on Redis Cache in Distributed Storage System Bizhong Wei1,2,3, Liqiang Deng1,2,3, Ya Fan1,2,3, and Miao Ye3(&) 1 School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China [email protected], [email protected], [email protected] 2 Guangxi Cooperative Innovation Center of Cloud Computing and Big Data, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China 3 Guangxi Colleges and Universities Key Laboratory of Cloud Computing and Complex Systems, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China [email protected]

Abstract. At present, small files stored in most of the distributed storage systems account for a large proportion, how to improve the storage efficiency of small file has long been a hot issue in the academic community. In the existing methods, researchers usually integrate small files into content-related large files for association storage. The shortcoming is that if we read some small files in a large file, the entire large file must be read into the cache space, which greatly reduces the utilization of the cache space. To solve this defect, this paper designs a scheme based on the combination of cache replacement optimization algorithm and multi-level cache dynamic elimination mechanism. We compute the content heat value of the cache object by the cache replacement optimization algorithm, and then through the multi-level cache dynamic elimination mechanism we expel the object with low content heat value in the cache space to reserve enough space for the new arrived object. This proposed method not only improves the cache space utilization, but also increases the cache data search efficiency and hit rate. Series of experiments and their result show that the cache hit rate obtained by our proposed method is 14% higher than the LRU cache replacement algorithm and 52% higher than the LFU cache replacement algorithm, which can optimize the reading performance of small files based on Redis cache. Keywords: Redis cache mechanism

Small file reading Multi-level cache Caching

1 Introduction Cache is a memory that can exchange data at high speed, mainly used to improve the response speed of access requests. But the space of cache is limited, we need to use cache elimination algorithm to remove some objects in cache space and reserve enough space for the new arrived objects when cache space reaches the upper limit. Commonly used cache elimination algorithms include the least recently used algorithm LRU [1], © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 343–352, 2020. https://doi.org/10.1007/978-981-15-3308-2_38

344

B. Wei et al.

the least frequently used algorithm LFU [2] and the FIFO algorithm FIFO [3]. Generally speaking, with different application scenarios, the cache hit ratio of each algorithm will be different. An efficient cache elimination algorithm should be able to keep the most valuable objects in the cache and eliminate the low-value cache objects from the cache. Because the LRU replacement algorithm does not have the usual temporal locality, Nagasako et al. [4] proposed a cache replacement algorithm which considers negative temporal locality of reference, called Weighted Count (WC). The algorithm judged the value of the cached object by accessing the historical time record, which made it better than the traditional LRU cache algorithm, but it only considered the time interval without considering the influence of the access frequency on the value of the cached object. Based the Pareto Least Recently Used (PLRU) algorithm and Least Recently - k (LRU - k) algorithm, Miao [5] proposed an improved online video multilevel PLRU (MLPLRU) cache replacement strategy. By setting the multi-level cache and priority for the cache object to dynamically adjust the content in the cache list, the cache pollution problem can be effectively avoided, However, this strategy did not take into account the impact of video size on the cache replacement strategy. Li et al. [6] presented an NDN cache replacement algorithm based on content heat, which can more truly and effectively represent the value of each cache object in the cache space. But the algorithm simply considered the number of accesses of the cache object, and did not consider the access time interval and size of the cache object. Based on this, we design a cache replacement optimization algorithm and a multi-level cache dynamic elimination scheme for the caching mechanism in the Redis cache database, which effectively improves the cache hit ratio of small files. The rest of this paper is structured as follows. Section 2 reviews related work; Sect. 3 describes the problems of reading small files; We introduce the details of our approach in Sect. 4; Then we give the experimental results in Sect. 5 and summarize this paper in Sect. 6.

2 Related Work When Redis is used as a cache system, when the cache storage space is insufficient to store newly arrived data, the objects with low cache value in the cache space need to be ejected to reserve enough storage space for store newly arrived objects. Therefore, Redis provides eight kinds of cache replacement models to eliminate the cache objects, in the eight kinds of models, allkeys-lru is the most widely used, it eliminated the cache objects by using the approximate cache replacement algorithm of LRU. In order to provide a cache replacement strategy that can be applied to more access patterns, Redis introduced the approximate cache replacement algorithm of LFU in version 4.0 [7], the algorithm is the approximate algorithm of LFU cache replacement algorithm, it achieved with less memory space overhead to estimate the number of visits to cache objects. In contrast, LFU cache replacement algorithm is more flexible than the cache replacement algorithm of LRU [1], which can not only apply to more access models, but also provide better cache hit ratio [8]. The traditional cache replacement algorithm of LRU and LFU only consider a single cache impact factor, in order to represent the cache value of cache objects reasonably, a

Small File Read Performance Optimization Based on Redis Cache

345

common solution is to introduce a temperature factor, the higher the temperature value is, the higher the cache value will be. The cache replacement algorithm of NDN (Named Data Network) based on content heat presented in literature [6] shows that, with the passage of time, the number of previous visits to the cached object has less and less impact on the current heat value, although the algorithm can truly represent the value of the current cache object, it simply considers the impact of the number of visits to the cached object on the value of the cached object, ignoring the influence of such factors as the access time and the size of the cache object for the cache object value. Therefore, this paper proposes a cache replacement optimization algorithm to improve it.

3 Problem Description for Small File Read Without considering the limitation of cache space, the function of prereading small files can be realized by writing large file into the cache where small file is located, however, in practical applications, the memory space is very limited, and there are certain limitations on the size of the written file. When multiple cache objects with too much data are stored in the cache space, the working efficiency of caching system will visibly decline, which has a great impact on the reading effect of small file. To solve this problem, this paper controls the written cache objects in the pre-read stage, designs a cache replacement optimization algorithm and a multi-stage cache dynamic elimination scheme, so as to optimize the reading performance with small file based on Redis cache.

4 Cache Optimization Method/Scheme Based on Redis Cache Mechanism 4.1

The Designed Cache Replacement Optimization Algorithm in Redis

The NDN cache replacement algorithm introduced in literature [6] only considers the influence of a single cache impact factor on the cache elimination algorithm. We present a temperature value calculation algorithm BME (Based on Multiple Elements) based on multiple caching factors. On the basis of the original calorific value calculation formula, this algorithm introduces two factors, the interval of the nearest access time and the size of cache objects, both of these factors belong to the cost factors of cache influence factors, the higher its value is, the smaller the cache object value is, so they are inversely proportional to the temperature value relation, and the number of visits is the benefit factor of cache influence, the higher its value is, the greater the cache object value is, so it is proportional to the temperature. The improved temperature value calculation formula is shown in Eq. 1: V ½i þ 1 ¼ S½i þ

V ½i w

ð1Þ

Where w is the weight of temperature value, S[i] is influence factor, and their calculation formulas are shown in Eqs. 2 and 3 respectively:

346

B. Wei et al.

w ¼ 1þc T S½i ¼ esi eti 1 e

ð2Þ f i

ð3Þ

In formula 2, c is the proportionality coefficient and T is the period. In formula 3, ti fi are the influence values of the size of cache object, the recent e , e and 1 e access time and the access times respectively, and their values range from 0 to 1. When the new cache object reaches the cache, its initial temperature value is only affected by the size of the cache object, so the initial temperature can be calculated by Eq. 4: si

V ½i ¼ esi

ð4Þ

The derivation formula of the improved temperature value calculation formula is shown in Eq. 5: V ½i þ 1 ¼

S½i w þ V ½i S½i 1 S½i 2 ¼ S½ i þ þ þ... w w w2

ð5Þ

From Eq. 5, we can also deduce that with the increase of time, the cache impact factors of early cache objects has less and less influence on the current temperature value. 4.2

The Designed Redis Three-Level Cache Dynamic Elimination Scheme

When Redis database is used as a cache, the traditional elimination mechanism process is divided into three steps: first, the user sends commands to Redis to write data (either a request to write new objects or a request to modify existing objects), second, Redis checks to see if it has enough space to store new data. If the storage space is insufficient, it uses the cache replacement strategy to weed out low-value cache objects until there is enough cache space to execute user commands, finally executes commands. The traditional elimination mechanism has the following three shortcomings in the working process: 1. The maximum length of string in Redis can be 512M. Due to the limitation of given cache space, when storing multiple big data objects, only a few objects can be stored in the cache space. 2. When the cache space is insufficient, the objects in the cache space need to be traversed whenever new objects arrive, and the cache objects with low value are eliminated. Frequent computing operations will increase the workload of CPU and reduce the working efficiency of the system. 3. Response time and CPU workload will increase when there are a large number of keys in the cache space. In view of the above three shortcomings, we provide the following three solutions: 1. Given the check threshold of writing cache object size, when the cache data to be written is larger than the given threshold, it will not be written.

Small File Read Performance Optimization Based on Redis Cache

347

2. Set a maximum and minimum critical value for the cache space. When the cache data reaches the maximum critical value, the cache elimination mechanism is triggered to expel the objects with low cache value until the cache data is less than the minimum critical value. 3. A three-level cache structure is given to store the cache data hierarchically, thus reducing response time and CPU workload. Three Level Cache Structure Design In order to improve the search efficiency and hit ratio of cache data, we design a temperature-based L3 (level 3) cache structure, which divides the cache into three levels: hot cache, normal temperature cache and cold cache. When the system receives a write request of new data, it writes the new data directly to the normal temperature cache database. While the space of normal temperature cache database is not enough, the data of the temperature value in top 5% and greater than the lowest in hot cache is moved into the hot cache database, at the same time, the temperature ranked the last 20% of the cache object is moved into the cold cache database. When hot cache database space is insufficient, it will transfer the temperature value ranked last 10% of cached objects in hot cache into the normal temperature cache database. Cold cache database always reserved enough storage space for objects that are eliminated from normal temperature cache database. After each write object to the cold cache which eject from the normal temperature cache database, cold cache first writes the object into the normal temperature cache whose temperature value ranks the top 5% and it is higher than the lowest value in the normal temperature cache, and then determines whether the data in the cold cache exceeds the threshold H, if so, the cold object in the cold cache database will be expelled until the minimum space reserve threshold L in the cold cache is reached. Let the cache capacity of each level be S(n), and its overall structure is shown in Fig. 1: From Fig. 1, we can see that objects in normal temperature cache database come from three places: newly written cached objects, objects eliminated from the hot cache, and objects recovered from the cold cache. The objects in hot and cold cache are only from the normal temperature cache database, and the cache objects with the lowest temperature value are finally expelled from the cold cache database.

evict 0.2Sn from normal_db

new data

evict 0.1Sn from hot_db

hot_db normal_db

cold_db add 0.05Sn to hot_db evict from cold_db

add 0.05Sn to normal_db

Fig. 1. Three-level cache structure

348

B. Wei et al.

Three Level Cache Dynamic Elimination Process When the cache space is insufficient, the operation of data expulsion is always performed before the data recovery, and the number of objects eliminated is larger than the recovered, so, the normal temperature database always reserves enough space for recovering cache objects from the cold cache and eliminating cache objects from the hot cache. The cache objects recovered from cold cache and eliminated from hot cache will not trigger the elimination mechanism. The elimination mechanism will only be triggered by the newly arrived cache objects run out of cache space in the normal temperature cache database. The following will analyze the recovery and elimination process of cache objects when the storage space of temperature database is insufficient: 1. When the new arrived data triggers the elimination mechanism of the normal temperature cache database and the hot cache database still has enough space to hold the data written from the normal temperature cache database, the elimination process is shown in Fig. 2: When the normal temperature cache database space is insufficient and the hot cache database has enough space to hold the data written from the normal temperature cache database, the normal temperature cache database first moves the cache objects whose temperature value ranks the last 20% into the cold cache database, and then moves the data whose temperature value ranks the top 5% and it is greater than the lowest temperature value in the hot cache into the hot cache database. 2. When the new arrived data triggers the elimination mechanism of the normal temperature cache database and the hot cache database space is less than 5%, the elimination process is shown in Fig. 3: When the normal temperature cache space is insufficient and the hot cache database space is not enough to hold the data written from the normal temperature cache database, the cache database at normal temperature firstly moves the cache objects whose temperature value ranks the last 20% into the cold cache database, then, hot cache transfers the last 10% of the cache objects with the temperature value to the normal temperature cache database. After finish the data elimination in the first two steps, the data in the normal temperature cache whose temperature value is greater than the lowest temperature value in the hot cache is moved into the hot cache database. 3. Cold cache database has always reserved enough storage space for objects eliminated from normal temperature cache, and its specific workflow is shown in Fig. 4: After the normal temperature cache database writes out the eliminated object to the cold cache database, the cold cache database first determines whether the cache data in the cache space exceeds the fixed threshold at this time, if so, first, the cache object of the temperature value in the cold cache database ranks the top 5% and is higher than the object with the lowest temperature in the normal temperature cache is restored to the normal temperature cache database, and then the low temperature object in cold cache database is expelled, until the cache object in the cold cache space is less than the given minimum threshold value.

Small File Read Performance Optimization Based on Redis Cache

349

Fig. 2. Schematic diagram of cache elimination at normal temperature 1

Fig. 3. Schematic diagram of cache elimination at normal temperature 2

Fig. 4. Cold cache elimination schematic diagram

5 Experiment and Evaluation 5.1

Experimental Environment and Dataset

The experiment of this paper is divided into two parts. The first part is to test the cache hit ratio of LRU, LFU and BME algorithm, and the second part is to test the cache hit ratio of Redis database. The two experiments are conducted in the Windows system of physical machine and Ubuntu system of virtual machine respectively. Configuration information is shown in Table 1: Table 1. Configuration information table Configuration information OS 1 Ubuntu 14.04.5 LTS OS 2 Windows 10.0 17134 OS 1 CPU Intel(R) Core (TM) i7-4790 CPU @ 3.60 GHz OS 2 CPU Intel(R) Core (TM) i7-6500U CPU @ 2.50 GHz (4 CPUs), *2.6 GHz OS 1 memory 8 GB OS 2 memory 12 GB Redis Version 5.0.3

350

5.2

B. Wei et al.

Cache Hit Ratio

The cache hit ratio of three cache replacement algorithms, LRU, LFU and BME, were experimented by setting two cache space restrictions. Firstly, the number of cache objects was taken as the upper limit condition of cache space for the experiment. We set 500 cache object spaces for each algorithm. Since the impact of cache object size on the experimental results was not considered at this time, we adjusted the temperature formula of BME. The adjusted formula is shown in Eq. 6: V ½i V ½i þ 1 ¼ eti 1 efi þ w

ð6Þ

LRU

LFU

LRU

BME

CACHE HIT RATIO

AVERAGE CACHE HITS (PER 10,000 HITS)

Randomly selected 2000 small files from 15000 small files to read each time, the hit times of the three algorithms under 20,000, 60,000, 100,000, 200,000 and 300,000 access requests were tested respectively. Finally, the average of the five results was taken as the final result. Experimental test results are shown in Fig. 5: Figure 5 shows that the average cache hit times of BME algorithm are always the best. As the number of visits increases, the gap between LFU and BME is narrowing, the cache hit times of BME and LFU are higher than LRU.

400 300 200 100 0 2

6

10

20

BME

5% 10% 15% 20% 30% CACHE CAPACITY/DATA VOLUME

30

NUMBER OF VISITS (TEN THOUSAND)

Fig. 5. Average cache hits

LFU

30% 25% 20% 15% 10% 5% 0%

Fig. 6. Cache hit ratio

We also tested the cache hit ratio of cache capacity and total data in different ratio of these three algorithms. The cache hit ratio calculation formula is shown as 7: hit rate ¼

hits hits þ misses

ð7Þ

A total of 50,000 read requests were sent to the database, and 1,000 requests were sent each time. The cache hit ratio was tested when the proportion of cache capacity and data volume was 5%, 10%, 15%, 20% and 30% respectively. The results are shown in Fig. 6: Figure 6 shows that as the ratio of cache capacity to data volume increases, the cache hit ratio of the three cache replacement algorithms also increases. The cache hit ratio of BME algorithm is always the highest, and the gap between LRU and the other two cache replacement algorithms is also narrowing. Figures 5 and 6 show that BME

Small File Read Performance Optimization Based on Redis Cache

351

algorithm can effectively improve cache hit ratio when the number of cache objects is constant. Took the size of cache space as the condition for the experiment, set cache space to 50M. The cache hit ratio of the three algorithms under access requests of 20,000, 60,000, 80,000, 100,000, 120,000 and 150,000 was measured respectively. Since file size factor Si is introduced at this time, the temperature value formula is shown in formula 8: V ½i V ½i þ 1 ¼ eðsi þ ti Þ 1 efi þ w

ð8Þ

30%

LRU

LFU

BME

25% 20% 15% 10% 5% 0% 2 0 0 0 0 6 0 0 0 0 8 0 0 0 0 1 0 0 0 0 01 2 0 0 0 01 5 0 0 0 0

NUMBER OF ACCESS REQUESTS

Fig. 7. Cache hit rate

REDIS CACHE HIT RATE

CACHE HIT RATE

The cache hit ratio is shown in Fig. 7:

30%

LRU

LFU

BME

25% 20% 15% 10% 5% 0% 2 0 0 0 0 6 0 0 0 0 8 0 0 0 0 1 0 0 0 0 01 2 0 0 0 01 5 0 0 0 0

NUMBER OF ACCESS REQUESTS

Fig. 8. Redis cache hit rate

Figure 7 shows the cache hit ratio of BME algorithm is the highest, which is 14% higher than LFU and 52% higher than LRU on average. Compared to Fig. 6, when setting a limit on the amount of data stored in cache space, the cache hit ratio of BME is significantly higher than LFU. Because in second scheme, the factor of cache object size is lead into BME algorithm. For the same size of cache space, BME algorithm can store more cache objects, so the cache hit ratio is significantly better than LFU. 5.3

Redis Cache Hit Ratio

Set the cache space of Redis to 50 M, and tested the cache hit ratio of three cache replacement algorithms of LRU, LFU and BME under access requests of 20,000, 60,000, 80,000, 100,000, 120,000 and 150,000 respectively. Experimental results are shown in Fig. 8: Figure 8 shows that the cache hit ratio of BME algorithm is significantly higher than the other two algorithms, and its cache hit ratio is 16% higher than LFU and 57% higher than LRU on average. Therefore, BME algorithm can effectively improves the cache hit ratio of data.

352

B. Wei et al.

6 Conclusion In this paper, we propose a cache replacement optimization algorithm and a three-level cache dynamic elimination scheme for Redis caching mechanism. In addition, the design idea and implementation steps of cache replacement optimization algorithm are explained in detail. The three-level cache dynamic elimination scheme proposed in this paper is also described from the cache structure, the elimination trigger conditions of all levels of cache and the elimination process of cache objects. Finally, experiments show that BME algorithm with three-level cache structure has a greater cache hit ratio than LFU and LRU cache replacement algorithms in Redis cache mechanism and optimizes the reading performance of small file based on Redis cache, it also improves the working efficiency of caching system. Acknowledgement. This work is supported by National Natural Science Foundation of China (Nos. 61861013, 61662018). Scientific Research and Technology Development Project of Guangxi (No. 1598019-2), Guangxi Natural Science Foundation of China (No. 2016GXNSFAA380153), the Doctoral Research Foundation of Guilin University of Electronic Science and Technology (No. UF19033Y), Guangxi Graduate Education Innovation Program (2019YCXS044).

References 1. Weng, M., Shang, Y., Tian, Y.: The design and implementation of LRU-based web cache. In: 2013 8th International Conference on Communications and Networking in China, Guilin, China, pp. 400–404. IEEE (2014) 2. Hasslinger, G., Ntougias, K., Hasslinger, F., Hohlfeld, O.: Comparing web cache implementations for fast O(1) updates based on LRU, LFU and score gated strategies. In: 2018 IEEE 23rd International Workshop on Computer Aided Modeling and Design of Communication Links and Networks, Barcelona, Spain, pp. 1–7. IEEE (2018) 3. Herr, Q.P., Bunyk, P.: Implementation and application of first-in first-out buffers. IEEE Trans. Appl. Supercond. 13(2), 563–566 (2003) 4. Nagasako, Y., Yamaguchi, S.: A server cache size aware cache replacement algorithm for block level network Storage. In: 2011 Tenth International Symposium on Autonomous Decentralized Systems, Tokyo & Hiroshima, Japan, pp. 573–576. IEEE (2011) 5. Miao, F., Chen, D., Jin, L.: Multi-level PLRU cache algorithm for content delivery networks. In: 2017 10th International Symposium on Computational Intelligence and Design, Hangzhou, China, pp. 320–323. IEEE (2017) 6. Li, T., Li, Y.: A content popularity based cache replacement algorithm for NDN [EB/OL] (2019). http://www.paper.edu.cn/releasepaper/content/201212-897 7. Podlipnig, S., Böszörmenyi, L.: A survey of web cache replacement strategies. ACM Comput. Surv. 35(4), 374–398 (2003) 8. Jelenkovic, P.R., Radovanovic, A.: The persistent-access-caching algorithm. Random Struct. Algorithms 33(2), 219–251 (2008)

Neural Network and Deep Learning

Lightweight Real-Time Object Detection Based on Deep Learning Zhefei Wei, Zhe Huang, Baolong Guo(B) , Cheng Li, and Geng Wang School of Aerospace Science and Technology, Xidian University, Taibai South Road, Xi’an 710071, China {weizhifei,huangz,licheng812}@stu.xidian.edu.cn, [email protected], [email protected]

Abstract. Real-time object detection plays a significant role in the field of computer vision. Advanced object detection networks combine with the distribution characteristics in the image, while exposing detecting small targets as a bottleneck. In this paper, a novel network YOLO-light, can real-time detect and accurately detect objects in embedded system or portable devices. Firstly, in order to gaining better priori boxes, the clustering analysis is applied to pre-processing. Secondly, inspired from the multi-scale connection in the Feature Pyramid Networks (FPN) algorithm, YOLO-light enables multi-view and integrate features of various scales. With this design, YOLO-light can end-to-end training with low latency and higher average precision. The experiments testify that YOLO-light algorithm reveals satisfactory performance both in speed and accuracy.

Keywords: Objects detection Convolutional network

1

· Real-time · Multi-scale ·

Introduction

In contemporary society, objects detection is one of topic researches in computer vision, such as face-scanning payment system, vehicle monitoring, driving cars without specialized sensors, and so on. In order to match human visual system, computer vision technologies are necessary to automatically detect a specific target from the image, and predicting the class, size and location of the detected targets. Deep learning techniques have greatly revolutionized the computer vision fields, with high speed, accuracy and incremental robustness. Currently, the existing object detection methods based on deep learning could be divided into two groups: the models based on two-stage and the models based on one-stage. The classical object detection methods based on two-stage have high latency due to the heavy computation problem. Typical detectors include R-CNNs (region-based convolutional neural networks), Fast R-CNN [1], SPP-net (spatial pyramid pooling networks), Faster R-CNN [2] and R-FPN [3] (region-based fully c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 355–365, 2020. https://doi.org/10.1007/978-981-15-3308-2_39

356

Z. Wei et al.

convolutional networks). R-CNNs based on region proposal methods to hypothesize object locations. From the previous section, feature vectors are extracted and fed into the following convolution layer. Finally, the feature vectors are classified or regressed with different algorithms. Thanks to the advent of sharing convolutions, the computational cost of R-CNNs is dramatically decreased. For instance, like SPP-net and Fast R-CNN have reduced the training time, while wasting a great deal of time in proposal computation. Thus, Region Proposal Network (RPN) makes full use of image convolutional features with target detection networks. Thus it can efficiently predict regions with diversiform scales. To unify RPNs with Fast R-CNN, Faster R-CNN is produced. Faster R-CNN can be trained directly for generating the proposal boxes, which almost enable real-time detect objects. The emergence of detection methods based on one-stage make real-time performance possible. Current object detection models based on one-stage are You Only Look Once (YOLO) and Single Shot Multi-box Detector (SSD). YOLO is overly simple, which is a single neural network that can predict bounding boxes coordinates and associated class probabilities simultaneously [4]. The Single Shot Detection (SSD) algorithm discretizes the output space of the bounding box into a set of bounding box, which includes a prior of different aspect ratios and the proportion of each feature map position [5]. In the process of predicting, the confidence is injected by each previous correlation injects, which produces an adjustment to the object of interest for matching the shape of the object better. Both the SSD and YOLO use the regression method in target detection, which markedly speeds up the process [6]. This paperwork adopts lightweight network to achieve rapidity, and multiscale target detector for dim target. Using clustering algorithm as a starting point, the proposed YOLO-light is inspired from YOLOv3-tiny network. Taking the accuracy and real-time effect into account, YOLO-light employs threes scales to fuse and detect the extracted features. In this way, YOLO-light enables end-to-end training and real time speeds with a higher average precision. The experiment results show that YOLO-light algorithm reveals satisfactory performance both in precision and recall.

2 2.1

Related Work Brief Introduction of YOLO Algorithm

Based on YOLOv1, Redmon and Liu proposed you only look once version 2 (YOLOv2), which did make breakthroughs in accuracy and speed. In the VOC2007 dataset, the speed is 67 fps in detecting, as well as the accuracy is 76.8% in real-time performance. On account of YOLOv2 performing poor in small target detection, the author proposed a new algorithm, you only look once version 3 (YOLOv3) based on YOLOv2. Compared with the original algorithm, it has multi-label classification and multi-scale prediction. The detection effect on the small target has been significantly improved [7]. In addition, while drawing on the residual structure in residual neural network (ResNet) it constructs

Lightweight Real-Time Object Detection Based on Deep Learning

Fig. 1. The workflow diagram of the proposed YOLO-light

357

358

Z. Wei et al.

a new structure (Darknet-53). Under the condition that ensuring the speed and accuracy basically unchanged, detection effect on small targets has been dramatically promoted in YOLOv3. As the latest algorithm in the YOLO series, YOLOv3 has both reservations and improvements to the previous algorithms. Here is a brief introduction. Firstly, starting from YOLOv1, the algorithm does the detection by dividing the cells, but the number of divisions is different. With respect to YOLOv2, it uses batch normalization as a method of regularization, accelerated convergence and avoided overfitting. As for YOLOv3, it uses multi-label multi-classification to replace the original single-label multi-classification, employing binary cross-entropy loss for the class prediction, and adopting multiscale to predict the boxes. 2.2

The Network of the Darknet19

The proposed YOLO-light adopts the advances of the darknet19 network. YOLOv3 continues the practice of YOLOv2 in the coordinate prediction way. In the prediction of the category, multi-label and multi-classification are adopted. Binary cross entropy loss function is used in YOLOv3, rather than multi-class loss function in YOLOv2 [8]. On equipment with good performance, YOLOv3 can attain the requirement of real-time. However, in the resource-limited equipment, the YOLOv3 algorithm cannot satisfy real time request. Accordingly, the YOLOv3-tiny algorithm framework is adopted here. To put it another way, the YOLOv3-tiny network can basically gratify the requirement of real time in the miniaturized embedded devices.

3

The Proposed YOLO-light Network

In order to detect objects faster and more accurately, YOLO-light applies multiscale fusion, multi-scale detection and k-means++ clustering. In this way, our approach can be used in embedded systems. The detection effect of YOLO-light network is more than satisfactory, especially in recognizing small-size targets. 3.1

Confusing of Features in Multiple Layers

For the sake of solving the issue of degradation, this paper is inspired from ResNet proposed by Kaiming He in 2015 and the DenseNet network proposed in 2017. The ResNet network constructs two methods—a residual block and a deep residual block, which transfers the previous features to the subsequent layers. Therefore, the number of network layers can be continuously deepened. Moreover, the skip connection solves the problem of previous degradation, also reduces the error rate. In the Dense block, the previous convolutional layer output is transferred to the subsequent convolutional layer. Put it another way, it is equivalent to directly connecting the input to each layer, which can dramatically alleviate the dilemma of gradient disappearance. By incorporating the ideas into

Lightweight Real-Time Object Detection Based on Deep Learning

359

the hidden network of darknet19, the excellent features of the above network can be used. Since the number of layers is not too large, the output layers of the same size in each layer of the previous feature extraction layer can be extracted [9]. Subsequently all of them have been connected together. Eventually, the subsequent feature extraction layer is output. The paper adopts this standpoint and gets satisfactory results, which has improved a lot. In order to make full use of the hidden features, the Feature Pyramid Network (FPN) has been introduced. The proposed YOLO-light network uses the multiscale fusion method. And the features which are extracted from each of the previous scales are up-sampled. The features are put to the next layer and are connected with the scale of the next layer. Finally, the predicted results are output in the last layer. The features in 15th, 16th and 17th layers of the network are confused, and subsequently are fed into convolutional layer and up-sampling layer. Similarly, the feature map in 9th and 10th layers are processed above together, forming an innovative input and are fed into the next layer. Finally, the 6th and 7th layers and the tensor are connected to the next hidden layer to detect big scale. The above is all-layer connection method in this paper. 3.2

The Framework of Multi-scale Prediction

Similar top-down architect applies skip connections, which are popular in stateof-the-art object detection research. The proposed YOLO-light creates a feature pyramid that can multi-view and have strong semantics. If there are n kinds of objects in datasets, and each pictures is RGB pictures, the detecting tensor is M × M × [3 × (4 + 1 + n)]. In this formula, four parameters assure the size and location of bounding box, and 1 means the confidence [4]. Several convolutional layers are added from the based feature extractor. It is obviously that in the place where the scales are merged, the amount of network layers is not marvelous. YOLO-light network connects the feature maps with the same feature scale in the previous darknet19 network architecture [10]. Meantime, the network takes the feature map from two convolution layers and then up-sample the features. Consequently the tensors above are connected together. In this way, the characteristics of the hidden layer, as well as he deep features can be obtained by YOLO-light. The tensor size of 13 × 13 × 18 is get in the first layer. Thereafter, the tensors are passed via two convolutional layers and an up-sample layer. The network gets the second scale, where the size of the tensor is 26 × 26 × 18. The third scale is 52 × 52 × 18. Through the convolutional layer and up-sampling again, the tensor size is get. These three scales are used to detect targets of different sizes. The network performs large-scale detection in 13 × 13 tensor, detects the mesoscale target in 26 × 26 tensor, and detects the moderate-size target in the last 52 × 52 tensor. Therefore, the network proposes that in a network with a hidden layer. By confusing multiple features in the same scale, YOLO-light network prominently lifts the ability to detect objects. Benefiting from the prior network, the detection accuracy of small targets has been signally improves by the third scale of the structure. At the same time, in order to predict the object better, this kind of

360

Z. Wei et al.

structure is applied to YOLO-light three times. The frame diagram of YOLOlight can be found in Fig. 1. 3.3

Targets for Anchors Clustering

In this work, k-means++ algorithm is used to conduct latitude clustering. At first, a sample point, as the first initialized cluster center, would be randomly selected. Secondly, the distances of this point of the rest for each cluster center would be calculated [11]. When the distance among the cluster centers after the sample points has been initialized, the shortest distance from its own cluster center is chosen [6]. Euclidean distance is applied in k-means algorithm to measure the distance between two points. Three sizes of targets here is defined by the proportion in the entire image. There are various targets in three scale in the dataset: large-scale targets, mesoscale targets and small-scale targets. However, more errors could be exposed due to the larger bounding boxer rather than the smaller bounding boxes. Since our goal is to get better Intersection-over-Union (IoU) through anchor boxes, this paper is applied Jake’s distance.

4 4.1

Experimental Verification and Result Analysis Comparison in mAP

The mAP (Mean Average Accuracy) is one of significant indicators to evaluate the test results. The sample is based on its accurate category and learner prediction class. Samples can be divided into four parts: TP (true positive), TN (true negative), FP (fault positive) and FN (fault negative). Precision refers to the proportion that is currently classified into positive sample categories and correctly classified. Recall refers to the proportion of actual positive samples in all positive samples currently assigned to positive sample categories. After that, they can indicate the number of their corresponding samples. The accuracy and recall rate are defined in Eqs. (1) and (2):

precision =

recall =

TP TP + FP

TP TP + FN

(1)

(2)

It is quiet difficult to find a trade-off between accuracy and recall. Assuming that the document at that location is irrelevant, then the position precision is equal to 0. In NWPU VHR-10 dataset, there are a total of 10 types of data. In this experiment, the relevant AP for each type of target is calculated. And it is easy to get mAP for different networks, which are displayed in Table 1. By improving the network structure and after k-means clustering, YOLO-light performs better than YOLOv3-tiny, with mAP reaching 0.89693.

Lightweight Real-Time Object Detection Based on Deep Learning

4.2

361

Comparison of Speeds

It is quiet difficult to find a trade-off between accuracy and recall. Assuming that the document at that location is irrelevant, then the position precision is equal to 0. In NWPU VHR-10 dataset, there are a total of 10 types of data, including airplane, vehicle, harbor, ship, bridge, storage tank, baseball diamond, tennis court, basketball court and ground track field. In this experiment, the relevant AP for each type of target is calculated. And it is easy to get mAP for different networks, which are displayed in Table 1. By improving the network structure and after k-means clustering, YOLO-light performs better than YOLOv3-tiny, with mAP reaching 0.89693. We select about 300 images containing small targets in VOC 2007 dataset. Compared with SPP-net, faster RCNN, YOLOv3 and YOLOv3-tiny in detection speed and accuracy, Table 2 demonstrates that the proposed YOLO-light finds a trade-off between the accuracy and detection speed, which exhibits the best performance in small object among these methods. Table 1. Comparison of precision between YOLOv3-tiny and YOLO-light on NWPU VHR-10 dataset Precision Classes

YOLOv3-tiny YOLO-light

AP (%)

Airplane

0.81806

0.85874

AP (%)

Vehicle

0.68695

0.82016

AP (%)

Harbor

0.92718

0.99867

AP (%)

Ship

0.89593

0.99114

AP (%)

Bridge

0.76209

0.81266

AP (%)

Storage tank

0.71365

0.77361

AP (%)

Baseball diamond

0.73652

0.85413

AP (%)

Tennis court

0.87765

0.90083

AP (%)

Basketball court

0.93876

0.95934

AP (%)

Ground track field 1.0

mAP (%)

1.0

0.83568

0.89693

Table 2. Comparison of precision and speed for detection methods on VOC 2007 dataset Method

SPP-net Faster RCNN YOLOv3 YOLOv3-tiny YOLO-light

mAP (%)

30.32

66.28

55.73

27.16

31.59

0.38

0.26

0.13

0.10

0.09

Run time (sec/img)

362

Z. Wei et al.

Fig. 2. Comparison of IoU curves of two networks

Fig. 3. Comparison of test effect for two networks: the left column is represented by YOLOv3-tiny; the right column is represented by YOLO-light relevantly.

Lightweight Real-Time Object Detection Based on Deep Learning

4.3

363

Comparisons of IoU Curves

Intersection-over-Union (IoU) is an evaluation parameter in the field of objects detection. Simply speaking, IoU is the overlap rate between the target window generated by the YOLO-light network and the original markup window. Specifically, it can be simply understood that the intersection of DR (Detection Result) and GT (Ground Truth) is the union of them. The ideal situation is accomplish overlap. Generally speaking, if this score is over 0.5, it could become a good result. The IoU curve of YOLOv3-tiny and YOLO-light are shown in Fig. 2. It is apparent to find that, compared with YOLOv3-tiny, the area under the curve of the YOLO-light network is also larger. Better still, its IoU curve converges faster. YOLO-light produces a higher overlap between the candidate bound and the ground truth bound, and the ratio of their intersection to union is higher. This shows that candidate bound is highly correlated with ground truth bound. In summary, the performance of the YOLO-light network has indeed improved a lot.

Fig. 4. The test effect in the proposed YOLO-light

364

4.4

Z. Wei et al.

Test Effect Charts

The Fig. 3 displays the comparison between YOLOv3-tiny and YOLO-light. It is easy to find that there are certain missed and false detection in YOLOv3-tiny networks. Figure 4 shows the detection results of YOLO-light. It is obvious that whether it is for single or multiple targets, no matter large size, moderate size or small size, the effect of YOLO-light is overwhelmingly satisfactory. As mentioned earlier, parameters such as mAP perform very well.

5

Conclusions

This paper introduces a lightweight and low-latency object detection based on deep learning. The biggest success of YOLO-light is due to its structural optimization of YOLOv3-tiny. Our approach can be applied to the embedded system or portable devices. In addition, by using the k-means++ algorithm to cluster the dataset and select the number and specifications of the candidate frames, the data can have a good a priori box. After the optimization algorithm, YOLOlight can carry out multi-scale prediction, and the accuracy of the detection of small targets has also been memorably improved. In future work, the background information of the overall image will be fully used to enhance the detector’s performance. Acknowledgment. This work is supported by the National Natural Science Foundation of China (61571346). The research is also supported by the Fundamental Research Funds for the Central Universities and the Innovation Fund of Xidian University. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

References 1. Ren, S.Q., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, Cambridge, MA, USA, pp. 1137– 1149 (2017) 2. Dai, J.F., Li, Y., He, K.M., Sun, J.: Object detection via region-based fully convolutional networks. In: Proceedings of the International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 379–387 (2016) 3. Felzenszwalb, P.F., Girshic, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010) 4. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 779–788 (2016) 5. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: Proceedings of the European Conference on Computer Vision and Pattern Recognition, Amsterdam, The Netherlands, pp. 21–37 (2016)

Lightweight Real-Time Object Detection Based on Deep Learning

365

6. Huang, G., Liu, Z., Van, D.: Densely connected convolutional networks. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, Hawaii, HL, USA, pp. 4700–4708 (2017) 7. Jeong, Y.N., Son, S.R., Jeong, E.H., Lee, B.K.: An integrated self-diagnosis system for an autonomous vehicle based on an IoT gateway and deep learning. Appl. Sci. 7, 1164 (2018) 8. Tang, C., Ling, Y., Yang, X., Jin, W., Zheng, C.: Muti-view object detection based on deep learning. Appl. Sci. 8, 1423 (2018) 9. Qu, H., Zhang, L., Wu, X., He, X., Hu, X., Wen, X.: Multiscale object detection in infrared streetscape images based on deep learning and instance level data augmentation. Appl. Sci. 9(3), 565 (2019) 10. Chen, J., Luo, X., Liu, Y., Wang, J., Ma, Y.: Selective learning confusion class for text-based CAPTCHA recognition. IEEE Access 7, 22246–22259 (2019) 11. He, W.P., Wang, G., Hu, J., Li, C., Guo, B.L., Li, F.P.: Simultaneous human health monitoring and time-frequency sparse representation using EEG and ECG signals. IEEE Access 7, 85986–85994 (2019)

Design of Intelligent Water Purification Control System for Small Waterworks Based on LSTM Ying Ma1,2, Zhigang He1,2, Jianxing Li1,2, Kan Luo1,2, Zhengshan Chen1, and Lisang Liu1,2(&) 1

2

Fujian University of Technology, Fuzhou, China [email protected], [email protected] Research and Development Center for Industrial Automation Technology of Fujian Province, Fuzhou 350118, China

Abstract. This paper studies the application of intelligent drug control automatic water purification in tap water supply in remote areas. On the basis of LSTM technology, the solutions of control system, human-computer interaction system and water quality management system for high-efficiency unmanned water purification equipment which meets the small-scale discrete operation are put forward. DTU communication mode is used to realize centralized monitoring of remote multi-point distribution. The scheme is tested and implemented in some remote areas of Fujian Province where the water supply project cannot be covered. The feasibility of the scheme is tested. A good attempt has been made in the field of village-to-village water supply project, which fills the gap of centralized safe and controllable water supply in remote areas. Keywords: LSTM management

DTU Water purification system Water quality

1 Introduction With the expansion of human activities, the problem of water pollution is becoming more and more serious. The problem of centralized water purification of drinking water for residents has been paid more and more attention by the state and society. At present, our country’s densely populated villages have realized tap water supply. But our country is vast and diverse. There are many remote villages. These villages have inconvenient transportation and are far away from towns. Large water plants cannot provide water for them. Domestic water in these areas still depends on traditional mountain springs and groundwater. The nature of these traditional water sources, with the changes of weather and environment, has great instability and great health risks. With the realization of all village road, our country is vigorously promoting the Village-village connection of tap water. There are some problems in remote villages, such as sparse population and far away from cities and towns, so it is unrealistic to build standard water works in rural areas. If water supply is provided by water plants in nearby towns, there are some problems such as high cost, low utilization rate and © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 366–379, 2020. https://doi.org/10.1007/978-981-15-3308-2_40

Design of Intelligent Water Purification Control System

367

difficult maintenance of water pipelines. Therefore, for remote villages, the establishment of small water plants is a better choice to obtain stable and reliable water sources. However, due to the limited technicians who have the ability to use and maintain water plants, and the water purification process of water plants changes with the changes of weather, season and surrounding environment, it is necessary to design an automatic artificial intelligence water quality management system for such small-scale water plants. In order to facilitate the management and maintenance of technicians, it is also necessary to have the ability of multi-point remote monitoring.

2 Task Objectives Tap water production is a multi-variable, multi-task, multi-equipment system. It is a complex nonlinear system with time-varying, complex and stochastic characteristics. Therefore, the automation design of waterworks is difficult. Generally, water production system can be divided into three parts: water intake, water purification and water delivery. The system will management and control the automate different parts and eventually coordinate. In order to save space and energy, these three parts of a small waterworks are usually integrated into a set of equipment. Therefore, the automation of small waterworks is mainly to automate the management and control of water purification equipment. In this paper, GXZ high-efficiency water purifier is selected as water purification equipment. GXZ series of high-efficiency water purifiers optimize the selection and combination of water purification processes such as dosing, mixing, coagulation, sedimentation, backwashing and disinfection. They can purify high turbidity raw water into drinking water that meets the requirements of National Drinking Water Sanitation Standard (GB5749-2006), and are very suitable for small-scale rural water plants (Fig. 1). This paper will carry out automatic development on the basis of GXZ highefficiency water purifier. Mainly complete the following automatic control, such as water intake, dosage, mixing, coagulation, sedimentation, filtration, disinfection and

Fig. 1. GXZ High efficiency water purification equipment

368

Y. Ma et al.

sludge backwashing process. The system can also realize remote monitoring. And it can calculate the amount of the propellant according to the real-time data through intelligent algorithm, so as to save the consumption of medicine and energy [1].

3 Water Purification Process The process flow of water purification equipment is shown in Fig. 2.

Fig. 2. Process flow of GXZ-30 water purification equipment

(1) Coagulant dosage The common coagulants in tap water production are alum, Al2(SO4)3, Al2Cl(OH)5 and iron salts such as FeCl3 and FeSO4. In the past, alum was often used as coagulant in waterworks. Its reaction principle is that alum ionizes potassium and aluminium ions in water to form Al(OH)3 colloid. The reaction equation is as follows: KAlðSO4 Þ2 ! K þ + Al3 þ + 2SO2 4 Al3 þ + 3H2 O ! AlðOHÞ3 + 3H þ Al(OH)3 colloid has strong adsorption capacity, which can absorb impurity particles in water and produce precipitation, so as to purify water quality. Recent studies have shown that alum contains harmful aluminium, which can cause Alzheimer’s disease. So Al2Cl(OH)5 is widely used as a coagulant now. Polyaluminium chloride (PAC) has stronger adsorptive capacity. It can not only adsorb suspended particles in water, but also selectively adsorb water-soluble substances through hydrolysis products. A small amount of use can also bring about high efficiency of impurity removal and low cost. The dosage of coagulant is based on the turbidity of raw water, and the dosage of coagulant is controlled by the dosing equipment, so as to optimize the effect of water purification and ensure the turbidity of effluent. Drug feeding equipment can adopt electric metering pump dosing system, high negative pressure dosing device or water wheel metering mixing dosing device. (2) Mixing The mixer mixes raw water with coagulant quickly, so that the coagulant can be fully diffused into the water, which is conducive to the follow-up reaction. The improved tubular static mixer equipped with the equipment has the advantages of high efficiency and small resistance, and can save 20%–30% of the coagulant. (3) Coagulation reaction After mixing, the coagulant colloid changes from a highly stable state to an unstable state, and completes coagulation with the help of Brownian motion and

Design of Intelligent Water Purification Control System

(4)

(5)

(6)

(7)

369

water flow. After coagulation reaction, impurity suspensions and colloidal particles in water react to form uniform and coarse alum flowers, which is conducive to subsequent solid-liquid separation. The inner and outer spiral grid coil reactor has the advantages of high flocculation efficiency and strong shock resistance. When the value of delta is more than 3.8, the reaction time is only 5–6 min to form uniform and dense alum, which is easy to precipitate. The water yield can be increased by about one time compared with the conventional reactor. Precipitation After coagulation reaction, the water rises to the precipitation zone, and alum in the water precipitates in the precipitator to achieve the purpose of solid-liquid separation. Filtration Filtration is the removal of suspended particulate impurities in water by using a porous filling layer to remove particles over 2–5 l. After mixing, coagulation and precipitation, most of the impurities and microorganisms have been removed from raw water, but they can not meet the standard of drinking water. The residual impurities must be further removed through filtration. The requirement of filtered effluent is that the turbidity of effluent is less than 1 NTU and not higher than 3 NTU in special cases. Backwash In the long-term use of the filter element, the filter material will be retained in the filter element, which will lead to the decline of filtration capacity and the decrease of water flow. Therefore, it is necessary to clean the filter element independently through backwashing process to remove the impurities remaining in the filter element in order to restore the filter performance and prolong the service life of the filter element. Disinfection The raw water should be treated with disinfectant after coagulation and filtration. The chlorine dioxide is chosen as disinfectant here. The chlorine dioxide has better germicidal effect than chlorine. The residual amount of chlorine dioxide stays longer in the pipe network, can better the cell wall of the bedside microorganism, inhibit the synthesis of protein, so that it can be inactivated. The chlorine dioxide can be better exterminated only by adding 40% chlorine gas. In addition, the chloroform produced in water is only 10% of the chlorine gas. It has a strong effect on deodorization, decolorization and iron and manganese removal. However, if the amount of chlorine dioxide in water is too much, the turbidity of water will increase. Chlorine dioxide is less harmful to humans and animals. Microbial enzymes are distributed on the surface of cell membranes and are easily inactivated by chlorine dioxide attack. For humans and animals, enzymes are in the cytoplasm, chlorine dioxide is not easily accessible [2].

4 Control System Composition The control system mainly consists of influent, coagulation, coagulation, PH, reaction precipitation and sludge discharge, filtration and backwashing, chlorine dioxide dosage and effluent detection. The system needs to complete the monitoring of control

370

Y. Ma et al.

Fig. 3. Composition of water purification equipment GXZ30

Design of Intelligent Water Purification Control System

371

equipment and instrumentation of each process, and realize remote monitoring in different places. The composition of water purification equipment is shown in Fig. 3. According to the control requirements, the system uses Mitsubishi FX2N series PLC as the main controller. Advantech’s Web Access software is selected as the configuration software. WebAccess is a Web-based configuration software, which can realize remote configuration. When the project needs to be maintained and modified, it can solve the problem of configuration part through network, and the ability and effect of on-site and remote monitoring are exactly the same. In a real sense, zero distance between on-site and remote is realized. WebAccess can also transmit the real-time scene picture to the monitoring screen through the camera, which can be displayed together with the monitoring screen to show the real scene situation. Therefore, WebAccess meets the needs of this system. The system controls GXZ-30 water purification equipment by PLC, and completes the automatic control of each process link of water purification equipment. Web Access software is used to develop the configuration interface on the industrial computer to complete on-site monitoring. And upload the data to the cloud platform, cooperate with the client software of WebAccess, realize the function of remote monitoring. The system network topology is shown in Fig. 4.

Fig. 4. System network extension diagram

372

Y. Ma et al.

5 Calculation of Dosage Based on LSTM For this topic, we have conducted a certain range of visits and surveys in the relevant industries. It is known that there is not a relatively clear parameter index for the dosage of drugs at present. In the process of visiting and investigating, we know that besides the water quality, the dosage is also affected by elevation, temperature, environmental humidity and season. Considering the above factors, we selected LSTM to calculate the dosage effectively in this study. Using the existing computer computing ability, the model and calculation analysis of dosage are realized by sampling data of relevant enterprises for nearly one year [3–5]. Caffe neural network framework is a neural network with the advantages of fast start, fast speed and modularization. At the same time, it uses C/C++ as the development source, and has excellent portability. At the same time, it also has Python programming interface, which is easy to develop and use. The Caffe Neural Network Framework builds a network model as shown in Fig. 5:

Fig. 5. LSTM neural network model diagram

The monitoring data of 10 different water plants for one year and the experimental data of related enterprise laboratories were collected as reliable data sources. The training data source is used to train the neural network model used in this research. Due to the large amount of data, it is impossible to list all the data. Table 1 shows the feeding and monitoring data of a water plant for three days.

Design of Intelligent Water Purification Control System

373

Table 1. Drug addition monitoring data Time Date Initial continuity turbidity

Precipitation time

PH Ambient temperature

Water Air Reactive temperature humidity turbidity

Dosage

0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1

0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90

7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5

10 10 10 10 10 10 10 10 10 10 11.7 11.7 11.7 11.7 11.7 11.7 11.7 11.7 11.7 11.7 10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5

500 500 500 500 500 500 500 500 500 500 450 450 450 450 450 450 450 450 450 450 550 550 550 550 550 550 550 550 550 550

121 121 121 121 121 121 121 121 121 121 122 122 122 122 122 122 122 122 122 122 123 123 123 123 123 123 123 123 123 123

330 330 330 330 330 330 330 330 330 330 200 200 200 200 200 200 200 200 200 200 350 350 350 350 350 350 350 350 350 350

14 14 14 14 14 14 14 14 14 14 14.6 14.6 14.6 14.6 14.6 14.6 14.6 14.6 14.6 14.6 12 12 12 12 12 12 12 12 12 12

32.7 32.7 32.7 32.7 32.7 32.7 32.7 32.7 32.7 32.7 33.1 33.1 33.1 33.1 33.1 33.1 33.1 33.1 33.1 33.1 30.9 30.9 30.9 30.9 30.9 30.9 30.9 30.9 30.9 30.9

350 10.5 9.06 4.17 3.11 2.9 2.53 2.21 1.7 1.09 210 18.3 6.68 3.98 3.48 3.05 2.32 1.91 1.28 0.89 356 16.5 11.7 7.25 6.93 6.08 5.37 2.77 1.07 0.71

Introduction of training parameters: (A) Time Continuity Marker: It is mainly used to describe the time line in the cyclic neural network and act on the cont data of the source code. (B) Date: Since the reaction water body is also affected by season, the time of month and day is converted into numerical value for network training. (C) Initial turbidity: Uncontaminated water turbidity before water feeding reacts (D) Sedimentation time: It is related to the time after reaction to obtain ideal water. (E) PH value: has a certain impact on dosage

374

Y. Ma et al.

(F) Environmental temperature: one of the key factors affecting dosage (G) Water Temperature: One of the Factors Affecting Research Input (H) Air humidity: combined with seasonal variations, it also has a certain impact on water purification results. (I) Turbidity after precipitation: the most critical factor directly related to dosage. (J) Dosage: Used as a data label, that is, the value we need to obtain. After completing the training of the monitoring data of the effective data, the data operation test of the trained neural network is carried out. The testing process is shown in Fig. 6. At the same time, in order to minimize the problem of too long training time for computer operations caused by floating-point data. In order to save waiting time, all data values except date have been enlarged 100 times when training data is written. Make the input training data without floating point number.

Fig. 6. Training results test

The experimental data are from the actual operation data on April 19. The turbidity of liquid is 329.5, the expected settling time is 70 min, the acidity and alkalinity (PH value) of water body is 7.1, the ambient temperature is 22.3 °C, the water temperature is 21.4 °C, and the air humidity is 42.3%. The turbidity of liquid after treatment is expected to be 1.0 (national standard 1.0). The result of calculation is that the dosage is 599 ml per 100 L of water. The simulation results were tested in the laboratory environment. The corresponding water body was actually inspected. After 70 min of precipitation, the turbidity of the water was 0.98. The final result is basically consistent with the target result, with an error of 2%. Because it is difficult to obtain test data in this experiment, in order to verify the availability of the model, 20 dosage calculation experiments were carried out in the same way. The test data were listed in Table 2:

Design of Intelligent Water Purification Control System

375

Table 2. Records of test results Date Initial Precipitation turbidity time

PH Ambient temperature

Water Air Target temperature humidity turbidity

Dosage Actual turbidity

Turbidity error

508 508 508 508 508 508 508 508 508 508 508 508 508 508 508 508 508 508 508 508

7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5 7.5 6.5

27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6 27.6

1500 932 1350 895 1213 770 1170 585 1050 557 975 528 764 437 638 356 467 278 334 257

4.44% 1.11% 0% 0% 1.11% 0% 3.33% 2.22% 0% 2.22% 4.44% 3.33% 0% 0% 0% 1.11% 2.22% 3.33% 4.44% 0%

500 500 450 450 400 400 350 350 300 300 250 250 200 200 150 150 100 100 50 50

30 90 30 90 30 90 30 90 30 90 30 90 30 90 30 90 30 90 30 90

28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3 28.3

55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6 55.6

0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9

0.94 0.91 0.9 0.9 0.89 0.9 0.87 0.92 0.9 0.92 0.86 0.93 0.9 0.9 0.9 0.89 0.88 0.87 0.86 0.9

For the test water with turbidity from 500 to 50, the weak alkali and acidic water with pH values of 7.5 and 6.5 were controlled simultaneously. Twenty groups of data were tested at different settling times. In order to ensure that the target turbidity can be used in accordance with national standards, a micro-supersaturated turbidity value of 0.9 was selected for actual test. According to the final test results, the overall error is less than 5%. The ideal experimental results can be achieved.

6 System Programming This system mainly uses the PLC of Mitsubishi FX2N-32MR as the controller. The following is the program design for each process flow in the water purification process. The control of coagulant dosage includes the control of coagulant dosage and the control of coagulant aids dosage. The coagulant is mainly PAC. The solution needs to be stirred to prevent precipitation. So the control part of Coagulation Dosing mainly includes the control of metering pump and mixer. Among them, the dosage of coagulant needs to be synthetically analyzed after detecting multiple parameters by the aforementioned algorithm. In addition to PAM coagulant additive, PH regulator should be added when the PH value is too low. Similar to the coagulation part, this part also needs to control the metering pump and mixer. The control flow is shown in Fig. 7.

376

Y. Ma et al.

Fig. 7. Coagulation aid and PH regulation control flow

Because the system requires remote monitoring function, WebAccess configuration software is selected to develop the upper computer monitoring system. Web Access uses Web browser to complete the creation and operation of the whole project, which does not require a high operating system environment, and has a free client, which is very friendly for remote monitoring of users. The management system can call the database through HTTP hyperlink to realize seamless link with the upper management scheduling system. According to the control requirements of the system, the

Design of Intelligent Water Purification Control System

377

corresponding configuration monitoring interface is designed. The monitoring interface mainly includes: main interface, management interface, alarm interface, report interface and so on. The main interface mainly shows the process flow and can intuitively reflect the current state of the system. Administrators can use this interface for simple control. The top of the main interface shows the four most important parameters of the water intake equipment: flow rate, turbidity of the water in and out, residual chlorine. These four parameters reflect the operation effect of water purification. The process flow chart of water purification is in the middle of the interface. Through this chart, we can clearly understand the process of water purification and the operation status and parameters of each part. At the bottom of the main interface are the concise parameters of each subsystem. The setting parameters and real-time monitoring parameters of five subsystems, i.e. water intake, coagulation dosing, coagulation aiding dosing and sludge backwash, are provided. The left side of the interface is the interface switch button, which can be switched to the management interface of each subsystem after obtaining the administrator’s rights. The main interface design is shown in Fig. 8.

Fig. 8. Main interface

The management interface mainly includes the configuration of influent, coagulation, coagulation aid, disinfection, sludge backwashing and filtration processes. After obtaining the administrator’s authority, the functions of setting corresponding values and manually adjusting the system can be completed. Coagulation dosing management interface can mainly set dosing mode. In manual mode, the manual automatic mode switch between metering pump and mixer can be selected. The automatic mode is intelligent dosing mode. The automatic mode can control the switching time of the mixer according to the switching time of the mixer set on the interface. The real-time dosage is adjusted by the intelligent algorithm according to the influent temperature, turbidity and PH value. Real-time inlet and outlet water temperature and turbidity can also be seen at this interface. Figure 8 shows the coagulant dosing management interface (Fig. 9):

378

Y. Ma et al.

Fig. 9. Management interface of coagulant feeding control system

7 Conclusion Through the design and implementation of intelligent drug control automatic water purification, the tap water supply in remote areas is realized. On the basis of LSTM technology, the control system, human-computer interaction system and water quality management system of high-efficiency unmanned water purification equipment for small-scale discrete operation are proposed. DTU communication mode is used to realize centralized monitoring of remote multi-point distribution. The scheme has been tested and implemented in some remote areas of Fujian Province where there is no tap water. In the implementation of the design process, accumulated experience in design, adjustment and operation. This has laid a good foundation for us to do a good job of safe water supply projects in remote areas, and has also made a good reserve of experience and means for dealing with emergencies. This design has made a good attempt in the field of tap water village-to-village project, filling the gap of small and micro-scale centralized safe and controllable water supply in remote areas. Acknowledgement. This research was funded by the Key Scientific and Technological Projects in Fujian Province (Grant GY-Z160011), the Key Project of Science and Technology of Fujian Education Department (Grant JA13212).

References 1. Xu, S., Yan, X., Liu, B., Li, B.: Application of intelligent control in coagulant drug feeding system for purified water. Water Supply Drain. China (13), 70–73 (2017). (in Chinese) 2. Huang, Z., Liu, J.: Case study of water plant renovation and expansion based on integrated water purification device. Water Supply Drain. China (22), 82–85 (2018). (in Chinese) 3. Zhang, J., Zhu, Y., Zhang, X., Ye, M., Yang, J.: Developing a long short-term memory (LSTM) based model for predicting water table depth in agricultural areas. J. Hydrol. 561, 918–929 (2018)

Design of Intelligent Water Purification Control System

379

4. Huang, C.J., Kuo, P.H.: A deep CNN-LSTM model for particulate matter (pm2.5) forecasting in smart cities. Sensors 18(7), 2220 (2018) 5. Chemali, E., Kollmeyer, P., Preindl, M., Ahmed, R., Emadi, A.: Long short-term memorynetworks for accurate state of charge estimation of li-ion batteries. IEEE Trans. Ind. Electron. 65, 6730–6739 (2017)

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection Based on Convolutional Neural Network Geng Wang, Baolong Guo(B) , Cheng Li, Zhe Huang, and Jie Hu School of Aerospace Science and Technology, Xidian University, Taibai South Road, Xi’an 710071, China [email protected], [email protected], {licheng812,huangz}@stu.xidian.edu.cn, [email protected]

Abstract. In the field of feature extraction and machinery fault detection, intelligent fault diagnosis of rotating machinery has drawn much attention. This paper proposes an efficient and flexible diagnostic method based on convolutional neural network (CNN). The method directly feeds the original one-dimensional signals into the formulated network, and adopts one-dimensional convolution kernels to extract representative features. This reduces complexity and time consumption. In the training process, the stochastic gradient descent (SGD) method with momentum is adopted to minimize the loss function of the formulated learning network, so that it could get rid of local minimum points and saddle points as well as speed up optimizing. The experimental results demonstrate that the proposed method effectively identifies the rolling bearing faults under different conditions. Keywords: Rolling bearing · Fault diagnosis Convolutional neural network · Deep learning

1

· Feature extraction ·

Introduction

Since machinery equipment always works in severe environments, core components would suffer great damage. Once the machinery equipment breaks down, it will result in irreparable economic losses and even casualties [1]. Therefore, it is of great significance to monitor the health condition of machinery equipment and detect faults occurred in early stages [2]. Generally speaking, conventional fault diagnosis methods are mainly based on signal processing technology [3,4]. These methods mainly work on extracting fault features of machinery running condition, which require considerable machinery expertise and deep mathematical foundations. Hence, fault identification accuracy by these conventional methods is non-ideal [5]. In recent years, artificial intelligence (AI) techniques have been attracting more interests of machinery fault diagnostic technique, which could efficiently extract features and improve the accuracy of fault diagnosis [6,7]. As one of the c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 380–388, 2020. https://doi.org/10.1007/978-981-15-3308-2_41

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection

381

representative algorithms in deep learning, CNN is able to adaptively capture useful information from raw input signal through multiple non-linear transformations and approximate complex non-linear functions [8,9]. In recent years, CNN has been adopted in machinery fault diagnosis, and various related fault diagnostic methods have been proposed. In [10], machinery vibration signal was converted into a frequency signal as the input of CNN, which achieved an ideal fault identification performance. Liu et al. [11] proposed a dislocation layer for CNN and converted the one-dimensional signal into a two-dimensional signal, which identified the type of motor faults more exactly. Xia et al. [12] proposed a method to incorporate sensor fusion by taking advantage of the CNN structure to achieve higher diagnosis accuracy. This paper proposes an efficient and flexible method based on CNN for machinery fault diagnosis. In the training process, the SGD method with momentum is used to minimize the loss function of the formulated network, thus getting rid of local minimum points and saddle points. The method has excellent feature extraction ability and can automatically extract essential features from raw vibration data. Most machinery fault diagnostic methods converted vibration signals into two-dimensional signals as the input of CNN. The method proposed in this paper directly feeds the original one-dimensional signals into our formulated network and adopts one-dimensional convolution kernels, which facilitates the application of convolution operation and reduces the complexity.

2

The Proposed Machinery Fault Diagnostic Method Based on CNN

In this section, an efficient and flexible diagnostic method for machinery fault detection based on CNN is proposed, which extracts features directly from the raw vibration data. Owing to the capability of automatic feature extraction, it is unnecessary for this method to manually extract features in different conditions. The overall framework of the proposed fault diagnostic method is demonstrated in Fig. 1. The main steps are summarized as follows:

Fig. 1. Framework of the proposed fault diagnostic method

382

G. Wang et al.

Step 1: Collect raw vibration data from rolling bearings of the experimental setup. Step 2: Segment raw data and construct the training datasets and testing datasets. Step 3: Design the network model based on CNN and initialize the parameters of weights, biases and learning rates. Step 4: Feed the training datasets into the formulated network. Step 5: Calculate the error between the output of the last layer and the target values. The cross-entropy function is chosen as the loss function. Step 6: Transmit the error backward and update the weights along with biases layer by layer. In order to prevent falling into local minimum points and saddle points, the SGD method with momentum is applied to optimize the loss function of the proposed network in the training process. Step 7: Transmit forward and backward iteratively until the output error meets the predefined termination criterion. Step 8: Feed the testing datasets into the well trained network to classify the fault condition. In the neural network, parameter optimization is to minimize the loss function actually and the primary way to optimize parameters is SGD. Nevertheless, the method is not stable when it updates parameters. Consequently, the momentum was introduced to the SGD method in this paper. The SGD method with momentum can solve the ill-conditioned problem of Hessian matrix, in which the gradient is highly sensitive to some directions of the parameter space. This method contributes to get rid of local minimum points and saddle points. The algorithm is described as V = αV − η

∂E ∂θ

θ = θ + V

(1)

(2)

where E denotes the loss function, θ is the weight and bias before updating, θ represents the weight and bias after updating, η refers to the learning rate, V denotes the momentum before updating, V is the momentum after updating and α represents to the momentum attenuation parameter. Figure 2 depicts the comparison of the SGD method with momentum and the SGD method. The black line and the red line present the optimizing path of the SGD method and the SGD method with momentum respectively. The SGD method with momentum is adopted to accelerate the learning process when this gradient direction is consistent with the previous one. When they are inconsistent, the oscillation will be suppressed by this method. Therefore, it is reasonable that the SGD method with momentum has a faster convergence rate than the SGD method.

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection

383

Fig. 2. The comparison of the SGD method with momentum and the SGD method

3

Experimental Demonstrations

In this section, the effectiveness of the proposed intelligent machinery fault diagnosis method based on CNN is verified on the experimental datasets. The program of the formulated model is developed in python 3.6.7 with Tensorflow deep learning library and run on Windows 10 with a RTX 2080 Ti GPU. 3.1

Data Preparation

The vibration data for this experiment were provided from the public datasets of the Case Western Reserve University (CWRU) Bearing Data Center website. In this experiment, rolling bearing of the motor drive is introduced as the diagnostic object. Three kinds of faults are simulated by using electrical discharge machining technology to add damage on the inner race, outer race and ball of the testing bearings. The fault diameter is 0.014 in. The signals are collected from the acceleration sensor on the upper side of the motor drive. The sampling frequency is set to 48 kHz. The time domain figures of the bearings normal condition, the inner race fault condition, the outer race fault condition and the ball fault condition are illustrated in Fig. 3. Whereas, the collected vibration data are too large for this experiment. Once fed into the proposed network, it would result in too much computational cost. Therefore, the data need to be segmented before the experiment. As a result, there are 168 samples in the normal condition, 140 samples in the inner race fault condition, 168 samples in the outer race fault condition and 168 samples in the ball fault condition. Each sample consists of 10000 data units after segmenting. In the four conditions above, 644 samples need to be divided into training datasets and testing datasets. In order to ensure the accuracy of the experiment, these samples were shuffled and 500 of them were randomly selected as training samples. Then, the remainder of 144 samples were introduced as testing samples. Table 1 shows the sample size and labels for different fault conditions. The labels are applied to compare with the output of the formulated network to calculate error and optimize parameters.

384

G. Wang et al.

Fig. 3. The time domain figures for rolling bearings vibration signals: (a) Normal condition; (b) Inner race fault; (c) Outer race fault; (d) Ball fault Table 1. Rolling bearing fault samples distribution Rolling bearing condition Training sample Testing sample Label Normal condition

130

38

[0.99,0.01,0.01,0.01]

Inner race fault

102

38

[0.01,0.99,0.01,0.01]

Outer race fault

133

35

[0.01,0.01,0.99,0.01]

Ball fault

135

33

[0.01,0.01,0.01,0.99]

Total

500

144

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection

3.2

385

Model Construction

The proposed network based on CNN in this paper is depicted in Fig. 4, which has two convolutional layers, two polling layers followed by two fully connected layers and a softmax layer in the end. The convolution layers and pooling layers aim to extract representative features from the input signals. The fully connected layers and a softmax layer is to identify the machinery condition. In the proposed model, the input one-dimensional signal is 10000*1. After two convolutional layers with multiple 100*1 convolution kernels and two pooling layers, the output signals are sixteen signals of 2500*1. Then, the feature maps are flattened and connected with a fully connected layer. After two fully connected layers, 100*1 signal is output to the next layer. Finally, the original data is divided into four patterns through the softmax classification layer. ReLu function performs activation in the convolutional layer. The activation function of the fully connected layer is the Sigmoid function. The “Dropout” strategy is adopted to overcome the overfitting problem in the fully connected layer. The loss function taken in the model is the cross-entropy function. Backpropagation is employed for the updates of model parameters where the SGD method with momentum is used. Learning rate is set to 0.0001 and batch size is set to 1.

Fig. 4. The detailed architecture of the proposed network based on CNN

Fig. 5. The multi-classification confusion matrix of the proposed method

386

3.3

G. Wang et al.

Experimental Result

As mentioned in Sect. 3.1, there are 144 testing datasets in total. 38, 38, 35 and 33 datasets are in the normal condition, inner race fault condition, outer race fault condition and ball fault condition respectively. The testing datasets are used to verify the classification accuracy of the proposed method. After repeated training, the classification accuracy of the testing datasets will be output by the last layer. Figure 5 demonstrates the classification accuracy of the proposed method based on a confusion matrix. The row and the column of the confusion matrix represent the prediction label and the true label respectively.

Fig. 6. Comparison of classification methods

Fig. 7. Framework of the proposed fault diagnostic method

An Efficient and Flexible Diagnostic Method for Machinery Fault Detection

387

In the 144 testing datasets, 143 samples are properly classified and one sample is misjudged. An inner race fault sample is identified incorrectly as a ball fault. Hence, the proposed method shows a classification accuracy of 99.3%. The classification accuracy of the normal condition, the outer race fault condition and the ball fault condition can achieve 100%. The diagnostic accuracy for the inner race fault condition can also reach to 97.4%. It can be seen that the proposed method obtains a desirable result evidently. The classification results of the testing datasets are presented in Fig. 6 for better visualization of the performance. The testing datasets distribution of normal condition, outer race fault condition and ball fault condition is concentrated and almost gathered at the same point, which indicates that the proposed method has a great classification effect on these three conditions. However, the distribution of inner race fault condition is scattered. An inner race fault dataset is classified into the ball fault condition area, which indicates that the classification effect of the method on the inner race fault condition needs to be improved. For better demonstrating the superiority of the proposed machinery fault diagnostic method, three existing popular methods, K-nearest neighbor algorithm (KNN), artificial neural network (ANN) and support vector machines (SVM) are introduced to make a comparison. The results are illustrated in Fig. 7. It is obviously that the classification accuracy of KNN, ANN, SVM are 31.2%, 53.5% and 69.4% respectively. In contrast, the classification accuracy of the method proposed in this paper is 99.3%.

4

Conclusions

In this paper, an efficient and flexible diagnostic method based on CNN is proposed for feature extraction and fault identification in machinery fault diagnosis. The proposed method extracts features directly from the raw vibration data for learning. In order to handle the problem of local minimum points and saddle points, the SGD method with momentum is applied to optimize the loss function in the training process. The results of experiments with rolling bearing data demonstrate the superiority of the proposed method. The method achieves a satisfactory accuracy and offers an automatic feature extraction approach, which is practical and convenient in machine fault diagnosis. Compared with conventional intelligent methods, this method is more effective and more robust in identifying fault categories of rolling bearing. Acknowledgment. This work is supported by the National Natural Science Foundation of China (61571346). The research is also supported by the Fundamental Research Funds for the Central Universities and the Innovation Fund of Xidian University. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

388

G. Wang et al.

References 1. Cui, L.L., Gong, X.Y., Zhang, J.Y., Wang, H.Q.: Double-dictionary matching pursuit for fault extent evaluation of rolling bearing based on the Lempel-Ziv complexity. J. Sound Vib. 385, 372–388 (2016) 2. He, Q.B., Wu, E.H., Pan, Y.Y.: Multi-scale stochastic resonance spectrogram for fault diagnosis of rolling element bearings. J. Sound Vib. 420, 174–184 (2018) 3. Tu, X.T., Hu, Y., Li, F., Abbas, S., Liu, Z., Bao, W.J.: Demodulated high-order synchrosqueezing transform with application to machine fault diagnosis. IEEE Trans. Ind. Electron. 66(4), 3071–3081 (2018) 4. Cui, L.L., Huang, J.F., Zhang, F.B.: Quantitative and localization diagnosis of a defective ball bearing based on vertical-horizontal synchronization signal analysis. IEEE Trans. Ind. Electron. 64(11), 8695–8706 (2017) 5. Xiang, J.W., Zhong, Y.T.: A novel personalized diagnosis methodology using numerical simulation and an intelligent method to detect faults in a shaft. Appl. Sci. 6(12), 414 (2016) 6. He, W.P., Wang, G., Hu, J., Li, C., Guo, B.L., Li, F.P.: Simultaneous human health monitoring and time-frequency sparse representation using EEG and ECG signals. IEEE Access 7, 85986–85994 (2019) 7. Liu, R.N., Yang, B.Y., Zio, E., Chen, X.F.: Artificial intelligence for fault diagnosis of rotating machinery: a review. Mech. Syst. Signal Process. 108, 33–47 (2018) 8. He, W.P., Huang, Z., Wei, Z.F., Li, C., Guo, B.L.: TF-YOLO: an improved incremental network for real-time object detection. Appl. Sci. 9(16), 3225 (2019) 9. Zhao, R., Yan, R.Q., Chen, Z.H., Mao, K.Z., Wang, P., Gao, R.: Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 115, 213–237 (2019) 10. Jing, L.Y., Zhao, M., Li, P., Xu, X.Q.: A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox. Measurement 111, 1–10 (2017) 11. Liu, R.N., Meng, G.T., Yang, B.Y., Sun, C., Chen, X.F.: Dislocated time series convolutional neural architecture: an intelligent fault diagnosis approach for electric machine. IEEE Trans. Ind. Inform. 13(3), 1310–1320 (2016) 12. Xia, M., Li, T., Lin, X., Liu, L.Z., De Silva, C.: Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks. IEEE/ASME Trans. Mechatron. 23(1), 101–110 (2017)

Image Super-Resolution Reconstruction Based on Multi-scale Convolutional Neural Network Jianqiao Song and Feng Wang(&) College of Information and Computer Science, Taiyuan University of Technology, Taiyuan, Shanxi, China [email protected], [email protected]

Abstract. For image super-resolution based on convolutional neural network, there are many problems such as large amount of calculation, many parameters, and unresolved images. This paper proposes an image super-resolution reconstruction algorithm based on multi-scale convolutional neural network. The multi-scale convolution kernel method is introduced into convolutional neural networks. Multi-scale feature extraction is achieved for different sizes of convolutional layers, at the same time, the learning parameters are improved and the network parameters are reduced. Maxout is used as an activation function to introduce competing elements. At the same time, the Skip Connection in the residual network is added to the network model to accelerate the training of deep neural networks. Experiments show that the subjective visual and objective evaluation of this algorithm has been improved to a certain extent. The edge effect of the reconstructed high-resolution image is more clear when reducing the number of network parameters, and more detailed image information is recovered. Keywords: Convolutional neural network reconstruction Multi-scale features

Deep learning Super-resolution

1 Introduction Image super-resolution reconstruction refers to the restoration of high-resolution images through low-resolution images or image sequences [1]. These details have important value in the fields of medical imaging, aerospace, and high-definition television. In summary, super-resolution reconstruction technology overcomes some limitations of hardware, and people are paying more and more attention to the development of image super-resolution technology. With the development of this technology, the application field is constantly expanding, and it has more and more broad development prospects. In the 1960s, the concept of “image super-resolution” was first proposed by [2]. Harris and Goodman [3] pointed out that Single Image Reconstruction is a linear Research Project Supported by Shanxi Scholarship Council of China (NO. 2017-049). Supported by State Key Laboratory of Air Traffic Management System and Technology, (NO. SKLATM201803). © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 389–398, 2020. https://doi.org/10.1007/978-981-15-3308-2_42

390

J. Song and F. Wang

interpolation and spline interpolation of a single image. The basic idea of image reconstruction is to require that the high-resolution image generated by image reconstruction technology be as consistent as possible with its original low-resolution image [4]. Subsequently, Tsai and Huang [5] innovatively proposed a super-resolution reconstruction technique that combines multiple low-resolution images to reconstruct a high-resolution image and an image super-resolution reconstruction technique based on frequency domain conditions. In recent years, with the development of image processing technology, more and more super-resolution reconstruction methods have been proposed. In the frequency domain, Kim [6] and Wen [7] and others have improved the algorithms of Tsai and Huang, adding processing to include local motion and noise. Due to the limitations of the frequency domain algorithm, the next airspace-based approach has become a hot topic in image research. In 1989, Stark [8] proposed Projection onto Concex Sets (POCS) based on set projection theory. This method has a significant effect on speed improvement due to its fast convergence characteristics. One of the most important methods in the field of image super-resolution reconstruction technology. Subsequent literature [9–11] has improved it. In 1991, the Iterative BackProjection (IBP) was proposed by Irani [12] and others. Rasti et al. [13] proposed adding bicubic down-sampling and bicubic interpolation to each IBP iteration, which reduces the mean square error for each iteration. Dai et al. [14] proposed an IBP method based on bidirectional filtering, which adds additional information from the feature domain in order to keep the edges smooth. Katsaggelos and Tom proposed Maximum Likelihood (ML) [15], which estimates high-resolution images by estimating the noise variance and sub-pixel displacement of low-resolution images and then using the maximum expectation. In 1995, Stevenson and Schulte [16] proposed the Application of Maximum a posteriori (MAP), which improved image quality. Subsequent literature [17–20] has improved and optimized the research on MAP technology image super-resolution reconstruction. Feuer and Elad [21] proposed a hybrid POCS and MAP method based on the previous phase. This hybrid method combines the advantages of POCS and MAP, which not only makes full use of prior knowledge, but also enables the reconstruction process to converge stably. Learning-based image super-resolution technology has become one of the hot research directions in recent years. The main idea of the learning-based superresolution reconstruction method is to establish a mapping model by training high- and low-resolution images to learn the intrinsic relationship between the two. The high resolution image corresponding to the low resolution image to be reconstructed is obtained by this model. It mainly introduces three typical methods, based on neighborhood embedding algorithm (NE), sparse coding based on algorithm (SC), and anchored neighborhood regression based on algorithm (ANR). However, such methods often have problems such as excessive calculation and overfitting. In 2014, Dong et al. [23] used the Convolutional Neural Network (CNN) to implement Super Resolution Convolutional Neural Network (SRCNN), which can learn LR images in sparse expressions. The mapping process between HR images is simulated and reproduced. The pre-processed low-resolution images are input into the end-to-end deep convolutional neural network, and the mapping relationship between low-resolution

Image Super-Resolution Reconstruction Based on Multi-scale CNN

391

images and high-resolution images is gradually trained come out. Due to the end-to-end training in deep learning, this method greatly improves the reconstruction effect compared to the traditional method. Then Kim et al. proposed DRCN [24]. This algorithm uses a recurrent neural network and increases the number of convolution layers. Compared with SRCNN, the reconstruction effect of the algorithm is improved. In this paper, for the problems of large-scale computation, large parameters, and blurred image texture in super-resolution image reconstruction, a multi-scale superresolution Convolutional Neural Network (MSSRCNN) reconstruction based on multiscale convolutional neural network was proposed. The architecture model structure was elaborated and the experimental comparison of different image super-resolution algorithms was carried out.

2 Image Super-Resolution Reconstruction Based on Multi-scale Convolutional Neural Network 2.1

Model Architecture

Firstly, the network flow chart of single image super-resolution reconstruction algorithm based on multi-scale convolutional neural network designed in this paper is introduced, as shown in Fig. 1.

Fig. 1. Block diagram of the algorithm

First assume the output of the previous layer is Yi1 , which contains nl1 feature maps. Data input to the multi-scale convolution kernel will produce a set of feature maps vlk : vlk ¼ Wlk ðYi1 Þ þ Blk

ð1Þ

where Wlk is a convolution kernel, which contains nl k-th convolution kernel with size nl1 flk flk . The multi-scale convolution kernel model generates K nl feature maps for each convolutional layer, Blk is the offset and is the convolution. As shown in Fig. 2, K ¼ 4. Then, all the feature maps obtained above are divided into nl different groups, and the i group has the k feature maps vil1 ; vil2 ; ; vilk and then the Maxout

392

J. Song and F. Wang

function to maximize the operation of vil1 ; vil2 ; ; vilk . At the location ðx; yÞ, the process of the i-th group feature map through the Maxout function can be expressed as: Yli ðx; yÞ ¼ k vil1 ðx; yÞ; vil2 ðx; yÞ; L; vilk ðx; yÞ

ð2Þ

where kðgÞ represents the Maxout activation function.

Fig. 2. Multiscale convolution competition model

As shown in Fig. 2, K is taken as 4 in the multi-scale convolutional layer, and it contains a convolution kernel with size fl1 ¼ 3 3, fl2 ¼ 5 5, fl3 ¼ 9 9, fl4 ¼ 13 13. In the above figure, each convolution kernel is assumed to be nl1 ¼ nl2 ¼ nl3 ¼ nl4 ¼ 4. The 16 sets of feature maps output after convolution are divided into 4 groups, and the Maxout activation function maximizes the 4 sets of feature image subspaces [25]. The largest image in space, used to represent the information of the subspace, and then the Maxout function only outputs the maximum eigenvalue of the subspace. At each iteration of the training process, the convolutional layer maps the features to the maximum, and the activation function outputs the largest active unit, which ultimately outputs 4 feature maps. When fl1 ¼ 3, fl2 ¼ 5, fl3 ¼ 9, fl4 ¼ 13, we define a multi-scale convolution layer as f3; 5; 9; 13g. The model proposed in this paper uses Maxout as the activation function and introduces a method of competing units. This method makes the convolution kernels compete with each other while reducing the amount of parameters that the network needs to calculate. Mutual competition units can effectively prevent the occurrence of mutual adaptation between convolution kernels and prevent over-fitting in the network. Secondly, the model proposed in this paper, after the input image passes through the multi-scale convolution kernel, generates a set of feature maps to provide different ranges of super-resolution images. Finally, we connect the Skip Connection in the residual network [26] to the network model of this paper, and add the input image directly to the output of the last layer of the network, which accelerates the training of deep neural networks. Our network removes Batch Normalization (BN), which saves a lot of GPU memory and enables the network to be built under limited hardware conditions.

Image Super-Resolution Reconstruction Based on Multi-scale CNN

393

The formula for calculating the number of parameters of the multi-scale convolutional neural network proposed in this paper is as follows. First, we need to assume that the network is composed of a convolutional layer and a fully connected layer. The formula for calculating the number of parameters of the convolution kernel is as follows: pnum ¼

XN l¼1

ðfl fl nl nl1 Þ

ð3Þ

In the formula, n_0 = 1. As shown in Table 1, we compare the number of convolution kernel parameters of SRCNN with the number of convolution kernel parameters of the multiscale convolutional neural network proposed in this paper. In SRCNN, 9-5-5 and 13-7-5, n1 = 192, n2 = 64, n3 = 1, and in multi-scale convolutional neural networks, nl1 ¼ 48, nl2 ¼ 48, nl3 ¼ 48, nl4 ¼ 48, n2 ¼ 64, n3 ¼ 1, the number of convolution kernel parameters of multi-scale convolutional neural networks reduces a large part of the parameters compared with SRCNN, which will greatly improve the training speed of the network.

Table 1. Number of parameters of different networks Network 9-5-5 13-7-5 {3-5-9-13}-5-5 {3-5-9-13}-7-5

2.2

Number of parameters 324352 636160 92032 165760

Network Training and Loss Function

First introduce the loss function used in this article. If X is the original high-resolution image, Y is the input low-resolution image, and F(Y;H) is the network mapping before the summation, which is the network parameter. We use the loss function of residual learning [26], where X represents the final output of the network, then the loss function is defined as follows: 1 X X ^ 2 ¼ 1 kX ðY þ F ðY; HÞÞk2 ¼ 1 kR F ðY; HÞk2 2 2 2

ð4Þ

For a given pair of low resolution images (Yi) and their corresponding high resolution images (Xi), n refers to the total number of training samples. LH ¼ 1ni ¼ 1NRi FiYi; H2

ð5Þ

This paper minimizes the loss function using the standard backpropagation stochastic gradient descent algorithm [4, 27] to minimize losses. The gradient

394

J. Song and F. Wang

calculation formula and weight update formula of the first layer at the i-th iteration are as follows. We update the weights as follows: Di þ 1 ¼ cgDi gg

@L @Wil

Wilþ 1 ¼ Wil þ Di þ 1

ð6Þ ð7Þ

In the formula, the standard deviation is 0.01 and the offset is 0. η represents the learning rate, the value is 10-3, the Batch size is 32, and the momentum c is 0.9.

3 Experimental Results and Analysis 3.1

Data Set

(1) experiment platform: The training of the algorithm in this chapter is on the server provided by the lab, and the server is installed with the TensorFlow platform. The specific hardware configuration of the server is: CPU memory is 8G, graphics card is GPU-8GDDR5, NVIDIA-GTX1070B. (2) data set: For the fairness of the experiment, we used 91 image data sets of Yang et al. [27] as the training data set, while the SRCNN has verified that the deep learning model usually uses a large number of data sets to optimize the training. We rotate each image of the 91 images clockwise by 0°, 90°, 180°, 270° and then mirror the operation to expand the data set to 728 images. Each sub-block size is 36, and then the extended training set is set to a step size of 10, and then 200,000 sub-blocks are randomly selected therefrom. Select Set5 and Set14 as the test set. 3.2

Experimental Results and Analysis

(3) Experimental results: In the results of this experiment, the image quality was evaluated in two parts. The first part is based on two similar image qualities, structural similarity (SSIM) and peak signal-to-noise ratio (PSNR). The second part is subjective judgment on the quality of the image. We show the reconstruction effect of different super-resolution reconstruction algorithms in the test set. By observing the visual effect after image reconstruction, the reconstruction performance of each algorithm is excellent. Results and Comparisons of Different Magnifications This part of the paper is mainly for the performance comparison of different image super-resolution algorithms in test sets under different magnifications. This part adopts two objective image quality evaluation criteria: structural similarity (SSIM) and peak signal-to-noise ratio (PSNR). The higher the PSNR, the smaller the distortion.

Image Super-Resolution Reconstruction Based on Multi-scale CNN

395

The closer the SSIM is to 1, the better the effect of the generation. The comparison results are shown in Tables 2 and 3.

Table 2. Average PSNR of datasets Set5, Set14, Urban100, and BSD100 at 2, 3, and 4 magnification Dataset Set5

Scale 2 3 4 Set14 2 3 4 Urban 100 2 3 4 BSD 100 2 3 4

Bicubic 33.65 30.39 28.42 30.34 27.55 26.09 26.88 24.46 23.16 29.56 27.21 25.96

A+ 36.54 32.58 30.28 32.28 29.13 27.32 29.20 26.03 24.32 31.21 28.29 26.82

SRCNN [23] SelfExSR 36.66 36.49 32.75 32.58 30.48 30.31 32.29 32.22 29.31 29.16 27.59 27.40 29.51 29.54 26.24 26.44 24.54 24.79 31.36 31.18 28.41 28.29 26.91 26.84

Method of this paper 37.25 33.46 31.12 32.79 29.61 27.82 29.91 26.89 25.01 31.54 28.63 27.09

Table 3. Average SSIM for datasets Set5, Set14, Urban100, and BSD100 at 2, 3, and 4 magnification Datasets Scale Bicubic A+ SRCNN SelfEXSR Method of this paper Set5 2 0.9299 0.9544 0.9542 0.9537 0.9576 3 0.8681 0.9088 0.9090 0.9092 0.9189

From the test results of Tables 2 and 3, the similarity (SSIM) and peak signal-tonoise ratio (PSNR) structural indexes of the proposed algorithm are higher than the bicubic interpolation method, SRCNN, and SelfExSR algorithm. The underlined values in the table indicate the best objective indicators in the contrast image super-resolution algorithm. The peak signal-to-noise ratio (PSNR) of the proposed algorithm is improved by about 2.06 dB, 0.48 dB, 0.3 dB, and 0.45 dB in the case of the Set14 test set amplification factor, compared with the bicubic interpolation method, SRCNN, and SelfExSR algorithms. The structural similarity (SSIM) increased by 5.41, 1.05, 0.68, and 0.97 percentage points, respectively. Display and Compare Results with Magnification In the experimental results, we select the image in the Set5 data set as a comparison of test results, and compare it with the bicubic interpolation method, SRCNN, and SelfExSR algorithm. Then select 3 images in Set5 to display the image reconstruction effect of various image super-resolution algorithms.

396

J. Song and F. Wang

Fig. 3. Baby super-resolution image reconstruction results

Fig. 4. Butterfly super-resolution image reconstruction results

Fig. 5. Woman super-resolution image reconstruction results

In Figs. 3, 4 and 5, we partially enlarge the eye of the baby, the color change of the butterfly wings, the eye of the bird and the mesh hat of the person. The experimental results of the bicubic interpolation algorithm are very unclear and the edges are blurred.

Image Super-Resolution Reconstruction Based on Multi-scale CNN

397

The edge details of the mesh hat in Fig. 5, the experimental results of the SRCNN and SelfExSR algorithms can slightly see the edge, but still have a gap with the original image. The experimental results of this algorithm extract more image high frequency than the first three algorithms. Details, the edges are clearer and more complete. 3.3

Conclusion and Outlook

In this paper, the algorithm model architecture of image super-resolution reconstruction based on multi-scale convolutional neural network is described in detail, and then compared with bicubic interpolation, SRCNN and SelfExSR algorithms. Experiments show that the proposed algorithm has a certain degree of improvement in subjective visual or objective evaluation compared to the previous algorithm, and the algorithm reduces the number of parameters of the network while the reconstructed highresolution image has a clearer edge. The effect is to restore more image detail information. For convolutional neural networks, the deeper the network model tends to get higher dimensional image characteristics during training. So we need to continue to study the depth of the network.

References 1. Maheta, N.D.: A comparative study of image super resolution approaches. In: International Conference on Electronics Computer Technology, pp. 129–133. IEEE (2011) 2. Harris, J.L.: Diffraction and resolving power. J. Opt. Soc. Am. (1917–1983) 54, 931–933 (1964) 3. Goodman, J.W., Cox, M.E.: Introduction to Fourier Optics. McGraw-Hill, New York (1968) 4. Shen, H.F., Li, P.X., Zhang, L.P., et al.: Overview on super resolution image reconstruction. Opt. Tech. 35(2), 194–199+203 (2009) 5. Huang, T.S.: Advances in Computer Vision and Image Processing. Jai Press (1986) 6. Kim, S.P., Bose, N.K., Valenzuela, H.M.: Recursive reconstruction of high resolution image from noisy undersampled multiframes. IEEE Trans. Acoust. Speech Signal Process. 38(6), 1013–1027 (1990) 7. Wen, Y.S., Kim, S.P.: High-resolution restoration of dynamic image sequences. Int. J. Imaging Syst. Technol. 5(4), 330–339 (2010) 8. Stark, H., Oskoui, P.: High-resolution image recovery from image-plane arrays, using convex projections. J. Opt. Soc. Am. A Opt. Image Sci. 6(11), 1715 (1989) 9. Wernick, M.N., Chen, C.T.: Method of recovering tomographic signal elements in a projection profile or image by solving linear equations: US, US 5323007 A (1994) 10. Stark, H., Olsen, E.T.: Projection-based image restoration. J. Opt. Soc. Am. A 9(9), 1914– 1919 (1992) 11. Xie, T.: Super-resolution image restoration using improved POCS algorithm. Electron. Des. Eng. 21(18), 142–144 (2013) 12. Irani, M., Peleg, S.: Improving resolution by image registration. CVGIP: Graph. Models Image Process. 53(3), 231–239 (1991) 13. Rasti, P., Demirel, H., Anbarjafari, G.: Improved iterative back projection for video superresolution. In: Signal Processing and Communications Applications Conference, pp. 552– 555. IEEE (2014)

398

J. Song and F. Wang

14. Dai, S., Han, M., Wu, Y., et al.: Bilateral back-projection for single image super resolution. In: IEEE International Conference on Multimedia and Expo, pp. 1039–1042. IEEE (2007) 15. Tom, B.C., Katsaggelos, A.K.: Reconstruction of a high-resolution image by simultaneous registration, restoration, and interpolation of low-resolution images. In: 1995 Proceedings of the International Conference on Image Processing, p. 2539. IEEE (1995) 16. Schultz, R.R., Stevenson, R.L.: Extraction of high-resolution frames from video sequences. IEEE Trans. Image Process. 5(6), 996–1011 (1996). A Publication of the IEEE Signal Processing Society 17. Schultz, R.R., Stevenson, R.L.: Motion-compensated scan conversion of interlaced video sequences. In: Proceedings of SPIE - The International Society for Optical Engineering (1996) 18. Hardie, R.C., Barnard, K.J., Armstrong, E.E.: Joint MAP registration and high-resolution image estimation using a sequence of undersampled images. IEEE Trans. Image Process. 6(12), 1621–1633 (2002) 19. Zhang, L., Yang, J., Xuebin: Improved maximum posterior probability estimation method for single image super-resolution reconstruction. Prog. Laser Optoelectron. 48(1), 78–85 (2011) 20. Xujin: Research on Image Super Resolution Reconstruction Based on MAP Technology. Ocean University of China (2007) 21. Elad, M., Feuer, A.: Restoration of single super-resolution image from several blurred. IEEE Trans. Image Process. 6, 1646–1658 (1997) 22. Ouwerkerk, J.D.V.: Image super-resolution survey. Image Vis. Comput. 24(10), 1039–1052 (2006) 23. Dong, C., Chen, C.L., He, K., et al.: Learning a deep convolutional network for image superresolution, vol. 8692, pp. 184–199 (2014) 24. Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637– 1645. IEEE Computer Society (2016) 25. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv Preprint arXiv:1301.3557 (2013) 26. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition, pp. 770–778 (2015) 27. Lecun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

PLSTM: Long Short-Term Memory Neural Networks for Propagatable Traffic Congested States Prediction Yuxin Zheng1,2, Lyuchao Liao1,2(&), Fumin Zou1,2, Ming Xu3, and Zhihui Chen1,2 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected], [email protected] Fujian Provincial Big Data Institute for Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, Fujian, China 3 College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin 300222, China

Abstract. The accurate prediction of traffic congested states in major cities is indispensable for urban traffic management and public traveling routes planning. However, the understanding of traffic congestion propagation has not raised much concern. Traffic congestion propagation reflects how the current congested roads will affect their connected roads, which is vital to improve prediction accuracy of traffic conditions. In this paper, we propose a novel method named PLSTM to further explore the characteristics of traffic congestion propagation and predict short-term traffic congested states, which is a long short-term memory (LSTM) neural network for modeling traffic propagation. Firstly, we consider local spatial-temporal correlation of congestion and integrate the data into input series. Secondly, the PLSTM component that comprises multi-LSTM layers is trained with the input series. Finally, we conduct various contrast experiments with state-of-the-art predictors to evaluate the performance of PLSTM. The experimental results have validated the rationality of input series on improving prediction accuracy and the effectiveness of PLSTM. Keywords: Traffic states prediction network Confusion matrix

Congestion propagation LSTM neural

1 Introduction The sharp increase of vehicles in recent years has led to severe traffic congestion in metropolitan areas, which has reduced traffic efficiency and aggravated environmental problems [1]. Road construction projects to enlarge the road capacity cannot solely address the problem due to land constraints as well as high cost of construction and maintenance. Recent years has witnessed the boost of computation and big data storage [2], which makes prediction algorithms based on big data prevalent in the field of traffic conditions forecast. The accurate prediction of real-time traffic conditions assists government in designing traffic management strategies and benefit travelers’ route preplanning, thus relieve congestion economically. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 399–406, 2020. https://doi.org/10.1007/978-981-15-3308-2_43

400

Y. Zheng et al.

Most existing works have been done on predicting traffic data that can be obtained directly from sensors, such as traffic flow, traffic velocity and travel time, rather than congested states [3]. Meanwhile, the congested states are more understandable because people can hardly figure out whether a specific road is congested through the other traffic information [4]. In addition, congestion emerges on several roads, their connected roads are probably affected and then congested, which is called traffic congestion propagation. Understanding the congestion propagation identifies failure on road networks. At present, many researches have done to figure out traffic congestion patterns using spanning trees to construct the sequence of congestion propagation graph to describe the congestion propagation in continuous periods [5, 6]. However, traffic congested states prediction based on congestion propagation provides real-time traffic conditions and benefits the decision determination of people and governments dynamically, which are more valuable than studying congestion propagation patterns. Over the past decades, many researches have been carried out on the prediction of short-term traffic states with good effect, including parametric approaches represented by ARIMA [7] and linear regression [8], as well as non-parametric approaches such as bayesian approaches [9], support vector machines [10] and artificial neural network [11, 12]. Parametric methods perform well only if the traffic states change regularly, which seldom occurs in real life. The non-parametric methods, by contrast, are more flexible of describing high dimensional and non-linear relationship. Among the nonparametric approaches, LSTM-based models perform quite well when dealing the time sequences and prevent the vanishing gradient problem with gating mechanism, thus are widely used recently. Zhao et al. [11] gave an LSTM with two-dimensional input and outperformed convolutional neural network. Wang et al. [12] validated the robustness of LSTM in exploring longer time-dependency of traffic conditions data. In this paper, an LSTM-based traffic congested states prediction model named PLSTM is introduced. We first extract the unique features of traffic congestion propagation and constructed input series. The input series are converted into one-hot vectors and regarded as inputs of PLSTM component for model training. After learning the complicated congestion propagation regularities, PLSTM provides accurate short-term traffic congested states. Compared with some advanced baseline models, PLSTM is more effective for improving prediction accuracy and robustness.

2 Preliminaries In this section, we give some essential definitions and the problem to be settled. Definition 1. Congested State SRi;j : A road R is defined as “congested” in time interval j of day i Ti;j when the average travel time ti;j is longer than 90% of its travel time distribution ðt90 Þ [6]. SRi;j ¼

1; ti;j t90 0; otherwise

ð1Þ

PLSTM: Long Short-Term Memory Neural Networks

401

Definition 2. Propagation Causer: Road A (with an origin node Ao and a destination node Ad ) is a propagation causer of road R (with an origin node Ro and a destination node Rd ) if and only if Ao ¼ Rd , namely, vehicles traverse from R to A. Problem 1. Short-Term Traffic Congestion Propagation Prediction: Fora certain road

R and its propagation causers road A and B, given the congested states SRi;j ; SAi;j ; SBi;j ,

the problem is to predict b S Ri;j þ 1 , namely, the congested state of road R in the next time interval.

3 Methodology This paper proposes a novel prediction model named PLSTM to learn the law of congestion propagation based on history data and make short-term prediction of the studied road. Figure 1 illustrates the architecture of PLSTM. Firstly, we integrate the raw data into one-dimensional series and transform them into one-hot vectors. Then the input vectors are fed into PLSTM component to train the predicting model. Finally, we obtain the predicted congested states and evaluate the performance of the proposed model compared with the actual values.

Fig. 1. The architecture of PLSTM

3.1

Input Series Construction

The traffic congested states of a specific road R are locally coherent in time and directly propagatable. For instance, once a traffic jam is formed on road R, it rarely disappears in a very short time period like 5-min. Therefore, the current traffic congested state has strong relation with the traffic states in the last interval. Besides, when the propagation causers of R are congested, vehicles on R need to wait and may form another congestion in the following intervals. Obviously, the current traffic congested states can be affected by the history traffic states of propagation causers. Taking a fork in a road as an example, we construct the input series as follows: h i Xi;j ¼ Ti;j ; SRi;j ; SAi;j ; SBi;j ; i ¼ 1; 2; . . .; m; j ¼ 1; 2; . . .; n

ð2Þ

402

Y. Zheng et al.

where, Ti;j represents the time interval j in day i, SRi;j is the traffic congested state of the studied road at Ti;j , SAi;j and SBi;j are traffic congested states of propagation causers, m is the number of days while n is the number of time intervals in a day. Theses input series are then transformed into one-hot vectors. 3.2

PLSTM Predicting Component

LSTM neural networks are extremely suitable for time series processing, and much easier to be trained without gradient disappearance problem [13]. Hence, we use LSTM to capture the propagation law of traffic congested states. LSTM has input gate I, forget gate F and output gate O to control how information flows in and out of internal states in the network. Taking a one-hot vector xi;j as the input, the information transmission and states update in LSTM units at Ti;j þ 1 can be described as follows: Ii;j þ 1 ¼ r WI hi;j ; xi;j þ bI Fi;j þ 1 ¼ r WF hi;j ; xi;j þ bF e i;j þ 1 ¼ tanh WC hi;j ; xi;j þ bC C

ð3Þ

en þ1 Ci;j þ 1 ¼ Fi;j þ 1 Cn þ Ii;j þ 1 C Oi;j þ 1 ¼ r WO hi;j ; xi;j þ bO hi;j þ 1 ¼ Oi;j þ 1 tanh Ci;j þ 1

where, r is the sigmoid activation function, tanh is the tanh activation function. W and b with different subscripts are parameters of corresponding gates. Ii;j þ 1 and Fi;j þ 1 determine the degree of information that should be retained or eliminated respectively. e i;j þ 1 denotes the new information at the next time interval. Ci;j þ 1 passes the main C message and determines what information should be stored and overwritten across each step. Oi;j þ 1 denotes the output information and the final output of LSTM units can be acquired through further transformation of hi;j þ 1 . As can be seen from Fig. 1, the input one-hot vectors xi;j are fed into the PLSTM predicting component, which consists of 3 LSTM layers and a fully connected layer, A b and obtained the traffic states of the studied road in the next interval S . The i;j þ 1

extra LSTM layers increase the depth of the original LSTM neural network, thus improving the efficiency and accuracy of the training process, and reducing the number of neurons and iterations of each hidden layer [13]. However, too many layers will prolong the training time. Take the data scale into consideration, this paper uses 3 LSTM layers to extract complex features of propagation patterns. After the process of each LSTM layer, we randomly disable a certain proportion of neurons by ignoring their effects on processing results, which can avoid overfitting during the training process and improve the generalization ability of prediction model. The number of the hidden neurons and the proportion of disabled neurons in each LSTM layer are unfixed in order to increase the flexibility and diversity of the prediction model and achieve a better training effect. Finally, the output of the 3 LSTM layers is transmitted to a fully connected layer, which integrates all representations to a predicted traffic congested state.

PLSTM: Long Short-Term Memory Neural Networks

3.3

403

Loss Function

The traffic congested states only have two values, “0” (non-congested) and “1” (congested), so the prediction of traffic congestion states is more like a binary classification problem. We use binary cross entropy (Eq. 4) to evaluate the difference between the values predicted by PLSTM model and the actual traffic congested states. loss ¼ min

P b S Ri;j R R R b b max P S þ logð1 þ e Þ ð4Þ ; 0 P S S i;j i;j i;j j¼1

Xm Xn i¼1

where, m and n respectively represent the number of days and the number of time intervals of a day in the train set. b S Ri;j is the predicted traffic state of the studied road and P b S Ri;j denotes its probability. SRi;j is the actual traffic state of the studied road.

4 Experiments In this section, we conduct various comparison experiments to evaluate the performance of PLSTM. We first introduce the traffic congested data, parameters setting of PLSTM and the evaluation metrics. Then we compare the experimental results with LSTM that do not consider the congestion propagation to prove the rationality of our input series. Finally, we compare PLSTM with several effective predictors. 4.1

Data Set and Settings

The experimental data come from a fork in a road (No. 5, No. 19 and No. 21, No. 19 and No. 21 are propagation causers of No. 5) of a major city in Australia during a period of 22 days (from 17/06/2013 to 8/07/2013) with average sampling interval of 5 min. We compute the traffic congested states in each interval based on Definition 1, then transform them into 6336 one-hot vectors according to Sect. 3.1. The congested rate of No. 5 is about 25.5% in all time intervals. Here we take vectors from the former 3 weeks as the train set and the last day as the test set. We use keras package in python to construct the proposed predicted model. We set the batch size at 48 and the number of epochs at 30. The first LSTM layer owns 200 hidden neurons. To increase the robustness of PLSTM, the hidden neurons in the extra LSTM layers vary from 20 to 220 and the dropout proportions of neurons range from 0 to 0.5. We select the multiplicative error model (MEM) to be the training loss function of each LSTM layer and Adam [14] to be the optimizer. To validate the effectiveness of PLSTM, 3 evaluation metrics are used in this paper: confusion matrix (including precision, recall and F1-score), receiver operating characteristic (ROC) curves and area under curve (AUC). 4.2

Rationality of Input Series Construction

To evaluate the effect of congestion propagation patterns on the predicting accuracy of traffic congested states, we compare PLSTM with traditional LSTM neural networks.

404

Y. Zheng et al.

To ensure a fair comparison, the structure of multilayers and parameters setting h i of the R LSTM are the same as PLSTM. The input series of LSTM is Xi;j ¼ Ti;j ; Si;j , namely, spacial structure and congestion propagation are not considered. We only use history congested states of the studied road to predict the next congested state in traditional LSTM. The experimental results are demonstrated in Table 1, where, “0” and “1” represent “non-congested” and “congested” respectively, Avg is the average values of each evaluation metric. Precision (Pr) means the correct rate of all the results that predicted as “0” (or “1”), Recall (Re) means the correct rate of all actual values of “0” (or “1”), Pr and Re are negative correlated. F1-score (F1) is a comprehensive evalu2PrRe ation of Pr and Re, F1 ¼ Pr þ Re. The higher the values of these evaluating indicators, the better the performance of the model. Table 1. Comparisons of PLSTM and LSTM Method Precision (Pr) 0 1 Avg LSTM 0.995 0.813 0.953 PLSTM 0.973 0.909 0.958

Recall 0 0.932 0.973

(Re) 1 Avg 0.985 0.944 0.909 0.958

F1-score (F1) 0 1 Avg 0.963 0.890 0.946 0.973 0.909 0.958

As seen from Table 1, most values of evaluation metrics in PLSTM were higher than LSTM. The precision of “1” is more important because the congested rate is much lower than the non-congested rate, and people care more about traffic jams. Despite the extremely high precision of “0” in LSTM, the precision of “1” is much lower, which indicates the performance of LSTM is inferior to that of PLSTM. Therefore, the input series proposed in this paper is reasonable, and consideration of congestion propagation truly raises the prediction accuracy of traffic congested states. 4.3

Comparison with Other Predictors

To validate the performance of PLSTM, we conduct contrast experiments between PLSTM and several effective predictors. The baseline models consist of gaussian naive bayesian (GNB) classifier, support vector machines (SVM), k-nearest neighbors (KNN), and multilayer perceptions (MLP). The input series proposed in this paper are served as inputs of them. The comparison results of testing data are shown in Table 2. Table 2. Comparisons of PLSTM and several predictors Method Precision (Pr) 0 1 GNB 0.953 0.853 SVM 0.955 0.875 KNN 0.977 0.795 MLP 0.967 0.797 PLSTM 0.973 0.909

Avg 0.927 0.931 0.937 0.928 0.958

Recall 0 0.949 0.946 0.933 0.932 0.973

(Re) 1 0.865 0.895 0.921 0.894 0.909

Avg 0.927 0.931 0.931 0.924 0.958

F1-score (F1) 0 1 0.951 0.859 0.950 0.885 0.955 0.853 0.950 0.843 0.973 0.909

Avg 0.927 0.931 0.932 0.925 0.958

PLSTM: Long Short-Term Memory Neural Networks

405

Obviously, the former 4 predictors receive quite good Pr when predicting “0”, while do badly on prediction of “1”. They focus on the non-congested states which occur in most of the time, thus have bad generalization ability on the testing data. PLSTM was clearly superior to these predictors according to the contrast experimental results. Taking KNN, which has the highest average scores of evaluation metrics, as an example, although Pr of “0” and Re of “1” in PLSTM are a little smaller than those in KNN, the Pr of “1” and the Re of “0” are raised by 14.1% and 4.4%. Furthermore, PLSTM receives the highest F1 compared with the above predictors. To further discuss the performance of PLSTM, we then used ROC curves and AUC (defined in [15]) as evaluation metrics (Fig. 2). The closer the ROC curve is to (0,1) and the closer the AUC is to 1, the better the performance of the model.

Fig. 2. ROC curves and AUC of PLSTM and several predictors

As illustrated in Fig. 2, the AUC of PLSTM is the highest among all predictors. Compared with GNB and KNN, the ROC curves of GNB and KNN can be completely covered by those of PLSTM. As for SVM and MLP, although their true positive rates are higher in the beginning, the ROC curves of PLSTM is closer to (0,1). Therefore, PLSTM is superior to the above predicted models in the aspect of generalization abilities and the compressive prediction accuracy of traffic congested states.

5 Conclusion The accurate prediction of traffic congested states is beneficial to government and the public when making traffic management strategies or dynamically changing travel routes to relieve traffic jams. Further exploration of the traffic congestion propagation patterns on the road networks contributes to better forecast of the real-time traffic conditions and then serves as the preconditions of researches such as path planning and traffic signal control. In this paper, we propose an effective predicting method (PLSTM)

406

Y. Zheng et al.

to forecast the short-term traffic congested states of a fork in Australia. Initially, we notice that congestions on road network are locally time dependent and directly propagatable, thus integrate the data into input series. Secondly, PLSTM component is used to process those input series. Eventually, we conduct various experiments to evaluate the effectiveness of PLSTM compared with several typical predictors. The experimental results indicate that the novel construction of input series greatly improve the prediction accuracy of traffic congested states and the proposed PLSTM prediction model outperforms the advanced predicting methods. Acknowledgment. This work was supported in part by projects of the National Science Foundation of China (41971340, 41471333, 61304199), project 2017A13025 of Science and Technology Development Center, Ministry of Education, project 2018Y3001 of Fujian Provincial Department of Science and Technology, projects of Fujian Provincial Department of Education (JA14209, JA15325, FBJG20180049).

References 1. Liao, L., Jiang, X., Zou, F., et al.: A spectral clustering method for big trajectory data mining with latent semantic correlation. Chin. J. Electron. 43(5), 956–964 (2015) 2. Liao, L., Wu, J., Zou, F., et al.: Trajectory topic modelling to characterize driving behaviors with GPS-based trajectory data. J. Internet Technol. 19(3), 815–824 (2018) 3. Do, L.N.N., Taherifar, N., Vu, H.L.: Survey of neural network-based models for short-term traffic state prediction. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9(1), e1285 (2018) 4. Chen, M., Yu, X., Liu, Y.: PCNN: deep convolutional networks for short-term traffic congestion prediction. IEEE Trans. Intell. Transp. Syst. 19(11), 3550–3559 (2018) 5. Chen, Z., Yang, Y., Huang, L., et al.: Discovering urban traffic congestion propagation patterns with taxi trajectory data. IEEE Access 6, 69481–69491 (2018) 6. Nguyen, H., Liu, W., Chen, F.: Discovering congestion propagation patterns in spatiotemporal traffic data. IEEE Trans. Big Data 3(2), 169–180 (2017) 7. Chen, C., Hu, J., Meng, Q., et al. (eds.): Short-time traffic flow prediction with ARIMAGARCH model. In: 2011 IEEE Intelligent Vehicles Symposium (IV) (2011) 8. Yang, X., Kastner, R., Sarrafzadeh, M., et al.: Congestion estimation during top-down placement. IEEE Trans. Comput. Aided Des. Integr. Circuits 21(1), 72–80 (2002) 9. Kim, J., Wang, G.: Diagnosis and prediction of traffic congestion on urban road networks using Bayesian networks. Transp. Res. Rec. 2595(1), 108–118 (2016) 10. Nguyen, H.N., Krishnakumari, P., Vu, H.L., et al. (eds.): Traffic congestion pattern classification using multi-class SVM. In: 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC) (2016) 11. Zhao, Z., Chen, W., Wu, X., et al.: LSTM network: a deep learning approach for short-term traffic forecast. IET Intell. Transp. Syst. 11(2), 68–75 (2017) 12. Wang, J., Hu, F., Li, L. (eds.): Deep bi-directional long short-term memory model for shortterm traffic flow prediction. In: International Conference on Neural Information Processing (2017) 13. Hermans, M., Schrauwen, B. (eds.): Training and analysing deep recurrent neural networks. In: Advances in Neural Information Processing Systems (2013) 14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint (2014) 15. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)

Artificial Intelligence Applications

Analysis of Dynamic Movement of Elevator Doors Based on Semantic Segmentation Chih-Yu Hsu1, Joe-Yu1(&), and Jeng-Shyang Pan1,2 1

School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected] 2 College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266510, China

Abstract. An analysis method for the movement state of an elevator door is proposed in this paper. Firstly, we load the monitoring videos which record the movement of the elevator door. Then we label the position of the elevator door in the video and use it as a data set for training the semantic segmentation network. Next initialize the image input layer, downsampling network, upsampling network, and pixel classification layer in the semantic segmentation network, and stack all layers to complete the creation of the semantic segmentation network. Finally, after identifying the elevator door position in the video by semantic segmentation, process the identified images using image erosion and edge detection operators and estimate the distance between the elevator doors. The method for estimating the distance proposed in this paper has strong adaptability and low cost. As a result of experiments, the accuracy of method which is proposed in this paper has reached 97.7%. Keywords: Semantic segmentation Image erosion Edge detecting method Estimate distance

1 Introduction With the development of human society, people are increasingly demanding distance measurement. Distance measurement technology is widely used in electric power, water conservancy, communication, construction, etc. The traditional estimate distance technology is mainly laser ranging and ultrasonic ranging. Although these two technologies are highly accurate and are not affected by environmental factors such as illumination, the equipment is expensive and the measurement effect will be affected by the size of the object. However, the video estimate distance technology doesn’t need to actively transmit signals to the outside and has the characteristics of high concealment, simple structure, and convenient data [1]. In recent years, the research of video estimating distance technology has been deepened. There are two main types of current machine vision ranging methods: the first one is the calibration camera. The second category is based on the camera imaging model [2]. Besides, automated video analysis has attracted a great deal of attention and it is widely used in many fields. For example, the audience can better understand the game through the analysis of sports video [3]. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 409–415, 2020. https://doi.org/10.1007/978-981-15-3308-2_44

410

C.-Y. Hsu et al.

There are two methods studied previously for segmenting based on graph cut. One method employs graph cut to segment objects automatically, and the other uses the graph cut and the C-V model (simplified Mumford-Shah model) to segment objects [4]. The research of video estimating distance technology based on deep learning proposed in this paper applies a semantic segmentation network to identify elevator doors during the moving, estimate and record the distance between doors at each moment, and calculate the correct rate of the method.

2 Distance Estimation This chapter analyzes the elevator door surveillance video and detects the door spacing of the elevator door in the running state. First, the position of the elevator door at each moment is marked frame by frame, as shown in the following figure, wherein Fig. 1 is when the elevator door is in the open state. The position of the elevator door; Fig. 2 is the position of the elevator door when the operating state of the elevator door is OFF. In Sect. 2.1, the semantic segmentation network will be introduced. Section 2.2 introduces the distance estimation method to process the elevator door image identified by the semantic segmentation network and estimate the door.

Fig. 1. The elevator door is opening.

2.1

Fig. 2. The elevator door is closing

Semantic Segmentation

Segmentation is essential for image analysis tasks. Semantic segmentation describes the process of associating each pixel of an image with a class label, (such as flower, person, road, sky, ocean, or car). Figure 3 shows the principle of semantic segmentation.

Analysis of Dynamic Movement of Elevator Doors

411

Fig. 3. The principle of semantic segmentation.

The steps for training a semantic segmentation network are as follows: 1, analyze training data for semantic segmentation. 2, create a semantic segmentation network. 3, train a semantic segmentation network. 4, evaluate and inspect the results of semantic segmentation. 5, import pixel labeled dataset for semantic segmentation. 2.2

Image Erosion

The four basic operations of morphological morphology are expansion, erosion, opening, and closing. By selecting structural elements of different sizes and shapes, different images and processing effects can be achieved [5]. Since morphology stems from the concept of filling, and the object of gray value morphological processing is the topological property of image waveforms, we can directly define gray value operations using the filling concept [6]. Corrosion of an image using structural elements is defined as: ð1Þ From a geometric point of view, in order to find out the effect of the image being etched by the structural element at point x, we slide the structural element in space so that the starting point coincides with the x point, and then push up the structural element, and the structural element is still below the image. The maximum value is the corrosion result at that point. 2.3

Prewitt Edge Detection Operator

The Prewitt edge operator is an edge template operator. The template operator consists of an ideal edge sub-image. The edge template is used to detect the image in turn, and the template most similar to the detected area gives the maximum value, and this maximum value is used as the output of the operator. Px ¼ ff ðx þ 1; y 1Þ þ f ðx þ 1; yÞ þ f ðx þ 1; y þ 1Þg ff ðx 1; y 1Þ þ f ðx 1; yÞ þ f ðx 1; y þ 1Þg Py ¼ ff ðx 1; y þ 1Þ þ f ðx; y þ 1Þ þ f ðx þ 1; y þ 1Þg ff ðx 1; y 1Þ þ f ðx; y 1Þ þ f ðx þ 1; y 1Þg

ð2Þ

ð3Þ

412

C.-Y. Hsu et al.

As shown in Fig. 4, two convolution operators form a Prewitt edge operator, and each pixel in the image is convoluted by the two cores, and the maximum value is taken as an output, and the operation result is an edge detection image.

Fig. 4. Convolution operators of Prewitt

3 Results and Discussion This chapter shows the results of the methods involved in the second chapter of the experiment. 3.1

Semantic Segmentation

To train a semantic segmentation network, there should be a collection of images and its corresponding collection of pixel labeled images. Figure 5 shows the movement state of the elevator at a certain moment. Figure 6 shows the effect of the label in Fig. 5. The blue area represents the elevator door and the red area represents the other background.

Fig. 5. Elevator door at a certain moment

Fig. 6. Labeled Fig. 5.

Labeling each frame of the elevator door video as a data set to train the semantic segmentation network. Use the trained semantic segmentation network to analyze the elevator door video in real-time. The experimental result is shown in Fig. 7. The distance between elevator doors is shown in Fig. 8.

Analysis of Dynamic Movement of Elevator Doors

Fig. 7. The experimental result

3.2

413

Fig. 8. The distance between doors

Image Erosion

In this section, we convert Fig. 5 to the grayscale image which is shown in Fig. 9. And Fig. 10 shows the result of image erosion in Fig. 9.

Fig. 9. The grayscale image

3.3

Fig. 10. The result of image erosion

Prewitt Edge Detection Operator

To better identify the edge of the elevator door, this study performs a corrosion operation on the image identified by the semantic segmentation network. Edge extraction of the image of the eroded elevator door shown in Fig. 10 is now performed using the Prewitt edge operator. Figure 11 shows the result of Prewitt detecting.

414

C.-Y. Hsu et al.

Fig. 11. The result of Prewitt detecting

3.4

Result

Through the experimental effect of the above method, the distance between the left and right elevator doors in the running state is estimated. By analyzing the video of 5 complete elevator door opening and closing operations, the results are shown in Table 1. As is shown in Table 1, the accuracy of method which is proposed in this paper has reached 97.7%. Figure 12 shows the elevator door operation curve of the fourth experiment. Table 1. The accuracy of edge detecting. Index 1 2 3 4 5

Number of video frame Accuracy rate 296 96.3% 294 95.5% 300 97.2% 299 97.7% 303 96.1%

Fig. 12. The movement of elevator doors

Analysis of Dynamic Movement of Elevator Doors

415

4 Conclusion This paper proposes a distance estimation method. By analyzing the running state of the elevator door, the door spacing of the elevator is estimated and the running state curve of the elevator door is drawn. The experimental results show that the accuracy reaches 97.7%. In future work, High-precision door spacing estimation will be used to draw realtime elevator door operation curve, and real-time detection of elevator door operation status. After recording the fault curve of the elevator door, the machine learning technology is used to predict the fault of the elevator door. The manager can be promptly alerted and the elevator door can be inspected in time to avoid the safety accident.

References 1. Roberts, L.G.: Machine perception of three-dimensional solids, vol. 20, pp. 31–39 (1963) 2. Vashitz, G., Shinar, D., Blum, Y.: Vehicle information system to improve traffic safety tunnels. Transp. Res. Part F: Traffic Psychol. Behav. 11(1), 61–74 (2008) 3. Hung, M.-H., Hsieh, C.-H., Kuo, C.-M., Pan, J.-S.: Generalized playfield segmentation of sport videos using color features. Pattern Recogn. Lett. 32(7), 987–1000 (2011) 4. Hou, Y., Gao, B.-L., Pan, J.-S.: The application and study of graph cut in motion segmentation. In: IAS2009, pp. 265–268 (2009) 5. Bouzerdoum, A., Pattison, T.R.: Neural network for quadratic optimization with bound constrains. IEEE Trans. Neural Netw. 4(2), 293–304 (1993) 6. Cui, Y.: Image Process and Analysis: Mathematical Morphology and Its Application. Science Press, Beijing (2000)

A Multi-component Chiller Status Prediction Method Using E-LSTM Chenrui Xu1,2,3, Kebin Jia1,2,3(&), Zhuozheng Wang1,2,3, and Ye Yuan1,2,3 1

3

Faculty of Information Technology, Beijing University of Technology, Beijing, China [email protected] 2 Beijing Laboratory of Advanced Information Networks, Beijing, China Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Abstract. With the development of intelligent information technology, chiller system composed by different interrelated components has been widely used in industry to cool products and machinery. Predicting the status of chiller system can effectively monitor energy consumption and reduce accident rate. In this paper, we propose an improved LSTM (E-LSTM) method to predict multicomponent chiller status. Firstly, a mean filter method is used to preprocess the original multi-component time series data. Secondly, we adopt E-LSTM to extract hidden features from seven component-wise inputs, consisting of outdoor temperature, wet bulb temperature, outdoor enthalpy, L1 & L2 differential pressures, total power, and IT load. Finally, the learned hidden features are fed into a regression layer to predict three future chiller statuses, including PUE, cold source power, and refrigeration secondary pump power, respectively. Experimental results show that the proposed method outperforms the baselines, such as linear regression, SVR, RNN, GRU and LSTM, and hence demonstrate the effectiveness of our proposed method in the task of chiller status prediction.

1 Introduction Chiller system is classified as a refrigeration system that removes heat from a liquid via a vapor-compression or absorption refrigeration cycle. The liquid can then be circulated through a heat exchanger to cool equipment, or another process stream (such as air or process water) [1]. Chiller is mainly composed by compressor, condenser, throttle valve, and evaporator [2]. There exist hidden connections among all the components, which result in a complex working flow. According to statistics [3], the energy loss caused by chiller malfunctions accounts for 10–40% of the total energy consumption. The malfunctions would cause different degrees of economic losses and serious consequences. Therefore, predicting the status of the multi-component chiller system can improve the capability of risk analysis, and enhance the stability of the refrigeration system. With the advancement of computer technology and data mining methods, numerous studies have been conducted on status prediction. In particular, Ming et al. [4] © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 416–428, 2020. https://doi.org/10.1007/978-981-15-3308-2_45

A Multi-component Chiller Status Prediction Method Using E-LSTM

417

proposed a prediction model for non-linear autoregressive time series to predict the horizontal convergence for surrounding rock and surface deformation during tunnel construction. It gets good prediction performance only for the stationary trend data. But for the non-stationary sequence with trend, seasonal periodicity, and randomness, this method is difficult to obtain high prediction accuracy. Jie et al. [2] adopted a regression prediction model to conduct in-depth research and analysis on real-world power forecasting problems. However, the authors use handcrafted features, which has poor generalization. Towards this end, Binnan et al. [5] predicted and analyzed the soil moisture characteristics by using a combined forecasting model based on grey-BP neural network. Although this method performs better than handcrafted feature extraction, the temporal relationships among timestamps are not taken into account. Nanyi et al. [6] proposed a feature extraction and prediction method based on RNNRBM to predict the outpatient volume data in hospital. Although the RNN-based model is able to extract temporal features from time series, it is easy to forget historical information in long sequence, and hence reduce the prediction accuracy. In view of the characteristics that the data collected by refrigeration equipment in data center are time series, we apply a deep learning method (LSTM) to predict the state of refrigeration equipment. It is found that the working principle of refrigeration equipment is complex and there are extensive data types. It is difficult to extract the features of refrigeration equipment effectively using LSTM. Therefore, we propose an improved LSTM(E-LSTM) method to predict chiller status. In this paper, we firstly preprocess the original time series data by using a mean filter method. Then, the hidden features are extracted by using E-LSTM from seven component-wise inputs, including outdoor temperature, wet bulb temperature, outdoor enthalpy, L1 & L2 differential pressures, total power, and IT load. Finally, three future statuses of chiller system are predicted, including PUE, cold source power, and refrigeration secondary pump power, respectively. Experimental results show that the proposed method can not only improve the prediction accuracy using multi-component time series input, but also enhance the robustness in chiller status prediction.

2 Methodology 2.1

LSTM Neural Network

LSTM networks are well-suited to classifying, processing and making predictions based on time series data. It is an RNN variant that introduces the structure of “memory cell” in the hidden layer to calculate the state of hidden layer. A common LSTM cell is mainly composed of an input gate, an output gate, and a forget gate [7]. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell. The structure of “memory cell” in LSTM is shown in Fig. 1.

418

C. Xu et al.

In Fig. 1, xt represents the input vector of LSTM cell. it is the input gate, which is used to control the input of information;

Fig. 1. Structure of LSTM “memory cell”

ft is the forget gate, which is used to control the retention of the historical state of the cell; ot is the output gate, which is used to control the output of the information. The formula of it , ft , and ot are as follows: it ¼ rðWi xt þ Ui ht1 þ bi Þ ft ¼ r Wf xt þ Uf ht1 þ bf

ð1Þ ð2Þ

ot ¼ rðWo xt þ Uo ht1 þ bo Þ

ð3Þ

where W is the weight matrix of the corresponding layer. The weights of these connections, which need to be learned during training, determine how the gates operate. b is the deviation vector of the output in each layer. The activation function r is the sigmoid function, which is applied to rescale the resulting value between [0,1]. ct is used to update the cell state, and ht is the output of the hidden layer. The formula of ec t , ct , and ht are as follows: ec t ¼ tanhðWc xt þ Uc ht1 þ bc Þ

ð4Þ

ct ¼ it ec t þ ft ct1

ð5Þ

ð6Þ

ht ¼ ot

tanhðct Þ

where represents the product of Hadamard and tanh is the activation function. When the output value of forget gate is 0, it means that the information of the previous state is discarded. When the output value of forget gate is 1, it indicates that the information of the previous state is preserved.

A Multi-component Chiller Status Prediction Method Using E-LSTM

2.2

419

Improvement of LSTM Model

Introduction of E-LSTM Model In order to solve the problem of limited selection of LSTM memory module and not too long input sequence length, we propose an improved LSTM network (EnhancedLSTM) model to improve the accuracy of predicting refrigeration equipment. The concrete structure is shown in Fig. 2. The dense layer of fully connected network is added before the original LSTM model, which is equivalent to feature pre-extraction and dimensionality reduction by using neural network. We input pre-extracted features into LSTM network. In order to prevent the problem of over-fitting caused by deep network layers or few data dimensions, we embed a dropout layer in the LSTM network layer. The weights of some probabilistic neurons in the random extraction model are not involved in the training. Finally, dense layer is also used in the output layer of the improved model. The improved model E-LSTM effectively solves the problem of high data complexity in refrigeration equipment. It can not only pre-extract the features of longsequence information, but also increase the depth of the whole model. It can effectively prevent the over-fitting problem caused by the over-depth of network layers and improve the prediction accuracy of the whole network model.

Fig. 2. Structure of E-LSTM

420

C. Xu et al.

2.3

Loss Function

Loss function is used to estimate the degree of inconsistency between the predicted value f ð xÞ and the real value Y, expressed by LðY; f ð xÞÞ. The smaller the loss function, the better the performance of the model. For our regression problem, we use a fullyconnected layer to output the predicted value based on the learned hidden features. Then, we employ mean squared error (MSE) to measure the loss, defined as follows: 0

MSEðy; y Þ ¼

Pn

i¼1 ðyi

y0 i Þ 2

n

ð7Þ

where yi is the ground-truth value of the i-th batch, and y0i is the predictive value given by the neural network. The training process of our proposed method can be mainly divided into four steps: (1) (2) (3) (4)

Calculate the output value of LSTM cells by (1)*(6); Calculate the error of each LSTM cell using (7); Calculate the gradient of each weight according to the error; Update the weight by using the gradient based optimization algorithm.

3 Experiments and Analysis In this section, we experimentally evaluate the performance of our proposed method on a chiller system dataset, compare its performance with six baselines, and conduct a case study to show the benefit of the prediction task using multi-component input. 3.1

Dataset Description

The chiller system dataset is collected from a real-world data center, and the data are recorded on every 35 s from September 28, 2018 to July 1, 2018. The multicomponents input contains: outdoor temperature, wet bulb temperature, outdoor enthalpy value, loop 1/2 differential pressure, data center total power and IT load. The single output of components includes: PUE, cold source power, and refrigerating secondary pump power. Note that the historical information of the output is also included as input. The total number of data sample is 3,608. Figure 3 shows a sample of PUE time series data. In general, the closer the PUE value is to 1, the higher the degree of greening of the data center. In practice, the PUE value of foreign advanced data center in machine room is usually less than 2, while the PUE value of most data centers in China is between 2–3. Cooling in the machine room is mainly responsible for air conditioning, so reducing the power consumption of air conditioning in the machine room can effectively reduce the PUE value [8]. For the abnormal value of PUE in Fig. 3, we can see that sometimes there would be a large number of equipment dissolved. It means that the value will fluctuate and occur abnormalities, which makes the prediction task prediction more difficult.

A Multi-component Chiller Status Prediction Method Using E-LSTM

421

Fig. 3. Sample of PUE time series data

Data Preprocessing For cases where the original time series data may contain null values, it is impossible to predict the data directly. In order to improve the quality of data analysis and prediction, we adopt mean filter method as preprocessing to delete and fill the abnormal data. Specifically, we use the pandas library in python to clean the data, delete the empty line, filter the abnormal value, and fill the vacancy value. The mean filter flowchart is shown in Fig. 4. Moreover, we use MinMaxScaler() in the scikit-learn toolkit [9] to normalize the data and scale the data to 0*1.

Start

Read into data

Delete exception values

Y

Whether there is an exception value N Obtain data mean by column

To the NAN value, replace with the mean value

Save data

Fig. 4. Flowchart of mean filter method in data preprocessing

422

C. Xu et al.

Correlation Analysis To prove the correlation among components, we calculate the correlation coefficient. The correlation can be demonstrated by heatmap [10], which allows for a more intuitive understanding of linear relationships among components. The heatmap is shown in Fig. 5. In general, Heatmap uses different colors to identify the magnitude level of the correlation. The color is lighter, the correlation is stronger; The color is darker, the correlation is weaker. From Fig. 5, we can find that cold source power is positively correlated to data center total power (0.82), wet bulb temperature (0.94), and outdoor enthalpy (0.95). PUE and IT load (−0.63) are negatively correlated. According to the correlation analysis, all the components in the chiller system are correlated between each other, which justify the necessity of using multi-component time series to predict chiller status.

Fig. 5. Heat map of component correlations

3.2

Experiment Setup

In this subsection, we first introduce the baselines selected in our experiments, and then outline the evaluation measurements. Finally, we describe the implementation details. Baseline Approaches To validate the performance of our proposed method, we select six existing methods as baselines: • Linear regression. By using the LinearRegression() function in the scikit-learn toolkit, the linear regression model is used to train the fit() function after data preprocessing. Finally, the Predict() function is used to predict chiller status. • Support vector regression (SVR). We use SVR() function in the scikit-learn toolkit to set the kernel functions to be rbf, linear, and poly, respectively. In addition, we set the gamma and C parameters with svm_timeseries_prediction() function to predict status.

A Multi-component Chiller Status Prediction Method Using E-LSTM

423

• Neural network (NN). NN is a traditional artificial neural network where the output is only related to current moment, ignoring the temporal relationships. We use the sigmoid function as the activation function. • RNN. RNN is a kind of neural network used to process sequential data. We use the same parameters as LSTM to train the network model. • Gated recurrent unit (GRU). GRU is a variant of LSTM that preserves the long-term memory ability of the LSTM, replacing the input gate, the forget gate, the output gate in the LSTM cell with the update gate and the reset gate, and combines the cell state and the output two vectors. We use the same parameters as LSTM to train the network model. • Long short-term memory (LSTM). LSTM network has been described in detail in Sect. 2. Evaluation Criteria In order to validate the proposed model, average absolute error (MAE) and mean square root error (RMSE) are employed as the evaluation measurement. The formulation of MAE and RMSE can be expressed as: 1 XT jy ft j t¼1 t T sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 1 Tt¼1 ðyt ft Þ2 RMSE ¼ T MAE ¼

ð8Þ

ð9Þ

where the ft and yt represent the real value at the t-th timestamp and the output value of the model, respectively. T is the total number of timestamps. The lower values of MAE and RMSE indicate the more accurate the results. Implementation Details We implement all the approaches with tensorflow [11]. In order to conduct evaluations, we randomly split the dataset into the training and testing sets with a 2:1 ratio (i.e., 2500 training data and the rest are the testing data). We set the number of hidden neurons is 128, the input feature size at each timestamp is 8. We use Adadelta [12] with mini-batch to minimize the cost function. Furthermore, we adopt 100 epochs to do experiments and the size of batch is 200. Moreover, we set sequence length as 5 for all the recurrent networks. 3.3

Experimental results

In this subsection, we first report the comparison results between the proposed model E-LSTM and baselines. Then we conduct sensitivity analysis to show the RMSE and MAE errors of our model under different sequence lengths. Finally we compare the performance of multi-component predictions with single-component prediction through a case study.

424

C. Xu et al.

Prediction Performance In order to verify the advantages of E-LSTM, the experiment uses the same hyperparameters to train all the recurrent models. The comparison results of PUE, cold source power, and refrigeration secondary pump power are listed in Tables 1, 2, and 3, respectively. Table 1. Comparison results of PUE status prediction Method Linear regression SVR (rbf) SVR (linear) SVR (poly) NN RNN GRU LSTM E-LSTM

RMSE 0.524 0.056 0.073 0.062 0.083 0.013 0.01 0.008 0.007

MAE 0.472 0.053 0.072 0.059 0.081 0.014 0.013 0.011 0.009

Table 2. Comparison results of cold source power status prediction Method Linear regression SVR (rbf) SVR (linear) SVR (poly) NN RNN GRU LSTM E-LSTM

RMSE 0.259 0.141 0.124 0.135 0.093 0.077 0.069 0.068 0.064

MAE 0.258 0.126 0.118 0.121 0.095 0.100 0.094 0.09 0.083

Table 3. Comparison results of refrigeration secondary pump power status prediction Method Linear regression SVR (rbf) SVR (linear) SVR (poly) NN RNN GRU LSTM E-LSTM

RMSE 0.274 0.127 0.103 0.104 0.088 0.088 0.086 0.082 0.08

MAE 0.217 0.115 0.096 0.098 0.078 0.075 0.088 0.08 0.07

A Multi-component Chiller Status Prediction Method Using E-LSTM

425

From the tables we can see that the RMSE and MAE of the proposed model are lower than that of the LSTM model. And the proposed model performs better than the RNN and GRU model. Compared with baselines, the RMSE and MAE of the proposed model is significantly lower than that of three methods of NN, SVR, and linear regression, demonstrating the effectiveness of our proposed method in the task of chiller status prediction. Sensitivity Analysis To further evaluate the performance of our model, we conduct sensitivity analysis to study the impact of different sequence lengths. In particular, 1, 3, 5, 10, 30, 50, 100 and 150 are selected as sequence lengths to train the E-LSTM model, respectively. The prediction results of PUE, cold source power, and refrigeration secondary pump power are shown in Figs. 6, 7, and 8.

Fig. 6. RMSE and MAE variations of E-LSTM model under different sequence lengths (PUE)

Fig. 7. RMSE and MAE variations of E-LSTM model under different sequence lengths (Cold source power)

426

C. Xu et al.

Fig. 8. RMSE and MAE variations of E-LSTM model under different sequence lengths (Refrigeration secondary pump power)

As can be seen from the above figures, the RMSE and MAE of our model under different sequences of PUE, the cold source power, and refrigeration secondary pump power present wave ascend. We can see that the sequence length of 5 gets the lowest RMSE and MAE of PUE, the cold source power, and refrigeration secondary pump power. So we set sequence length as 5 to predict the chiller status. We can conclude that it is the best choice to use the LSTM network for multi-component input to predict the chiller status of the next 3 h. Case Study In order to demonstrate the necessity of multi-component chiller status prediction, in this part, we compare it with a single-component prediction. We keep the other parameter settings unchanged and only change the input. Figure 9 shows the performance comparisons of PUE, cold source power, and refrigeration secondary pump power between multi-component prediction and single-component prediction. Among them, the blue line represents the real trend, and the red line represents the predicted trend. From Fig. 9, we can see that the performance of the single-component prediction is far worse than the multi-component prediction. The error is higher than that of multicomponent prediction in terms of RMSE, which justify the effectiveness of our model to yield good performance of the multi-component status prediction in chillers.

A Multi-component Chiller Status Prediction Method Using E-LSTM

(a) PUE single-component predictive effect

(c) Cold source power single-component predictive effect

427

(b) PUE multi-component predictive effect

(d) Cold source power multi-component predictive effect

(e) Refrigeration secondary pump single-component predictive effect (f) Refrigeration secondary pump multi-component predictive effect

Fig. 9. Comparision of single and multi-component predictive effect

4 Conclusion In this paper, we propose a multi-component chiller status prediction method using ELSTM. We extract hidden features from seven component-wise inputs, including outdoor temperature, wet bulb temperature, outdoor enthalpy, L1 & L2 differential pressures, total power, and IT load. And three future statuses of chiller system are predicted, including PUE, cold source power, and secondary refrigeration pump power, respectively. The experimental results demonstrate that the performance of our model under different sequence lengths has a wave mode in chiller status prediction. Choosing

428

C. Xu et al.

the 5 sequence length can achieve the best performance in predicting the chiller status. Compared with the baseline methods, our model achieves better performance in three different status prediction tasks. In general, this paper improves the LSTM network and verifies the applicability of E-LSTM in the field of multi-component chiller status prediction. Acknowledgement. This paper is supported by the Project for the National Natural Science Foundation of China under Grants No. 61672064, Advanced Information Network Beijing Laboratory (040000546618017).

References 1. The Science Behind Refrigeration. Berg Chilling Systems Inc. Accessed 2016 2. Jie, Z.: Research on regression analysis model and its dynamic optimization mechanism for data prediction (2018) 3. Shunlai, R., Caiwu, L.: Optimal scheduling method of global energy consumption for virtualized data center. Comput. Eng. 39(12) (2013) 4. Ming, W., Dingli, Z., Qian, F., Jun, Q.I., Huangcheng, F., Wenbo, C.: Non-linear autoregressive time series prediction method for tunnel surrounding rock deformation. J. Beijing Jiaotong Univ. 41(4) (2017) 5. Binnan, L., Guishing, K.: Prediction model of soil moisture characteristic curve based on grey theory-BP neural network method. Resour. Environ. Arid Areas (7) (2018) 6. Nanyi, Z.: Study on the extraction and prediction of data features based on deep learning (2017) 7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 8. TechTarget Data Center: “Power usage effectiveness”. https://searchdatacenter.techtarget. com.cn/whatis/9-21989/ 9. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 10. Chao, D., Jinwei, S.: Visual analysis system of cigarette market data based on thermal attempt. Tobacco Technol. (12) (2016) 11. Metz, C.: Google just open sourced tensorflow, its artificial intelligence engine. Wired. Accessed 10 Nov 2015 12. Zeiler, M.D.: Adadelta: an adaptive learning rate method (2012). arXiv preprint arXiv:1212. 5701

Improvement of Chromatographic Peaks Qualitative Analysis for Power Transformer Base on Decision Tree Jie Shan, Cheng-Kuo Chang(&) , Hao-Min Chen, and Jeng-Shyang Pan School of Information Science and Engineering, Fujian University of Technology, Fuchou, Fujian, China [email protected]

Abstract. The decision tree applied to the chromatographic peak characterization of power transformers in this task. The follows parameters taken as the characteristic attributes of chromatographic peak identification that include peak height, peak width, peak area and peak position. Dichotomy adopted to discretize the continuous attributes to select the nodes by decision tree. From the algorithm, not only ones obtain the adaptive threshold of the characteristic attributes and achieve the correct classification of single effective peaks, but also avoid the errors caused by the simultaneous decision of seven component peaks. The test results shown that the improved algorithm has the advantages of simple principle, good peak drift resistance and false peaks eliminated. Keywords: Power transformer Decision tree

Gas chromatography Peak determination

1 Introduction Dissolved gas analysis (DGA) is the important method to diagnosis the fault of oilimmersed high-voltage power equipment. Online chromatography of power transformer is one of the most effective methods to detect dissolved gas in oil at present and it is the key technology to realize online real-time monitoring of power transformer [1]. Characteristics of on-line chromatographic technology for power transformers are no manual operation, and the system needs to collect chromatographic data automatically. At the same time, the chromatographic peak of transformer was firstly identified, then the composition of chromatographic peak was qualitatively analyzes. At last, the concentration of chromatographic components was calculated according to the peak height, peak area and baseline data. The power transformer chromatographic instrument generally used in large substations, power plants and other places where the environment is relatively bad, so it requires that the chromatograph need strong antiinterference ability. At present, there were many methods for chromatographic peak identification, including time window method [2], derivative method [3], matching mode method [4], gray relational degree analysis method [5] and BP neural network method [6]. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 429–436, 2020. https://doi.org/10.1007/978-981-15-3308-2_46

430

J. Shan et al.

The model used pattern match technology to identify chromatographic peaks of transformers [4]. That has two question unsolved, one was unreasonable negative correlation occurs and the other was different parameter selection leads to different results. The method proposed improved on [5], but whether the Gaussian Template is optimal unsolved and the value setting of correlation degree is too rigid. BP neural network [6] is ultimately the determination of slope threshold value and windows interval threshold value. That is difficult to identify chromatographic peak drift. Therefore, the mainstream methods presented [2, 3], but the qualitative techniques adopted by time window method and derivative method are to set the window interval of each component according to the retention time to conduct component qualitative analysis [7]. The shortcomings of this method in practical application are that the identification range is small and the ability to resist false peaks is poor. The most serious problem is that the chromatographic instrument runs for a long time, the chromatographic summit will inevitably drift. If the drift range exceeds the range of window, there will be no peak or peak detection fault. Fuzzy logic used to solve this problem [8], but the membership function generally based on empirical selection, which is very subjective. This method only increases the drift range of peak position, but cannot solve the problem fundamentally. Other methods can consider to use in this topic [9, 10]. When decision tree introduced to the field of chromatographic peak characterization in power transformer, seven effective component peaks obtained by using chromatographic peaks processed by decision tree. However, there will be decision-making errors in the process of using this method. Seven active component peaks cannot be accurately obtained such as, 5, 8, etc. In this task, the application of decision tree algorithm improved the chromatographic peak characterization.

2 Introduce Decision Tree to Chromatographic Peak Decision tree is a common machine learning method. Decision tree algorithm is one of the classical algorithms commonly used in data classification algorithms. The results obtained by the decision tree algorithm are relatively accurate and easy to understand [11]. This algorithm is also a kind of supervised learning. Firstly, given a bunch of samples and each sample has a set of characteristic attributes and a category, which are determined in advance. Then a classifier (decision tree model) will obtain through supervised learning. This classifier can give the correct classification of new objects according to their characteristic attributes. The chromatographic data in this paper comes from the online monitoring device of dissolved gas in transformer oil device NS801B of Guodian Nanjing Automation Co. Ltd, which based on gas chromatographic detection technology. It can monitor the gas content and growth rate of H2 , CO, CO2 , CH4 , C2 H2 , C2 H4 , C2 H6 dissolved in transformer oil in real time. The latent faults and fault types are analysis by fault diagnosis expert system and facilitate real-time understanding of transformer operation status. A group of chromatographic curves collected from an experimental device of the department of the company shown in Fig. 1.

Improvement of Chromatographic Peaks Qualitative Analysis

431

Fig. 1. Chromatographic curves

According to the peak identification algorithm of derivative of device NS801B, there are basic hundreds of peak positions obtained. Only the position of peaks known, but the composition of peaks unknown. The traditional method is to set the window interval according to the retention time to determine the qualitative components. The content of this study is directly to identify seven component peaks out of the hundreds of peaks obtained, and to remove the problems caused by the realization of chromatographic peak characterization according to the peak generation sequence. Decision tree algorithm adopted to conduct qualitative analysis of each peak and dichotomy adopted to discretize continuous attributes when selecting root nodes with decision tree. Then obtain adaptive threshold of characteristic attributes and make decision tree classification with characteristic attributes as nodes to obtain component peaks.

3 The Processes of Improve Algorithm 3.1

Data Preparation and Feature Selection

This task adopts device NS801B of Guodian Nanjing Automation Co. Ltd. For the purpose of data comprehensibility and universality, seven groups of data with different gas concentrations under the same main transformer were collected and the Numbers of data were No. 20190605152550, No. 20190605162730, No. 20190605172741, No. 20190605182731, No. 20190605187462, No. 201906051871772 and No. 20190605162435. The number of peaks in each set of data identified by the device is 363, 183 and 172, 221, 145, 179 and 156 respectively. In this paper, the first set of data is taken as training sample set D, consisting of 363 peaks. Take the remaining groups of data as the test sample set Q1 , Q2 , Q3 , Q4 , Q5 and Q6 Peak properties measured by the

432

J. Shan et al.

company’s device include starting point, peak midpoint, end point, peak height, peak width, peak area, aspect ratio ðpeak height=peak widthÞ, peak spacing and peak type. This paper divides its attributes into two categories. The first category is decision feature attributes, that is data classification for decision tree algorithm. The second category is invalid attributes: non-first category attributes. At this point, attribute set of peak qualitative under decision tree algorithm defined as U ¼ fpeak height; peak width; peak area; peak positiong:

3.2

H2 Under Decision Tree Model Construction of Component Peak

After data preparation and data selection, training sample set D and test sample set Q1 to Q6 are obtained in this paper, also the feature attribute set U. Make full use of the feature attribute set, select the optimal feature attribute for combination, establish classification rules and extract component peak of H2 from the data. Discretization of Feature Attributes. Since the peak height, peak width, peak area and peak position of the four characteristic attributes in the feature attribute set U are all continuous values. Firstly, the data needs to be preprocess, which called discretization. This is the reason why C4.5 classical algorithm adopted in many decision tree algorithms, which used bi-partition method to processing consecutively attributes [12]. The parameters, h, w, s and p, be defined as the continuous attributes of peak height, peak width, peak area and peak position in training sample set D respectively. The h, w, s and p have V possible values ðV 363Þ on the training sample set D and these values are sorted from small to large. They denoted as respectively, such as peak height ¼ fh1 ; h2 ; h3 ; ; hv g, peak width ¼ fw1 ; w2 ; w3 ; ; wv g, peak area ¼ fs1 ; s2 ; s3 ; ; sv g, aspect ratio ¼ fp1 ; p2 ; p3 ; ; pv g, and each value of V varies according to the real situation of the data. Taking the peak height h as an example. The training sample set can be divided into D based on the partition point Dtþ and D t , which Dt is the sample that contains the þ peak height not greater than the t value, Dt contains samples with peaks higher than value t. Obviously, one takes h for adjacent properties hi with hi þ 1 . The value t is in the interval of ½hi ;hi þ 1 Þ which will produce the same partition result with any value. Therefore, the candidate point set containing ðV 1Þ elements is investigated in this paper for continuous attribute peak height h: Ta ¼

hi þ hi þ 1 ;1 i V 1 2

ð1Þ

The middle site of the interval ½hi ;hi þ 1 Þ selected as the candidate partition point. The partition points could examine like discrete attribute values and the optimal partition points can select to divide the training sample set. Node Selection Based on Information Gain Ratio. Decision tree algorithm uses topdown greedy search to traverse the possible decision tree space. The construction process of the algorithm starts from “which feature in feature attribute set U will be tested at the

Improvement of Chromatographic Peaks Qualitative Analysis

433

root node of the tree?” The beginning of problem is that classification ability best attributes chosen for the root node of the tree, then each possible value of the root node characteristics produced a branch and the training sample set D arranged below the appropriate branch (i.e. the characteristics of the sample attribute values corresponding branch). At last repeat the whole process, with each branch node associated training samples in the node to select the best feature of tested [13]. There are four characteristic parameters ðh; w; s; pÞ in the set U, and the gain ratio of C4.5 decision tree algorithm is used to select the best divided characteristic attribute, as take peak height h as an example. Step 1: Calculate information entropy. Information entropy is the most commonly index used to measure the purity of sample set. The ratio of effective peak in the current training sample set D is Pk ðk ¼ 1; 2Þ, and then the information entropy of D is defined as Eq. (2): X2 P log2 Pk ð2Þ EntðDÞ ¼ k¼1 k The smaller value of EntðDÞ is the higher the purity of D. The training sample set D contains two types of peaks. The effective peak is the component peak of H2 and the invalid peak is the other than effective peak. Step 2: After discretization of 2.2.1 data, calculate the information gain obtained by dividing training sample set D with characteristic attribute peak height h as Eq. (3): GainðD; hÞ ¼ EntðDÞ

k D

X

t

k 2 f; þ g

jD j

Ent Dkt

ð3Þ

Using characteristic attribute peak height h to divide the training sample set D, V branch nodes will be generated, where Dv is the branch node v. Where Dv is the vth branch node containing all samples in D that have a value of hv at the peak height h, denoted as Dv . The value of jDv j=jDj represent the weight of node. The branches node with more samples has greater influence. Step 3: Compare with the sizes of GainðD; hÞ, GainðD; wÞ, GainðD; sÞ and GainðD; pÞ, than selects the largest value as the optimal partition point, namely the root node. Loop the above process over each branch node. End conditions: traverse all characteristic attributes or complete the classification of all component peaks. Decision Tree Formation. For the first group of data whose training number is No. 20190605152550, the repeated peaks removed from the 363 group of data to obtain nineteen groups of real peaks, namely nineteen training samples. Therefore, EntðDÞ ¼ 0:2974 obtained from Eq. (2). According to Eq. (1), the set of candidate partition points of this attribute contains eighteen candidate values Tpeak width ¼ f12:0; 25:5; 32:5; 37:5; 41; 50:0; 69:5; 82:5; 85:5; 92:5; 122:5; 165:0; 197:5; 217:0; 245:0; 273:5; 288:0; 308:5g:

434

J. Shan et al.

They calculated from Eq. (3) that the partition point with higher information gain of characteristic attribute peak width is 217.0 and the corresponding information gain is 0.1075. Similarly, the remaining characteristic attribute partition points and information gain obtained as 10:6; GainðD; hÞ ¼ 0:0922; 474:02; GainðD; sÞ ¼ 0:0922; 819:0; GainðD; pÞ ¼ 0:2860: Therefore, peak width selected as the root node partition attribute, then nodes recursive partitioning process. The decision tree pruning (i.e. found as shown in this paper in the process of the decision tree pruning can no longer do any clip, trim off any rules will make classification accuracy is reduced, the pruning methods as shown in [12]) the resulting decision tree as shown in Fig. 2.

Fig. 2. Decision tree for peak characterization

Improvement of Chromatographic Peaks Qualitative Analysis

435

4 Test Results and Discussions Samples set Q1 and Q2 tested with chromatographic peaks. The two groups of data tested on the decision tree shown in Fig. 2 and the test results shown in Table 1. Table 1. Classification accuracy results of improved decision tree algorithm Test sample set Q1 Q2 Q3 Q4 Q5 Q6

Number of chromatographic peaks 183 172 221 145 179 156

Test Number of component peaks from H2 1 1 1 1 1 1

Successful

Yes Yes Yes Yes Yes Yes

Testing results show that the improved algorithm is completely effective for the peak of H2 . The judgment accuracy of component peak is high. The classification completed when the third leaf node found in the decision tree formatted processing. Therefore, the end condition of the decision tree is given, that is the classification completed by traversing all feature attributes or all component peaks. Only the peak of H2 qualitative decision tree for component peak given in this paper. The components peaks of H2 are in the front of all the peaks in the axes, especially the first of the seven component peaks. Therefore, compared with the other six component peaks, the component peaks of H2 are easier to judge and less difficult to make decisions. Further experiments needed to determine the results of the other six gas component peaks. The results show that the improved algorithm has higher reliability and accuracy than the original one.

5 Conclusions In this task, ones improved the chromatographic peaks determination by introduce the decision tree algorithm. The principle of peaks determination based on decision tree algorithm constructed and the data from the device NS801B of Guodian Nanjing Automation Co. Ltd. had been tested and analyzed. Compare with the original method, the new algorithm improve the accuracy, the peaks drift resistance, and complete eliminate the false peaks.

436

J. Shan et al.

References 1. Dornerburg, E., Strittmatter, W.: Monitoring oil cooling transformers by gas analysis. Brown Boveri Rev. 61, 238–247 (1974) 2. Liu, Z., Zou, H., Ye, M., et al.: Effect of temperature and buffer solution on migration time window of micellar electrokinetic capillary chromatography. Chromatography 17(2), 147– 152 (1999) 3. Colmsjo, A.L.: Chromatographia. Int. J. Sep. Sci. 23(4), 25 (1987) 4. Hu, J., Bao, J., Zhou, F., Luo, H.: An algorithm for transformer chromatographic peak identification based on pattern matching. Power System Autom. 21, 89–91 (2005) 5. Cao, J., Fan, J., An, C.: Application of grey correlation analysis in the identification of transformer oil chromatographic peak. Power Grid Technol. 34(07), 206–210 (2010) 6. Wang, W.: A new optimization method for chromatographic peak identification and characterization of transformer. Meas. Control Technol. 33(3), 94–97 (2014) 7. Ni, Z., Jia, R., Li, L.: Case reasoning based on neural network. Microcomput. Dev. 115, 3–7 (2001) 8. Hu, J., Zhou, F., Bao, J., Luo, Z.: Qualitative algorithm of transformer chromatographic peak based on fuzzy membership function. Autom. Power Syst. 18(70–72), 84 (2005) 9. Chen, S.M., Manalu, G.M.T., Pan, J.S., Liu, H.C.: Fuzzy forecasting based on two-factors second-order fuzzy-trend logical relationship groups and particle swarm optimization techniques. IEEE Trans. Cybern. 43(3), 1102–1117 (2013) 10. Chen, S.M., Wang, N.Y., Pan, J.S.: Forecasting enrollments using automatic clustering techniques and fuzzy logical relationships. Exp. Syst. Appl. 36(8), 11070–11076 (2009) 11. Gehrke, J., Ganti, V., Ramakrishnan, R., Loh, W.Y.: Boat—optimistic decision tree construction. Acm Sigmod Rec. 28(2), 169–180 (1999) 12. Quinlan, J.R.: Programs for Machine Learning (1992) 13. Bian, J., Xu, J.: Video vehicle classification algorithm based on C4.5 decision tree. Microelectron. Comput. 34(5), 104–109 (2017)

A Review of the Development and Application of Natural Language Processing Wei-Wen Guo1,2(&), Li-Li Huang1, and Jeng-Shyang Pan1,2 1

2

Fujian University of Technology, Fuzhou 350118, China [email protected] Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, China

Abstract. With the development of convolutional neural networks and deep learning and a series of very significant breakthroughs in computer speech, many new models and methods have been provided for the field of Natural language processing. Natural language processing is a very important branch of artificial intelligence, and its application requirements and relevant fields are also becoming wider and wider. This paper first summarizes the related concepts of Natural language processing; then introduces in detail the development process of Natural language processing; then elaborates on the research progress of the application field of Natural language processing, including lexical analysis, syntactic analysis, machine translation and other fields; finally, the semantic understanding, the problem of low resources and the development direction of other fields are summarized and forecasted. Keywords: NLP translation

Artificial intelligence Lexical analysis Machine

1 Introduction Natural language processing (NLP) is an important direction in the field of computer science and artificial intelligence, aiming at studying various theories and methods that can realize effective communication between human and computer using by natural language [1]. Artificial intelligence is a branch of computer science,and its purpose is to understand the essence of intelligence and to produce a new intelligent machine that can respond in a similar way to human intelligence [2]. NLP is the most critical part in the direction of artificial intelligence. The core goal is to convert the human language into commands that can be executed by computer, so that the computer or machine can process and understand natural language like human being. NLP is a bridge between human and computer communication, which is of great practical significance and revolutionary theoretical significance. The research contents in the field of NLP are very extensive, mainly including part of lexical analysis, machine translation, emotional analysis, text classification and so on. This paper elaborates on the concept, development history and research application of NLP, especially the research progress of natural language application field, and the problems encountered in NLP and the future application in other fields. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 437–443, 2020. https://doi.org/10.1007/978-981-15-3308-2_47

438

W.-W. Guo et al.

2 Development of NLP From 2008 to the present, with the application of neural networks and deep learning models, the application field of NLP has pushed new enthusiasm and height. Under the encouragement of image recognition and speech recognition, people gradually began to use deep learning to do NLP research. From the original word vector to 2013, Tomas mikolov et al. created word2vec [3]. The combination of deep learning and NLP has reached a climax, and has achieved certain success in the fields of machine translation, question answering system, reading comprehension and so on. Deep learning is a multi-layered neural network that outputs from a layer-by-layer nonlinear change from the input layer, this is, end-to-end training from input to output. The RNN model proposed by Schuster et al. has been one of the commonly used methods of NLP [4]. The model of GRU proposed by Cho et al. and the model of LSTM proposed by Hochreiter et al. have triggered a round of rounds of enthusiasm [5, 6].

3 Application of NLP 3.1

Lexical Analysis

Words are the basic unit of a sentence. Lexical analysis is to first change the string that constitutes a sentence into a string of words, and then mark each word in a sentence with a syntactic category. Lexical analysis includes part-of-speech tagging, named entity recognition, and word sense disambiguation, and it is the most important part of Chinese NLP module [7]. Part-of-Speech tagging is a process of determining the grammatical category of each word in a given sentence, determining its part of speech and labeling it. Named entity recognition is a named entity that identifies a person’s name, place name, and institution name in a sentence. The focus of word meaning annotation is to solve the problem of how to determine the meaning of polysemy in a specific context. Currently, lexical analysis methods are based on rules, statistics-based, and machine-based learning. 3.2

Syntax Analysis

The basic task of syntactic analysis is to determine the syntactic structure of a sentence or the dependencies between words in a sentence, including syntactic structure parsing and dependency parsing. In 1997, Giguet et al. established a grammar knowledge base based on the basic ideas based on rules. Through checking and conditional constraints, he achieved the elimination of ambiguity in syntactic structure, and constructed a Tenier-style grammatical dependency parser [8]. This analyzer reduced the complexity of dependency relations and improves efficiency. In recent years, there have been two neural network models for the syntactic analysis of the transfer-based method and the graph-based method. The best model for current transfer-based methods is Stacked LSTM, which models the stack state, the sequence to be input, and the sequence of actions through

A Review of the Development and Application of Natural Language Processing

439

three LSTMs. In the current syntactic analysis, the classic method of graph-based methods is the Biaffine model. 3.3

Automatic Question and Answer

Automatic question and answer is a task that uses a computer to automatically answer questions raised by users to meet the user’s knowledge needs, including search questions and answers, knowledge base questions and answers, and community questions and answers. Its implementation process is to first correctly understand the user’s question, then extract the key information, search and match in the corpus and knowledge base, and finally feedback the obtained answer to the user. The more mature automatic question answering system developed abroad includes the Start Question Answering System of the MIT Artificial Intelligence Laboratory [9], the University of Michigan AnswerBus Question Answering System and IBM’s statistical question answering system [10, 11]. Among them, Start is the first web-based question answering system, and it adopts a hybrid model based on knowledge base and information retrieval. One application in the Chinese Academy of Sciences project is the NKI knowledge question and answer system, called HKI [12]. Based on the NKI knowledge base, HKI provides users with knowledge services in various fields. Li Dong et al. used convolutional neural networks to process intelligent question-andanswer tasks, and obtained vector representations of problems through training [13]. 3.4

Text Summary

The text abstract is the use of a computer to automatically extract important information from the text, and then form a summary to express the original text. According to the degree of matching between the abstract and the original text, it can be divided into a digested digest and a generated digest; according to the number of documents, it can be divided into a single document digest and a multi-document digest; according to the generated digest method, it can be divided into a statistical-based method, based on the graph model. The method is based on algebraic methods, machine learning based methods and deep learning based methods [14]. The earliest text abstracts date back to the 1950s and 1960s, and early research focused on text summaries of single documents [15, 16]. Since 1995, with the rise of machine learning, traditional machine learning methods have been widely used in text summaries, such as Bayesian classifiers and Markov models [17, 18]. Recently, with the acceptance of deep learning, in 2012, Liu Yan et al. proposed a method for extracting abstracts from multiple documents using unsupervised learning [19]. In 2016, Google opened up the automatic text summarization method in tensorflow to generate headlines for news using deep learning [20]. 3.5

Machine Translation

Machine translation is the process of automatically converting two languages using a computer. Many online translation services provide automatic conversion in multiple languages, such as: Google translation, Baidu translation, and Tao translation.

440

W.-W. Guo et al.

In 2013, Kalchbrenner and Blunsom summarized and proposed a neural networkbased translation method, which caused great concern in academia [21]. In 2014, after the neural network was applied to machine translation, the end-to-end neural machine translation (NMT) was rapidly developed. The method system was to directly use the neural network to realize the source language text to the target language of the mapping of the text [22]. In 2016, He et al. proposed a method of combining statistical machine translation (SMT) features with the NMT model under a log-linear framework to achieve the efficiency of improving the performance of the phrase generation process [23]. 3.6

Emotion Analysis

Sentiment analysis, also known as opinion mining, emotion mining, etc., processes and analyzes the emotional and subjective texts, the process of induction and reasoning, mainly studies people’s attitudes and emotional tendencies expressed on popular events or topics. In 2013, Tang Xiaobo et al. proposed a regression SVM sentiment classification model based on the precession principle and AdaBoost integration technology, which realized the visualization of text sentiment strength devaluation [24]. In 2017, Manek et al. proposed a feature selection method based on the Gini index for support vector machine classifiers [25]. With the development of deep learning, current researchers have applied deep learning to emotional classification to achieve the goal of optimizing classification.

4 Problems and Prospects Researchers in NLP have achieved quite good results in terms of part-of-speech analysis, syntactic analysis, and word similarity, but there are still no deeper and broader breakthroughs in some aspects. 4.1

Semantic Understanding Problems and Prospects

Today, deep learning truly understands human language from textual data, but it can’t get a deeper understanding of semantics. The problem of semantic understanding can be said to be the study of language knowledge, the study of common sense knowledge, the study of world knowledge, and the study of domain knowledge. In addition to the linguistic knowledge inherent in large-scale unmarked texts, various types of human knowledge are also associated. For example, the understanding of common sense problems is difficult to teach to machines to learn and train. A great deal of human knowledge (such as common sense) is hidden in our deep consciousness, so we should solve the problem of semantic understanding in future research. Designers of AI systems can collect as complete a human knowledge database as possible, summarize human knowledge, form another corresponding database, construct a knowledge map, and inject it into the system.

A Review of the Development and Application of Natural Language Processing

4.2

441

Low Resource Issues and Prospects

It is well known that unsupervised learning, Zero-shot learning, Few-shot learning, meta-learning, and transfer learning are essentially to solve the problem of low resources. Faced with the lack of labeled data resources, such as machine translation in small languages, domain-specific dialogue system, customer service system, multiround question-and-answer system, we call these problems as low-resource NLP problems. At present, there is no good solution for NLP in this respect. In the face of low resource issues, in addition to trying to introduce domain knowledge, such as dictionaries and rules, to enhance data capabilities, NLP techniques can achieve broader breakthroughs on such issues, and can be based on active learning methods. To add more manual annotation data, and to use unsupervised and semisupervised methods to take advantage of unlabeled data, or to use multitasking to use other tasks, or even other language information, you can also use migration learning methods. Use other models. 4.3

Other Areas of Future Applications

NLP technology has already penetrated into our daily life, and we are enjoying the convenience of NLP technology every day. With the deeper and wider breakthroughs in NLP technology, financial, legal, medical and health fields will be more and more widely used to support their progress. In the financial field, NLP can provide various analytical data for securities investment, such as hotspot mining, public opinion analysis, as well as financial risk analysis and fraud identification. In the legal field, NLP can help with those includes case searches, judgment predictions, automatic generation of legal documents, and translation of legal texts. In the field of medical health, NLP technology has broad application prospects, such as auxiliary entry of medical records, retrieval and analysis of medical materials, and auxiliary diagnosis. Modern medical materials are vast, new medical methods and methods are developing rapidly, no doctors or experts can grasp all the developments of medicine. NLP can help doctors quickly and accurately find the latest research progress of various difficult diseases, making patients the most quickly enjoy the fruits of medical technology advancement.

5 Conclusion The rapid development of the field of computer and artificial intelligence has brought a huge upsurge to the application of NLP. At present, neural network and deep learning models have become an indispensable method of NLP. Similarly, in the face of current problems, researchers are also provided with important research directions in the future to find better models or learning methods to solve these problems. All in all, in the field of NLP, as the research on NLP of neural networks and deep learning is slowly maturing, more breakthrough results will be achieved in more fields in the future.

442

W.-W. Guo et al.

Acknowledgements. This work was funded by the Education and Research Projects of Fujian University of Technology, numbered JXKA18015, GB-M-17-11, and GY-Z15101; and Foundation for Scientific Research of Fujian Education Committee (JAT170371).

References 1. Zong, C.: Statistical NLP, pp. 2–3. Tsinghua University Press, Beijing (2013) 2. Kaplan, J.: Artificial Intelligence Era, pp. 11–45. Zhejiang. Zhejiang People’s Publishing House, Hangzhou (2016) 3. Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. Comput. Sci. (2013) 4. Schuster, M., Palimal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997) 5. Cho, K., Merrienboer, B., Gulceher, C., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation (2014). arXiv Preprint arXiv: https:// arxiv.org/abs/1406.1078 6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 7. Xing, L.: Design of automatic translation generation system in English-Chinese machine translation. Mod. Electron. Technol. (2018) 8. Giguet, E., Vergne, J.: Syntactic analysis of unrestricted French. In: Proceedings for the International Conference on Recent Advances in Natural Languages Processing (RANLP97), pp. 276–281 (1997) 9. Katz, B., Marton, G., Borchardt, G., et al.: The START Natural Language Question Answering System [ EB/OL] (2006) 10. Zheng, Z.: AnswerBus Question Answering System [EB/OL] (2006) 11. Ittycheriah, A., Roukos, A.: IBM’s statistical question answering system. In: Proceedings of the TREC-11 Conference, pp. 394–401. NIST Special Publication, Gaithersburg (2002) 12. Cao, C.-G.: NKI-21 century technology hotspot. Comput. World 5(2), 1–3 (1998) 13. Dong, L., Wei, F., Zhou, M., et al.: Question answering over freebase with multi-column convolutional neural networks. In: Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, pp. 260–269 (2015) 14. Qin, B., Liu, T., Li, S.: Multi-document automatic summarization review. Chin. J. Inf. 19(6), 13–20 (2005) 15. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958) 16. Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969) 17. Conroy, J., O’leary, D.P.: Text summarization via hidden markov models and pivoted qr matrix decomposition. Technical report, In SIGIR (2001) 18. Osborne, M.: Using maximum entropy for sentence extraction. In: Proceedings of the ACLQ2 Workshop on Automatic Summarization-Volume4, AS 2002, pp. 1–8 (2002). pales 19. Liu, Y., Zhong, S.H., Li, W.: Query-oriented multi-document summarization via unsupervised deep learning. In: AAAI’12 Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1699–1705. AAAI Press, Palo Alto (2012) 20. Abadi, M., Agarwal, A., Barham, P., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467 21. Kalchbrenner, N., Blunson, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in NLP, pp. 1700–1709 (2013)

A Review of the Development and Application of Natural Language Processing

443

22. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014) 23. He, W., He, Z., Wu, H., et al.: Improved neural machine translation with SMT features. In: AAAI, pp. 151–157 (2016) 24. Tang, X.-B., Yan, C.-X.: A sentimental classification model based on SPIPR() principle and support vector machine. Inf. Stud.: Theory Appl. 36(1), 98–103 (2013) 25. Manek, A.S., Shenoy, P.D., Mohan, M.C., et al.: Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web-internet Web Inf. Syst. 20(2), 135–154 (2017)

Decision Support Systems and Data Security

Rapid Consortium Blockchain for Digital Right Management Yue Wu1,2(&), Zheming Lu1,2, Faxin Yu1,2, and Xuexue Luo2 1

2

Zhejiang University, Hangzhou 310027, China [email protected] Hangzhou Kilbychain Technology, Hangzhou 310030, China

Abstract. With the rapid achievement of internet technologies and digital multimedia, the strengthening of intellectual property rights remains an enormous challenge in digital right management (DRM). Toward the objective, the traditional DRM system relied on digital encryption and digital watermarking technology emphasizes much more on the encryption and decryption of digital property right than tracking digital property rights transaction and authorization, which is as important as encryption and decryption. To improve the situation, with the widely used blockchain and distributed ledgers technology (DLT) in a broad range of applications and different domains, this paper proposes a new DRM system named RCB (Rapid Consortium Blockchain)-DRM which has excellent ability to effectively trace the transaction of intellectual property matters and manage digital copyright with decentralization, tamper-proof, transparent digital date stored in master-slave rapid consortium blockchain. In the proposed master-slave rapid consortium blockchain, we adopt a new intelligent consensus mechanism (ICM) named PBS (Proof By System Nodes), improving the consensus efficiency, shortening the transaction confirmation time and improving the fault tolerance of the system, which can ensure the final consistency of each block and guarantee block generation stability without blockchain bifurcating. Keywords: Consortium blockchain

Consensus DRM

1 Introduction With the rapid achievement of internet technologies and digital multimedia, more and more people prefer to view digital resources through web browsers. Nevertheless, plenty of websites provide unauthorized digital data, like videos and images, which may damage the interests of the owner of the right. On account of the growing value of intellectual property rights, it’s necessary and significant for the DRM [1] system to use technology to prevent the data leakage and unauthorized redistribution of the copyrighted multimedia data, and the system should improve the capacity to manage origin content. In recent years, many DRM technologies [2, 3] have proposed by scholars and institutions, which emphasize too much on encryption and authorized management, so that the DRM system lacks the capacity to trace the track of intellectual property rights authorization and transaction. Meanwhile, the system implemented by these © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 447–454, 2020. https://doi.org/10.1007/978-981-15-3308-2_48

448

Y. Wu et al.

technologies is centralized, which may cause the digital right related data to be tampered with. Above the problems, a new DRM system should provide efficient and reliable technologies that make this system have decentralized, traceable and tamper-proof features. According to this requirement, blockchain and DLT are attracting our attention, which allows participants to track the digital transaction with a decentralized system. Each node updates a copy of the blockchain data automatically, and the record’s authenticity can be verified by the entire community using the blockchain instead of a single centralized system [4] As for recent research on blockchain, Wright A. demonstrated decentralized blockchain technology that has the possibility to construct a decentralized data management system [5]. Zyskind proposed a decentralizing privacy protection method for protecting personal data based on blockchain. [6]. Dorri et al. proposed a blockchain-based framework to protect user privacy and to enhance the security of the vehicular ecosystem [7]. The above research shows that blockchain can be well applied in data management so that we tried to apply blockchain technology to digital right protection. To implement a decentralized DRM system, in this paper we proposed a new blockchain-based system for digital rights management, which can provide a highly credible property rights protection and traceability of intellectual property rights authorization and transaction. This system has the following advantages and novelty: (1) We proposed a rapid consortium blockchain based on a new ICM named PBS for digital rights management (2) This system can trace the track of digital right transaction and authorization, with transparent, decentralization, tamper-proof data stored in rapid consortium blockchain. (3) This system uses master-slave blockchains to respectively store privacy and transaction to provides privacy protection for the digital right owner.

2 DRM System Requirement Analysis and Resolving Scheme 2.1

DRM System Requirement

In traditional solutions, the DRM system can only claim compensation from the infringer after the infringement of intellectual property rights, although it pays attention to image encryption and authorized management. In this paper, we proposed a new solution of the DRM for the right protection in a transparent platform, which can provide resource consumption, transfer of property right rather than that before. The new system requirement includes: (1) digital content protection, (2) authorization check, (3) protection, (4) digital transaction tracking. Here are some analyses and solutions. Digital Content Protection. A creditable DRM system should verify whether the digital content is infringing before viewed. If it isn’t infringed, it should be encrypted and watermarked to prevent illegal publish and transfer, otherwise, the DRM system will forbid any user to access these resources and record tort.

Rapid Consortium Blockchain for Digital Right Management

449

Authorization Check. The DRM system would check the authorization when the user tries to view the protected content. Since the storage structure of the blockchain is a chain structure, we put the authorization information on the block. And the blockchain is decentralized, which ensures that the authorization cannot be tampered with. Private Protection. Users always concern about privacy protection. To ensure personal privacy, we used consortium blockchain to store privacy information, which can only be accessed by the authorized node. Digital Right Transaction Tracking. How to track the trace of the digital right transaction remains a longstanding challenge in the digital field. To address this problem, our DRM system stored the information about resource and transformation into the block. 2.2

Blockchain Selection

Blockchain categories are divided into public blockchain, consortium blockchain, private blockchain [8]. We list the contrast of each blockchain, which is shown in Table 1: Table 1. The contrast of each blockchain. Public Typical chain Bitcoin, ethereum Account authority All node Tps 7–20

Consortium Hyperledger, corda Authorized node About 1000

Private JPMCoin (planed) System node More than 2000

Upon the requirement of the DRM system, privacy information should be managed by the authorized and credited institution in a tamper-resistant system, which should support the frequent recording of the transformation and consumption of dight right. This requirement prompts us that the suitable blockchain for the DRM system is consortium blockchain. Traditional consortium blockchain usually adopts practical Byzantine Fault Tolerance (PBFT) consensus, Paxos consensus, Raft consensus. Paxos consensus: although all nodes are generally under-wired access mechanisms, this consensus disabled when evil nodes are allowed in the election process. It is not fault-tolerant, difficult to implement this consensus to construct blockchain. PBFT consensus: Three stages of consensus may be repeated because the system can not provide services when more than one-third of error nodes exist in the blockchain, which causes this consensus to lose credibility. Raft consensus: Raft algorithm is a simplified implementation of the Paxos algorithm, which can not support the dynamic joining consensus nodes. The blockchainbased on Raft consensus has the following drawbacks: electing a consensus node and keeping the account of the node lead to the poor fault-tolerance; insufficient supervision of consensus nodes.

450

Y. Wu et al.

Due to the shortcomings of the existing ICM of consortium blockchain, the consensus algorithm mentioned above can not meet the requirements of DRM, so that we proposed a new ICM named PBS. This consensus learns from the public blockchain consensus, which can achieve the comprehensive optimum in many aspects: consistency, tolerance, and availability, effective supervision mechanism with high efficiency and credibility.

3 Digital Right Management System 3.1

Rapid Consortium Blockchain

Component of Rapid Consortium Blockchain. This paper proposes a rapid consortium blockchain on basis of a new intelligent consensus mechanism based on value token and negotiable token, including users, communities, nodes, consensus nodes, smart contracts, certificates, transactions, blocks. The node refers to the computer accessing the rapid consortium blockchain network, including the system node, the full node, and the normal node. The system node is a trusted node recognized by the consortium blockchain. The full node is an authorized trusted honest node. The system node and the common full node are both consensus nodes, which have the right of billing and block generation. In contrast to the system node and the authorized node, the normal node can only read the blockchain data. The smart contract is a set of commitments in digital format, including the agreement on which the contract participants can execute these commitments, which is a piece of code running on the blockchain and is executed when the blockchain detects certain trigger conditions. The transaction refers to a transaction or authorization of digital rights. A batch of transactions and authorization will be packaged according to the consensus mechanism to generate a new block, link to the blockchain, and notify the entire blockchain network. Value token is used to motivate users to participate in community activities, reward digital right uploaders, pay for initiating and implementing digital right transactions and consume smart contract deployments. The negotiable token is linked to the true value of the digital right to reflect the summation of digital rights value. The rapid consortium blockchain platform generates the value token to reward active users, and any user can also purchase value token through this platform. On the contrary, the negotiable token can’t be increased unless the value of ownership of digital property is enhanced. Participation and Equity Weights Calculation. With the purpose of stimulating and activating the intellectual property trading community, The calculation of participation and equity weights combines Power-of-Stake (PoS) algorithm with Proof-ofParticipation (PoP) algorithm. The purpose of calculating the weight is to obtain the probability that the user obtains the billing reward, determined by the PoS network weight and the PoP network weight.

Rapid Consortium Blockchain for Digital Right Management

451

On the one hand, as the idea of PoS, according to the number of negotiable token in the user account and the accumulated transaction amount, to win the reward of the bookkeeping right. Therefore, the higher the equity ratio, the greater the probability of winning the billing rights reward. On the other hand, drawing on the idea of PoP, users obtain the reward of the bookkeeping right according to the degree of participation that the number of value token in the user account and frequency of accumulated participation activities. Therefore, the higher the participation, the greater the probability of winning the billing rights reward. The above two weights are combined to reflect the probability that the user obtains the reward of the billing right. In other words, if the POS weights of two nodes are close, then the more the PoP weights, the more likely it is to get the billing rights reward. This step can effectively circumvent the problem of “the rich get richer and the greater the right”. The calculation formula of weight in PoS network and PoP network is determined as follows: Aamount Tcirculate WPoS ¼ Pn i¼1 Aamount Tcirculate

ð1Þ

Factivity Trights WPoP ¼ Pn i¼1 Factivity Trights

ð2Þ

Where Tcirculate is the negotiable token, Aamount is the cumulative transaction amount, Trights is value token, Factivity is the frequency of accumulated participation activities, and n is the number of full nodes. The Calculus of Consent. This paper proposed a new intelligent consensus mechanism named PBS for consortium blockchain, described in Fig. 1. Step 1: According to the requirement of the block generation, blockchain sets a suitable difficulty factor for the PoW network. Step 2: According to formula (1) and formula (2), the system node respectively performing network weight calculation on participation and equity of each full node. Step 3: Considering the equal influence of the WPoS and WPoP , the system node calculates the compositive effect of the above-mentioned factors as: WPoS þ WPoP i¼1 ðWPoS þ WPoP Þ

WPBS ¼ Pn

ð3Þ

Where WPoS and WPoP are respectively calculated via Eqs. (1) and (2), and n is the number of full nodes. Step 4: The system node and the authorized common full node, as the consensus node, perform the PoW algorithm in turn until a consensus node successfully obtains block generation right. The PoW algorithm solves the SHA256 math problem that solves complex but easy to verify. The mathematical problem is: according to the current difficulty factors, a suitable random number (Nonce) is solved by searching, and the double SHA256 hash value of each metadata of the block header is less than or equal to the target hash value.

452

Y. Wu et al.

Fig. 1. The consensus process

Step 5: If the full node calculation is successful, the node directly obtains the block generation right and the billing right reward, and this node constructs block with its own address and Coinbase transaction, then the system node broadcasts the result to the whole blockchain network, and the consensus ends. Or if the system node is successfully calculated, the system node broadcasts the winning node of step 2 to the entire network and constructs a Coinbase transaction generating block based on the winning node address, then the winning node obtains the block generating and the billing reward, then the consensus ends. According to the consensus mechanism of the consortium blockchain of the above process, as PBS performs PoW mining in turn, the consensus calculation of system nodes can ensure the final consistency of each block and guarantee block generation stability without blockchain bifurcating. In addition, the reward of bookkeeping right won by the system node will be given to the full node with more shares and higher participation, which can improve the activity of the community. Meanwhile, the credibility and internal transparency of consortium blockchain can be improved by the participation of trusted full nodes in consensus computing. In view of these advantages, we used PBS to implement a master-slave rapid consortium blockchain for DRM. 3.2

Master-Slave Rapid Consortium Blockchain

To improve access efficiency and making the main chain storage structure clear, The DRM system adopts the master-slave blockchain structure, shown in Fig. 2, which contains consumption blockchain, as master blockchain, resource blockchain, credit blockchain, as slave blockchain. The master-slave blockchain stores various types of

Rapid Consortium Blockchain for Digital Right Management

453

data on categories in the block. The consumption blockchain is a master blockchain in the DRM system, which is used to store transactions and authorization data related to digital property rights, these data can be accessed by any user. Since the consumption blockchain stores the digital property transaction and authorization records, the transaction information is transparent and cannot be tampered with. This storage method completes the function of tracking digital property rights transactions and authorization. In order to protect user privacy, the resource blockchain stores the digital right resource file, which can only be obtained by the digital property management relevant departments. Once the digital right infringement is detected, the infringement information will be permanently recorded in credit blockchain. All users can join the credit blockchain as the normal node, and the credit management related institutions, as the authorized nodes, put the corresponding infringement records into the block.

Fig. 2. Master-slave rapid consortium blockchain

4 Analysis of RCBDRM Traditional DRM is a centralized framework, which leads to data that can be tampered with and opaque, on the contrary, our RCBDRM uses blockchain technology to store data, making our data irreversible. And we store the authorization data and the transaction data in the consumption blockchain that all users can join, which makes our data transparent. And Traditional DRM pays more attention to encryption and decryption and ignores the tracking of transactions. This makes it difficult for tracking

Table 2. The contrast between Traditional DRM and RCBDRM Modifiability Authorization data Transaction data Data warehouse structure

RCBDRM Tamper-proof Transparent and open Effective and accessible Decentralized

Traditional DRM Adaptable Opaque Inaccessible Centralized

454

Y. Wu et al.

transactions, but our system uses blockchain technology to make digital right transactions easier to obtain. The difference between Traditional DRM and RCBDRM is listed in Table 2.

5 Conclusion DRM remains a longstanding challenge in the network, in this paper we proposed a new system named RCBDRM based on blockchain for digital right management, which can provide highly trusted and stable content services as well as privacy protection and transaction and authorization tracking. In the proposed digital right management, we present a new consensus named PBS to construct the master-slave rapid consortium blockchain, improving the consensus efficiency, shortening the transaction confirmation time and enhancing the fault tolerance of the system. The master-slave rapid consortium blockchain ensures privacy protection and a clear data structure of the node. The proposed system supports the service of tracking the trace of the digital right transaction and consumption, with decentralization, tamper-proof, transparent digital date stored in master-slave rapid consortium blockchain.

References 1. Lindsay, D., Ricketson, S.: Copyright, privacy, and digital rights management (DRM). In: New Dimensions in Privacy Law: International and Comparative Perspectives, pp. 121–153. Cambridge Univ. Press, New York (2006) 2. Zhao, J.D., Wang, R., Lu, Z.: Inter-frame passive-blind forgery detection for video shot based on similarity analysis. Multimedia Tools Appl. 77(19), 25389–25408 (2018) 3. Fang, H., Zhang, W., et al.: Screen-shooting resilient watermarking. IEEE Trans. Inf. Forensics Secur. 14(6), 1403–1418 (2019) 4. Bag, S., Ruj, S., Sakurai, K.: Bitcoin block withholding attack: analysis and mitigation. IEEE Trans. Inf. Forensics Secur. 12(8), 1967–1978 (2017) 5. Wright, A., Filippi, P.D.: Decentralized Blockchain Technology and the Rise of Lex Cryptographia. Social Science Electronic Publishing, New York (2015) 6. Zyskind, G., Nathan, O., Pentland, A.: Decentralizing privacy: using blockchain to protect personal data. In: IEEE Symposium on Security and Privacy, pp. 180–184 (2015) 7. Dorri, A., Steger, M., et al.: BlockChain: a distributed solution to automotive security and privacy. IEEE Commun. Mag. 55(12), 119–125 (2017) 8. Tschorsch, F., Scheuermann, B.: Bitcoin and beyond: a technical survey on decentralized digital currencies. IEEE Commun. Surv. Tutor. 18(3), 2084–2123 (2016)

Cryptanalysis of an Anonymous Message Authentication Scheme for Smart Grid Xiao-Cong Liang1,2 , Tsu-Yang Wu1,2,3(B) , Yu-Qi Lee1,2 , Tao Wang1,2 , Chien-Ming Chen4 , and Yeh-Cheng Chen5 1

5

School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350118, China [email protected], [email protected], [email protected], [email protected] 2 Fujian Provincial Key Lab of Big Data Mining and Applications, Fujian University of Technology, Fuzhou 350118, China 3 National Demonstration Center for Experimental Electronic Information and Electrical Technology Education, Fujian University of Technology, Fuzhou 350118, China 4 College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China [email protected] Department of Computer Science, University of California, Davis, CA 95616, USA [email protected]

Abstract. Authentication schemes are used to authenticate the identity of communication parties and widely applied to several environments. Recently, Wu et al. proposed an anonymous message authentication scheme in smart grid. However, we show that their scheme has several security weaknesses based on the CK adversarial model in this paper.

Keywords: Authentication Anonymity · Cryptanalysis

1

· Key agreement · Smart grid ·

Introduction

Compared with the one-way communication of traditional power grid, the twoway communication of smart grid [9–11,13] provides greater guarantee for the reliability, security and efficiency of power system. In the home area network (HAN), smart meters are responsible for collecting the power consumption data of various electrical equipment of power users and sending the data to the neighborhood area network gateway (or called NAN gateway for short) through the HAN gateway. The NAN gateway then sends the collected data to the utility control center. On the other hand, the utility control center will send power price, control and other information to power users. In Fig. 1, we depict a sketch of communication architecture described above. However, it is inevitable that c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 455–461, 2020. https://doi.org/10.1007/978-981-15-3308-2_49

456

X.-C. Liang et al.

the communication network of smart grid will be subject to a series of cyber attacks. Smart grid information network security is facing challenges.

Fig. 1. A sketch of communication architecture

Authentication key exchange is a cryptographic primitive to guarantee the security of transmitted messages [4–8,15]. It can authenticate the identity of communication parties and establish a session key between them. At present, the technology has been widely used with different communication backgrounds and applications. In smart grid, several schemes have been proposed. In 2017, Sha et al. [14] considered that most authentication schemes only apply to networked smart grid devices. Therefore, Sha et al. based on a two-level authentication to proposed a secure and efficient smart grid data reading framework. The framework uses smart reader to realize the communication of devices. Meanwhile, the security analysis are demonstrated that the framework is effective and secure. However, in 2018, Abbasinezhad-Mood and Nikooghadam [1] found that Sha et al.’s scheme is vulnerable to desynchronization attack and violates perfect forward secrecy. In order to overcome the security defects, they proposed an anonymous password authentication key exchange scheme by using an extended Chebyshev chaotic map, and claimed that the scheme is secure.

Cryptanalysis of a Message Authentication Scheme for Smart Grid

457

In 2019, Abbasinezhad-Mood et al. [2] analyzed Sha et al. and AbbasinezhadMood and Nikooghadam’s scheme. In both schemes, smart readers need to connect to the power service provider via the Internet, which brought a large cost. Therefore, they proposed a novel anonymous key agreement scheme for isolated smart meters. In their scheme, online key agreement process does not require the participation of the power service provider and can significantly reduce cost. Zhang et al. [17] proposed a lightweight authentication scheme for privacy protection. Considering the storage and computing resource limitations of smart meters, this scheme uses a large number of XOR operations to reduce the computational overhead and realize the anonymity of smart meters. They also used hardware in the experiment and the experiment showed that compared with other schemes, this scheme can reduces the computational cost. Recently, Li et al. [12] proposed an anonymous authentication scheme between HAN gateway and NAN gateway in smart grid. This scheme mainly uses some lightweight encryption technology. Meanwhile, they claim that their scheme is secure and reliable. However, Wu et al. [16] found that Li et al.’s scheme was vulnerable to denied of service (DoS) attack and did not provide secure two-way authentication. Therefore, Wu et al. proposed a new anonymous message authentication scheme based on Li et al.’s scheme. In this paper, we find that the improvement scheme proposed by Wu et al. also has security weakness under the Canetti-Krawczyk adversarial (CK-adversarial) model [3]. Concretely, their scheme is insecure against a known session-specific temporary information attack. When adversary knows a temporary secret, the anonymity cannot be guaranteed and the session key as well as the long-term private key of the home area network gateway are leaked. This will have a huge impact on the security and privacy of smart grid.

2

Review of Wu et al.’s Scheme [16]

First, we describe the network model of their scheme. 2.1

Network Model

This model mainly focuses on the communication between NAN gateway and HAN gateways. As shown in Fig. 2, suppose a NAN covers a certain range of HAN, and the smart meter acts as the gateway of HAN. 1. Registration Center (RC). We assume the RC is trusted and responds to generate all system parameters and keys of the communicating parties. 2. Neighborhood Area Network gateway (NAN-GW): The NAN-GW receives the power consumption data of HAN-GW and sends the data to the utility control center. 3. The Home Area Network gateway (HAN-GW): Smart meters are installed in HAN-GW to collect data on household power consumption and transmit them to NAN-GW.

458

X.-C. Liang et al.

Fig. 2. A communication network model

2.2

Detailed Scheme

Wu et al.’s scheme consists of following three phases. Note that the notations are summarized in Table 1. Initialization 1. The RC chooses a multiplicative cyclic group G with order q, where g is a generator of G. Then, the RC chooses two hash functions h : {0, 1}∗ → Zq and h : {0, 1}∗ → {0, 1}log2 q+|ID| . 2. For HANGWi , RC selects random value xi as private key and computes Pi = g xi as public key. Then, the RC sends the key pair (xi , Pi ) to HANGWi via a secure channel. 3. For N ANGWj , RC selects random value yj as private key and computes Qj = g yj as public key. Then the RC sends the key pair (yj , Qj ) to N ANGWj via a secure channel.

Cryptanalysis of a Message Authentication Scheme for Smart Grid

459

Table 1. Notation Notations Meanings HAN GWi N ANGWj G q g h h IDi IDj

The ith home area network gateway The j th neighborhood area network gateway A cyclic multiplication group The order of G The generator of G Hash function Hash function The identity of HANGWi ,IDi ∈ {0, 1}∗ The identity of N ANGWj ,IDj ∈ {0, 1}∗

Authentication and Key Agreement. This phase between HANGWi and N ANGWj is described as follows. 1. HANGWi → N ANGWj : M1 = {A, C, ti }. HAN GWi chooses a random number a ∈ Zq∗ and computes A = g a , S = a − xi h(IDi ti A), C = Qaj ⊕ (IDi S), where ti is the current timestamp. 2. N ANGWj → HANGWi : M2 = {B, D, tj }. Upon receiving M1 , N ANGWj gets the current timestamp ti and computes (IDi S) = C ⊕ Ayj . Then, it ?

h(ID t A)

?

i i . If both verifications first checks |ti − ti | ≤ ΔT and A = g S · Pi ∗ hold, N ANGWj chooses a random number b ∈ Zq and computes B = g b , skN −H = h(A B Ab IDi IDj ), v = h(IDi IDj skN −H ), D = h(Ayj IDi tj ) ⊕ (IDj v), where tj is the current timestamp. 3. Upon receiving M2 , HANGWi gets the current timestamp tj and checks |tj −

?

tj | ≤ ΔT . If it holds, HANGWi computes (IDj v) = h(Qaj IDi tj ) ⊕ D, ?

skH−N = h(A B B a IDi IDj ), and check v = h(IDi IDj skH−N ). After all communications are finish and all verifications are true, NAN-GW and HAN-GW can authenticate each other and compute a session sk collaboratively. Message Transmission 1. HANGWi → N ANGWj : {Ci }. HANGWi collects the power consumption data Mi and computes Hi = h(Mi Ti ), where Ti is the current timestamp. Then, HANGWi uses the session key skH−N to encrypt the Mi as Ci = EncskH−N (Mi Ti Hi ). 2. Upon receiving Ci , N ANGWj decrypts it by using skH−N to obtain (Mi ?

?

Ti Hi ). Then, N ANGWj checks |Ti − Ti | ≤ Δt and h(Mi Ti ) = Hi .

460

3

X.-C. Liang et al.

Cryptanalysis of Wu et al.’s Scheme

In this section, we find that Wu et al.’s scheme has security limitations in the CK-adversarial model [3]. CK-adversary Model. In this model, an adversary A is allowed to eavesdrop, intercept, modify, and insert the transmitted messages. Moreover, A can obtain the temporary secret such as random value chosen by entity. Security Weaknesses. Based on the CK-adversarial model, we assume that the ephemeral secret a is revealed by A. Then, we find that Wu et al.’s anonymous message authentication scheme violates anonymity and session key security. Moreover, the long-term private key of HANGWi named xi can be recovered. The details are described as follow. 1. We assume that A intercepts two messages M1 = {A, C, ti } and M2 = {B, D, tj }. 2. The identity of HANGWi , named IDi can be recovered by computing (IDi S) = C ⊕ Qaj , where Qj is the public key of N ANGWi . 3. Similarly, the identity of N ANGWj , named IDj can be recovered by computing (IDj v) = D ⊕ h(Qaj IDi tj ), where Qj is the public key of N ANGWi . 4. The session key skH−N can be recovered by computing skH−N = h(A B B a IDi IDj ). 5. The long-term private key of HANGWi , named xi can be recovered by com. puting xi = h(IDa−S i ti A)

4

Conclusion

In this paper, we found that Wu et al.’s scheme has several security weaknesses under a ephemeral secret reveal attack (CK-adversarial model). In the future, We will propose an improvement based on their scheme to fix the proposed security weaknesses. Acknowledgments. The work was supported in part by the Natural Science Foundation of Fujian Province under Grant no. 2018J01636 and the Science and Technology Development Center, Ministry of Education, China under Grant no. 2017A13025.

References 1. Abbasinezhad-Mood, D., Nikooghadam, M.: Efficient anonymous passwordauthenticated key exchange protocol to read isolated smart meters by utilization of extended chebyshev chaotic maps. IEEE Trans. Ind. Inf. 14(11), 4815–4828 (2018)

Cryptanalysis of a Message Authentication Scheme for Smart Grid

461

2. Abbasinezhad-Mood, D., Ostad-Sharif, A., Nikooghadam, M.: Novel anonymous key establishment protocol for isolated smart meters. IEEE Trans. Ind. Electron. 67, 2844–2851 (2019) 3. Canetti, R., Krawczyk, H.: Analysis of key-exchange protocols and their use for building secure channels. In: Pfitzmann, B. (ed.) Advances in Cryptology – EUROCRYPT 2001, pp. 453–474. Springer, Heidelberg (2001) 4. Chen, C.M., Huang, Y., Wang, E.K., Wu, T.Y.: Improvement of a mutual authentication protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern Recognit. 2(1), 15–24 (2018) 5. Chen, C.M., Wang, K.H., Fang, W., Wu, T.Y., Wang, E.K.: Reconsidering a lightweight anonymous authentication protocol. J. Chin. Inst. Eng. 42(1), 9–14 (2019) 6. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated key agreement protocol based on chaotic maps. Data Sci. Pattern Recognit. 1(2), 1–10 (2017) 7. Chen, C.M., Wang, K.H., Yeh, K.H., Xiang, B., Wu, T.Y.: Attacks and solutions on a three-party password-based authenticated key exchange protocol for wireless communications. J. Ambient Intell. Humaniz. Comput. 10(8), 3133–3142 (2019) 8. Chen, C.M., Xiang, B., Wang, K.H., Zhang, Y., Wu, T.Y.: An efficient and secure smart card based authentication scheme. J. Internet Technol. 20(4), 1113–1123 (2019) 9. De Dutta, S., Prasad, R.: Security for smart grid in 5G and beyond networks. Wirel. Pers. Commun. 106(1), 261–273 (2019). https://doi.org/10.1007/s11277019-06274-5 10. Ghosal, A., Conti, M.: Key management systems for smart grid advanced metering infrastructure: a survey. IEEE Commun. Surv. Tutor. 21, 2831–2848 (2019) 11. Gungor, V.C., Sahin, D., Kocak, T., Ergut, S., Buccella, C., Cecati, C., Hancke, G.P.: Smart grid technologies: communication technologies and standards. IEEE Trans. Indu. Inf. 7(4), 529–539 (2011) 12. Li, X., Wu, F., Kumari, S., Xu, L., Sangaiah, A.K., Choo, K.K.R.: A provably secure and anonymous message authentication scheme for smart grids. J. Parallel Distrib. Comput. 132, 242–249 (2019). http://www.sciencedirect.com/ science/article/pii/S0743731517303064 13. Liang, X.C., Wu, T.Y., Lee, Y.Q., Chen, C.M., Yeh, J.H.: Cryptanalysis of a pairing-based anonymous key agreement scheme for smart grid. In: Advances in Intelligent Information Hiding and Multimedia Signal Processing, pp. 125–131. Springer (2020) 14. Sha, K., Alatrash, N., Wang, Z.: A secure and efficient framework to read isolated smart grid devices. IEEE Trans. Smart Grid 8(6), 2519–2531 (2017) 15. Wang, K.H., Chen, C.M., Fang, W., Wu, T.Y.: A secure authentication scheme for internet of things. Pervasive Mobile Comput. 42, 15–26 (2017) 16. Wu, L., Wang, J., Zeadally, S., He, D.: Anonymous and efficient message authentication scheme for smart grid. In: Security and Communication Networks (2019) 17. Zhang, L., Zhao, L., Yin, S., Chi, C.H., Liu, R., Zhang, Y: A lightweight authentication scheme with privacy protection for smart grid communications. Future Gener. Comput. Syst. 100, 770–778 (2019). http://www.sciencedirect.com/science/ article/pii/S0167739X19310398

Cryptanalysis of an Anonymous Mutual Authentication Scheme in Mobile Networks Lei Yang1 , Tsu-Yang Wu1(B) , Zhiyuan Lee1 , Chien-Ming Chen1 , King-Hang Wang2 , Jeng-Shyang Pan1 , Shu-Chuan Chu1 , and Mu-En Wu3 1

College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] 2 Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong [email protected] 3 Department of Information and Finance Management, National Taipei University of Technology, Taipei 10608, Taiwan, R.O.C. [email protected]

Abstract. With the rapid development of mobile networks, secure communication technologies for mobile users are received much attentions from researchers. Recently, Chung et al. proposed an anonymous mutual authentication scheme for communication inter-devices in mobile networks. Some previous literatures are shown that their scheme has some security weaknesses. In this paper, we also point out their scheme violates perfect forward secrecy and is insecure against a replay attack.

1

Introduction

With the development of science and technology [15–17], mobile devices are widely used and applied to several environments. Mobile network environments are particularly important for applications of mobile devices, because they involve the security of user communications [19]. In the mobile network environment, a typical client-server architecture is shown in Fig. 1. In the process of user communication, users hope that their communication messages can be protected. In order to solve this problem, several mutual authentication scheme are proposed in [1–3,9–11,14]. In the mobile network environment, the trusted server responds to complete the anonymous authentication of both users [6,7,13], and after successful authentication, the session key can be established for communication. In 2015, Saravanan et al. [8] proposed an anonymous security authentication scheme for users of global mobile network roaming service [12]. In 2016, Chung et al. [4] proposed an authentication scheme with anonymity. In 2017, Feng et al. [5] proposed a smart card based authentication scheme in multi-server environments. c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 462–467, 2020. https://doi.org/10.1007/978-981-15-3308-2_50

Cryptanalysis of an Anonymous Mutual Authentication Scheme

463

Fig. 1. A typical client-server architecture

Wu et al. [18] found that Chung et al.’s scheme is insecure against a denial of service attack and a user simulation attack in 2018. In this paper, we also show that Chung et al.’s scheme violates perfect forward secrecy and is insecure against a replay attack. In our proposed replay attack, an malicious user can impersonate some user to construct session key with other user. Meanwhile, the server believes the malicious user is legal in the session.

2

Review of Chung et al.’s Scheme [4]

Chung et al.’s scheme involves two types of entities: mobile devices and a trusted server. 2.1

Registration

When mobile device M with identity IDM wants to register to the server S, S selects a random number xM and computes V IDM = h(IDM ||xM ||x),

(1)

where x is the S’s secret key. Then, S sends (V IDM , xM , h(x)) to M via a secure channel and stores it into its database. 2.2

Authentication and Session Key Establishment

In this phase, M1 and M2 can execute the following steps to achieve mutual authentication and session key establishment. 1. M1 generates nM1 , rM1 and computes SIDM1 = V IDM1 ⊕ h(h(x)||nM1 )

(2)

V1 = rM1 ⊕ h(xM1 ||nM1 )

(3)

H1 = h(xM1 ||SIDM1 ||V1 ||nM1 ).

(4)

Then, M1 sends m1 = {SIDM1 , V1 , H1 , nM1 , IDS } to M2 .

464

L. Yang et al.

2. Upon receiving m1 , M2 first checks the validity the IDS . Then, M2 generates a nonce nM2 and computes SIDM2 = V IDM2 ⊕ h(h(x)||nM2 )

(5)

H2 = h(xM2 ||SIDM2 ||nM2 )

(6)

Then, M2 sends m2 = {SIDM1 , V1 , H1 , nM1 , SIDM2 , H2 , nM2 } to S. 3. Upon receiving m2 , S computes V IDM1 = SIDM1 ⊕ h((h(x)||nM1 )

(7)

and V IDM2 = SIDM2 ⊕ h((h(x)||nM2 ). (8) Then, S retrieves the records (V IDM1 , IDM1 , xM1 ), (V IDM2 , IDM2 , xM2 ) and verifies H1 , H2 . If two verifications hold, S generates a nonce nS and computes (9) rM1 = V1 ⊕ h(xM1 ||nM1 ) V2 = rM1 ⊕ h(xM2 ||nM2 )

(10)

V3 = h(xM2 ||V IDM1 ||nM2 )

(11)

V4 = h(xM1 ||V IDM2 ||nM1 )

(12)

H3 = h(xM2 ||V2 ||V3 ||V4 ||nS ).

(13)

Finally, S sends m3 = {V2 , V3 , V4 , H3 , nS } to M2 . 4. Upon receiving m3 , M2 first checks H3 . If the verifications hold, M2 computes V IDM1 = SIDM1 ⊕ h((h(x)||nM1 )

(14)

and checks V3 , computes rM1 = V2 ⊕ h(xM2 ||nM2 ).

(15)

Then, M2 generates a nonce rM2 and computes SK = h(h(x)||rM1 ||rM2 ||V IDM1 ||V IDM2 )

(16)

V5 = rM2 ⊕ h(h(x)||rM1 )

(17)

V6 = h(SK||nM1 )

(18)

H4 = h(rM1 ||SIDM2 ||V4 ||V5 ||V6 ||nM2 )

(19)

and sends m4 = {SIDM2 , V4 , V5 , V6 , H4 , nM2 } to M1 . 5. Upon receiving m4 , M1 first checks the validity of H4 . Then, M1 computes V IDM2 = SIDM2 ⊕ h((h(x)||nM2 )

(20)

rM2 = V5 ⊕ h(h(x)||rM1 )

(21)

SK = h(h(x)||rM1 ||rM2 ||V IDM1 ||V IDM2 )

(22)

Then, M1 checks the validity of V4 and V6 , and computes V7 = h(SK||nM2 )

(23)

and sends m5 = {V7 } to M2 . 6. After receiving m5 , M2 checks the validity of V7 . If it is valid, SK is the session key for M1 and M2 .

Cryptanalysis of an Anonymous Mutual Authentication Scheme

3

465

Cryptanalysis of Chung et al.’s Scheme

In this section, we show that Chung et al.’s scheme violates perfect forward secrecy and suffered from a replay attack. In our proposed attacks, we assume that attacker ME is an legal user, which has registered to server. 3.1

Violating Perfect Forward Secrecy

In this attack, we assume that ME has obtained M1 ’s secret key xM1 and captured m1 , m4 . Then, ME can recover session key as follows. 1. To recover V IDM1 = SIDM1 ⊕ h(h(x)||nM1 )

(24)

rM1 = V1 ⊕ h(xM1 ||nM1 )

(25)

V IDM2 = SIDM2 ⊕ h(h(x)||nM2 )

(26)

rM2 = V5 ⊕ h(h(x)||rM1 )

(27)

and from m1 . 2. To recover

and from m4 . 3. Session key SK can recovered by SK = h(h(x)||rM1 ||rM2 ||V IDM1 ||V IDM2 ). 3.2

(28)

Replay Attack

In this attack, ME can impersonate M2 to establish a session key with M1 . Meanwhile, the server S will think ME is M2 . Our attack contains following two parts. Part 1. ME launches a request to M2 and then gathers some information for ME needs. 1. To perform the step 1 of the authentication and session key establishment phase. ME generates nME , rME and computes SIDME , VE , HE . Then, ME sends m1 = {SIDME , VE , HE , nME , IDS } to M2 . 2. When M2 sends m2 to S, ME eavesdrops m2 and obtains {SIDM2 , H2 , nM2 }. 3. After S authenticates ME and M2 , S sends m3 to M2 . At the same time, ME eavesdrops m3 to M2 . Note that ME can further compute h(xM2 ||nM2 ) = V2 ⊕ rME .

(29)

466

L. Yang et al.

Part 2. If M1 launches a request to M2 , then ME can impersonate M2 with the information obtained in Part 1. 1. Upon receiving m1 from M1 , ME sends m2 to S, where m2 = {SIDM1 , V1 , H1 , nM1 , SIDM1 , H2 , nM2 }.

(30)

Note that ME has obtained {SIDM2 , H2 , nM2 } in Part 1. 2. Upon receiving m2 , S first checks the validity of m2 . It is easy to see that it can pass all verifications. In other words, S believes m2 is sent by M2 . In fact, it is sent by ME . 3. Upon receiving m3 from S, ME can compute rM1 = V2 ⊕ h(xM2 ||nM2 )

(31)

and then sends m4 to M1 . 4. Finally, both M1 and ME can establish a session SK.

4

Conclusion

In this paper, we have reviewed Chung et al.’s scheme and found that their scheme violates perfect forward secrecy. Meanwhile, we further proposed a replay attack in their scheme.

References 1. Chen, C.M., Huang, Y., Wang, E.K., Wu, T.Y.: Improvement of a mutual authentication protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern Recogn. 2(1), 15–24 (2018) 2. Chen, C.M., Wang, K.H., Fang, W., Wu, T.Y., Wang, E.K.: Reconsidering a lightweight anonymous authentication protocol. J. Chin. Inst. Eng. 42(1), 9–14 (2019) 3. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated key agreement protocol based on chaotic maps. Data Sci. Pattern Recogn. 1(2), 1–10 (2017) 4. Chung, Y., Choi, S., Won, D.: Anonymous mutual authentication scheme for secure inter-device communication in mobile networks. In: International Conference on Computational Science and Its Applications, pp. 289–301. Springer, Heidelberg (2016) 5. Feng, Q., He, D., Zeadally, S., Wang, H.: Anonymous biometrics-based authentication scheme with key distribution for mobile multi-server environment. Future Gen. Comput. Syst. 84, 239–251 (2018) 6. Jegadeesan, S., Azees, M., Kumar, P.M., Manogaran, G., Chilamkurti, N., Varatharajan, R., Hsu, C.H.: An efficient anonymous mutual authentication technique for providing secure communication in mobile cloud computing for smart city applications. Sustain. Cities Soc. 49, 101522 (2019)

Cryptanalysis of an Anonymous Mutual Authentication Scheme

467

7. Jia, X., He, D., Kumar, N., Choo, K.K.R.: A provably secure and efficient identitybased anonymous authentication scheme for mobile edge computing. IEEE Syst. J. (2019) 8. Karuppiah, M., Saravanan, R.: A secure authentication scheme with user anonymity for roaming service in global mobility networks. Wirel. Pers. Commun. 84(3), 2055–2078 (2015) 9. Korać, D., Simić, D.: Fishbone model and universal authentication framework for evaluation of multifactor authentication in mobile environment. Comput. Secur. 85, 313–332 (2019) 10. Li, C.T., Wu, T.Y., Chen, C.M.: A provably secure group key agreement scheme with privacy preservation for online social networks using extended chaotic maps. IEEE Access 6, 66742–66753 (2018) 11. Liang, X.C., Wu, T.Y., Lee, Y.Q., Chen, C.M., Yeh, J.H.: Cryptanalysis of a pairing-based anonymous key agreement scheme for smart grid. In: Advances in Intelligent Information Hiding and Multimedia Signal Processing, pp. 125–131. Springer, Heidelberg (2020) 12. Madhusudhan, R., et al.: A secure and lightweight authentication scheme for roaming service in global mobile networks. J. Inf. Secur. Appl. 38, 96–110 (2018) 13. Mo, J., Hu, Z., Lin, Y.: Remote user authentication and key agreement for mobile client-server environments on elliptic curve cryptography. J. Supercomput. 74(11), 5927–5943 (2018) 14. Wang, K.H., Chen, C.M., Fang, W., Wu, T.Y.: A secure authentication scheme for internet of things. Pervasive Mob. Comput. 42, 15–26 (2017) 15. Wu, T.Y., Chen, C.M., Wang, K.H., Meng, C., Wang, E.K.: A provably secure certificateless public key encryption with keyword search. J. Chinese Inst. Eng. 42(1), 20–28 (2019) 16. Wu, T.Y., Chen, C.M., Wang, K.H., Pan, J.S., Zheng, W., Chu, S.C., Roddick, J.F.: Security analysis of Rhee et al.’s public encryption with keyword search schemes: a review. J. Netw. Intell. 3(1), 16–25 (2018) 17. Wu, T.Y., Chen, C.M., Wang, K.H., Wu, J.M.T.: Security analysis and enhancement of a certificateless searchable public key encryption scheme for IIoT environments. IEEE Access 7, 49232–49239 (2019) 18. Wu, T.Y., Fang, W., Chen, C.M., Wang, G.: Cryptanalysis of an anonymous mutual authentication scheme for secure inter-device communication in mobile networks. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 206–213. Springer, Heidelberg (2017) 19. Xie, Q., Tang, Z., Chen, K.: Cryptanalysis and improvement on anonymous threefactor authentication scheme for mobile networks. Comput. Electr. Eng. 59, 218– 230 (2017)

A Lightweight Anonymous Mutual Authentication Scheme in Mobile Networks Zhiyuan Lee1 , Tsu-Yang Wu1(B) , Lei Yang1 , Chien-Ming Chen1 , King-Hang Wang2 , Jeng-Shyang Pan1 , Shu-Chuan Chu1 , and Yeh-Cheng Chen3 1

3

College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] 2 Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong [email protected] Department of Computer Science, University of California, Davis, CA 95616, USA [email protected]

Abstract. Today, many security risks are exposed due to the rapid development of mobile communications. Anonymous mutual authentication allows entities to authenticate without revealing any identity information. Recently, numerous the kinds of anonymous mutual authentication schemes were proposed. However, most of them have some security weaknesses. In this paper, we refer the previous works to propose a lightweight anonymous mutual authentication scheme in mobile networks. Informal security analysis is shown that our scheme can resist several attacks. Finally, the performance analysis and comparisons is shown that our scheme is efficient. Keywords: Authentication · Key agreement Cryptanalysis · Mobile networks

1

· Anonymity ·

Introduction

Authentication [2] is a cryptographic primitive which provides entities to authenticate each other. Numerous authentication schemes [1,6,7,9,10,13, 17] are designed in smart grid or wireless sensor networks for the secure communications. With the rapid growth of information age [14], the usage of mobile devices (smart phones, personal digital assistants, etc.) become more widely and popularly. However, users are exposed to various security risks in an open network such as mobile networks. Recently, several authentication schemes are designed for mobile networks have been proposed. In 2015, Shin et al. [12] proposed an c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 468–473, 2020. https://doi.org/10.1007/978-981-15-3308-2_51

A Lightweight Anonymous Mutual Authentication Scheme

469

efficient authentication protocol with only hash functions and exclusive-or operations. However, Farash et al. [4] showed that Shin et al.’s protocol violated untraceablility and then proposed an improved protocol. In 2018, Qiu et al. [11] proposed an improved scheme for session initiation protocol in communication networks, and considered their scheme to be very secure. However, in 2019, Zhang et al. [16] analyzed Qiu et al.’s scheme and found a serious error, in which the attacker could affect the user password update and even access the important information of legitimate users. Then, they proposed a more secure mutual authentication protocol. In order to make the entire authentication process more secure, bioinformatics are usually added during the protocol. Kumari et al. [8] proposed a cloud server authentication scheme based on biometrics. Although their scheme claims to provide anonymity to users and satisfy a number of security features, it was later discovered by Feng et al. [5] that Kumari et al.’s scheme did not actually provide anonymity to users. In order to achieve the communication securely in mobile networks, Chung et al. [3] proposed a new anonymous authentication scheme. However, Wu et al. [15] found that Chung et al.’s scheme violated untraceability. In this paper, we based on Chung et al.’s protocol to propose an anonymous mutual authentication scheme in mobile networks which overcomes the security weaknesses mentioned in [15]. Informal security analysis is shown that our scheme is secure against replay, impersonate, and knowing session key attacks as well as provides anonymity. Finally, performance analysis and comparisons are made to show the efficiency of our scheme.

2

The Proposed Scheme

Our scheme consists of two phases: the registration phase and the authentication phase. Note that, in the registration phase, the communication is under on a secure channel. 2.1

Registration Phase

In this phase, a mobile device Mi selects his identity IDi to server S. After checking the validity of IDi , S selects secret key x, random value xi to computes V IDi = h(IDi xi x)

(1)

Then, S stores {V IDi , IDi , xi } in its database and sends back {V IDi , xi , h(x)} to Mi . Finally, Mi stores these values in his memory securely. 2.2

Authentication Phase

Mi and Mj can perform the following steps to authenticate each other and establishing a session key.

470

Z. Lee et al.

1. Mi generates two nonces ni , ri and retrieves {V IDi , xi , h(x)}. Then, it computes (2) V1 = ri ⊕h(xi ni ) H1 = h(xi V IDi V1 ni )

(3)

and sends m1 = {V IDi , V1 , Hi , ni } to Mj . 2. Upon receiving m1 from Mi , Mj generates a random number nj , retrieves {V IDj , xj , h(x)} and computes H2 = h(xj V IDj nj )

(4)

Then, Mj sends m2 = {V IDi , V IDj , V1 , H1 , H2 , ni , nj } to S. from Mj , S retrieves {V IDi , IDi , xi } 3. Upon receiving m2 and {V IDj , IDj , xj } from the database and then checks the validity of H1 and H2 . If both verifications are valid, it means that S authenticates both Mi and Mj . 4. S generates x∗i , x∗j , nS and computes new virtual identities V IDi∗ for Mi and V IDj∗ for Mj where V IDi∗ = h(IDi x∗i x) and V IDj∗ = h(IDj x∗j x). Then, S updates {V IDi , IDi , xi } and {V IDj , IDj , xj } with {V IDi∗ , IDi , x∗i } and {V IDj∗ , IDj , x∗j } respectively. Finally, S computes Vsi = x∗i ⊕h(xi nS )

(5)

Vsj = x∗j ⊕h(xj nS )

(6)

V2 = ri ⊕h(x∗j nj )

(7)

H3 = h(xj V

IDj∗ V

IDi∗ Vsi Vsj V2 nS )

(8)

and sends m3 = {V IDi∗ , V IDj∗ , Vsi , Vsj , V2 , nS } to Mj . 5. Mj computes the required values to check H3 with xj . Also, it obtains it’s new secret key x∗j where x∗j = Vsj ⊕h(xj nS ). Now Mj can update it’s record {V IDj , xj , h(x)} with {V IDj∗ , x∗j , h(x)}. After that, Mj computes ri = V2 ⊕h(x∗j nj )

(9)

SK = h(h(x)ri rj V IDi∗ V IDj∗ )

(10)

V3 = rj ⊕h(h(x)ri )

(11)

V4 = h(SKni )

(12)

H4 = h(ri V3 V4 nj ),

(13)

where rj is a random number generated by Mj . Finally, it sends m4 = {V IDi∗ , Vsi , V3 , V4 , nj , nS } to Mi .

A Lightweight Anonymous Mutual Authentication Scheme

471

6. Upon receiving m4 , Mi also computes the required values to check H4 with xi . Then, it obtains it’s new secret key x∗i = Vsi ⊕h(xi nS ). Now, Mi can update it’s record {V IDi , xi , h(x)} with {V IDi∗ , x∗i , h(x)}. Finally, Mi computes rj = V3 ⊕h(h(x)ni )

(14)

SK = h(h(x)ri rj V IDi∗ V IDj∗ )

(15)

V5 = h(SKnj )

(16)

and sends m5 = {V5 } to Mj . 7. Upon receiving m5 , Mj first verifies the validity of V5 . If it is true, the established session key between Mi and Mj is defined by SK = h(h(x)ri rj V IDi∗ V IDj∗ ).

3 3.1

(17)

Informal Security Analysis of the Proposed Scheme Replay Attack

In our scheme, the server updates {V IDi , IDi , xi } in each session. It can effectively prevent an adversary A to launch replay attacks. Although can compute the value of h(xi ni ), he cannot obtain ri with the value of h(xi ni ). It means that A cannot compute the session key and further to impersonate Mi . 3.2

Denial of Service Attack

Fortunately, A can launch denial of service attack because the renewal request is integrated to the authentication phase. However, A has no ability to change the virtual identity of registered mobile device in our design. 3.3

User Impersonation Attack

The adversary A may try to impersonate a registered mobile device by eavesdropping communication messages in the authentication phase. In our design, A has no change to calculate the precise session key with the messages he has eavesdropped. Therefore, the proposed protocol is secure against the user impersonation attack. 3.4

Knowing Session Key Attack

Knowing session key attack means that if A realizes the session key which was previously used, he cannot obtain the current session key. In our scheme, the calculation of session key SK = (h(h(x)ri rj V IDi∗ V IDj∗ ))

(18)

is linked to the random number ri and rj . Even A gets the previous SK, it still cannot deduce the current session key which is generated by the random number of current session. For the reason, the proposed protocol is secure against knowing session key attack.

472

Z. Lee et al.

3.5

Anonymity

A can obtain V IDi of any mobile device Mi . This attacker may try to guesses the real identity IDi . However, in our scheme V IDi is calculated as h(IDi xi x) and A does not know the value of xi and x. It means that A has no ability to check the correctness of IDi . For this reason, the proposed protocol guarantees the anonymity.

4

Performance Analysis

In order to analyze the performance of our proposed scheme, we use the symbols th denotes a hash operation, txor denotes a bitwise XOR operation, tsym denotes a symmetric encryption/decryption operation, and tm denotes a point multiplication operation in elliptic curve. In Table 1, we demonstrate the comparison of the computational cost between our scheme and previous works [3,5,8,12]. The multiplication of elliptic curves takes much longer than the hash operation. It is obvious that our scheme is efficient. Table 1. Performance comparisons

5

Schemes

Mi

Mj

S

Total

Shin et al. [12]

8th + 7txor

2tsym + 1th

2tsym + 3th + 4txor

4tsym + 12th

Chung et al. [3]

10th + 4txor

11th + 4txor

9th + 4txor

30th

Kumari et al. [8] 3tm + 5th + 3txor

3tm + 5th + 2txor

2tm + 6th + 2txor

8tm + 16th

Feng et al. [5]

3tm + 7th + 3txor

3tm + 7th + 2txor

2tm + 10th + 4txor

8tm + 24th

Ours

8th + 3txor

9th + 3txor

9th + 4txor

26th

Conclusion

In this paper, we have reviewed the recent authentication schemes in mobile networks. Then, an anonymous authentication scheme based on mobile network environment is proposed. Informal security analysis shows that our scheme provide anonymity is secure against replay, impersonation, and knowing session key attacks.

References 1. Chen, C.M., Huang, Y., Wang, E.K., Wu, T.Y.: Improvement of a mutual authentication protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern Recognit. 2(1), 15–24 (2018) 2. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated key agreement protocol based on chaotic maps. Data Sci. Pattern Recognit. 1(2), 1–10 (2017)

A Lightweight Anonymous Mutual Authentication Scheme

473

3. Chung, Y., Choi, S., Won, D.: Anonymous mutual authentication scheme for secure inter-device communication in mobile networks. In: International Conference on Computational Science and Its Applications, pp. 289–301. Springer (2016) 4. Farash, M.S., Chaudhry, S.A., Heydari, M., Sajad Sadough, S.M., Kumari, S., Khan, M.K.: A lightweight anonymous authentication scheme for consumer roaming in ubiquitous networks with provable security. Int. J. Commun. Syst. 30(4), e3019 (2017) 5. Feng, Q., He, D., Zeadally, S., Wang, H.: Anonymous biometrics-based authentication scheme with key distribution for mobile multi-server environment. Future Gener. Comput. Syst. 84, 239–251 (2018) 6. He, D., Kumar, N., Chen, J., Lee, C.C., Chilamkurti, N., Yeo, S.S.: Robust anonymous authentication protocol for health-care applications using wireless medical sensor networks. Multimed. Syst. 21(1), 49–60 (2015) 7. Jabbari, A., Mohasefi, J.: Improvement in new three-party-authenticated key agreement scheme based on chaotic maps without password table. Nonlinear Dyn. 95(4), 3177–3191 (2019) 8. Kumari, S., Li, X., Wu, F., Das, A.K., Choo, K.K.R., Shen, J.: Design of a provably secure biometrics-based multi-cloud-server authentication scheme. Future Gener. Comput. Syst. 68, 320–330 (2017) 9. Liang, X.C., Wu, T.Y., Lee, Y.Q., Chen, C.M., Yeh, J.H.: Cryptanalysis of a pairing-based anonymous key agreement scheme for smart grid. In: Advances in Intelligent Information Hiding and Multimedia Signal Processing, pp. 125–131. Springer (2020) 10. Mahmood, K., Li, X., Chaudhry, S.A., Naqvi, H., Kumari, S., Sangaiah, A.K., Rodrigues, J.J.: Pairing based anonymous and secure key agreement protocol for smart grid edge computing infrastructure. Future Gener. Comput. Syst. 88, 491– 500 (2018) 11. Qiu, S., Xu, G., Ahmad, H., Guo, Y.: An enhanced password authentication scheme for session initiation protocol with perfect forward secrecy. PLoS ONE 13(3), e0194072 (2018) 12. Shin, S., Yeh, H., Kim, K.: An efficient secure authentication scheme with user anonymity for roaming user in ubiquitous networks. Peer-to-Peer Netw. Appl. 8(4), 674–683 (2015) 13. Wang, F., Xu, Y., Zhang, H., Zhang, Y., Zhu, L.: 2FLIP: a two-factor lightweight privacy-preserving authentication scheme for VANET. IEEE Trans. Veh. Technol. 65(2), 896–911 (2015) 14. Wu, T.Y., Chen, C.M., Wang, K.H., Wu, J.M.T.: Security analysis and enhancement of a certificateless searchable public key encryption scheme for IIoT environments. IEEE Access 7, 49232–49239 (2019) 15. Wu, T.Y., Fang, W., Chen, C.M., Wang, G.: Cryptanalysis of an anonymous mutual authentication scheme for secure inter-device communication in mobile networks. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 206–213. Springer (2017) 16. Zhang, Y., Xie, K., Ruan, O.: An improved and efficient mutual authentication scheme for session initiation protocol. PLoS ONE 14(3), e0213688 (2019) 17. Zhou, Y., Liu, T., Tang, F., Tinashe, M.: An unlinkable authentication scheme for distributed IoT application. IEEE Access 7, 14757–14766 (2019)

Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme Zhuoyu Tie1, Eric Ke Wang1, Jyh-Haw Yeh2, and Chien-Ming Chen3(&) 1

3

Harbin Institute of Technology (Shenzhen), Shenzhen, China 2 Boise State University, Boise, ID, USA Shandong University of Science and Technology, Qingdao, Shandong, China [email protected]

Abstract. Dynamic searchable encryption is one branch of searchable encryption. Compared with the static searchable encryption, a dynamic searchable encryption scheme can support update (add or delete) of outsourced data. However, this kind of operation may cause data leakage. Forward privacy is an important character for dynamic scheme to limit leakage of inserted document. It requests that a previous search token cannot be linked to later inserted document. In this paper, we demonstrate that a dynamic searchable symmetric encryption scheme proposed recently does not satisfy the forward privacy. It means that the cloud server can realize whether or not a newly added document contains any of the keywords used in previous searches. Keywords: Dynamic searchable encryption Forward privacy Cloud storage

1 Introduction Searchable encryption is a special type of encryption scheme, first proposed by Song et al. [1] in 2000. In a typical searchable encryption scheme, a data owner performs searchable encryption on the outsourced data, and a user generates a search token by using the search keyword and sends it to a server. Then, the server searches for the encrypted data according to the token, and finally returns the search results to the user. During the whole process, data and query that server can achieve are encrypted; thus, the privacy of the outsourced data is protected. Now, searchable encryption schemes have developed several branches, such as searchable symmetric encryption [2–5], public key encryption with keyword search [6–10], multi-keywords ranked searchable encryption, fuzzy searchable encryption [8, 11, 14], verifiable searchable encryption [4, 12–15], and dynamic searchable encryption [3, 4, 16–20], etc. In this paper, we put emphasis on dynamic searchable encryption schemes. A dynamic searchable encryption scheme can support the update (upload or delete) of the outsourced data. However, this kind of operation may cause data leakage or other security problems. In 2014, Stefanov et al. [20] firstly proposed the concept of forward privacy for searchable encryption. It refers to the fact that the server cannot distinguish whether the search token generated in the past can match the newly added document. Also, in a © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 474–480, 2020. https://doi.org/10.1007/978-981-15-3308-2_52

Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme

475

dynamic searchable encryption, forward privacy is extremely important. In 2016, Zhang et al. [21] proposed a new attack, named file injection attack, which is a strong threat to dynamic searchable encryption schemes. In this attack, the cloud server can easily attack those dynamic schemes that do not satisfy the forward privacy. The cloud server now can recover the content of search tokens generated by the data user before. For these reason, various dynamic searchable encryption schemes satisfy the forward privacy [4, 20, 22–24] are proposed. In this paper, we point out that a dynamic searchable symmetric encryption scheme named DSSE [25] fails to achieve forward privacy. We also provide a simple way to fix its failure.

2 DSSE Scheme In this section, we briefly describe the DSSE [25] scheme. There are three different entities in DSSE, data owner, data user, and cloud server, as shown in Fig. 1.

Fig. 1. A figure illustrates how different entities work in a dynamic searchable encryption scheme.

Data Owner. Adata owner builds secure index tree c for plaintext documents f ¼ ff1 ; f2 ; . . .; fn g, and generate encrypted documents c ¼ fc1 ; c2 ; c3 ; . . .; cn g. Then, data owner outsources ðc; cÞ to cloud server to storage. Data Users. Data users are under data owner’s control. When an authorized data user desires to search on collection f ¼ ff1 ; f2 ; . . .; fn g with keywords fwi g, he queries to data owner. This authorized user can generate a search token ss according to search control mechanisms to fetch encrypted documents from cloud server. The data user finally decrypts the documents with shared secret key. Cloud Server. A cloud server stores the encrypted documents c ¼ fc1 ; c2 ; c3 ; . . .; cn g and secure index tree c. After receiving a search token ss from a data user, cloud server uses ss to search on c. In searchable encryption scene, cloud

476

Z. Tie et al.

server is usually regard as a semi-trusted entity, meaning that the cloud server honestly and correctly executes instructions in the designated protocol. Meanwhile, it is curious to infer and analyze received data, which helps it acquire additional information. The following is the detailed steps of DSSE • K Gen 1k : The data owner generates k-bit K1 , K2 , and r. lets e:Gen 1k ; r , where e is a CPA-secure encryption scheme to encrypt docuK3 ments. K ¼ fK1 ; K2 ; K3 g. • d buildindexðf Þ: The data owner builds a binary tree d for the collection of documents f ¼ ff1 ; f2 ; . . .; fn g. The leaf nodes point to documents, and an inner nodes u store a m-bit vector datau account for keyword wi , for i ¼ 1; 2; . . .; m. If datau ½i ¼ 1, it means that some documents pointed by leaf nodes of u contain keyword wi . datau ¼ datal þ datar , where nodes l and r are the children of node u, and “+” is the bitwise Boolean OR operation. • ðc; cÞ EncðK; d; f Þ: (a) For each keyword wi , i ¼ 1; 2; . ..; m, the data owner gives a corresponding secrete key SKi R:Gen 1k ; GK2 ðwi Þ , where R is another CPA-secure encryption scheme, and G is a pseudo-random function with k-bit e:Enc K3 ; fij . output. (b) For each document fij , j ¼ 1; 2; . . .; n, generates cij (c) Initialize two k m hash table k0v and k1v . (d) For a node v of index tree d, i ¼ 1; 2; . . .; m, sets kbv ½PK1 ðwi Þ R:EncðSKi ; datav ½iÞ, where b is a bit calculate by a random oracle H, and b ¼ HðPK1 ðwi Þ; idðvÞÞ. (e) Store a random string at kð1bÞv ½PK1 ðwi Þ. (f) Creates a new node v0 to stores k0v and k1v . Constructs encrypted index c with v0 , according to d. • ss SrchTokenðK; wi Þ: Calculates SKi R:Gen 1k ; GK2 ðwi Þ , and generates search token ss ¼ ðPK1 ðwi Þ; SKi Þ. • cw Searchðc; c; ss Þ: Given ss ¼ ðPK1 ðwi Þ; SKi Þ, the cloud server firstly calculates b ¼ HðPK1 ðwi Þ; idðvÞÞ; and start search with root node. The cloud server calculates a ¼ R:DecðSKi ; kbv ½PK1 ðwi ÞÞ. If a ¼ 0, return; If a 6¼ 0 and the node u is a leaf, set cw ¼ cw [ cv . If a 6¼ 0 and u is an inner node, recursively search its children nodes. • infoi;u UpdHelperði; u; c; cÞ: The data owner generates update information containing a subtree dðuÞ corresponding the part of index tree d, which contain all the changed nodes because of the update operation. i is identifier, and u is ‘insertion’ or ‘deletion’. UpdToken K; fi ; infoi;u : The data owner encrypts the subtree of infoi;u in the • su same way of step ðc; cÞ EncðK; d; f Þ and get cðuÞ. Encrypts the document fi and get ci e:EncðK3 ; ci Þ at the same time. Then sends su ¼ ðcðuÞ; ci Þ to the cloud server. • ð c0 ; c0 Þ Updateðc0 ; c; su Þ: The cloud server replaces the corresponding subtree of c with cðuÞ, and get new encrypted index tree c0 . If su is ‘insertion’ token, adds ci into c. Else if su is ‘deletion’ token, finds and removes ci from c. Get new ciphertext collection c0 . • fi DecðK; ci Þ: decrypts ciphertext fi e:EncðK3 ; ci Þ.

Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme

477

3 Forward Privacy In this section, we demonstrate that DSSE fails to achieve forward privacy. This is because a previous search token can be linked to a new added document. It will leakage the information of new added document, and also make the scheme fragile to file injection attack (leakage the query information). The details are listed as follows. Assuming that the data owner keeps secrete key K ¼ fK1 ; K2 ; K3 g. He has already built encrypted index tree c and documents c for the collection of documents f ¼ ff1 ; f2 ; . . .; fn g. Both c and c have sent to the cloud server. (1) The data owner generates a search token ss for keyword wi . He calculates SKi R:Gen 1k ; GK2 ðwi Þ , and ss ¼ ðPK1 ðwi Þ; SKi Þ. (2) The data owner chooses a new document fn þ 1 containing keyword wi and prepares to insert it into c and c stored by the cloud server. Executes UpdHelperðn þ 1; u; c; cÞ, where u is ‘insertion’, and gets update information infoi;u that contains a subtree dðuÞ. (3) The data owner calls UpdToken K; fi ; infoi;u to generate update token su ¼ ðcðuÞ; cn þ 1 Þ, where ðcðuÞ; cn þ 1 Þ EncðK; dðuÞ; ffn þ 1 gÞ. Then, he sends su to the cloud server. (4) Given update token su ¼ ðcðuÞ; cn þ 1 Þ, the cloud server replaces the corresponding subtree of c and adds cn þ 1 into c. Renames the latest encrypted index tree as c0 , and ciphertext documents as c0 : (5) The data owner resends previous search token ss ¼ ðPK1 ðwi Þ; SKi Þ to the cloud server. The cloud server does Searchðc; c; ss Þ process. For the leaf node v of subtree cðuÞ (pointing to fn þ 1 ), calculates a ¼ R:DecðSKi ; kbv ½PK1 ðwi ÞÞ ¼ 1. The cloud server adds corresponding cv (also is cn þ 1 ) into result cw ¼ cw [ cv . (6) The cloud server sends cw to the cloud server. It is obvious that the previous search token ss is linked to the new inserted document fn þ 1 . The main reason is that the search token ss ¼ ðPK1 ðwi Þ; SKi Þ is always valid for hash table kbv ½PK1 ðwi Þ in the CPA-secure encrypt scheme R, if both of them are generated by the same secrete key K1 and K2 .

4 Discussion The reason that DSSE fails to achieve forward privacy is that DSSE uses the same secrete key. It leads to the validity of previous search token to any later inserted document. To avoid this situation, we describe a simple and straightforward method. We try to replace a fresh secrete key to generate a update token. We can make some change for ‘insertion’ update as follow: • infoi;u UpdHelperði; u; c; cÞ: The data owner generates update information containing a subtree dðuÞ. Then the data owner pick fresh secret key k-bit K10 , K20 , and K10 , K2 K20 . lets K1 Thus, all of the previous search token are disable to link to later inserted document.

478

Z. Tie et al.

It should be noted that all new search token is also invalid to previous document because of fresh secrete key, meaning disability to do search on previous document. We need to use fresh secrete key to re-encrypt previous documents: UpdToken K; fi ; infoi;u : Different than before, the data owner firstly replaces • su the corresponding subtree of d with dðuÞ, and encrypt whole index tree and added document with secrete key (already changed). Then sends su ¼ ðc0 ; ci Þ to the cloud server, where c0 is latest encrypted index tree. • ð c0 ; c0 Þ Updateðc; su Þ: The cloud server receives the latest encrypted index tree c0 and ciphertext of added document ci . c0 ¼ c [ ci . Proceed as above, each time the data owner tries to insert a new document, he must re-encrypt whole index tree. This modification is easy to realize, while the cost of reencryption of index tree is high.

5 Conclusion Various searchable encryption schemes have been shown insecure [26–33]. In this paper, we point out that Kamara’s scheme DSSE fail to achieve forward privacy. We also propose a simple and straightforward way to fix it. We believe that there is room for reducing cost brought by re-encryption of index tree, and we will focus on it in our future work.

References 1. Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE Symposium on Security and Privacy, S&P 2000, pp. 44–55. IEEE, Berkeley (2000) 2. Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. J. Comput. Secur. 19(5), 895–934 (2011) 3. Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric encryption. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 965–976. ACM, Raleigh, October 2012 4. Bost, R., Fouque, P.A., Pointcheval, D.: Verifiable dynamic symmetric searchable encryption: optimality and forward security. IACR Cryptology ePrint Archive 2016, 62 (2016) 5. Kamara, S., Moataz, T.: Boolean searchable symmetric encryption with worst-case sublinear complexity. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 94–124. Springer, Cham (2017) 6. Wu, T.Y., Chen, C.M., Wang, K.H., Wu, M.T.: Security analysis and enhancement of a certificateless searchable public key encryption scheme for IIoT envionments. IEEE Access 7, 49232–49239 (2019) 7. Wu, T.Y., Chen, C.M., Wang, K.H., Meng, C., Wang, E.K.: A provably secure certificateless public key encryption with keyword search. J. Chin. Inst. Eng. 42, 20–28 (2019)

Forward Privacy Analysis of a Dynamic Searchable Encryption Scheme

479

8. Xu, P., Jin, H., Wu, Q., Wang, W.: Public-key encryption with fuzzy keyword search: a provably secure scheme under keyword guessing attack. IEEE Trans. Comput. 62(11), 2266–2277 (2012) 9. Zhang, B., Zhang, F.: An efficient public key encryption with conjunctive-subset keywords search. J. Netw. Comput. Appl. 34(1), 262–267 (2011) 10. Fang, L., Susilo, W., Ge, C., Wang, J.: Public key encryption with keyword search secure against keyword guessing attacks without random oracle. Inf. Sci. 238, 221–241 (2013) 11. Chen, C.M., Zhang, W., Wu, T.Y., Wang, K.H., Wu, J.M.T., Pan, J.S.: Hierarchical semantic approximate multi-keyword ranked search over encrypted data. In: 2017 International Conference on Smart Vehicular Technology, Transportation, Communication and Applications (2018) 12. Chai, Q., Gong, G.: Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers. In: 2012 IEEE International Conference on Communications (ICC), pp. 917– 922. IEEE, Ottawa (2012) 13. Cheng, R., Yan, J., Guan, C., Zhang, F., Ren, K.: Verifiable searchable symmetric encryption from indistinguishability obfuscation. In: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, pp. 621–626. ACM, Washington (2015) 14. Wang, J., Ma, H., Tang, Q., Li, J., Zhu, H., Ma, S., Chen, X.: Efficient verifiable fuzzy keyword search over encrypted data in cloud computing. Comput. Sci. Inf. Syst. 10(2), 667– 684 (2013) 15. Miao, Y., Ma, J., Jiang, Q., Li, X., Sangaiah, A.K.: Verifiable keyword search over encrypted cloud data in smart city. Comput. Electr. Eng. 65, 90–101 (2018) 16. Cash, D., Jarecki, S., Jutla, C., Krawczyk, H., Roşu, M.C., Steiner, M.: Highly-scalable searchable symmetric encryption with support for Boolean queries. In: Annual Cryptology Conference, pp. 353–373. Springer, Heidelberg (2013) 17. Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2015) 18. Guo, C., Chen, X., Jie, Y., Zhangjie, F., Li, M., Feng, B.: Dynamic multi-phrase ranked search over encrypted data with symmetric searchable encryption. IEEE Trans. Serv. Comput. (2017) 19. Ocansey, S.K., Ametepe, W., Li, X.W., Wang, C.: Dynamic searchable encryption with privacy protection for cloud computing. Int. J. Commun Syst 31(1), e3403 (2018) 20. Stefanov, E., Papamanthou, C., Shi, E.: Practical dynamic searchable encryption with small leakage. In: NDSS, vol. 71, pp. 72–75. Internet Society, San Diego (2014) 21. Zhang, Y., Katz, J., Papamanthou, C.: All your queries are belong to us: the power of fileinjection attacks on searchable encryption. In: 25th USENIX Security Symposium, pp. 707– 720. USENIX, Austin (2016) 22. Bost, R.: ∑ oφoς:: forward secure searchable encryption. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1143–1154. ACM, Vienna (2016) 23. Kim, K.S., Kim, M., Lee, D., Park, J.H., Kim, W.H.: Forward secure dynamic searchable symmetric encryption with efficient updates. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1449–1463. ACM, Dallas (2017) 24. Song, X., Dong, C., Yuan, D., Xu, Q., Zhao, M.: Forward private searchable symmetric encryption with optimized I/O efficiency. IEEE Trans. Dependable Secur. Comput. (2018) 25. Kamara, S., Papamanthou, C.: Parallel and dynamic searchable symmetric encryption. In: International Conference on Financial Cryptography and Data Security, pp. 258–274. Springer, Heidelberg (2013)

480

Z. Tie et al.

26. Wu, T.Y., Chen, C.M., Wang, K.H., Wu, J.M.T., Pan, J.S.: Security analysis of a public key authenticated encryption with keyword search scheme. In: International Conference on Intelligent Information Hiding and Multimedia Processing (2018) 27. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated key agreement protocol based on chaotic maps. Data Sci. Pattern Recognit. 1 (2), 1–10 (2017) 28. Wu, T.Y., Meng, C., Wang, K.H., Chen, C.M., Pan, J.S.: Comments on Islam et al.’s certificateless designated server based public key encryption with keyward search scheme. In: International Conference on Genetic and Evolutionary Computing (2017) 29. Li, C.T., Lee, C.C., Weng, C.Y., Wu, T.Y., Chen, C.M.: Cryptanalysis of an efficient searchable encryption against keyword guessing attacks for shareable electronic medical records in cloud-based system. In: International Conference on Information Science and Applications (2017) 30. Wu, T.Y., Meng, F., Chen, C.M., Liu, S., Pan, J.S.: On the security of a certificateless searchable public key encryption scheme. In: International Conference on Genetic and Evolutionary Computing (2016) 31. Chen, C.M., Xiang, B., Liu, Y., Wang, K.H.: A secure authentication protocol for internet of vehicles. IEEE Access 7, 12047–12057 (2019) 32. Wu, T.Y., Chen, C.M., Wang, K.H., Pan, J.S., Zheng, W., Chu, S.C., Roddick, J.F.: Security analysis of Rhee et al.’s public encryption with keyword search schemes: a review. J. Netw. Intell. 3, 16–25 (2018) 33. Wang, K.H., Chen, C.M., Fang, W., Wu, T.Y.: On the security of a new ultra-lightweight authentication protocol in IoT environment for RFID tags. J. Supercomput. 74(1), 65–70 (2018)

Data Classification and Clustering

A Novel Topic Number Selecting Algorithm for Topic Model Linlin Tang(&) and Liang Zhao Jiaying University, Meizhou, China [email protected]

Abstract. A novel algorithm named the MTN (Multiple-Topic-Number) algorithm is introduced to deal with the problem of topic number selecting in topic model issue. The purpose of our algorithm is to build the LDA (Latent Dirichlet Allocation) matrices of different topic numbers to make the LDA matrices and machine learning algorithm combined better. So it can be used to solve the traditional problem of selecting topic number: under-size or over-size. The method here is to use different levels of machine learning tree structure to complete the combination. Experimental results show the efficiency of our proposed algorithm. Keywords: LDA

Topic model Xgboost

1 Introduction Text clustering algorithms transformed from the traditional clustering algorithms in data mining mainly focus on which classes the text should be grouped into. However, there is not much more research on the set of suitable class C itself in the first goal of text clustering. Although such text clustering results can be used for solving some practical problems, such as clustering retrieval, search results clustering and so on, in traditional text clustering algorithms, classes C represented by a simple set of tags can only tell us which text belongs to one class, It cannot help people to understand the semantic commonality of this group. That is, why they are classified into one category. When text clustering results are used to support decision-making, decision makers will not be able to make full use of such text clustering results because they cannot understand the semantics of various text classes. At the beginning of this century, David M. Blei and others proposed the topic modeling method represented by Latent Dirichlet Allocation (LDA) algorithm, which can be seen as a new beginning of text clustering research. The basic to steps shown as the following Fig. 1.

© Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 483–490, 2020. https://doi.org/10.1007/978-981-15-3308-2_53

484

L. Tang and L. Zhao

Information

Method

Rule 1. Model Design

Entertain ment

Politics

Spots

Unknown

topic 4

Unknown

topic 3

Unknown

topic 2

Unknown

topic 1

Text

Text

2. Model Reasoning Fig. 1. Process of topic modeling

As shown in the above Fig. 1, the basic process of one topic model can be described as below: (a) Model Design. In the model design step, the probabilistic generation model is used to describe a set of unknown topic and the rules for generating text from that set of topics. We call this set of unknown topics described by probabilistic generation model and the rules for generating text from the topic model. (b) Model Reasoning. In the process of model reasoning, based on a given set of text data and a topic model designed in the previous step, probabilistic reasoning method is used to mine the hidden topics behind text and to search for the association between each text and topic. The topic model describes rules for generating text from a set of interrelated unknown topics exist in the human brain. It has three key parts: text, topic, and rules for generating texts based on topics. Our proposed algorithm is focused on design of LDA matrices based on Xgboost algorithm. Some related work will be introduced in Sect. 2 and our proposed algorithm will be given in Sect. 3. Experiments will be shown in Sect. 4. Conclusion and future work will be introduced in Sect. 5.

2 Related Works Generally speaking, academic community work in the LDA topic model research area is mainly in the following aspects.

A Novel Topic Number Selecting Algorithm for Topic Model

485

The first one is the analysis of LDA model algorithm and the adjustment of the probability structure. For example, Anandkumar, Ge, Hsu analyzed the hidden variable assignment model, transform the parameter estimation into tensor decomposition problem, and improve the computing performance of the model [1]. Wang, Bai, Stanton constructed the PLDA model structure, the PLDA smooths out storage and computation bottleneck and provides fault recovery for lengthy distributed computations [2]. Teh, Newman, Welling proposed the collapsed variational Bayesian inference algorithm for LDA. This algorithm is efficient and easy to implement and it is significantly more accurate than standard variational Bayesian inference for LDA [3]. Foulds et al. proposed a collapsed variational Bayesian inference algorithm for LDA, and its computation efficiency has been shown through experiments. It is more accurate than the standard variational Bayesian inference of the LDA [4]. Secondly, some researchers made further analysis on the application of LDA. In fact, LDA model is far extended beyond the document classification field. For example, Feuerriegel et al. analyzed the effects of topics found in such corporate press releases on stock market returns in German market and then determined topic of adhoc announcements by using the LDA algorithm [5]. Philbin et al. gave a large-scale collection of images to efficiently associate with the images which contain the same entity: a building or object, and LDA model can also be used to discover the significant entities [6]. Do et al. did some work for classifying different cancer for supporting personalized medicine by using Latent Dirichlet Allocation model. The LDA model was used to catalog for every patient, it can find the specific genomic and downstream events that give rise to cancer cells [7]. Through these researches, we can see that LDA model can be applied in many academic fields not only in document classification. Thirdly, LDA matrix always be considered to be combined with machine learning algorithms for better optimization. For example, Al-Salemi et al. combined the LDA model with adboost.mh algorithm and shew the effect of combined the topic model with clustering algorithm in reference [8]. Xie et al. proposed the Multi-Grain Clustering Topic Model (MGCTM) which integrates document clustering purpose and topic modeling purpose into one frame. Then, an overall best performance has been achieved by applying these two tasks together [9]. Lei et al. used LDA to make the feature representation method and combined topic model with Naive Baves learner to learn model [10]. Experiments show that combination of machine learning algorithms and LDA can effectively improves accuracy of documents classification.

3 MTN (Multiple-Topic-Number) Algorithm The main idea of our proposed algorithm is to select different topic numbers to build the LDA matrices. Different levels of Xgboost decision trees are used to match the LDA matrices of different topic numbers. This method can solve traditional LDA matrix problems: if the topic number is undersize, it cannot pay enough attention to some details of the theme; if the topic number is oversize, it will cause excessive separation of topic information. The key point of this algorithm is to use LDA matrices of different topic number to combine different levels of trees. The LDA matrix of exiguous topic number can be as

486

L. Tang and L. Zhao

the upstream tree attribute option, and the other scale matrices are also matched with other level trees. Then we can achieve a scientific and rational method to make the multiple LDA topic number model extraction for th document classification. 3.1

Basic Theory

Different topic numbers are used in LDA algorithm for getting the LDA matrices. The combination of topic number is obtained by presetting for specific problems. In order to establish a serial tree structure based on Xgboost algorithm, tree structure will be combined with these LDA matrices as follows: Firstly, upstream trees mainly accept the LDA matrices with lower topic numbers. Topic number to be corresponded to the matrix can be obtained by formula (1). The value of Oi is a threshold value chosen for the LDA model. The t in the formula indicates the level of residual tree, it is relatively small in the upstream trees. Meanwhile, the value of Ki chosen from the dimension of matrix will be small. 9 8 K1 t O1 > > > > = < K2 t [ O1 and t O2 ð1Þ f ðmod ðldaÞ; tÞ ¼ K t [ O2 and t O3 > > > > ; : 3 ... Further, midstream trees mainly accept LDA matrices with middle topic numbers, the value of t subject to both upper and lower threshold limits. And downstream trees mainly accept the LDA matrices with higher topic number. The responsibility of the downstream trees is to calculate and transmit the detail information. The residual tree formula is also as above. Then, through training and adjusting model constantly to optimize the parameters of the convergence precision. And then, change the depth of trees and the number of the trees to ensure the accuracy of the final results. 3.2

Process of MTN

Different trees of different levels chose their feature from the LDA matrices of different level scale. The MTN algorithm will improve the performance of the LDA topic model. The number and the scale of different LDA matrices we chose according to a conceived way. The number of LDA matrices is chosen from 3 to 8. And the scale of LDA matrices can be chosen will be the original size, double size or triple size, and so on. The whole research process includes a main process and a sub process, dominant process is an algorithm that combine the machine learning classification with LDA matrix of multiple dimensions, and this sub process is for combining the different information granularity on document topics with hierarchy tree. The MTN algorithm is mainly embodied in the sub process, which is automatically generated by using composite matrix for ensemble learning. The flow chart of our proposed algorithm can be shown as the following Fig. 2.

A Novel Topic Number Selecting Algorithm for Topic Model

487

Begin

The documents

input

The level of tree

If upstream,then use matrix_A

If midstream,then use matrix_B

If downstream,then use matrix_C

Make the LDA topic model of number A

Make the LDA topic model of number B

Make the LDA topic model of number C

The LDA matrix of A

The LDA matrix of B

The LDA matrix of C

The IntegraƟon of results The mulƟple LDA matrix

output

The MTN algorithm

Use the MTN algorithm

End

Fig. 2. Flow chart of MTN algorithm

4 Experimental Results Both the Chinese and English experiments have been tested here. 4.1

Experiment of Chinese

Experiment in Chinese news has been done, data set comes from Dr. Yong Hui Cong. The comparison results with the SA-CWTM algorithm, the AA-CWTW algorithm, the AS-CWTW algorithm and the SS-CWTW algorithm have been given here, and these algorithms can be found in reference [11]. Comparison results are shown in the following Fig. 3.

488

L. Tang and L. Zhao

0.95 0.9 0.85 0.8 0.75 0.7 0

50

100

MTN

SA

150

200

SS

250 AA

300 AS

Fig. 3. Comparison result of Chinese news

As we can see, the MTN algorithm shows a good effect in the high dimensions of topic number. And in the Chinese document set, the global optimal solution has also achieved a satisfactory result through this algorithm. 4.2

Experiment of English

Firstly, Newsgroups-18828 is used in this experiment of English. Newsgroups contains 20000 articles about the usenet document, almost 20 different newsgroups. It is usually treated document sets for classification. Jason and Rennie made necessary process to Newsgroups, make each document belongs to only one newsgroup (Table 1). Table 1. Comparison result of Chinese newsgroups-18828 Name

Num 50 LDA+GBDT 0.75 LDA+Xgboost 0.78 LDA+MTN 0.78

100 0.78 0.80 0.81

150 0.78 0.79 0.83

200 0.80 0.81 0.83

250 0.79 0.80 0.84

The lifting of classification accuracy is very stable in the LDA+MTN method, meanwhile the classification accuracy of other methods will be limited to a certain dimension. The MTN algorithm can obviously improve the classification accuracy of English set. Secondly, 10 sets from reuters 21578 are chosen to make the set R10 to do the other experiment of English. This experiment in order to prove the result with MTN is better than results of sample LDA+Xgboost method, and to find the best combination of the topic numbers (Table 2).

A Novel Topic Number Selecting Algorithm for Topic Model

489

Table 2. Contrast result of R10 Num Num_c 0+ 10+ 20 0.802 0.788 30 0.797 0.815 50 0.790 0.811 100 0.780 0.799

20+ 0.802 0.815 0.812 0.812

30+ Null 0.797 0.812 0.811

The vertical axis of the table as above represents the sum of all the topic numbers, and the horizontal axis is combination mode. For example, value of vertical is 10+ and value of horizontal is 30 means the sum of topic numbers is 30. Meanwhile, the combination method is a LDA matrix of 10 topics combinates a LDA matrix of 20 topics number. The 0+ column responses the normal result without MTN algorithm, and the else reflect the improvement through the MTN algorithm. These experiments show that the MTN algorithm performs extremely well when the number of subjects is large. A steady rise can be got in a broader range. And these experiments indicate that the MTN algorithm is effective on both English and Chinese documents.

5 Conclusion In this paper, the combination of LDA matrices with machine learning algorithm is studied in this paper. Deciding the number of topic is a difficult problem for using the topic model to make documents classification, so as to select the multiple numbers of topic and combined with an excellent algorithm to study for solve it. Firstly, we analyze the capacity of the MTN algorithm by theory. Then Chinese and English document datasets both be selected for experiment, and get the final conclusion by contrast method. Results show that the LDA matrices combined with the MTN algorithm will be more effective by using the different levels of topic information. The MTN algorithm will improve the accuracy of documents classification and optimize the final result. Acknowledgement. This work was supported by Shenzhen Science and Technology Plan under grant number JCYJ20180306171938767 and the Shenzhen Foundational Research Funding JCYJ20180507183527919.

References 1. Anandkumar, A., Ge, R., Hsu, D., et al.: Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15(1), 2773–2832 (2012) 2. Wang, Y., Bai, H., Stanton, M., et al.: PLDA: parallel latent Dirichlet allocation for largescale applications. In: Algorithmic Aspects in Information and Management, pp. 301–314. Springer, Heidelberg (2009)

490

L. Tang and L. Zhao

3. Teh, Y.W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In: Advances in Neural Information Processing Systems, pp. 1353–1360 (2007) 4. Foulds, J., et al.: Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 446–454. ACM (2013) 5. Feuerriegel, S., Ratku, A., Neumann, D.: Analysis of how underlying topics in financial news affect stock prices using latent Dirichlet allocation. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp. 1072–1081. IEEE (2016) 6. Philbin, J., Sivic, J., Zisserman, A.: Geometric latent Dirichlet allocation on a matching graph for large-scale image datasets. Int. J. Comput. Vis. 95(2), 138–153 (2011) 7. Do, K.-A., Qin, Z.S., Vannucci, M.: Predicting cancer subtypes using survival-supervised latent Dirichlet allocation models. In: Advances in Statistical Bioinformatics, pp. 366–381 (2013) 8. Al-Salemi, B., Ab Aziz, M.J., Noah, S.A.: LDA-AdaBoost. MH: accelerated AdaBoost. MH based on latent Dirichlet allocation for text categorization. J. Inf. Sci. 41(1), 27–40 (2015) 9. Xie, P., Xing, E.P.: Integrating document clustering and topic modeling. In: Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pp. 694–703. AUAI Press (2013) 10. Lei, L., Qiao, G., Qimin, C., Qitao, L.: LDA boost classification: boosting by topics. EURASIP J. Adv. Signal Process. 2012, 1–14 (2012) 11. Qin, Z., Cong, Y., Wan, T.: Topic modeling of Chinese language beyond a bag-of-words. Comput. Speech Lang. 40, 60–78 (2016)

A Multivariate Time Series Classification Method Based on Self-attention Huiwei Lin1 , Yunming Ye1(B) , Ka-Cheong Leung1 , and Bowen Zhang2

2

1 School of Computer Science and Technology, Harbin Institute of Technology, ShenZhen, ShenZhen 518055, China [email protected] School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China

Abstract. Multivariate Time Series Classification (MTSC) is believed to be a crucial task towards dynamic process recognition and has been widely studied. Recent years, end-to-end MTSC with Convolutional Neural Network (CNN) has gained increasing attention thanks to its ability to integrates local features. However, it remains a significant challenge for CNN to handle global information and long-range dependencies of time series. In this paper, we present a simple and feasible architecture for MTSC to address these problems. Our model benefits from selfattention, which can help CNN directly capture the relationships of time series between two random time steps or variables. Experimental results of the proposed model work on thirty five complex MTSC tasks show its effectiveness and universality that has to outperform existing stateof-the-art (SOTA) model overall. Besides, our model is computationally efficient, and the parsing speed is six hours faster than the current model. Keywords: Multivariate time series classification Convolutional Network · Self-attention

1

· Temporal

Introduction

A time series is a sequence of data points indexed in time order and it is named multivariate time series if it has multiple variables univariate. Due to the abundance and the wide applications of time series in the medical, biological and industrial fields, there has been a genuine interest in the research of Time Series Classification (TSC) with the data mining communities. The difficulty and complexity of TSC developed when considering multivariate time series, and researchers have proposed many methods to solve this work. The traditional method is mainly features driven approaches. On the one hand, distance-based approaches use nearest neighbor classifier coupled with several distance measures like Euclidean Distance or Dynamic Time Warping [1] Supported in part by NSFC under Grant No. U1836107 and Shenzhen Science and Technology Program under Grant No. JCYJ20180507183823045. c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 491–499, 2020. https://doi.org/10.1007/978-981-15-3308-2_54

492

H. Lin et al.

to classify raw time series. On the other hand, transform-based methods transform time series into the new space and extract more characteristic features to be used by machine learning classifiers, such as statistical measurements, shapelets, Bag-of-SFA-Symbols, and Fast Fourier Transform (FFT) [2–5]. However, heavy handcraft and complicated domain knowledge make these methods out of favor. Data-driven approaches are widely used to overcome these shortcomings recently. It is DL approach that substitutes hand-crafted features with hierarchical features learned from raw time series data automatically. CNN [6] focus on mining local region features of raw data, while Long Short-Term Memory (LSTM) [7] is good at mining features of long time intervals. At the same time, Attention Mechanism help these model pay more attention to more discriminative features [8]. Furthermore, CNN can not only perform massively parallel processing but also achieve the same results as LSTM in dealing with sequences [9]. Thus, CNN is the first choice to handle the TSC problem. Under the power of the filters and deep layers, CNN integrates local features. Nevertheless, it also tends to follow with interest local information while ignoring global information, especially edge information, which is vital to characterize. In other words, CNN still needs to improve in mining the correlation between each time step and each variable of time series. This paper proposes a universal and empirical study model, taking the Temporal Convolutional Network (TCN) with different sizes 1-D kernels as our base layer. Hundreds of filters of equal variable length N with self-attention to classify multivariate time series, which can solve the problem as mentioned about CNN. Our model, Self-Attention Temporal Convolutional Network (SA-TCN), aims to model the inherent relevance of multivariate time series to themselves in a two-perspective manner. First, we select TCN to convolute data from time axis, smoothing data in the time dimension and extracting part of association features in the variable dimension. Next, we connect the various self-attention layers behind each convolution layer. With the help of self-attention, our model settles long time step problem as well as self-reliance, parallelly capturing relevant information in different subspaces. Finally, with minimal preprocessing that consistently improves classification performance, we conduct experiments on 35 free multivariate time series datasets to demonstrate its effectiveness and universality.

2 2.1

Related Work Convolutional Neural Network for Multivariate Time Series Classification

CNN has shown high power to hierarchically regional features in many TSC tasks. Zheng et al. [6] developed an effective model named Multi-Channels Deep Convolution Neural Networks (MC-DCNN). For each channel in each dimension, MC-DCNN is able to extract features individually from each univariate time series and combine them as final information representation. Based on CNN, the model of [10] developed a fully connected layer to unify the features maps

A Multivariate Time Series Classification Method Based on Self-attention

493

output by temporal convolution and pooling. Similarly taking CNN as a base layer, Ronao et al. [2] took advantage of FFT to enrich the input of model and achieved the best result with additional information. Lee et al. [11] transformed multivariate data into univariate data and took several filters with different sizes to extract robust features. Devineau et al. [12] extended the MC-DCNN framework to a new architecture rely on intra-parallel and inter-parallel processing of time series. Karim et al. [13] put forward a joint model combining LSTM and CNN with a squeeze-and-excitation block to further improve accuracy. Fawaz et al. [14] had made a detailed summary of DL for TSC task and described how the black-box effect of deep models. Most results on both multivariate time series and univariate time series showed us that CNN still has immense heuristic value. 2.2

Self-attention

Recently, as a particular case of attention mechanism to compute a representation of sequence, self-attention [15] has achieved a SOTA quality score on the Natural Language Processing (NLP). Uniquely the proposed module named multi-head attention mapped the input sequence to various subspaces and used scaled dot-product attention to represent it in each subspace. Due to its flexibility in parallel computation and time dependencies, the application of multi-head attention is more and more extensive. In [16], there is a CNN layer combining with a self-attention layer in every sub-module. In [17], the proposed model replaced fully-connected layer by CNN in the transformer module, connecting multi-head attention with CNN. It is said that we can take self-attention as an independent layer in conjunction with CNN. Although this technology is currently applicable to NLP, self-attention also has wide application prospects in MTSC task while considering the similarity between language sequences and time series. Therefore, this module is so significant that it can enhance the performance of our model.

3

Self-attention Temporal Convolutional Networks

We design our model to capture the relationships between two random time steps or variables, and we describe SA-TCN in detail in this section. The main component of our network consists of four identical feature modules that contain a nonlinear layer followed by several self-attention layers for each module. 3.1

Network Structure

Adding the self-attention layer to each TCN layer, we build our deep attentional neural network shown in Fig. 1 to learn the variable and sequential information of given time series. For convenience, all layers of the SA-TCN can be grouped into two parts as detailed below.

494

H. Lin et al.

For the first part, we define a module that consists of TCN, self-attention, addition, batch normalization, and the activate function of ReLu in order. On this basis, we connect four modules like this in bottom-up order. Based on past experience, their filter number, filter size and attention layer number are set as {128, 8, 4}, {64, 5, 2}, {64, 5, 2}, {32, 3, 1} respectively. Certainly, the number of attention layers may be flexible. We take the output of the last module as the input of the pooling layer to reduce the length of the time dimension to 1. For the second part, we conduct a fully-connected layer with a softmax function as classifier, maping the latent features into the output classes. The softmax function provides the posterior probability of the classification results.

Fig. 1. The self-attention temporal convolutional networks structure

3.2

Temporal Convolutional Networks

The input to our TCN can be a multivariate time series or latent encoding of a spatial self-attention applied to each module. We denote the number of time steps as Tl , the number of variables as Ml , the times duration of filters as Dl , the input as X (l) ∈ RMl ×Tl , the full sizes of filters as W (l) ∈ RMl−1 ×Dl ×Ml , and the bias as B (l) ∈ RMl ×Tl in the Lth TCN layer. Thus, for each of L TCN layer in the feature encoder, we apply many 1D filters that capture how the input signal (l) change over time. Let Xm,t be the value of the mth (0 < m ≤ M ) variables with the tth (0 < t ≤ T ) time step input series and the limited time T may vary from each sequence. By the activation function f (·) named Rectified Linear Unit

A Multivariate Time Series Classification Method Based on Self-attention

495

(ReLu), we can get the value of each position for ∀m ∈ {1, 2, 3, ..., Ml }, ∀t ∈ {1, 2, 3, ..., Tl }, ∀l ∈ {1, 2, 3, ..., L}. ⎛ ⎞ D l −1 M l −1 (l) (l) (l) (l−1) Xm,t = f ⎝Bm,t + Wi,j,m , Xi,t+j ⎠ . (1) j=0

3.3

i=0

Self-attention

Self-attention is a special case of attention where the keys K is equal to the values V . The important layer in each of our four modules is a attention mechanism like [15]. To form the layer output, results from each self-attention are concatenated and transformed by a parameterized linear. Given a multivariate time series X ∈ RT ×M as an input sequence, the machanism first transforms it into the queries Q ∈ RT ×M , keys K ∈ RT ×M , value V ∈ RT ×M : Q, K, V = X WQ , WK , WV . (2) where {WQ , WK , WV } ∈ RM ×M are trainable parameters and M indicates the number of variables. Then, we apply the scaled dot-product attention to calculate weight value of every position as follows:

QK T √ Attention (Q, K, V ) = Sof tmax V. (3) d where d is the scaling factor, using to push the softmax function into regions where it has extremely small gradients.

4 4.1

Performance Evaluation Datasets

We validate SA-TCN on 35 datasets collected by Karim et al. on the GitHub website [18]. According to the description of [13], these datasets cover a wide range of domain applications such as medical care, speech recognition and motion recognition. The diversity of datasets can better reflect the robustness and generality of the model. Among all of them, there are at most 95 categories and 570 variables while the max length is 5396. Each of the datasets is normalized to have zero mean and unit standard deviation. Furthermore, the datasets are padded with zeros, such that each time series length is equivalent to the maximum length of all variables in the training dataset. 4.2

Model Setup

All experiments have been conducted in Python with Keras and run on an Nvidia GTX 1080Ti. We initialize the weights of all layers by he gaussian distribution

496

H. Lin et al.

initializer, which initialize them by sampling each element with zero mean and 2/#units standard variance. And then thinking of the imbalanced class problem of datasets, we weight the contribution of each class Ci (1 ≤ i ≤ C) to the N , where wi is the loss weight for ith class, N is loss by the factor wi = C×N Ci the number of samples, C is the number of classes and NCi is the number of samples which belongs to class Ci . Furthermore, we set the scaling factor d as 64 like [15] empirically. Parameter optimization is performed using stochastic gradient descent. We adopt Adam as the optimizer and set the learning rate lr = 10−3 while other parameters are the default. To avoid exploding gradients problem, the learning 1 every 100 epochs of no improvement in the rate was reduced by a factor of √ 3 2 validation score until it reaches the final learning rate lr = 10−4 . 4.3

Evaluation Metrics

In our paper, we evaluated our proposed model and other existing SOTA model by classification accuracy. Based on the basic accuracy and rank of models on all datasets, we can calculate their Arithmetic Mean Rank (AMR), Geometric Mean Rank (GMR), Mean Error Rates (MER) and Mean Per Class Error (MPCE). AMR is the sum of a collection of ranks divided by the count of numbers in the collection while GMR indicates the central tendency or typical value of a set of ranks by using the product of their values as opposed to the arithmetic mean which uses their sum. MER is a mean value of all error rates from a model on every dataset. Furthermore, we analyze the generality of the model by MPCE, which is the average error of each class for all the datasets like: P CEn =

1 − accuracy . #classes

M P CE = N

n=1

4.4

1 P CEn

(4) .

(5)

Results

The classification accuracy performance of the proposed model on the above datasets have been recorded in Table 1. We compare our results to the existing reported SOTA models in [13]. At the same time, we choose the model with the minimum training loss within the specified iteration number as the final model to test and take test data as our validation data like it. By considering the randomness of model initialization, we have done at least three experiments on each data set and selected the maximum value as the final result. In addition, we also record how long our model has been running on each data set. The evaluation metrics of the proposed model with other models for comparison have been recorded in Table 2. From the simplest statistical results, our proposed model SA-TCN not only defeats the TCN model but also outperforms the SOTA models on 22 out of 35 datasets of this experiment. Comprehensively

A Multivariate Time Series Classification Method Based on Self-attention

497

Table 1. Performance comparison of proposed model with others on benchmark datasets. We used bold fonts for the best results. No Dataset

#Training #Class G instances

MLSTM- MALSTMFCN FCN

TCN SA-TCN Time

1

Action3d

270

20

13.5

75.42

74.74

75.08 86

138

2

Activity

160

16

10

61.88

58.75

58.12 63

478

3

ArabicDigits

6600

10

660

100

99

99

99

1559

4

ArabicVoice

6600

88

75

98

98.27

97.5

98.1

1022

5

AREM

7

6.1

84.62

84.62

82.05 100

6

AUSLAN

95

12

97

96

95.44 98.52

7

CharacterTraject

300

20

15

100

100

98.5

99.45

447

8

CK+

299

7

42.7

96.43

97.5

96.43 97.8

290

9

CMUsubject16

29

2

14.5

100

100

100

19

240

99.65

99.72

99.42 99.72

10 DailySport

43 1140

4560

100

42 378

20 827

24

4

6

100

100

100

100

20

100

2

50

86

86

86

89

94

13 EEG

64

2

32

65.64

64.07

56.25 64.06

14 EEG2

600

2

300

91

91.33

96

11 DigitShapes 12 ECG

15 GesturePhase 16 HAR 17 HTSensor 18 JapaneseVowels 19 KickvsPunch

88.66

69 444

198

5

39.6

53.53

53.05

55.05 57.5

68

7352

6

1225 96.71

96.71

97.08 96.57

1200

90

50

3

16.7

78

80

82

270

9

30

100

99

99.72 99.72

16

2

8

100

100

90

2184 103

100

126 123

360

15

24

97

97

98.2

96.58

21 LP1

38

4

9.5

86

82

82

82

84

22 LP2

17

5

3.4

83

77

80

83

83

23 LP3

17

4

4.25

80

73

73

77

83

24 LP4

42

3

14

92

93

92

96

85

67

46

20 Libras

64

5

12.8

66

67

70

26 MovementAAL

157

2

78.5

79.63

78.34

78.98 77.7

79

27 NetFlow

803

2

401.5 95

95

98.12 97.3

4236

25 LP5

2

20.5

76.31

72.37

73.68 77.63

2411

29 OHC

2580

20

129

99.96

99.96

100

5460

30 Ozone

173

2

86.5

81.50

79.78

84.97 83.2

117

31 PenDigits

300

10

30

97

97

96.5

96

360

28 Occupancy

41

100

32 Shapes

18

3

6

100

100

100

100

33 UWave

896

8

112

98

98

98

93

794

34 Wafer

298

2

149

99

99

99

99

239

28

2

14

100

100

100

100

35 WalkvsRun

20

60

considering the accuracy ranking of every dataset, SA-TCN achieves the best average arithmetic rank, average geometric rank, and MPCE. The classification accuracy of this model on most data sets is relatively high, and the classification error rate for most categories is relatively low. In other words, this model performs well in most data sets and most categories.

498

H. Lin et al.

Ulteriorly, let G=

#training instances . #labels

(6)

and we group the datasets by G and divide them into four parts according to the ranked quartile. The first part contains datasets with G ≤ 10 (25%), the second part contains datasets with 10 < G ≤ 20.5 (25%), the third part contains datasets with 20.5 < G ≤ 78.5 (25%) and the last part contains datasets with G > 78.5 (25%). Later, we can calculate the average classification error rates on all datasets grouped by G. The results show that our model performs better when G is smaller and it does not outperform the SOTA models while G is larger in general. It is said that it can effectively overcome the difficulty of fewer data per category. Finally, enhancing the classification effect of TCN through self-attention mechanism is not only better but also faster than using LSTM. According to the result of [13], its SOTA models required 13 h to process all the datasets while it only takes SA-TCN less 7 h. Table 2. Evaluation metrics of proposed model with others.

5

Index Metrics

MLSTM-FCN MALSTM-FCN SA-TCN TCN

1

#Win/pcs

16

11

20

13

2

Arith Mean/i

1.89

2.14

1.80

2.34

3

Geo Mean/i

1.66

1.91

1.53

2.00

4

G(0–25%) Error Rate/%

13.06

15.58

11.88

16.85

5

G(25–50%) Error Rate/%

12.81

12.99

9.49

12.59

6

G(50–75%) Error Rate/%

14.09

14.42

13.73

15.18

7

G(75–100 %) Error Rate/% 4.35

4.61

4.83

3.16

8

MPEC/%

3.10

3.31

2.88

3.31

9

Time/h

13

/

6.61

/

Conclusion

In this work, we propose a deep attentional neural network named SA-TCN for the task of MTSC. We train our SA-TCN models with a depth of four and evaluate them on 35 datasets to verify its effectiveness and universality. Since global features and local features are complementary to each other, the combination of self-attention and TCN not only enhances the original representations and improve classification accuracy but also saves time for the task. At the same time, the model can solve the performance problems caused by the small amount of categorical data to a certain extent.

A Multivariate Time Series Classification Method Based on Self-attention

499

References 1. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 31(3), 606–660 (2017) 2. Ronao, C.A., Cho, S.B.: Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst. Appl. 59, 235–244 (2016) 3. Ye, L., Keogh, E.: Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min. Knowl. Discov. 22(1–2), 149–182 (2011) 4. Sch¨ afer, P.: The boss is concerned with time series classification in the presence of noise. Data Min. Knowl. Discov. 29(6), 1505–1530 (2015) 5. Failure prediction of concrete piston for concrete pump vehicles. https://www. datafountain.cn/competitions/336 6. Zheng, Y., Liu, Q., Chen, E., Ge, Y., Zhao, J.L.: Time series classification using multi-channels deep convolutional neural networks. In: International Conference on Web-Age Information Management, pp. 298–310. Springer (2014) 7. Lipton, Z.C., Kale, D.C., Elkan, C., Wetzel, R.: Learning to diagnose with LSTM recurrent neural networks. arXiv preprint arXiv:1511.03677 (2015) 8. Wang, K., He, J., Zhang, L.: Attention-based convolutional neural network for weakly labeled human activities recognition with wearable sensors. IEEE Sens. J. 19, 7598–7604 (2019) 9. Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018) 10. Yang, J., Nguyen, M.N., San, P.P., Li, X.L., Krishnaswamy, S.: Deep convolutional neural networks on multichannel time series for human activity recognition. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015) 11. Lee, S.M., Yoon, S.M., Cho, H.: Human activity recognition from accelerometer data using convolutional neural network. In: 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 131–134. IEEE (2017) 12. Devineau, G., Moutarde, F., Xi, W., Yang, J.: Deep learning for hand gesture recognition on skeletal data. In: 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2018), pp. 106–113. IEEE (2018) 13. Karim, F., Majumdar, S., Darabi, H., Harford, S.: Multivariate lstm-fcns for time series classification. Neural Netw. 116, 237–245 (2019) 14. Fawaz, H.I., Forestier, G., Weber, J., Idoumghar, L., Muller, P.A.: Deep learning for time series classification: a review. Data Min. Knowl. Discov. 33(4), 917–963 (2019) 15. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L ., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017) 16. Tan, Z., Wang, M., Xie, J., Chen, Y., Shi, X.: Deep semantic role labeling with self-attention. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018) 17. Verga, P., Strubell, E., McCallum, A.: Simultaneously self-attending to all mentions for full-abstract biological relation extraction. arXiv preprint arXiv:1802.10569 (2018) 18. Multivariate time series dataset archive for LSTM-FCNs. https://github.com/ titu1994/MLSTM-FCN/releases

Butterfly Detection and Classification Based on Integrated YOLO Algorithm Bohan Liang(B) , Shangxi Wu, Kaiyuan Xu, and Jingyu Hao School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100089, China [email protected]

Abstract. Insects are abundant species on the earth, and the task of identification and identification of insects is complex and arduous. How to apply artificial intelligence technology and digital image processing methods to automatic identification of insect species is a hot issue in current research. In this paper, the problem of automatic detection and classification of butterfly photographs is studied, and a method of biolabeling suitable for butterfly classification is proposed. On the basis of YOLO algorithm [1], by synthesizing the results of YOLO models with different training mechanisms, a butterfly automatic detection and classification algorithm based on YOLO algorithm is proposed. It greatly improves the generalization ability of Yolo algorithm and makes it have better ability to solve small sample problems. The experimental results show that the proposed annotation method and integrated YOLO algorithm have high accuracy and classification rate in butterfly automatic detection and classification. Keywords: YOLO · Butterfly detection Ensemble learning · Object detection

1

· Butterfly classification ·

Introduction

As one of the important environmental indicators insects, butterfly species are complex and diverse. The identification of butterfly species is directly related to the crops eaten by humans and animals. At present, reliable butterfly identification methods widely used are not effective. Artificial identification of butterfly species is not only a huge workload, but also requires Long-term Experience and Knowledge Accumulation. How to recognize insect species automatically is one of the hot topics in the field of computer vision. Automatic recognition of insect species requires detection and classification of digital images, and the effect of image classification is closely related to the quality of texture feature extraction. In 2004, Gaston et al. [2] introduced the application of artificial intelligence technology and digital image processing method in digital image recognition. Since then, many experts and scholars have done a lot of work in this area [3,4]. In recent years, with the development of machine c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 500–512, 2020. https://doi.org/10.1007/978-981-15-3308-2_55

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

501

learning, researchers have proposed many related application algorithms in butterfly detection and detection. In 2012, Wang et al. [5] used content-based image retrieval (CBIR) to extract image features of butterflies, such as color, shape and texture, compared different features, feature weights and similarity matching algorithms, and proposed corresponding classification methods. In 2014, Kaya et al. [6] applied Local Binary Patterns (LBP) [7] and Grey-Level Co-occurrence Matrix (GLCM) [8] to extract the texture features of butterfly images, then used single hidden layer neural network to classify, and proposed an automatic butterfly species recognition method based on extreme learning machine law. In the same year, Kang et al. [9] proposed an effective recognition scheme based on branch length similarity (BLS) entropy profile using butterfly images observed from different angles as training data of neural network. In 2015, based on the morphological characteristics and texture distribution of butterflies, Li [10] proposed the corresponding feature extraction and classification decision-making methods by using the gray level co-occurrence matrix (GLCM) features of image blocks and K-Nearest Neighbor classification algorithm. In 2016, Zhou et al. [11] proved that the deep learning model is feasible and has strong generalization ability in automatic recognition of butterfly specimens. In 2018, Xie et al. [12] used the method of butterfly detection and species recognition based on Faster-RCNN [13], and expanded the data set by using butterfly photos in natural ecological environment. The algorithm has high positioning accuracy and classification accuracy for butterfly photos in natural ecological environment. Traditional butterfly recognition algorithms have the following problems: 1. In natural ecological photos, butterflies often appear in the form of small targets (the area of butterfly image is too small), traditional butterfly recognition algorithms are often powerless. 2. The amount of data needed for training is huge, but it can not find highquality annotated public data sets. 3. There are too few pictures of some rare butterflies in the natural state to be used directly as training sets. YOLO model, proposed by Redmon [1], is a well-known end-to-end learning model in the field of target detection. Compared with the two-step model of RCNN [13] series, YOLO model can execute much faster and avoid background errors, but it has poor positioning accuracy and false positive rate (FPR) of some classifications of single model is high. In order to improve the efficiency of butterfly recognition, this paper will make full use of the data provided by China Data Mining Competition and Baidu Encyclopedia, establish a butterfly data set containing a large number of butterfly ecological photos, train the model using ecological photos in natural environment, and based on YOLO V3 algorithm, propose an integration algorithm which can be used to locate and identify butterfly ecological photos. The main structure of the paper is as follows: 1. Data Set, Data Annotation and Data Preprocessing. A butterfly dataset with 2342 precisely labeled natural environments is established by self-labeling

502

B. Liang et al.

with the butterfly ecological photos provided in document [14] and the pictures in Baidu and other Internet photo databases. A set of image labeling methods suitable for labeling butterflies is sorted out through experiments, and the labeling methods are generalized. 2. Integrated YOLO algorithm. This algorithm inherits the speed of YOLO algorithm, optimizes YOLO algorithm, improves the performance of YOLO algorithm in detection and classification, and the learning ability and generalization ability of YOLO algorithm for small sample. It provides a good idea for subsequent target detection based on atlas. 3. Experiments and analysis. The performance of YOLO algorithm in different annotation and processing modes and integrated YOLO algorithm on butterfly test set are presented and analyzed. 4. Summary and Prospect.

2 2.1

Data Set, Data Annotation and Data Preprocessing Data Set

The butterfly data sets used in this paper are all photos of butterflies in the natural ecological environment, hereinafter referred to as ecological photos. One part is from the data set provided in document [14], the other part is from the images in search engines such as Baidu and image libraries, including 94 species and 11 genera of butterflies. Figure 1 shows some samples of butterfly ecology.

Fig. 1. Butterfly Eco-photograph

A total of 5695 pictures were taken from document [14], including two kinds of photographs of butterflies: specimen photograph and ecological photograph. According to document [12], because the shooting angle and background environment of specimens differ greatly from the ecological photographs, the training effect of using only ecological photographs in the training set is obviously better than that of using both specimens and ecological photographs together in

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

503

butterfly detection and classification tasks, and the purpose of this study is to locate butterflies in natural environment and determine the species of butterflies, so only 1048 photos of butterflies in natural ecological environment are selected in this paper. Most of the butterfly samples contained in each photo in the data set are only one, and the maximum number is not more than 20. Each butterfly species consists of at least four samples with a typical heavy-tailed distribution. The test set is based on the standard test set provided in document [14], which contains 678 ecological photos and the rest as training set. 2.2

Data Annotations

Because the posture of butterflies in ecological photographs is more complex, and even there are many butterflies overlapping together, and the data sets provided in document [14] are confused and there is no uniform standard for labeling. We formulated a set of uniform labeling standards and manually labeled the positions and species of all butterfly samples in all photos according to this standard. In the data set provided in document [14], there are two ways to label the area where butterflies are located: one is to use the antennae and legs of butterflies as the border, as shown in Fig. 2(a); the other is to use the trunk and wings of butterflies as the border, as shown in Fig. 2(b). We use two annotation methods to unify data sets.

Fig. 2. Two different methods to annotate a single butterfly

Because some butterfly species have social attributes, many butterflies often overlap in photos. The data set provided in document [14] uses the method of labeling multiple butterflies in overlapping areas as a single sample, as shown in Fig. 3(a). We have also developed a standard for labeling this situation: each butterfly in the overlapping area is independently labeled and the occluded part is ignored, as shown in Fig. 3(b). By using this method, not only the number of training samples is increased, but also the recognition effect of the model for complex scenes is improved.

504

B. Liang et al.

Fig. 3. Two different methods to annotate two or more overlapping butterflies

2.3

Data Preprocessing

Target detection algorithms based on deep learning often require a large amount of data as training set. In this paper, we expand the training set by nine transformation methods, such as rotation, mirror image, blur, contrast rise and fall, and combine different pretreatment methods and their parameters (such as rotation angle, exposure, etc.) to get the optimal pretreatment method. The results will be shown in Part 4. Through the above process, butterfly automatic detection and classification in natural ecological environment has been transformed into a multi-objective detection and classification problem. Different from common target detection problems, butterfly automatic detection and classification problems have three difficulties: (1) There are many classifications (94 classifications); (2) The distribution of samples is not uniform. Some rare species of butterflies have significantly fewer samples than other species of butterflies; (3) It is necessary to classify different small classes (different kinds of butterflies) under the same big class (butterflies), that is, fine-grained classification is needed. Therefore, the research of butterfly automatic detection and classification in this paper is more difficult.

3 3.1

Butterfly Detection and Classification Method YOLO Model

YOLO model [1] proposed by Joe Redmon is a well-known end-to-end learning model in the field of target detection. Its characteristic is that compared with the two-step model of RCNN [8] series, the execution speed of YOLO model is much faster, and it performs well in fine-grained detection. The third generation model of YOLO, YOLO v3, is chosen in our task. The structure of the YOLO V3 model is shown in Fig. 4. In order to detect butterflies of different sizes (different proportion of area) in natural photographs,

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

505

YOLO V3 uses multi-scale feature maps to detect objects of different sizes after feature extraction network (darknet-53). YOLO V3 outputs three feature maps of different scales. After 79 layers of convolution network, the detection results of the first scale are obtained through a three-layer feature extraction network. The feature map used for detection here is 32 times down-sampling of the input image. Because of the high downsampling ratio, the perceptual field of the feature map is relatively large, so it is suitable for detecting objects with large area in the image. Up-sampling convolution is done from 79 layers back. The eighty-first layer feature map is joined with the sixty-first layer feature map. After a three-layer feature extraction network, a fine-grained feature map of the ninety-first layer, i.e. a 16-fold down-sampling feature map relative to the input image, is obtained. It has a medium-scale perceptual field of view and is suitable for detecting objects with medium area proportion in image. Finally, layer 91 feature map is sampled again, and joined with layer 36 feature map. After a three-layer feature extraction network, the feature map with 8 times lower sampling relative to the input image is obtained. It has the smallest perception field and is suitable for detecting objects with small area proportion in the image. Each output contains 3D2 separate 5 + N dimension vectors, where symbol D represents the edge length of the output feature graph at this scale. Number 3 centers on the number of priori boxes in each grid cell, and symbol N is the number of classifications. The first four dimensions of each vector represent the position of the prediction box, the 5th dimension represents the probability of the target in the candidate box, and the 5 + i dimension represents the probability that the target in the candidate box belongs to the category i. YOLO uses mean squares and errors as loss functions. It consists of four parts: prediction box error (ERRcenter ), predicting boundary width and height Error (ERRwh ), classification error (ERRclass ) and prediction confidence error (ERRconf ) are composed of four parts. Loss = λcoord (ERRcenter + ERRwh ) + ERRclass + ERRconf

(1)

2

ERRcenter =

D n

ιobj î )2 + (yi − yî )2 ] ij [(xi − x

(2)

i=0 j=0 2

ERRwh =

D n

√ 2 ˆ i )2 ] ιobj [( w − w ˆ ) + ( h − h i i i ij

(3)

i=0 j=0

Here, n is the number of prediction frames in a grid cell, (x, y) is the center coordinate of the prediction frame, and w and h are the width and height of the prediction frame, respectively. Because the error caused by large prediction box is obviously larger than that caused by small prediction box, YOLO adopts the method of predicting square root of width and height instead of directly

506

B. Liang et al.

predicting width and height. If the jth prediction box in the 1st grid cell is obj responsible for the object, then we have ιobj ij = 1, conversely, ιij = 0. 2

ERRclass =

D

ιobj ij

i=0

n

[pi (c) − pî (c)]2

(4)

j=0

YOLO considers that each grid cell contains only one classified object. If c is the correct category, then pî (c) = 1, conversely, pî (c) = 0. 2

ERRconf =

D n

2

ιobj ij (ci

D n 2 − cî ) + λnoobj (1 − ιobj ij )(ci − cî ) 2

i=0 j=0

(5)

i=0 j=0

where ci denotes the confidence of objects contained in the prediction box. If there is an object in the real boundary box, Cî is the IoU value of the real boundary box and the prediction box. Conversely, there are Cî = 0. The parameter λ is used in different weighted parts of the loss function to improve the robustness of the model. In this paper, we have λcoord = 5, λcoord = 0.5.

Fig. 4. Structure diagram of YOLO V3

3.2

Integrated YOLO Algorithm

In order to get more accurate classification and detection results and improve the generalization ability of magic, this paper further processes the results of multiple YOLO models, and obtains the integrated YOLO algorithm. The pseudo-code of the algorithm is shown in Algorithm 1. Its core idea is to use multiple models with better training effect to predict the pictures separately and cluster the prediction frames. The clustering process is shown in Fig. 5.

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

507

Each positioning box can be described as a four-dimensional vector bi = (x1i , x2i , y1i , y2i ), an integer ci and a real number pi . It represents: in a rectangle with upper left corner x1i , y1i , lower right corner x2i and y2i , the probability of having a butterfly with the species number ci is pi . Let each cluster set be S1 , S2 , ..., Sk . For each Si (i = 1, 2, ..., k), satisfy: for ∀i, j ∈ S, ci = cj . Define the “summary” of a set as the following: ⎧ 1 ⎪ ⎨ B(S) = i∈S Pi i∈S Pi bi 1 (6) P (S) = |S| maxi∈S Pi ⎪ ⎩ C(S) = c (i ∈ S) i where, B(S) is the “aggregate” positioning box of set S, P (S) is the “aggregate” probability of set S, and C(S) is the “aggregate” classification of set S. Each time a single predicted bounding box is categorized, the set S with the highest probability P (S) is selected from all sets with the same classification as the detection box and IoU (B(S), b) ≥ 0.5. If there is no S that meets the criteria, place the box in a new set Sk+1 .

Fig. 5. Structural diagram of integrated YOLO algorithm

4 4.1

Experimental Results and Analysis Evaluation Index

In this paper, the intersection over union(IoU) is used as the evaluation index of butterfly positioning task, which is defined as the ratio of the area of two regions to the area of merging. The optimal ratio is 1, i.e. complete overlap. In the experiment, IoU = 0.5 is taken as the threshold value, i.e. the prediction box and the original tag box are positively positioned with IoU > 0.5, and the IoU ≤ 0.5 is positively positioned with error. In this paper, the mean average precision mean (mAP) is used as the evaluation index of butterfly classification

508

B. Liang et al.

Algorithm 1: Merge Boxes Require: n: number of boxes; b: boxes; c: classes; p: probability; Ensure: S: Merged boxes, sorted by P(S); 1: S ← ∅ 2: k ← 0 3: for i = 1, 2, · · · , n do for j = 1, 2, · · · , k do if C(Si ) = Ci and IoU (bi , b(Sj ) > 0.5) then Insert bi to Sj if ∃Sj (j = 0, 1, · · · , k), st.i ∈ Sj then k ← k + 1; Sk ← bi Sort S by P (Si ) 4: return B(Si ), P (Si ), C(Si )(i = 1, 2, · · · , k);

task. mAP is derived from precision (pre) and recall (recall). The calculation formula is as follows. pre =

TP TP + FP

recall =

TP TP + FN

(7) (8)

where TP (rue-positive number), FP (false-positive number) and FN (falsenegative number) represent the number of positive samples predicted to be positive, the number of negative samples predicted to be positive and the number of positive samples predicted to be negative, respectively. According to different confidence levels, we can get several (pre, recall) points, and draw the pre-recall curve with recall rate as the horizontal coordinate and precision rate as the longitudinal coordinate. Where, the average precision AP is the area around the pre-recall curve and the recall axis, which is the integral of the pre-recall curve, as shown below

1

p(r)dr

AP =

(9)

0

In practical applications, the sum of rectangular areas is generally used to approximate the integral. In this paper, we use the PASCAL VOC Challenge’s calculation method after 2010 [15], i.e. the recall can be divided into n blocks [0, n1 1, ..., (n−1) n , 1], then the average accuracy (AP) can be expressed as n

1 AP = maxr∈[ i−1 , i ] p(r) n n n i=1

(10)

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

509

Table 1. Under different annotations, the best mAP for a single YOLO v3 model on a butterfly test set. Annotation method

mAP

The full-scale labeling method

0.734

The non-full-scale labeling method 0.777 Table 2. The mAP of the YOLO V3 model in the butterfly test set when different sizes of images are used as input. Image size (px × px) mAP 416 × 416

0.475

608 × 608

0.553

736 × 736

0.447

The mean average precision (mAP) of all classes can be expressed as: N

mAP =

4.2

1 AP (n) n i=1

(11)

YOLO V3 Model Effect Experiment

In this paper, two different methods of butterfly labeling are tested: one is to use the antennae of butterflies as the boundary of the frame, which is called the full-scale labeling method; the other is to use the trunk and wings of butterflies as the boundary of the frame, which is called the non-full-scale labeling method. Taking different preprocessing methods, the best results of the two annotation methods are shown in Table 1. It can be seen that the results obtained by the non-full-scale method are obviously better than that by the full-scale method. Because the area around the antennae of butterflies is larger in the full-scale method, this can make the proportion of the area around the antennae of butterflies in the labeled area decrease, thus the influence of background environment on classification can become larger. The non-full labeling method is more suitable for butterfly automatic detection and classification tasks. Because the different size of the input image in YOLO model will lead to the different number of mesh cells in different output scales, we test the performance of single YOLO V3 model in butterfly automatic localization task under different input sizes in this paper. Using the non-full-scale method, without any other pretreatment, the best results of the three input sizes are shown in Table 2. It can be seen that the resolution input using 608px × 608px achieves better accuracy.

510

B. Liang et al.

Table 3. mAP of YOLO v3 model on butterfly test set under different and processing modes. Rotations

Saturation

mAP

Null

NULL

0.553

(0◦ , 45◦ , 90◦ , 180◦ )

NULL

0.691

◦

◦

◦

◦

◦

◦

◦

(0 , 45 , 90 , 135 , 180 , 255 , 270 ) NULL

0.681

(0◦ , 45◦ , 90◦ , 180◦ )

(1.0, 1.5)

0.0691

(0◦ , 45◦ , 90◦ , 180◦ )

(1.0, 1.2, 1.5, 1.8)

0.777

(0◦ , 45◦ , 90◦ , 180◦ )

(1.0, 1.3, 1.5, 1.7)

0.753

◦

◦

◦

◦

(0 , 45 , 90 , 180 )

(1.0, 1.2, 1.3, 1.5, 1.7, 1.8, 2.0) 0.723

Table 4. mAP results on butterfly test sets under different models. Model

mAP

Faster-RCNN + ZF

0.733

Faster-RCNN + VGG CNN M 1024 0.726 Faster-RCNN + VGG16

0.761

YOLO v3

0.777

Integrated YOLO

0.798

We also test the performance of a single YOLO V3 model in butterfly automatic positioning and classification tasks when different pretreatment methods are used to label 608px × 608px input with the non-full labeling method. Table 3 shows the effects of two parameters that have a greater impact on the results of classification. It can be seen that the best classification results are obtained by rotating the original images (0◦ , 45◦ , 90◦ , 180◦ ) and exposing them to 1.0, 1.2, 1.5 and 1.8 times respectively. The value of mAP can be reached to 0.7766. In particular, the mAP of those butterflies with dark and protective colors is significantly higher than that of the first group. 4.3

Integrated Model Effectiveness Experiments

We chose the three best performing models in a single YOLO model (Models 3, 4, 5 in Table 3) for integration. In detection task, 98.35% accuracy is obtained. In classification task, 0.798 mAP is obtained in the classification of test sets (94 classes) and 0.850 mAP is obtained in the classification of families (11 classes). Table 4 shows the performance of the integrated Yolo model and the mainstream target detection models, such as fast-RCNN, YOLO v3 etc., in butterfly automatic detection and classification tasks. The above results show that the data annotation and preprocessing methods presented in this paper are suitable for butterfly automatic detection and clas-

Butterfly Detection and Classification Based on Integrated YOLO Algorithm

511

sification tasks. It also shows that the integrated YOLO algorithm proposed by us is very effective and correct for butterfly detection and species identification in natural ecological photos.

5

Summary and Prospect

At present, the field of target detection is mainly divided into two schools: endto-end detection and distributed detection. End-to-end detection is very fast, but there is a big gap between the accuracy and the distributed detection scheme. Taking advantage of the fast speed of the end-to-end model and the high similarity between butterfly species, we propose an integrated model based on YOLO with different concentrations, which maintains the detection speed of the endto-end model and improves the detection accuracy and positioning accuracy. The essence of the integrated model is to find a better solution based on the optimal solution of a model under various conditions. By this way, the performance of the model on a specific training set can be improved, and the comprehensive generalization ability of the model can be improved. In the test set provided in reference [14], the model achieves 98.35% accuracy in the task of locating ecological photos, 0.7978 map in the task of locating and identifying species, and 0.8501 map in the task of locating and identifying subjects. The fact that butterflies have a high similarity among species indicates that there is a strong relationship among all kinds of butterflies in the knowledge map. For a wider range of target detection tasks, species can be accurately classified and located by means of knowledge maps, and their recognition tasks can be further optimized by means of different model capabilities and different recognition focus. Acknowledgement. Thank you here for the organizers of Baidu Encyclopedia Butterfly related pages, collectors of university open biological data sets and collectors of data provided by China Data Mining Conference. They have provided valuable data for our experiments.

References 1. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018) 2. Gaston, K.J., O’Neill, M.A.: Automated species identification: why not? Philos. Trans. R. Soc. B: Biol. Sci. 359(1444), 655–667 (2004) 3. Liu, F., Shen, Z.R., Zhang, J.W., et al.: Automatic insect identification based on color characters. Chin. Bull. Entomol. 45, 150–153 (2008) 4. Wen, C., Guyer, D.E., Li, W.: Local feature-based identification and classification for orchard insects. Biosyst. Eng. 104(3), 299–307 (2009) 5. Wang, J., Ji, L., Liang, A., et al.: The identification of butterfly families using content-based image retrieval. Biosyst. Eng. 111(1), 24–32 (2012) 6. Kaya, Y., Kayci, L., Tekin, R., et al.: Evaluation of texture features for automatic detecting butterfly species using extreme learning machine. J. Exp. Theor. Artif. Intell. 26(2), 267–281 (2014)

512

B. Liang et al.

7. Ojala, T., Pietik¨ ainen, M., M¨ aenp¨ aa ¨, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002) 8. Marceau, D.J., Howarth, P.J., Dubois, J.M., et al.: Evaluation of the grey-level cooccurrence matrix method for land-cover classification using spot imagery. IEEE Trans. Geosci. Remote Sens. 28(4), 513–519 (1990) 9. Kang, S.H., Cho, J.H., Lee, S.H.: Identification of butterfly based on their shapes when viewed from different angles using an artificial neural network. J. Asia-Pac. Entomol. 17(2), 143–149 (2014) 10. Li, F.: The research on automatic identification of butterfly species based on the digital image. Beijing Forestry University (2015) 11. Zhou, A.-M., Ma, P.-P., Xi, T.-Y., Wang, J.-N., Feng, J., Shao, Z.-Z., Tao, Y.-L., Yao, Q.: Automatic identification of butterfly specimen images at the family level based on deep learning method. Acta Entomol. Sin. 60(11), 1339–1348 (2017) 12. Xie, J., Hou, Q., Shi, Y., Lv, P., Jing, L., Zhuang, F., Zhang, J., Tan, X., Xu, S.: The automatic identification of butterfly species. J. Comput. Res. Dev. 55(08), 1609–1618 (2018) 13. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015) 14. 3rd China Data Mining Competition (1st International Butterfly Recognition Competition). http://ccdm2018.sdufe.edu.cn/sjwjjs.htm 15. Everingham, M., Eslami, S.M.A., Van Gool, L., et al.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

Text Classification Based on Topic Modeling and Chi-square Yujia Sun1,2(&) and Jan Platoš1 1

Technical University of Ostrava, 17. listopadu 2172/15, 70800 Ostrava-Poruba, Czech Republic [email protected] 2 Hebei GEO University, No. 136 East Huai’an Road, Shijiazhuang, Hebei, China

Abstract. This paper compares two topic modeling algorithms - Latent Dirichlet Allocation (LDA), Latent Semantic Index (LSI), and a feature selection algorithm chi-square to extract news feature words. After feature extraction, the three classifiers (Logistics Regression, Naive Bayes and SVM) are compared in news classification. Based on the test results, combined LSI and Logistics Regression gives the highest result compared to the other algorithms, with precision of 96% and recall of 95%.

1 Introduction Today, the need of large data collection processing increase. Such type of data can has very large dimension and hidden relationships, so analyzing these data requires dimensionality reduction techniques [1]. Dimensionality reduction technology is a based on the allocation of high-dimensional space to low-dimensional space (3D, 2D) to better visualize data and solve various data analysis tasks [2]. In natural language processing, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) are two well known in the field of text feature vector dimension reduction [3– 5]. This work, focuses on LSI, LDA and Chi-square [6] methods, comparing the performance of these three methods for extracting feature words on a publicly available data set from BBC news. In order to properly contrast LSI, LDA and Chisquare, t-SNE [7] technology is used to illuminate the success of the clustering process. After feature words extraction, Logistics Regression, Naive Bayes and SVM classifiers are used in the comparison. In the news classification, Precision, Recall, and F1-score were compared. The organization structure of this paper is as follows. Section 2 discusses the proposed approach. The third part explains the experimental results. Finally, the fourth part summarizes the research conclusions of this paper and looks forward to future research work.

© Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 513–520, 2020. https://doi.org/10.1007/978-981-15-3308-2_56

514

Y. Sun and J. Platoš

2 Methodology This section demonstrates the overall flow in the text classification. It includes four major modules - text preprocessing, feature extraction, dimension reduction, and text classification. Among the four modules, feature extraction and dimension reduction are the foremost segments. Figure 1 summarizes the proposed methodology.

Text Document

Text Preprocessing

Feature Extraction

Performance Evaluation

Training Classification Model

Dimension Reduction/ Feature Selection

Fig. 1. The steps that make up the proposed methodology

2.1

Text Preprocessing

Text preprocessing is used to reduce the complexity of the document and make the classification process easier. Normally, it includes processes such as tokenization and stop words removal. The purpose of tokenization is to obtain the basic form of each word, then use this procedure, which is well-defined for all documents, to remove the stop words and remove words that are less meaningful, such as prepositions and conjunctions. After the stop word is deleted, all the words in the documents are converted to lowercase to maintain consistency between the words in the entire document set. 2.2

Feature Extraction

A term-document vector matrix is formed using term frequency, document frequency, inverse document frequency and, feature extraction module. In particular, the term frequency specifies the frequency a word occurs in the corresponding document. Document frequency is the frequency a given term or query appears in a particular document in a larger search index. The inverse document frequency refers to the degree to which a word is related to the entire document set. Finally, the document is normalized to unit length, represented by a matrix which is the term-frequency of the entire document set. 2.3

Dimension Reduction

The main purpose of the dimension reduction module is to reduce the dimension of the features that will in the classification model, as each word can not be extracted as a feature in the classification model. Topic modeling techniques are one of the most wellknown techniques for reducing the dimensionality of document representations. There are two of the most famous and effective scalable dimensionality reduction techniques:

Text Classification Based on Topic Modeling and Chi-square

515

(1) Latent Semantic Indexing (LSI) is the application of Singular Value Decomposition (SVD) in the text feature matrix. SVD is a matrix decomposition technique. Its original data set matrix A is decomposed into three matrices U, R, and V [8]. T Amn ¼ Umm Rmn Vnn :

ð1Þ

In the above decomposition, U is a unitary matrix, and V is a conjugate matrix of a unitary matrix. Both U and V are square matrices. The matrix R is a rectangular diagonal matrix with non-negative real numbers on the diagonal. These diagonal elements are the singular values of matrix A. The singular value of the matrix A is r, and the other singular values are zero, that means there are only r important features in the data set, and the rest is noise or redundant features. T Amn Umr Rrr Vnr :

ð2Þ

Where U representing m documents and r concepts. V representing n terms and r concepts. U and V are the left-singular vectors and the right-singular vector of matrix A. R = r r is a diagonal matrix. These values are the singular values of matrix A. (2) Latent Dirichlet Allocation (LDA) is a three-layer Bayesian model of wordsdocuments-corpora [5]. LDA is a method that uses a bag of words. Each document is regarded as a word frequency vector, and the topic of each document in the document set is given as a probability distribution, thereby transforming the text information into easy modeling. The process of dealing with text by LDA is as follows: • For each document, extract a topic from the topic distribution. Traverse through all the words in all the documents, randomly assigning a topic to them. For instance, where K is the number of topics. Pðwi Þ ¼

K X

Pðwi jzi ¼ jÞPðzi ¼ jÞ:

ð3Þ

j¼1

• Extract a feature word with a specified probability in this topic. • Repeat the first two steps to get enough feature words to represent the document. • Thus, the probability of each feature word in a document is shown in the resultant formula 4. PðwjdÞ ¼

K X z¼1

PðwjzÞPðzjdÞ:

ð4Þ

516

2.4

Y. Sun and J. Platoš

Chi-square

Chi-square is a feature selection method [6]. Feature selection is to select the subset of features from the set of text features that best represent the content of the text topic. Feature selection improves the training and application of the classifier by reducing effective vocabulary and improves the classification accuracy by eliminating noise signatures. This study uses the Chi-square statistic to measures the independence of two variables in mathematical statistics. Given the category ck and feature ti, if the Chi (ti, ck) value of the category ti is higher, then it has significant category information. The value of ti is calculated as follows: Chiðti Þ ¼ maxfChiðti ; ck Þg; Chiðti ; ck Þ ck

¼

Nðaik dik bik eik Þ2 : ðaik þ bik Þðaik þ eik Þðbik þ dik Þðeik þ dik Þ

ð5Þ

Where N is the total number of messages, aik is the frequency at which ti and ck occur simultaneously, bik is the frequency at which ti does not occur in ck, eik is the frequency at which ck occurs and does not contain feature ti, dik means neither ck nor ti occurs frequency. 2.5

Classification

The classification module provides a categorization tool by using three different classifiers - Logistics Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM). 2.6

t-SNE

T-distributed Stochastic Neighbour Embedding (t-SNE) is a machine learning algorithm for visualization [7]. It is also a nonlinear dimensionality reduction technique. t-SNE maps multidimensional data to two or more dimensions suitable for human vision, which clearly shows the clustering of the input. t-SNE mainly focus on the distance of high-dimensional distribution points to express similarity by conditional probability, and the low-dimensional distribution points are also expressed as follows. If the conditional probabilities of the two are very close, it indicates that the highdimensional distribution points have been mapped to the low-dimensional distribution.

3 Experimental Results and Analysis The data used here is in the form of the original text file of the British Broadcasting Corporation (BBC) news data set. A total of 2225 text files were provided by the BBC news website, corresponding to reports from five topical areas from 2004–2005. Text documents are arranged into folders containing five types of labels, each of which

Text Classification Based on Topic Modeling and Chi-square

517

contains new articles related to the class labels. The five types of labels are business, entertainment, politics, sports, and technology. After text preprocessing and feature extraction, the top 10 feature words of the five topics in the news text are extracted by LSI, LDA, and Chi-square respectively. Table 1 shows the top 10 feature words in descending order of weight in the topic model extracted by LSI. Table 2 shows the top 10 feature words in descending order of weight in the topic model extracted by LDA. Table 3 shows the top 10 feature words in descending order of weight extracted by Chi-square. Table 1. Top 10 feature words by LSI. Technology People company Time Firm government Work UK World Plan report

Sports win play game match England injuries champion cup Ireland team

Politics labor elect part Lib Dem tax Dem lib conserve chancellor leader

Business growth economic rate price quarter economy rise fall interest rate bank

Entertainment film award star Oscar actor actress director comedy movie festival

Table 2. Top 10 feature words by LDA. Technology people game mobile technology phone service user computer music time

Sports game play win time player England world match team second

Politics government labor elect part people plan public tax lord law

Business company market firm bank sale growth price economic month government

Entertainment film award music star show actor band song director Oscar

To properly compare LSI, LDA, and Chi-square, we used a technique called t-SNE, which will also serve to better illuminate the success of the clustering process. Since these vectors have been reduced to two-dimensional representations, we can draw these clusters. t-SNE clustering of LSI topics is shown in Fig. 2. t-SNE clustering of LDA topics is shown in Fig. 3. t-SNE clustering of Chi-square topics is shown in Fig. 4. The feature words extracted by the three methods and the t-SNE clustering effect show that the LDA model extracts more effective topics and a better clustering effect.

518

Y. Sun and J. Platoš Table 3. Top 10 feature words by Chi-square. Technology phone device online Microsoft digit mobile software technology computer user

Sports season championship team Chelsea rugby injury champion coach match cup

Politics Dem liber prime lib chancellor secretary conserve part elect labor

Business stock economy price market analyst economic profit oil growth bank

Entertainment chart actress band album singer Oscar actor award star film

Fig. 2. t-SNE clustering of LSI topic.

Fig. 3. t-SNE clustering of LDA topics.

Text Classification Based on Topic Modeling and Chi-square

519

Fig. 4. t-SNE clustering of Chi-square topics.

Table 4. Classification effect of each method. Precision LSI+LR 96.22% LSI+NB 95.87% LSI+SVM 96.06% LDA+LR 72.54% LDA+NB 83.09% LDA+SVM 82.17% Chi-square+LR 70.3%

Recall 95.97% 95.73% 95.96% 79.08% 82.46% 81.34% 54.16%

F1 95.93% 95.72% 95.91% 73.28% 80.92% 78.71% 48.86%

After applying LSI, LDA, and Chi-square algorithms for feature dimension reduction and feature selection, LR, SVM, and NB are applied respectively for text classification comparison. In this experiment, precision, recall, and F1 were used to evaluate the system performance. Table 4 shows the classification effect of each method.

4 Conclusion This paper uses Latent Dirichlet Allocation, Latent Semantic Index and Chi-square to extract feature words from different topics in BBC News, and visualize the clustering results using the t-SNE technique. Then three machine learning classifiers (Logistics Regression, Naive Bayes and SVM) were applied for classification testing. Through comparative analysis, it has been observed that Latent Semantic Index and Logistics Regression combination provides the best result for news classification. For future work, we will extend the topic modeling to other applications.

520

Y. Sun and J. Platoš

References 1. Platos, J., Gajdos, P., Kromer, P., Snasel, V.: Non-negative matrix factorization on GPU. In: Second International Conference 2010, vol. 87, pp. 21–30. Springer, Heidelberg (2010) 2. Snasel, V., Nowakova, J., Xhafa, F., Barolli, L.: Geometrical and topological approaches to Big Data. J. Future Gener. Comput. Syst. 67, 286–296 (2017) 3. Berry, M., Browne, M.: Understanding Search Engines: Mathematical Modeling and Text Retrieval. SIAM, Philadelphia (1999) 4. Snasel, V., Gajdos, P., Abdulla, H.M.D., Polovincak, M.: Concept lattice reduction by matrix decompositins. DCCA (2007) 5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993– 1022 (2003) 6. Chi2 Feature selection Homepage. https://nlp.standford.edu/IR-book/html/htmledition/ feature-selectionchi2-feature-selection-1.html 7. Van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008) 8. Platos, J., Kromer, P.: Prediction of multi-class industrial data. In: International Conference on Intelligent Networking and Collaborative Systems 2013, pp. 64–68 (2013) 9. Mantyla, M.V., Claes M., Farooq U.: Measuring LDA topic stability from clusters of replicated runs. In: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, p. 49. ACM (2018) 10. Linderman, G.C., Steinerberger, S.: Clustering with t-SNE, provably. arXiv preprint arXiv: 1706.02582 (2017) 11. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1–2), 177–196 (2001) 12. McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: RecSys, pp. 165–172. ACM (2013) 13. Yang, X., Macdonald, C., Ounis, I.: Using word embeddings in twitter election classification. In: The SIGIR 2016 Workshop on Neural Information Retrieval (2016) 14. Sun, Y., Platoš, J.: CAPTCHA recognition based on Kohonen maps. In: International Conference on Intelligent Networking and Collaborative Systems 2019, pp. 296–305. Springer, Cham (2019) 15. Pan, J.S., Liu, J.L., Liu, E.J.: Improved whale optimization algorithm and its application to UCAV path planning problem. In: International Conference on Genetic and Evolutionary Computing 2018, vol. 834, pp. 37–47. Springer, Singapore (2018) 16. Chang, K.C., Pan, J.S., Chu, K.C., Horng, D.J., Jing, H.: Study on information and integrated of MES big data and semiconductor process furnace automation. In: International Conference on Genetic and Evolutionary Computing 2018, vol. 834, pp. 669–678. Springer, Singapore (2018)

Big Data Analysis and Ontology System

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis Linlin Tang(&), Xupeng Tong, and Jingyong Su Harbin Institute of Technology, Shenzhen, China [email protected]

Abstract. Classification and recognition of objects for images is an important issue in many scientific fields such as computer vision, biometrics and medical image analysis. An important feature of many objects is shape, so shape analysis has become an important part of classification. One method of shape analysis is to estimate boundaries and analyze the shape of these boundaries to determining the characteristics of the original object. However, many literature studies on point cloud shape analysis are based on existing shapes. This paper mainly refers to Ripley’s K-function in spatial point analysis, through this judgment on spatial distribution of point cloud data to determine the existence of shape in point cloud data, through the spatial distribution of 2D point cloud data and 3D point cloud data. Judging by the randomness of the experiment, K-function has a considerable effect on judging existence of point cloud data shape through relevant experimental verification analysis. Keywords: Point cloud data

Ripley’s K-function Spatial randomness

1 Introduction With development of computer graphics and virtual reality technology, it has been continuously developed and studied for estimating corresponding shape from massive point cloud data detection. There have been many more interesting researches on the analysis of point cloud data, such as registration, 3D surface reconstruction, point cloud reduction and shape classification. In 2017, Ji et al. [1] proposed a new hybrid least squares point cloud registration method. The registration process is done in two steps: coarse registration and accurate registration. In the rough registration process, genetic algorithm is used to transform point cloud into a three-dimensional shape, and the iterative nearest point algorithm is used to improve the accuracy of point cloud registration in the precise registration stage. Experimental results show that the registration rate of the algorithm matches. Both accuracy and convergence speed are improved. Leale et al. [2] proposed a point cloud data reduction method based on point cloud local density estimation. This method is robust to noise and outliers. The method consists of three phases. The first stage uses the expectation maximization algorithm to cluster the point clouds according to the local distribution of the points. The second stage identifies points with high curvature that are not deleted. Finally, the linear programming model is used to simplify cloud. Each cluster is a graph in which the cost of a node is defined by the © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 523–533, 2020. https://doi.org/10.1007/978-981-15-3308-2_57

524

L. Tang et al.

reciprocal of its distance from the centroid. The results show that simplified cloud is a good approximation of the original cloud, and it also gets a better streamlining rate. In 2013, Kazhdan et al. [3] improved the Poisson surface reconstruction algorithm based on the Poisson surface reconstruction. Adding a binary filter to Poisson surface reconstruction framework can significantly improve its geometric fidelity. Degree, while still allowing efficient multi-grid solvers, speeds up the refactoring. In the shape analysis of point clouds, Srivastava et al. [4] proposed an algorithm for detecting a given shape in two-dimensional point cloud data based on Bayesian method, and the author used Monte Carlo method for unknown variables. Computational solutions such as poses, scales, and point labels are used to estimate the posterior probability of a given shape. Su et al. [5] used the hypothesis test method to divide the points in a given point cloud data into two main types, one is the point of the shape of the object to be studied, and the other is the background clutter. Points to detect, estimate, and analyze the shape of point cloud data. The above research and analysis are all based on known shapes. De Oliveira Martins et al. [6] used the K-function to classify breast tumors by breast image classification. This paper mainly introduces the K-function [8] in spatial point analysis, and uses k-function to judge the existence of the shape of point cloud data in the spatial distribution of point cloud data in space, through 2D and 3D point clouds. The experimental analysis of the data has achieved more satisfactory results.

2 Ripley’s K-function Application Analysis 2.1

Ripley’s K-function in 2D

Among the various analytical methods that have been proposed for studying spatial point pattern distribution, Ripley’s K-function [8] is the most commonly used, and its definition is as follows: K ðtÞ ¼ ðE ðthe number of points in the distance tÞÞ=k

ð1Þ

For any given point pattern Sn ¼ ðsi : i ¼ 1; ::; nÞ, and the point pair si ; sj 2 Sn , here d si ; sj represents the point pair si ; sj euclidean distance. For any scale t, the following indication function is defined: It dij ¼ It d si ; sj ¼

1; dij t 0; dij [ t

ð2Þ

Then Eq. (1) can be rewritten as: 1 ki ðtÞ ¼ E Rj6¼i It dij k

ð3Þ

The density of the points in the space represented by k, kKðtÞ represents the number of desired points in the range of the scale t.

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis

525

In addition, it should be noted that for a stationary point process, the value of K(t) must be independent of the particular point event selected. Therefore, all the point events i ¼ 1; . . .; n in the region R are summed and multiplied by k, and the following results are obtained: P E Rj6¼i It dij ¼ kKðtÞ ; i ¼ 1; ::; n ) ni¼1 E Rj6¼i It dij ¼ nkKðtÞ Pn 1 ) KðtÞ ¼ kn i¼1 E Rj6¼i It dij

ð4Þ

In the statistical analysis of spatial point analysis, the observed point pattern is regarded as an implementation of the point process in the finite area E. Through the above analysis, the estimation of the K-function is: Xn X b ðt Þ ¼ 1 K I d j6¼i t ij i¼1 ^kn

ð5Þ

b k ¼ n=aðEÞ in the formula (5). However, the above estimation of the K-function is a simple estimate and is only valid in an infinite spatial region. However, in the actual situation, the spatial region E is usually a finite region, and the points outside the E boundary are unobservable. This is the “edge effect”, which causes the estimation of the K-function described above to be biased. That is, when dij is outside the range of the study area, the above formula needs to be corrected to eliminate edge effects, often in the following form: 1 Xn X It dij a Xn X It dij b K ðt Þ ¼ ¼ 2 j6¼i w j6¼i w i¼1 i¼1 ^kn n ij ij

ð6Þ

wij in Eq. (6) is the correction factor. Ripley’s K-function unbiased estimated expectation of the k-function in the case of complete spatial randomness: 1 ^ 2 b k ðr Þ ¼ kpr ¼ pr 2 ^k

ð7Þ

If b k ðr Þ [ pr 2 calculated by K-function for a given point cloud data, it indicates that the given point cloud data is not randomly distributed in space and belongs to nonrandom phenomenon (refer to Fig. 1c). Explain that there is some shape in the given point cloud data. If b k ðr Þ\pr 2 calculated by the K-function for a given point cloud data, it indicates that the given point cloud data is not spatially distributed and is random. In this case, it is almost impossible for a given point cloud data to have a shape.

526

L. Tang et al.

(a)

(b)

(c)

(d)

Fig. 1. Schematic diagram of K-function to 2D point cloud application

2.2

Ripley’s K-function in 3D

Similar to the principle in 2D, The value of the K-function in 3D can be obtained by:

1 4 3 4 b b K ðtÞ ¼ k pr ¼ pr 3 b 3 3 k

ð8Þ

Therefore: If b k ðr Þ [ 43 pr 3 calculated by K-function for a given point clo data, it indicates that the given point cloud data is not spatially distributed and belongs to non-random phenomenon (refer to Fig. 1c). Explain that there is some shape in the given point cloud data. If b k ðr Þ\ 43 pr 3 calculated by the K-function for a given point cloud data, it indicates that the given point cloud data is not spatially random and is random. In this case, it is almost impossible for a given point cloud data to have a shape.

3 Experimental Results and Analysis 3.1

2D Point Cloud

K-function is used to analyze a given point cloud data because there are three types of point cloud distribution: a completely random (CSR) distribution, a uniform distribution (actually conforming to a uniform random distribution pattern), an aggregate distribution, we will evenly distribute and random distribution is called random

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis

527

phenomenon and this aggregate distribution is called non-random phenomenon. The following results can be obtained by using the K-function for random phenomena and non-random phenomena of 2D point cloud data, as shown in following Fig. 1. As shown, in a given area (0, 1) * (0, 1), the figure is a representation of 100 random points and the number of monte carlo simulation for 10000 times under the function of envelope, the observation curve in monte carlo envelope below the ceiling area as random phenomenon, beyond the limit of nonrandom said phenomenon, nonrandom degree by observation curve and the distance limit (use sumK_value). Figure b shows the point cloud data with dog shape containing 100 points in a given region (0, 1) (0, 1), which has a relatively obvious non-random degree. X sumK value ¼ k ðr Þ ð9Þ rD sumK_value represents the degree of non-randomness, D is the range of r, and k(r) is the estimation of K-function at scale r. In order to verify effectiveness of Ripley’s K-function, research on simulated pictures and real pictures should be conducted respectively. Kimia-99 shape data is adopted in the research on simulated pictures. For real pictures, we should obtain enough real pictures from the Internet or taking photos. The general process for the point cloud data acquired from the simulated picture is through the edge detection related means in the image processing, and then the contour obtained by the image processing is sampled to acquire the point cloud data (Fig. 2).

Fig. 2. Obtaining a 2D point cloud process

The number of points randomly sampled by dog2 (such as a range of 50–600), and the shape of the dog is kept as much as possible. Firstly, focus on 100 points of point cloud data. Plus different Gaussian noise (mean 0, standard deviation 0; 0:01; 0:015; 0:02. . .), as shown in Fig. 3.

Fig. 3. Schematic diagram of dog2 Gaussian noise results

528

L. Tang et al.

For 2D point cloud, the process of experimental verification analysis of dog2 et al. in the kimia-99 shape database is selected here (Fig. 4).

dog2 13 11 9 7 5 3 0

0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0.055 0.06

Fig. 4. Variation of non-randomness calculated by dog2 Gaussian noise

It can be seen that as the standard deviation in Gaussian noise increases, the shape of dog2’s point cloud data is more and more deviated from the original shape, which means that the dog2 point cloud data increases with standard deviation of Gaussian noise (Fig. 5).

Fig. 5. Change of dog2 plus background noise

At the same time of shape extraction, the background noise point is also a factor that must be considered. Here we add random noise points to the point cloud data obtained by sampling the 600 point dog2 to simulate background noise points. Random noise points increase from 50 (Fig. 6).

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis

529

Fig. 6. Change in the degree of non-randomness calculated by dog2 plus background noise

3.2

3D Point Cloud

For the 3D point cloud, here we mainly study the 3D point cloud containing an irregular shape in the 3D point cloud. We selected the public and authoritative database BR database and the Georgia Institute of Technology geometric model database.

(a)

(b)

(c)

(d)

Fig. 7. Schematic diagram of K-function for 3D point cloud application

Similarly, the following results can be obtained by using the K-function for random and non-random phenomena of 3D point cloud data, as shown in the following Fig. 7.

530

L. Tang et al.

The (a) in the above figure represents the points randomly distributed in the space, and the corresponding figure (b) represents the K-function envelope diagram. It can be clearly seen that the observation curve is below the upper limit of the envelope, which is a random phenomenon. For the graph (c) is a 3D data with a horse shape, and the corresponding graph (d) is a K-function envelope result graph, and most of the observation curve is located above the envelope. As described in the previous section, the non-random degree value calculation method is the same as the calculation method in the previous section. Schematic diagram of 3D point cloud data containing horses with different standard deviations of Gaussian noise (Fig. 8).

Fig. 8. Schematic diagram of horse plus Gaussian noise

As we can see that the shape of contained horse becomes increasingly blurred. The resulting non-randomness changes are as following Fig. 9.

horse

25 20 15 10 5 0 0

0.05

0.1

0.15

0.2

0.25

0.3

Fig. 9. Schematic diagram of the change in non-randomness calculated by horse plus

Through the result diagram, as the standard deviation in Gaussian noise increases, the shape contained becomes more and more blurred and the value of non-randomness is also decreasing (Fig. 10).

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis

531

With background noise points:

Fig. 10. Schematic diagram of the change in non-randomness calculated by horse plus

As the background noise point continues to increase, the point cloud data containing the horse shape becomes less and less easy for the horse to judge, and the corresponding 3D point cloud data is calculated by the non-randomness value change trend as shown in the following Fig. 11.

horse

25 20 15 10 5 0 0

100

200

300

400

500

600

Fig. 11. Change in the degree of non-randomness calculated by horse plus background noise

According to experimental results, with the continuous increase of background noise points, the value of non-random degree obtained by calculation is in a decreasing trend, which proves the idea that with the continuous increase of background noise points, it is increasingly difficult to judge the shape of 3D point cloud data. In order to further illustrate the effectiveness of the application of K-function, we used the data set of ModelNet40 [7] and the shapeless point cloud data generated randomly for verification. It can be seen that with the continuous increase of shapeless point cloud data, the accuracy of K-function for shape recognition has no obvious trend of decline (Fig. 12).

532

L. Tang et al.

Accurancy 100 90 80 70 60 50 0

2000

4000

6000

8000

Fig. 12. Shape recognition

With the increasing number of point cloud data without shape, the accuracy of shape recognition is maintained at a relatively good level, indicating that the K-function determines the existence of shape has a certain effect.

4 Conclusion By applying K-function to the shape analysis and research of 2D point cloud and 3D point cloud, it is verified that K-function has certain effect on the possibility of 2D point cloud and 3D point cloud to form. In the future, we can further study and analyze the shape classification of point cloud data with shapes we are interested in. Acknowledgement. This work was supported by Shenzhen Science and Technology Plan Fundamental Research Funding JCYJ20180306171938767 and Shenzhen Foundational Research Funding JCYJ20180507183527919.

References 1. Ji, S., Ren, Y., Ji, Z., et al.: An improved method for registration of point cloud. Opt.-Int. J. Light Electron Opt. 140, 451–458 (2017) 2. Leal, N., Leal, E., German, S.T.: A linear programming approach for 3D point cloud simplification. IAENG Int. J. Comput. Sci. 44(1), 60–67 (2017) 3. Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Graph. (ToG) 32(3), 61–70 (2013) 4. Srivastava, A., Jermyn, I.H.: Looking for shapes in two-dimensional cluttered point clouds. IEEE Trans. Pattern Anal. Mach. Intell. 31(9), 1616–1629 (2009) 5. Su, J., Srivastava, A., Huffer, F.W.: Detection, classification and estimation of individual shapes in 2D and 3D point clouds. Comput. Stat. Data Anal. 58, 227–241 (2013)

Application Research of Ripley’s K-function in Point Cloud Data Shape Analysis

533

6. de Oliveira Martins, L., da Silva, E.C., Silva, A.C., et al.: Classification of breast masses in mammogram images using Ripley’s K-function and support vector machine. In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, pp. 784– 794. Springer, Heidelberg (2007) 7. Qi, C.R., Su, H., Mo, K., et al.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017) 8. Dixon, P.M.: Ripley’s K-function. Wiley StatsRef: Statistics Reference Online (2014) 9. Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017) 10. Arbia, G., Espa, G., Giuliani, D., et al.: Effects of missing data and locational errors on spatial concentration measures based on Ripley’s K-function. Spat. Econ. Anal. 12(2–3), 326–346 (2017)

InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method Xiaoqing Zhang1, Weike Liu2(&), Yongguo Zheng1, and Zhiyong Wang3 1

3

College of Computer Science and Engineering, Shandong University of Science and Technology, 579 Qianwangang Road, Qingdao, China 2 Center of Information and Network, Shandong University of Science and Technology, 579 Qianwangang Road, Qingdao, China [email protected] College of Geomatics, Shandong University of Science and Technology, 579 Qianwangang Road, Qingdao, China

Abstract. Interferogram phase unwrapping is one of the important data processing step in InSAR applications, however, it is affected by the high slope of terrain factors. In this paper, an improved constrained nonlinear least squares InSAR phase unwrapping method is proposed. First, we estimate the phase instantaneous frequency of the interferogram by the least squares method. Then, through the transformation between the phase instantaneous frequency and the phase gradient, we pre-estimate a phase gradient model considering terrain factors. Finally, using phase instantaneous frequency estimation (PIFE) model as the constraints of the nonlinear least squares phase, we propose the improved constrained nonlinear least squares (CNLS) phase unwrapping method. When compares with the other algorithms in the interferometric phase unwrapping experiments, the improved method is shown to be the most robust to noise caused by the terrain factors and to be the most suitable for phase unwrapping of the InSAR data in rugged and varied terrain regions. Keywords: InSAR Phase unwrapping Constrained nonlinear least squares method High slope of terrain factors Phase instantaneous frequency

1 Introduction InSAR phase unwrapping is one of the most important steps to obtain the DEM and differential interferometry, but it is also affected by the terrain factors such as the slope and its angle, which is the main source of errors of InSAR products [1, 2]. Currently, typical phase unwrapping methods are based on the optimal estimate, such as minimum cost flow method [3], the instantaneous frequency estimation method [4], nonlinear Kalman filter method [5], adaptive unscented kalman filter (UKF) method [6] and so on. But for the highly sloped terrain, the interferometric phase values are discontinuous, the growth of ground range sampling rate and baseline de-coherence lead to the © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 534–542, 2020. https://doi.org/10.1007/978-981-15-3308-2_58

InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method

535

actual phase unwrapping errors [7, 8]. This paper proposes a nonlinear least squares phase unwrapping algorithm using the phase instantaneous frequency estimation (PIFE) model as the constraint conditions. The algorithm could effectively take into account the terrain factors and restrains the errors propagation. The unwrapping experimental results show that this algorithm is accurate and robust.

2 Terrain Factors Estimation by LS The relationship between SAR interferometric phase and the terrain deformation geometric is shown in Fig. 1:

Fig. 1. SAR interferometric phase and terrain deformation geometric relationship

Points P0 ðx0 ; y0 ; z0 Þ and P00 ðx; y; zÞ are the same points from twice different SAR interferograms. Suppose the slope angle of oP0 is s. sy and sx are the average angles in the vertical and azimuth directions. The slant distance of the two points is as the (1): R0 R1 ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðH z0 þ Bv Þ2 þ ðy Bh Þ2 ðH z0 Þ2 þ y20

According to the Fig. 1 and the formula (1), the distance of

ð1Þ

decomposed in

azimuth and vertical direction: ¼ ðz DzÞ þ ðx DxÞ. Equation (2) shows the phase difference could respectively decompose in three vector directions: line of sight (LOS), azimuth and vertical direction: 4p ðR0 R1 Þ k 4p 1 þ tan h tan sy B? tan sx cos sy B? cos sy 0 ðx x0 Þ þ h ¼ ðR1 R0 Þ þ k 2 tan h R0 sinðh sy Þ R0 sinðh sy Þ

D/ ¼

ð2Þ

536

X. Zhang et al.

Where h is the incidence angle, h ¼ ðh1 þ h2 Þ=2 h1 h2 , h0 ¼ h1 h2 B? =R0 , cos sy ¼ ðy y0 Þ=S0 , DR0 ¼ R00 R0 , DR1 ¼ R1 R01 , S0 ¼ DR1 sinðh sy Þ ðz z0 Þ ¼ ðy y0 Þ tan sy sin sx þ h ¼ ðx x0 Þ tan sx þ h, sin h ¼ DR1 =ðy y0 Þ ¼ sinðh sy Þ cos sy . Along x and y directions, the Eq. (2) can be projected and divided the slant range R and the vertical direction Z and then differentiating, as shown in: 8 1 þ tan h tan s 1 < @/ ¼ kB ¼ kB tan htan s y @y tanðhsy Þ ð yÞ ð3Þ tan sx : @/ ¼ kB tan sx cos sy ¼ kB where kB ¼ 4pB? =kR0 sinðhsy Þ sin hcos h tan sy @x T Let the vector Df ¼ Dfx Dfy represents the 2-D instantaneous frequency: 1 @/ @/ T 1 D/ D/ T ½Dfx Dfy ¼ ¼ 2p @x @y 2p Dx Dy T

ð4Þ

Suppose the terrain variation vector d, and d ¼ ½tan sy tan sx T ¼ ½dy dx T is the instantaneous frequency’s function:

Dfy þ kB tan h dy ðDfy Þ ¼ Dfy tan h kB dx ðDfx ; WÞ ¼ Dfx W

where W ¼ cos hðtan h dy Þ kB ð5Þ

Formula (3), (4) and (5) show the relationship between instantaneous frequency and terrain slope. Suppose ði; jÞ is the known point, which can be gotten from DEM, its’ instantaneous frequency observation equation by the nearby points of it can be established: T ðf i;j Þ ¼ T ðf i;j þ Df m;n Þ

k l P 1 m;n þ Dfxm;n T ðf i;j Þ þ r l ðDf m;n Þ ¼ T ðf i;j Þ þ k! Dfy

m; n ¼ ð1; 1Þ; ð2; 2Þ; ; ðM; N Þ

k¼1

ð6Þ Where Df m;n is the known point near the point of ði; jÞ, m; n 6¼ i; j, ðM; N Þ is the size of the estimation window, l is the order of Taylor series expansion; r l ðDf m;n Þ is the residual term of Taylor series expansion, k represents the number of virtual observation equations that can be constructed in the estimation window. The simplified formula is (7): L ¼ Az þ r

ð7Þ

InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method

537

i;j 1;1 i;j f 2;2 Þ; ; T ðf i;j Þ T ðf M;N Þ 2 RM1;N1 , 8 Where L ¼ ½T ðf Þ T ðf Þ; T ðf Þ T ð2 3 2 1;1 3 f1;1 fi;j Df > > > > 6 f2;2 fi;j 7 6 Df 2;2 7 > > 6 7 6 7 > > 7¼6 7 2 RðM;N Þ2 ; > A ¼ D1 D2 Dk TMN ; D1 ¼ 6 > . . 6 6 7 > .. .. 7 > 4 4 5 5 > > > > > m;n > fm;n fi;j Df > > < 2 2 2 3 Dfy1;1 Dfy1;1 Dfx1;1 Dfx1;1 > 7 6 > > 6 2 2 7 > > 7 6 > 2;2 2;2 2;2 2;2 > Dfy Dfx Dfx 7 6 Dfy > 2 16 > 7 2 RðM;N Þ3 ; > ¼ D > 2 7 6 > > .. .. .. 7 6 > > 7 6 . . . > > 5 4 >

2 2 > > : Dfym;n Dfym;n Dfxm;n Dfxm;n

2 3 1;1 Df y 6 3 6 6 2;2 3 1 6 Dfy D ¼ 66 6 .. 6 4 . 3 Dfym;n z ¼ r1 T ðf i;j Þ

3 Dfx1;1

3 Dfx2;2 ..

. 3 Dfxm;n

2 2 Dfx1;1

2 2 Dfy2;2 Dfx2;2 ..

2. 2 Dfym;n Dfxm;n Dfy1;1

r2 T ðf i;j Þ rl T ðf i;j Þ

T

3 Dfy1;1 Dfx1;1 7

7 Dfy2;2 Dfx2;2 7 7 7 2 RðM;N Þ4 : 7 .. 7 .

5 Dfym;n Dfxm;n

; r ¼ r k ðDf 1;1 Þ

r k ðDf 2;2 Þ

h 8 < r1 T ðf i;j Þ ¼ ½ @/=@y @/=@x 2 R21 ; r2 T ðf i;j Þ ¼ @ 2 /2 @y h i @3/ 41 : r3 T ðf i;j Þ ¼ @ 3 /3 @ 3 /3 @ 32/ 2R @y @x @y @x @y@x2

@2/ @x2

@2 / @y@x

r k ðDf m;n Þ

i

2 R31

If the residual r in the (7) is regarded as a random error, the LS solution of formula ^ ð f Þ ¼ ðAT AÞ1 AT L. (7) is: U LS and WLS [9] phase unwrapping methods use the discrete phase gradient to obtain the global unwrapped phase, which inevitable cause the phase errors to be transmitted. In this paper, we use the LS estimation of instantaneous frequency instead of phase gradient. By the constructed observation equation in the estimated window, the Eq. (7) could be solved, and the higher order Taylor series expansion term could represent the detailed information of the terrain.

Suppose D2 f ¼ @ 2 f @x2 þ @ 2 f @y2 is Laplace operator, using the PIFE model to pre-estimate phase gradient which is also the instantaneous frequency’s function, the second partial derivative for discrete phase ði; jÞ can be expressed: (

@x2 ¼ Dfx ði þ 1; jÞ Dfx ði; jÞ

2 2 @ f @y ¼ Dfy ði; j þ 1Þ Dfy ði; jÞ @2f

ð8Þ

538

X. Zhang et al.

So we proposed the constrained nonlinear least squares (CNLS) phase unwrapping formula based on the PIFE model is (9): min Jð/Þ ¼ h s:t:

M2 P

P N1 i¼0 j¼0

h i 2 M1 2 P P N2 @wi þ 1;j @wi;j exp j2p fxi;j þ @wi;j þ 1 @wi;j exp j2p fyi;j i¼0 j¼0

i t t1 Dfx;y ði; jÞ Dfx;y ði; jÞ \n

ð9Þ Where w is the unwrapped phase, / is the same point of the wrapped one. n is the minimum. The constraint ensures that the twice iteration difference of the phase gradient values is tending to zero. The terrain slope derivation formula (3) shows that the trend of terrain variation is nonlinear [10]. The formula (9) is also the non-linear constraint formula. This unwrapping model represents the minimizing deviation of unwrapped phase frequency with the wrapped one.

3 Experiments and Analysis 3.1

PIFE Model Experiment

The experiment interferogram is one of the coal-mining regions of JINING, CHINA (see Fig. 2a). The basic parameters of the real data experiments are shown in the Table 1: Table 1. The basic parameters of ALOS satellite data NO Satellite 1 2

Date

Orbit

Path/frame Polarization Track Central latitude

PALSAR 20071222 10175 448/700 PALSAR 20080206 10846 448/700

HH HH

A A

35.654 35.639

Central Observation longitude Mode 117.061 117.074

FBS FBS

Figure 2 is the original interferogram and deviation comparison, which shows the distribution of the corresponding phase by PIFE model:

Fig. 2. Original interferogram and deviation comparison: (a) original interferogram (b) PIFE distribution (c) LS-PG distribution (d) pre-estimation of nonlinear phase result

InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method

539

Figure 2(b) and (c) compare the average distribution of PIFE and LS phase gradient (LS-PG) estimated in the same terrain conditions. The red oblique line reflects the real terrain slope information (Fig. 2). When the slope is equal to 0 means the estimated phases are the original real phases. The smaller of the slope means the pre-estimated PG is closer to the real one. The blue dotted line reflects the ability to anti noise. The smoother of the line means the easier to distinguish the residues from the real phases. The Fig. 2(d) is the pre-estimated result of the nonlinear phase.

Fig. 3. Profile of the 20th line of interferogram and differential diagram: (a) profile of 20th line (b) azimuth direction (c) range direction

Figure 3 is the profile of the 20th line of interferogram and differential diagram. Figure 3(a) is the sectional drawing of the 20th rows; Fig. 3(b) and (c) are respectively in azimuth and range direction. A and B regions are two noise regions generated by the interference of terrain. It is can be seen in the 20th rows and 0 to 50 vertical pixels interval of the differential profile be influenced by seriously noise, the phase changed obviously in the two directions of the profile interferogram.

Fig. 4. PIFE-PG and LS-PG transformation of the 20th line profile: azimuth direction: (a1) LSPG, (a2) PIFE-PG, estimation; range direction: (b1) LS-PG, (b2) PIFE-PG, estimation

As is shown in Fig. 4a1 and b1, the profile is the LS-PG estimation, which is appeared only one peak. It obviously reflects the change of the A region but it does not reflect the changes in the B region. Figure 4a2 and b2 are the PIFE-PG estimation, in the azimuth direction the change is like frequency. But in the range direction, PIFE-PG profile appeared two troughs, it can reflect the change of phase gradient, which is as shown as formula (3). 3.2

Unwrapping Experiment and Analysis

The experiment uses a 30 m ENVISAT ASAR interferogram as the original data. The subsidence terrain caused the change of phase stripes in the diagram. The parameters of original complex phase diagram as shown in Table 2:

540

X. Zhang et al. Table 2. Seleted parameters of ASAR image experiment data

Serial NO 022403 022404

Sensor ENVISAT ENVISAT

Imaging date 20050306 20050130

Image type SLC SLC

Latitude imaging Center/(°) 116.314 6, 35.305 1 116.314 4, 35.305 1

Radar wavelength 0.056 666 0 0.056 666 0

Fig. 5. Shandong Jining area the contrast with interferogram after filtering and unwrapping results: (a) interferogram after filtering, (b) LS unwrapping, (c) Snaphu unwrapping, (d) UKF unwrapping (e) CNLS unwrapping results

The experiment segments the interest interferogram corresponding image block size is 1024 1024 pixels, Fig. 5 is a two-dimensional unwrapping result of each algorithm. The 2-D unwrapped results (Fig. 5) reflect that the residual point of LS phase unwrapping only have 168 points, but it also caused the global transmission of errors, the unwrapped phase values range (−15, 15). Snaphu [12, 13] phase unwrapping is using minimum cost flow optimization algorithm, but it could not distinguish different kinds of phase in formation, the unwrapped phase values range (−20, 15). The UKF method has the unwrapped phase values range (−15, 5), and it reflects more details information of the phases. The CNLS algorithm reflects the more detailed than the optimization algorithms, which proves the method can reduce the effect of phase residuals caused by the terrain factors. The unwrapped phase values range (−20, 10), it takes into account the terrain factors of geometric distortion effects on unwrapping. Compared with the optimization algorithm, it has the highest accuracy. Table 3 compares the runtimes of three methods, root mean square error (RMSE) of unwrapped and rewrap around the original interference, e values [11], and the residual data points: Table 3. Unwrapped time, RMSE, e values and residuals Algorithm WLS Snaphu UKF CNLS

RMSE (rad) 1.7594 0.6272 0.5393 0.3860

e value (rad) Residues (Num) Runtime (s) 0.6214 168 165.940 0.5601 495 1277.625 0.4228 246 936.948 0.3617 134 672.663

InSAR Phase Unwrapping with the Constrained Nonlinear Least Squares Method

541

The experiment results are running in Matlab2018, Ubantu operating system (CPU is Intel (R) Pentium (R) 3.20 GHz, memory is 2G). It can be seen that the WLS algorithm has no better results except the runtime. Snaphu algorithm has the longest running time and the largest number of phase residuals. The running time of CNLS unwrapping method is less than Snaphu and UKF method but longer than LS. It takes 147.002 s to solve the frequency conversion, which was also added to the phase unwrapping time. But the CNLS has the best results of RMSE, the e value and residues numbers. From the residues numbers, we can see the CNLS algorithm uses the terrain slope factors as the constrained condition considers to reduce the phase gradient estimated biases. It shows that CNLS algorithm is stability and adaptability that can get relatively reliable unwrapping results. The experiment also shows that in the case of increasing the amount of limited computing to take into account the terrain factors, this approach has increased the ability to resist the phase distortion.

4 Conclusion One SAR interferogram has the same baseline, but different terrain caused different slant range and phase instantaneous frequency. From the above derived formulas, the phase instantaneous frequency is consistent with the variation of slope angle and the phase gradient. This paper approach replaces the phase gradient with the instantaneous frequency, which solves the problem of traditional LS phase unwrapping cause the phase errors to be transmitted. Based on the principle of Woodward [14, 15], by separately calculating the azimuth and range phase instantaneous frequency, the PG could be estimated in different terrain conditions, so the PIFE model can take into account the changes of terrain factors and be applied into the phase unwrapping as the constraint condition. Acknowledgment. This work was supported by foundation of The National Natural Science Fund (41876202, 41774002); Natural Science Foundation of Shandong Province (ZR2017MD020).

References 1. Liu, G.L., Hao, H.D., Yan, M.: Phase unwrapping algorithm by using Kalman filter based on topographic factors. Acta Geod. Cartogr. Sin. 40(3), 283–288 (2011) 2. Yu, H.W., Lan, Y., Yuan, Z.H.: Phase unwrapping in InSAR a review. IEEE Geosci. Remote Sens. Mag. 7(1), 40–58 (2019). https://doi.org/10.1109/MGRS.2018.2873644 3. Dudczyk, J., Kawalec, A.: Optimizing the minimum cost flow algorithm for the phase unwrapping process in SAR radar. Bull. Pol. Acad. Sci. Tech. Sci. 62(3), 511–516 (2014) 4. Man, L., Zibang, Z., Jingang, Z.: Phase unwrapping guided by instantaneous frequency for wavelet transform profilometry. J. Optoelectron. Laser 27(8), 853–862 (2016) 5. Xie, X.M., Zeng, Q.N.: Efficient and robust phase unwrapping algorithm based on unscented Kalman filter, the strategy of quantizing paths-guided map and pixel classification strategy. Appl. Opt. 54(31), 92–94 (2015)

542

X. Zhang et al.

6. Yandong, G., Shubi, Z., Tao, L.: Adaptive unscented Kalman filter phase unwrapping method and its application on Gaofen-3 interferometric SAR data. Sensors 1793(18), 853– 862 (2018) 7. Haifeng, H., Qingsong, W.: A method of filtering and unwrapping SAR interferometric phase based on nonlinear phase model. Prog. Electromagn. Res. 144(1), 67–78 (2014) 8. Zhong, H., Tian, Z., Pan, H.: A combined phase unwrapping algorithm for InSAR interferogram in shared memory environment. In: International Congress on Image & Signal Processing. pp. 1504–1509 (2015). https://doi.org/10.1109/CISP.2015.7408122 9. Tao, Z., Liu, T., Liu, Z.D.: A novel DEM reconstruction strategy based on multi-frequency InSAR in highly sloped terrain. Sci. China Inf. Sci. 60(1), 1–3 (2017) 10. Zhiyong, W., Jixian, Z., Guoman, H.: Precise monitoring and analysis of the land subsidence in Jining coal mining area based on InSAR technique. J. China Univ. Min. Technol. 43(1), 169–174 (2014) 11. Weike, L., Guolin, L., Qiuxiang, T.: Nonlinear least squares phase unwrapping based on homotopy method. Sci. Surv. Mapp. 37(4), 126–128 (2012) 12. Syakrani, N., Baskoro, E.T., Mengko, T.L.: New weighting alternatives to MCFN phase unwrapping. Int. J. Tomogr. Simul. 28(3), 39–52 (2015) 13. Boyd, J.P.: Convergence and error theorems for Hermite function pseudo-RBFs: Interpolation on a finite interval by Gaussian-localized polynomials. Appl. Numer. Math. 87, 125– 144 (2015) 14. Chen, C.W., Zebker, H.A.: Phase unwrapping for large SAR interferograms: statistical segmentation and generalized network models. IEEE Trans. Geosci. Remote Sens. 40(8), 1709–1719 (2002)

Research and Implementation of Distributed Intelligent Processing Architecture Chunyu Chen1, Yong Zhang1(&), Yulong Qiao1, Hengxiang He1, and Xingfu Zhang2 1

2

Harbin Engineering University, Harbin, China [email protected] Beijing Focused Loong Technology Co., Ltd., Beijing, China

Abstract. In order to meet the needs of modern development, a distributed intelligent processing architecture that is easy to manage is designed and implemented for the problem of inconvenient management of many production environments with limited resources or high security requirements. The distributed architecture has the following characteristics: 1, intelligent; 2, security; 3, high cost performance. The distributed architecture mainly consists of a cloud server platform, a collection device in the production environment, and an information processing workstation, etc. The combination of software and hardware is used to allocate resources with reasonable resources. This distributed intelligent processing architecture is also applicable to other scenarios that require a lot of data-assisted guidance. Keywords: Distributed Cloud server Hardware and software combination Resource allocation Big data

1 Introduction In today’s production practice activities, in terms of safety considerations, many of the scenes used for production activities are no longer as negligent as management or cost a lot of manpower and resources, but adopt formal safety protection measures, such as data and intelligence processing. In this regard, in the important production environment, the staff’s access is minimized, especially for non-workers; at the same time, because the environment for production is mostly in a relatively remote location away from urban and rural areas, the resources used are limited. To sum up the above problems, modern production practices urgently require an automated modern information processing system to rationally allocate limited available resources. In order to solve this practical problem, a distributed intelligent processing architecture [1] was designed and implemented. It has been proved that this system can improve the production efficiency of production activities and has higher security.

© Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 543–550, 2020. https://doi.org/10.1007/978-981-15-3308-2_59

544

C. Chen et al.

2 Introduction of the Whole Idea For modern information processing systems [2] applied to actual production activities, it must have several characteristics: 1. Intelligent [3]. Based on the consideration of production environment safety standards, most production environments will limit the entry and exit of non-workers, and even more emphasis on safety issues will even use a sterile production environment. Therefore, for an intelligent system, it is necessary to be able to process basic information by itself, and to minimize the security risks caused by personnel entering and leaving. At the same time, in order to reduce the consumption of unnecessary resources and reduce the waste of manpower and material resources, it is necessary to use an intelligent system instead of the staff to perform some operations, such as: environmental monitoring, data collection, etc. can be handed over to the intelligent Processing system to complete. 2. Security. In the rare case, people should not have to worry about the stability of the information processing system. For an intelligent distributed information processing architecture, the use of resources on the maintenance system will result in a certain waste of resources. The system should meet certain usage security standards and have certain ability to respond to emergencies [4]. 3. high cost performance. For the production activities, the production cost must be controlled within a certain range. Therefore, the intelligent system must control the quantity of the products as much as possible and increase the production efficiency and increase the production results. Based on the above characteristics and actual needs, the distributed intelligent processing architecture designed and used in this paper consists of the following basic parts: 1. Collection equipment installed in the production environment; 2. A server platform set up in the cloud; 3. Set up a local information processing workstation; The working mode of the entire distributed system architecture is illustrated in Fig. 1 below:

Fig. 1. Distributed system architecture job description

Research and Implementation of Distributed Intelligent Processing Architecture

545

For a distributed processing architecture, the first thing to do is to clarify the job functions of each component. The working mode shown in the above figure is taken as an example: the server platform installed in the cloud is the resource allocation control center of the whole distributed system, which is used to mobilize the allocation of various resources; the real-time data collected by the collection device in the production environment The collection is carried out; the local information processing workstation is responsible for analyzing the environmental data and formulating a reasonable production labor plan. The specific working steps are such that when the collection device is set up in the production environment, a message containing information such as the device ID is first sent to the cloud server platform. The local information processing workstation will also send a message to the server platform when it starts up. The difference is that the information sent by the workstation is the task performed by the current workstation, the currently connected collection device, and the remaining available information processing resources. After receiving the information, the cloud server platform can intelligently process the information according to the information, allocate resources reasonably, and then return a message to the collecting device, and notify the collecting device of the workstation address that the current device can connect, so that the collecting device receives the information. After the address of the workstation, the communication connection can be established with the information processing workstation through the acquired address, thereby notifying the workstation of the data information of the production environment, and processing and analyzing by the data processing workstation. Since the local information processing workstations have different number of connectable devices according to the number of available resources, within a reasonable range, the workstation can process the environmental information sent by the connected devices in real time, according to the obtained information. Based on the information obtained, the workstation can monitor the production environment in real time. Once abnormal data is found, it can issue a warning and inform the production manager for processing. These are the basic performance of a distributed intelligent processing architecture and the most basic security performance guarantee for modern production work. Based on these functions, the distributed system can be further adjusted to add other functions. First of all, the collection equipment installed in the production environment is composed of a variety of information collection equipment. In addition to the necessary environmental monitoring equipment, such as detectors of various functions, it can also be equipped with equipment of various professional fields to collect specific needs data information. Taking the actual production environment described in this article as an example, pig information can be collected in real time by erecting cameras and other equipment in the pig house of the pig company, and the collection device sends the information to the workstation, realizing the collection of pig images in real time. The acquisition device sends this information to the workstation, realizing the collection of pig image information in real time, and the workstation can analyze and process these images to establish a complete information processing system. As long as these functions are realized, for a place with high requirements on the safety of the production environment, a large number of unrelated personnel can be reduced in and out, the security risks are reduced, and intelligent data collection is realized, thereby

546

C. Chen et al.

ensuring the timeliness and accuracy of the data. Secondly, for some scenarios where the production activity space is large, if a fixed data acquisition device is used, it is not a simple matter to install the device and maintain the device. The use of the portable acquisition device can reduce the installation and maintenance expenditure. In this way, not only the number of collection devices is reduced, but also more accurate environmental information can be obtained through the mobile device. The corresponding basic distributed system architecture working method described above can be further optimized. The updated working description is shown in Fig. 2.

Fig. 2. Flow chart of distributed system architecture in complex situations

The main difference between the distributed architecture in this complex situation and the architecture used in the smaller space system is that the mobile acquisition device can be controlled in different ways. Wherein, one control terminal is a remotely located control system, and the control system can obtain the currently running mobile collection device number by communicating with the cloud server platform, and only needs to manually control one of the collection devices when needed The control command is sent to the cloud server platform, and the cloud server platform sends the control command to the specified collection device. After receiving the command, the collection device performs the operation according to the command. The same when the collection device needs to send the execution feedback result of the control system operation command, the feedback information of the device is first sent to the server platform, and the server platform forwards the control information to the control system, and the control system updates and displays the device information. Another control terminal uses a simple control device, and a device connected to the network can control the device after installing the corresponding software APP. First of all, this control device needs to use the same network environment as the acquisition device, so it can still communicate within the local area network when there is a network

Research and Implementation of Distributed Intelligent Processing Architecture

547

interruption in the production environment. Secondly, this method can realize real-time control of the device to achieve precise displacement operation and achieve zerolatency control standard; however, this control capability is also limited to avoid unnecessary losses caused by misoperation.

3 Practical Application Results Currently, the distributed architecture described above has been tested in the actual production environment. This article relies on the experimental site of Beijing Focused Loong Technology Co., Ltd. to carry out the implementation of a complete set of data collection and implementation. The specific information is as follows: 1. Cloud server control platform built using the cloud server foundation provided by Alibaba Cloud Alibaba Cloud is committed to providing secure, reliable computing and data processing capabilities in an online public service that provides reliable computing support for businesses and developers. The cloud server ECS provided has the advantages of high availability, security and flexibility. Therefore, it is very suitable to build a cloud server platform using Alibaba Cloud Server. The server platform built on the cloud platform can receive two kinds of information: one is the environmental data information of the collection device from the production environment, and the data mainly includes the data information in the environment and the specific data transmitted according to the demand. The other is information from the available resources of the local information processing workstation, which mainly contains the resources available to the current workstation and the information of the devices to which the current workstation is connected. 2. Set up the watcher track patrol robot used in the experimental pig farm In recent years, due to the impact of African swine fever, small-scale pig farms are on the verge of elimination due to poor safety awareness, while large-scale farms are becoming more aware of environmental safety hazards. In order to solve the safety problems of the farm and reduce the spread of viruses caused by the flow of people, a mobile acquisition device-watcher track patrol robot was successfully designed and installed. The portable device is mainly used to collect image data of pigs in real time and monitor environmental data. The customized track patrol robot can greatly reduce the consumption of human resources and is very suitable for large-scale production space. In addition to the necessary charging time, each track patrol robot performs data collection and environmental monitoring functions in the defined area. 3. Data analysis processing workstation for data processing Workstations have different information processing capabilities depending on the number of resources that can be used. The workstation can monitor the changes of the production environment in real time according to the environmental data transmitted from the collection device, and can issue a warning to alert the management personnel if necessary. The most important thing is that the workstation can establish an

548

C. Chen et al.

environmental data-pig image-complete information processing system for predicting pig information based on the collected large amount of data, which can play an important role in data, information and scientific farming of the production industry effect. 4. Equipment display and control system For the portable data acquisition device, the video image can be transmitted to the control system in real time by installing the camera to realize monitoring. If necessary, the control system can control the mobile device to achieve key focus in the key areas. 5. Portable communication control program In a modern production environment, the acquisition device can be controlled at close range by using the mobile APP, so that when the external network is interrupted, the control can still be implemented only through the local area network. At the same time, the portable communication control program also sets the corresponding management authority to avoid loss. Based on the above listed experimental equipment, the specific test environment and related test results of the test site are described as follows: 1. Intelligent aspects In the previous farm farming, all the feeding and cleaning work must be done by workers, and it takes a lot of manpower and resources to deal with various emergencies. In this regard, the distributed intelligent processing architecture can solve several hidden dangers: (1) data monitoring of the production environment, timely detection of abnormal changes in the temperature and humidity of the environment; (2) monitoring of information on pigs The inspection by the orbital robot can timely detect the abnormal state of the pig, such as illness, fighting, etc., and timely treatment; (3) It is possible to analyze the body condition information of each pig, determine the number of things that should be fed, and reasonably control the body information of the pig; (4) have an overall estimate of the overall pig weight and other information, reminding the work The person sells when the pig’s weight is most appropriate, and gets the most benefits. 2. Network transmission real-time aspects In the test of the overall performance, the control part of the information transmission part of the test environment is tested. The method used in this test is to start timing by the control system set at the remote end (such as: forward and backward movement instructions), transit through the cloud server control platform, and then send it to the watcher track patrol robot. The command moves and sends the executed status information to the cloud server for return to the remote display and control system. In such an information transmission and feedback system, the experimental data obtained after actual testing is as follows: In the case of a good network environment of the experimental farm, the command information transmitted by the control system is etched by the patrol robot after an average of 0.15 s. To the command, the same is true that the status information sent by the robot is passed to the control system in about 0.19 s. At the same time, long-running tests are also carried out in the overall practice

Research and Implementation of Distributed Intelligent Processing Architecture

549

environment. In the entire distributed architecture, the process of transmitting this information is sufficient to meet the real-time requirements of the experimental scenarios. The portable communication control program is placed on the mobile phone and is used in the same network environment as the robot. Therefore, in terms of real-time requirements, it is only restricted to the local network environment, and there is almost no network delay, which is very suitable for simple device control. 3. Cost performance In the process of building the infrastructure, workstations used only for large amounts of data information have higher resource requirements, and the rest are not very strict with the amount of resources available. Appropriate changes can be made depending on the job requirements and the amount of resources that can be used. Take the experimental farm as an example. After the infrastructure is built, when a new environment is added, only one set of watcher track patrol robots and corresponding tracks need to be installed in a new pig house. The rest of the outside the equipment and facilities can realize intelligent resource allocation for the newly added robot data acquisition equipment, without the need to install the same equipment control system and information processing workstation for each pig house. This intelligent production farming method is using modern trends. For large and medium-sized enterprises, scientific farming information, intelligent resource allocation and higher safety standards can be obtained only by paying lower production costs, and the cost performance is very high.

4 Conclusions Due to the security requirements of the production environment, it is very reasonable to adopt a distributed architecture in practical applications, but at the same time, the requirements for distributed architecture become strict. Whether it is a data acquisition device installed at the production site, a server system installed in the cloud, or a local information processing workstation. When one of the devices needs to reduce the number or resources for some reasons, the resource information can be sent to the cloud server platform, and the server platform can make a decision, re-define a new data transmission path, or rationally allocate resources according to requirements. For the cloud server platform, after the initial resource allocation task is completed at the startup of various devices, it is only necessary to maintain communication with the device to ensure that the device state is normal, and control for the mobile device when manual control is required. The system and the collection device pass control information and feedback information to each other, which is also necessary for resource consumption. Even if the cloud service platform is temporarily disconnected from various devices due to network fluctuations, etc., the connection between the collection device and the information processing workstation can ensure that the detection data of the production environment will not be lost in a short time. At the same time, the mechanism of disconnection reconnection is adopted. After the cloud server returns to normal, the information of the collection device and

550

C. Chen et al.

the workstation will be re-received, which will not have a big impact on the entire system, and even cause irreparable damage. At the same time, the distributed architecture is also improved in terms of security performance. The components that perform different tasks are installed in different parts, which not only reduces the risk of common loss in special cases, but also solves the need for intelligent processing of production work, making the management staff sufficient. You can understand the situation of the on-site environment without leaving the house. In case of danger, you can also control the situation on the spot and determine a reasonable solution. For an intelligent distributed processing architecture, the architecture can be simplified by adopting different modular components according to functions, and only need to be configured when new devices need to be added or installed in new usage scenarios. This part can be completed and the system can be upgraded without affecting the original architecture. Such an architecture is suitable for most of the production environments in the current stage, especially for those industries that need to adjust and upgrade their industrial structure, there is a distributed architecture that is easy to manage, reduces manpower consumption, and is easy to upgrade Suitable. Acknowledgments. This study was conducted under the project “Dynamic Texture Analysis and Applied Research Based on Signal Processing on Graphs” (National Natural Science Foundation of China: 61871142) and “Research and Implementation of Multi-sensor Information Fusion and Decision-making System Based on Artificial Intelligence Architecture (KY10800180032)”. Thanks to the practice environment and financial support provided by Beijing Focused Loong Technology Co., Ltd. Thanks to the technical guidance and support of Teacher Chunyu Chen and Teacher Yulong Qiao, and for the selfless help provided by the partners.

References 1. Nesenbergs, K., Selavo, L.: A distributed data processing architecture for real time intelligent transport systems (2013) 2. Tao, W., Jiang, Y.: Research on intelligent information acquisition and monitoring method of NC machine tool processing. In: Proceedings of the 2019 3rd International Forum on Environment, Materials and Energy (IFEME 2019) (2019) 3. Wang, Z.: Research on design method of intelligent service system in product processing under PSS concept. Procedia CIRP 83, 705–709 (2019) 4. Zheng, T., Chen, G., Wang, X., Chen, C., Wang, X., Luo, S.: Real-time intelligent big data processing: technology, platform, and applications. Sci. China (Inf. Sci.) 62(08), 102–113 (2019)

Sentiment Analysis System for Myanmar News Using Support Vector Machine and Naïve Bayes Thein Yu1(&) and Khin Thandar Nwet2(&) 1

2

University of Computer Studies, Yangon, Myanmar [email protected] University of Information Technology, Yangon, Myanmar [email protected]

Abstract. With the growth of web technology, there are huge amount of information in the web for the internet users. Users not only use that information but also provide opinions for decision making process. Sentiment analysis or opinion mining is one of text categorization techniques that extract opinion expressed in a piece of text. This system created sentiment annotated corpus. Feature extraction and selection are needed in sentiment analysis to get high performance. N-grams are used for feature selection and TF-IDF is used for feature extraction. Machine learning is creating a computer programs that improve performance with experience. Machine learning is combination of the techniques and basis from both statistics and computer science. There are generally three types of machine learning algorithm such as supervised machine learning, unsupervised machine learning and hybrid learning. Supervised machine learning also known as classification algorithm that models the relationships between the feature set and the label set. In this system, Myanmar sentiment analysis system is implemented using supervised machine learning method. This system shows the comparison results of support vector machine (SVM) and Naïve Bayes algorithms. Keywords: Sentiment analysis

SVM Naïve Bayes N-gram TF-IDF

1 Introduction Opinion mining research is the computational analysis of subjective information contained in text. The sentiment is usually considered as a categorical variable with three values: positive, negative, and neutral. With the growth of web 2.0 technology, social media has become an emerging popular due to the huge and rapid advances in information technology. Sentiment analysis applications have been used in almost every possible domain, from consumer products, services, healthcare, and financial services to social events and political elections. News is available from many resources and provides valuable information. Sentiment analysis for Myanmar language has many challenges due to scarcity of resources such as automatic feature extraction tools, stemming, anaphora resolution, name entity recognition, and so on. In this paper, an automatic sentiment analysis system for Myanmar news is proposed. N-gram is used as a feature selection method. © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 551–557, 2020. https://doi.org/10.1007/978-981-15-3308-2_60

552

T. Yu and K. T. Nwet

Tf-Idf is used in this system for feature weighting. Support vector machine and Naives Bayes algorithms are applied in implementing sentiment analysis system. The remaining parts of this paper are structured as follows; the related works are explained in Sect. 2. In Sect. 3, proposed system is discussed. Experimental result is showed in Sect. 4. Conclusion and future work is presented in Sect. 5.

2 Related Works There are many sentiment analysis systems for English language and other languages. Different systems use different resources with different methods at different levels. Godbole, Srinivasaiah, and Skiena proposed sentiment analysis system for news that assigns score as positive and negative to subject expressed in new corpus. They identify sentiment of entity and score entity relevant to same class [1]. In paper [3], authors implement a tool for sentiment analysis able to find the polarity of opinions in reviews extracted from e-commerce magazines and blogs in Arabic language. They developed a small (symbol to Word) converter for the use of emoticons and a checker for the use of the elongated words. They presented the reviews in six different models (unigrams, bigrams, tri-grams, unigrams + bigrams, bigrams + trigrams, unigrams + bigrams + trigrams) for feature extraction. They showed the best results by the combining Unigrams, Unigrams + Bigrams, Unigrams + Bigrams + Trigrams with a standard corpus with the application of the Arabic light stemming classified by Naive Bayes. They also showed the best results by the combinations: Unigrams + Bigrams, Unigrams + Bigrams + Trigrams with a preprocessed corpus, stemmed and classified by Support Vectors machine (SVM). Swati, Pranali, Pragati propose the system that crawls the news from web pages and extracts desire data using HTML parser. They extract features using Weka tool and classified sentiment using Naïve Bayes algorithm [4]. Ayutthaya and Pasupa developed Thai sentiment analysis for Thai children stories. They did incorporation of part-ofspeech and sentic features. They implement sentiment system by using combination of Bidirectional Long Short-term Memory and Convolutional Neural Networks models. Their best result of f1 score is 78.79. They use 40 Thai stories dataset and get the best result of f1 score 78.79 [5]. Okada and Yanagimoto, and Hashimoto developed sentiment analysis system for customer review. They use Amazon Product Review dataset and Japanese review dataset of TripAdvisor. They also used gated convolutional neural networks model and then compare the best results of gCNN faster than RNN with different language datasets [7].

3 Proposed System 3.1

Training Dataset

Myanmar News data are collected from web sites and ALT Tree bank for training and testing data. The task is to identify if positive or negative sentiment is expressed at that level. Following are the features of the dataset:

Sentiment Analysis System for Myanmar News Using SVM and Naïve Bayes

• • • • • • • • •

553

Originally procured 3000 Myanmar News. 2000 news annotated as positive in the dataset. 1000 news annotated as negative in the dataset. Total numbers of words in the dataset are 50395. Average numbers of words in news are rounded to 28. Total numbers of words in positive news are 40635. Average numbers of words in positive news can be rounded off to 26. Total numbers of words in negative news are 9760. Average numbers of words in negative news can be rounded off to 32.

We aim to identify whether the news evokes a positive or a negative emotion for training data. 3.2

Preprocessing

Preprocessing is the first step in natural language processing. There are three preprocessing steps in the proposed system such as word segmentation, tokenization, and stop words removal. • Word Segmentation is the basic task in natural language processing task that determine boundaries of word. Myanmar word segmentation is the process of inserting spaces into textual data without other replacing or rewriting operations. This system used word segmentation tool from NLP Lab, University of Computer Studies, Yangon, Myanmar examples of segmented result are as follow:

• Tokenization is the process of breaking up a sequence of strings into words, phrases, keywords and other elements. Tokens or words are separated and identified by white space, punctuation marks or line breaks. • Stop words are commonly used words that are programmed to ignore for searching, retrieving, and other natural language processing tasks. Stop word removal is a basic important preprocessing step to get more performance result. Examples of Myanmar stop words are and etc. 3.3

Feature Extraction and Transformation

N-gram N-gram is a language models that assigns probabilities to the sequences of words. Ngram is based on bag of word model. N-gram is a sequence of word with n length. Ngram is one of most important tool in speech and language processing. N-gram with

554

T. Yu and K. T. Nwet

length (n = 1) is called unigram and length (n = 2) is also called bigram and then length (n = 3) is also called trigram. Text classification depends on text representation [2]. By using n gram, the system accuracy is higher. Examples of N-gram are shown in Table 1. Examples of n-gram words with the following sentence are shown in Table 1.

Table 1. N-gram feature example

TF-IDF TF-IDF calculates a weight which shows the importance of a term in a document to present textual data. It compares the frequency of word states in an individual document as opposed to the entire document. TF-IDF is based on the bag-of-words (BoW) model, therefore it does not need to get position in text, semantics, cooccurrences in different documents, etc. TF is term frequency that gives frequency of the word in each document in the corpus. It raises as the number of occurrences of that word within the document raises [6]. Formula of TF is as in Eq. (1). TFðtÞ ¼ ðNumber of times term t appears in a documentÞ = ðTotal number of terms in the documentÞ

ð1Þ IDF is inverse document frequency that calculate the weight of rare words across all documents in the corpus. The words that occur scarcely in the corpus have a high IDF score. Formula of IDF is as in Eq. (2). IDFðtÞ ¼ log eðTotal number of documents = Number of documents with term t in itÞ ð2Þ

Sentiment Analysis System for Myanmar News Using SVM and Naïve Bayes

555

Then, TF-IDF is calculated as in Eq. (3) [6]. TF-IDF ¼ TF IDF

ð3Þ

For example, calculation of TF-IDF is shown in Table 2 and considers two news sentences as follow:

Table 2. TF-IDF values

3.4

Classification

Support Vector Machine Support vector machine (SVM) is one of most applicable supervised machine learning algorithms which can be used for both classification and regression challenges. SVM is often defined as the classifier that makes the highest accuracy outcomes in text classification issues. Each data item is plotted as a point in n-dimensional space (n is number of features) with the value of each feature being the value of a particular coordinate. Classification is processed by searching the hyper-plane that differentiates the two classes very well. SVM defines the hyper plane that has the maximum margin values between the nearest data points of either class [8]. The decision function is as in Eq. (4). f ðvÞ ¼ xv þ c

ð4Þ

556

T. Yu and K. T. Nwet

The hyper plane function is as in Eq. (5). f ðvÞ ¼ xv þ c = 0

ð5Þ

For positive case, Eq. (6) is used. f ðvÞ ¼ xv þ c [ 0

ð6Þ

For negative case, Eq. (7) is used. f ðvÞ ¼ xv þ c \ 0

ð7Þ

X means input feature vector, w means weight vector, b means bias. Naïve Bayes The bayesian classification is used as a probabilistic learning method to classify text. Naive bayes classifier is an independent features model that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature. Naïve bayes classifiers can be trained very efficiently on the precise nature of the probability model. The probability function of sentiment class given document is as in Eq. (8). Pðc=dÞ ¼ PðcÞPðd=cÞ=PðdÞ

ð8Þ

Where, c = sentiment class d = document P(d) has the same value, so p(d) can be drop. Document has many features, function can be as in Eq. (9). Y Pðc=d Þ ¼ argmax PðcÞ f 2F Pðf =cÞ ð9Þ Where, F = features vector [9, 10].

4 Experimental Result Most effective feature to use in this system is the combination of unigram and bigram (unigram + bigram). Highest accuracy score (81.50%) is obtained by using combination of unigram and bigram (unigram + bigram) with support vector machine. Table 3 shows accuracy score values of the proposed system. Table 3. Accuracy score value Algorithm

Unigram (%) Bigram (%) Trigram % Uniram + Bigram + Unigram + Bigram (%) Trigram (%) Bigram + Trigram (%)

SVM 81.33 Naïve Bayes 73.33

72.33 65.17

65.33 64.67

81.50 68.00

68.50 65.00

80.33 66.00

Sentiment Analysis System for Myanmar News Using SVM and Naïve Bayes

557

5 Conclusion and Future Work This system is implemented to classify sentiment of news in Myanmar language. N-gram is used for feature extraction. TFIDF is used as feature weighting process. By using system, user can easily feel emotion of news. Support vector machine and naïve bayes algorithms are used in this system. This system compares accuracies of those algorithms with many features. In future, we intend to classify with many algorithms such as logistics regression, convolutional neural network, maximum entropy, random forest.

References 1. Godbole, N., Srinivasaiah, M., Skiena, S.: Large-scale sentiment analysis for news and blogs (2007) 2. Jurafsky, D., Martin, J.H.: N-gram language models. Speech and Language Processing, 23 September 2018 3. Sghaier, M.A., Zrigui, M.: Sentiment analysis for Arabic e-commerce websites, June 2015 4. Swati, U., Pranali, C., Pragati, S.: Sentiment analysis of news articles using machine learning approach. Int. J. Adv. Electron. Comput. Sci. 2(4), 114–116 (2015) 5. Ayutthaya, T.S.N., Pasupa K.: Thai sentiment analysis via bidirectional LSTM-CNN model with embedding vectors and sentic features (2018) 6. Aizawa, A.: An information-theoretic perspective of TF-DF measures. Inf. Process. Manag. 39, 45–65 (2003) 7. Okada, M., Yanagimoto, H., Essand, K.H.: Sentiment classification with gated CNN for customer reviews (2018) 8. Fletcher, T.: Support vector machines explained, 23 December 2008 9. www.machinelearningblog.com/tutorial 10. http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html

State Evaluation of Electric Bus Battery Capacity Based on Big Data Yifan Li1,2, Weidong Fang1,2(&), Fumin Zou1,2(&), Sheng Wang1,2, Chaoda Xu1,2, and Weisong Dong1,2 1

2

Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology, Fuzhou 350118, Fujian, China [email protected], [email protected] Fujian Provincial Big Data Research Institute of Intelligent Transportation, Fujian University of Technology, Fuzhou 350118, Fujian, China

Abstract. Compared with traditional fuel vehicles, electric vehicles have the advantages of low carbon, low pollution and low noise. At present, major auto manufacturers are actively exploring the field of electric vehicles. In the future, electric vehicles will become the leading force in the automotive field. However, the problems of short battery life, insufficient cruising range and slow charging time of electric vehicles have been criticized. Battery capacity is an important indicator of battery health and directly reflects the health of the battery. This paper proposes a big data-based approach to assess the capacity of electric bus batteries. First by getting the vehicle’s history data, analysis the SOC change of the vehicle and battery discharge power relations, with a certain amount of data as the training set, the regression algorithm is used to construct the electric vehicle battery capacity prediction model. The model is used to determine the average capacity of vehicle batteries in stages. The average capacity is compared with the capacity values of different stages to achieve an assessment of the health status of the electric bus battery. Keywords: Electric vehicle of health State of charge

Battery capacity Big data Regression State

1 Introduction According to the relevant regulations, the discharge capacity of the electric vehicle battery cycle should not be lower than 80% of the initial capacity when it reaches 1000 times. In fact, with the increase in the number of uses, a series of chemical reactions inside the battery of an electric vehicle lead to a decline in battery capacity and a decrease in life, which seriously affects the safety and cruising range of the electric vehicle. Therefore, timely and accurate detection of the health status of electric vehicle batteries is particularly important. At present, a series of research methods for detecting the health status of electric vehicle batteries have been proposed at home and abroad. The establishment of electrochemical model [1] and equivalent circuit model [2] is a common method in this field. In addition, for example, Kim et al. [3] proposed a new method for estimating the health status of lithium batteries based on the dual synovial © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 558–566, 2020. https://doi.org/10.1007/978-981-15-3308-2_61

State Evaluation of Electric Bus Battery Capacity Based on Big Data

559

observer. Remmlinger et al. [4] proposed a method for detecting the degree of degradation of a battery by detecting an increase in battery resistance. Jiao et al. [5] proposed a method for diagnosing the health status of electric vehicle batteries based on discrete Fréchet distance. Li et al. [6] proposed a new integrated method based on Gaussian process (MGP) model and particle filter (PF) hybridization for SOH estimation of lithium-ion battery under uncertain conditions. In the process of using electric vehicles, not only a series of complicated chemical reactions occur, but also often influenced by external factors, it is difficult to establish a precise model. This paper proposes an electric vehicle battery capacity state detection method based on driving big data. It aims to establish an electric vehicle battery capacity estimation model by mining data. From the data point of view, skip the complex internal mechanism to realize research on the health status detection of electric vehicle batteries.

2 Data Preprocessing Battery Health Status can be regarded as the current maximum discharge capacity and the percentage of rated capacity. It is an important parameter for the performance and life of the reaction battery. It describes the irreversible performance degradation process of the battery. There are many factors affecting the health of the battery, such as charge and discharge rate, depth of charge and discharge, ambient temperature, etc., a series of factors will directly affect the chemical reaction inside the battery, thus affecting the internal resistance, capacity and other properties of the battery. By comparing the current capacity and rated capacity of the battery, the health of the battery can be visually reflected. The current capacity of the battery can be detected by a charge and discharge test, but the test usually requires the battery to go offline, which affects the normal use of the battery. Therefore, the data-driven battery capacity estimation method that does not require the battery to be offline and does not affect the normal use of the electric vehicle can well compensate for this disadvantage. 2.1

Data Background

In accordance with the provisions of the “Regulations on the Administration of New Energy Vehicle Manufacturing Enterprises and Products” issued by the Ministry of Industry and Information Technology and the “Notice of the Ministry of Industry and Information Technology on Further Improving the Safety Supervision and Application of New Energy Vehicles”, Using information technology to establish and improve the local inspection platform for the promotion and application of new energy vehicles in the public service field, and receive in real time the safety status, mileage and charge of new energy vehicles in the public service areas of the local jurisdictions forwarded by vehicle manufacturers, vehicle and power Information such as critical system faults such as batteries and drive motors. The experimental data is taken from a new energy vehicle detection platform. The data is obtained by the vehicle terminal reading the vehicle bus Can data, and is uploaded by the GPRS wireless network through a proprietary protocol, including a series of driving data such as data time, latitude,

560

Y. Li et al.

longitude, vehicle speed, SOC, accumulated mileage of the instrument, battery voltage, lithium battery output total energy, and lithium battery input total energy. 2.2

Data Processing

The experimental data in this paper is the historical driving data of the CY8413 electric bus in Quanzhou, Fujian Province from November 2018 to May 2019. The data includes multiple standard attribute features. The experiment only needs some attributes. After filtering, only the data time, SOC (battery module SOC) (%), and lithium battery output total energy (kwh) data content are retained. The capacity when the battery is fully charged, that is, the capacity at 100% SOC. Generally, the battery capacity is expressed in units of AH, which is the length of time at which a certain rated current can be discharged. However, since the total output of lithium battery output in the driving data is kWh, this article uses kWh to indicate the battery capacity unit. Through data visualization (partial visualization results are shown in Fig. 1), it is found that the bus will replenish power at night after completing the daily operation task, so as to ensure the full power state when the bus leaves the next day. Buses are also given a chance to recharge their batteries in the late afternoon. Therefore, buses have two SOC decline stages every day, which are battery discharge stages. The data required in this paper are taken from these two stages. Take the driving data on March 2 and 3, 2019 as an example to observe the SOC change, and get the Fig. 1.

Fig. 1. Bus battery SOC change

Data may be affected by many factors during the acquisition and transmission phase, and data loss and noise are prone to occur. Throughout the experimental data as a whole, there is only a small amount of data missing, so only the missing values need to be processed. In the data mining, the processing of missing values generally has a padding method and a deletion method. The deletion method is applicable to a case where the amount of missing values accounts for a small proportion of the total amount of data, and even if the deletion does not have much influence on the result. The proportion of missing values in this experimental data is extremely small and will not affect the regression model. Therefore, the samples with missing data can be eliminated.

State Evaluation of Electric Bus Battery Capacity Based on Big Data

561

“lithium battery output total electrical energy (KWH)” records of the bus is lithium battery cumulative output power value, and the experiment is to use the electric bus every discharge data to build the model of the process, so the bus must be starting “lithium battery output total electrical energy (KWH)” numerical value as the base, the subsequent “lithium battery output total electrical energy (KWH)” value minus the base value, then the base value of zero, the formation of lithium battery when the car in each time you start output characteristics data of total energy is zero. In this way, the change of output energy of lithium battery in each discharge process can be intuitively reflected. Part of the pre-processed data is shown in Table 1.

Table 1. Partial data after preprocessing Data time 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019 03-03-2019

06:51 06:51 06:51 06:52 06:52 06:52 06:52 06:52 06:52 06:52 06:53 06:53 06:54 06:54 06:54 06:54 06:54

SOC (%) Output total energy (kWh) 100 0 100 0 100 0.01 100 0.01 100 0.01 100 0.02 100 0.02 100 0.02 100 0.02 100 0.03 100 0.05 100 0.06 100 0.08 100 0.09 100 0.1 100 0.11 100 0.14

3 Establishment of Regression Model 3.1

Support Vector Regression

With the help of Python’s Scikit-Learn machine learning library, the data is fitted by the linear kernel function of Support Vector Regression (SVR). The training set of regression problem and classification problem is similar, which can be expressed as: T ¼ fðx1 ; y1 Þ; ðx2 ; y2 Þ; ðxl ; yl Þg 2 ðRn YÞl

ð1Þ

562

Y. Li et al.

Where xi 2 Rn is the input indicator vector, or input, whose component is called an attribute or feature. yi 2 Y ¼ RðI ¼ 1; 2; lÞ is the output indicator. Its main meaning is to try to find a linear function gðxÞ ¼ ðw xÞ þ b on Rn , in order to use y ¼ gðxÞ to infer the output value y corresponding to any input x. The objective function of the SVR problem can be expressed as: min 12 jjwjj2 w;b

s:t: ðw xi Þ þ b yi e; i ¼ 1; ; l yi ðw xi Þ b e; i ¼ 1; ; l

ð2Þ

The constraint in the above formula means that all training sample points should be within the regression line e band, and the objective function means that the regression line to be sought should be the one with the smallest slope satisfying the above conditions. The training samples required for the study were taken from the pre-processed data, using 70% of the sample data as a training set to construct a regression model, and 30% as a test set to evaluate the model. In addition, considering the irreversibility of the battery health state attenuation, the prediction model needs to be continuously updated according to the update of the driving data, and the battery capacity of the battery is estimated by the latest battery discharge data. 3.2

Model Evaluation

The sample characteristics of the test set are predicted by the established model, and the predicted result is compared with the real value. The data is selected on March 2 and March 3, 2019 to obtain the test results as shown in Figs. 2 and 3. Comparison of true and predicted values and error distribution:

Fig. 2. Data evaluation results on March 2, 2019

State Evaluation of Electric Bus Battery Capacity Based on Big Data

563

Fig. 3. Data evaluation results on March 3, 2019

The predicted value in the left graph rises stepwise because the SOC attribute value in the data varies by 0.4%. The ordinate label ‘error’ in the image on the right is the absolute value of the difference between the predicted value and the true value: error ¼ jypre ytest j

ð3Þ

Where ypre is the predicted value of the battery capacity and ytest is the true value of the battery capacity. MSE (Mean Square Error) and RMSE (Root Mean Square Error) are adopted to evaluate the regression model. MSE and RMSE are often used to evaluate the performance of the regression model, where: MSE ¼

m 1X ðypre ytest Þ2 m i¼1

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m 1X RMSE ¼ ðypre ytest Þ2 m i¼1

ð4Þ

ð5Þ

Table 2. Model prediction error Data time 2018/11/15 2018/12/15 2019/01/22 2019/02/22

MSE 0.673 0.420 0.485 0.159

RMSE 0.821 0.648 0.697 0.398

Taking the data of four days on November 15, 2015, December 15, 2019 and February 22, 2018 as an example, the mean square error and root mean square error of the regression model are calculated as shown in Table 2.

564

Y. Li et al.

The mean square error and root mean square error are less than 1, indicating that the overall prediction accuracy is high, and the regression model is reliable. 3.3

Battery Capacity Estimation

If the SOC is 100% at the start of the car, the trained regression model can be used directly to predict the total amount of electricity discharged by the lithium battery when the SOC is 0. This value is the total battery power. However, during the experiment, it was found that the SOC was less than 100% when the vehicle was started, especially when the charging time was short during the noon every day, and the electric vehicle often could not get enough power to make the SOC reach 100%. In the data preprocessing stage, the “lithium battery output total energy” attribute data when the SOC is initially decremented is zeroed to indicate the total amount of electric energy discharged by the automobile battery during one discharge, so if the initial SOC of the SOC decrement phase occurs Less than 100% of the cases will result in a smaller final forecast. To solve this problem, use the following formula to calculate the total capacity of the car battery: S ¼ b þ ðwÞ ð100 c0 Þ

ð6Þ

Where S is the current total capacity of the car battery, b is the constant term of the regression formula, w is the coefficient of the primary term of the regression formula, and c0 is the starting SOC value of the battery discharge phase. Due to the presence of various influencing factors, there is a difference in capacity even for the same battery at a similar time. Considering that the attenuation of the health status of the car battery is a slow process, the daily battery capacity is estimated by using the prediction model in combination with the above algorithm, and the average value of each month is obtained after statistics. Using the average of each month’s battery capacity as a representative of battery capacity over time can reduce the impact of other factors on the prediction of results. In addition, because the discharge process of the electric vehicle battery is continuous, the temperature will continuously affect the capacity of the car battery in a continuous variable, so the average temperature can be taken as an influencing factor in the whole driving process. After the results are counted, the table is created as follows: Table 3. Statistics of battery capacity estimation results Date 2018/11 2018/12 2019/01 2019/02 2019/03 2019/04 2019/05

Average temperature (°C) Average battery capacity (kWh) 19.28 129.43 17.17 126.10 13.50 118.24 15.72 117.37 16.80 118.57 19.81 121.72 22.33 121.27

State Evaluation of Electric Bus Battery Capacity Based on Big Data

565

It can be observed from the Table 3 that the battery capacity is significantly affected by the temperature. Although the battery capacity will decrease with the use of the battery, compared with the battery capacity value in November 2018 and the battery capacity value in April and May of 2019, it is found that the battery actually has a low capacity state in January, February and March of 2019. The low-capacity state is not only the decline of the health status of the battery itself, but also the impact of temperature drop. Therefore, it is only possible to reflect the true attenuation of the battery capacity by the difference in the average battery capacity between months with similar temperatures, and to judge the change in the health state of the battery.

4 Conclusions In this paper, adopting the new energy vehicles monitoring platform of new energy bus traffic data as the research data base, through the data visualization and pretreatment, understand the status of the new energy bus daily operation, use the data to construct a regression model to predict the bus every day battery capacity. The battery capacity and statistical summary of all the results give the staged capacity of the electric bus battery represented by the month. By comparing the battery capacity of the same bus at different stages, the change in battery health and the change in battery capacity due to ambient temperature are reflected. Using data presentation, it is proved that in addition to the decay of the battery’s own life, when the temperature is low (less than 25 °C), the battery capacity will decrease as the temperature decreases. Comparing the average battery capacity between months with similar average temperatures, the degree of attenuation of the battery health status can be obtained. Due to the limited experimental data, the time span of the data is not large enough to compare the data of many years, which may result in insufficient results. With the popularity of electric vehicles and the increase in the amount of data, as long as the time distribution is longer, using this method to estimate the health status of electric vehicle batteries will lead to more objective results.

References 1. Klein, R., Chaturvedi, N.A., Christensen, J., et al.: State estimation of a reduced electrochemical model of a lithium-ion battery. In: American Control Conference. IEEE (2010) 2. Vasebi, A., Partovibakhsh, M., Bathaee, S.M.T.: A novel combined battery model for stateof-charge estimation in lead-acid batteries based on extended Kalman filter for hybrid electric vehicle applications. J. Power Sources 174(1), 30–40 (2007) 3. Kim, I.S.: A technique for estimating the state of health of lithium batteries through a dualsliding-mode observer. IEEE Trans. Power Electron. 25(4), 1013–1022 (2010) 4. Remmlinger, J., Buchholz, M., Meiler, M., et al.: State-of-health monitoring of lithium-ion batteries in electric vehicles by on-board internal resistance estimation. J. Power Sources 196 (12), 5357–5363 (2011) 5. Jiao, D., Wang, H., Zhu, J., et al.: EV battery SOH diagnosis method based on discrete Fréchet distance. Power Syst. Prot. Control 44(12), 68–74 (2016)

566

Y. Li et al.

6. Li, F., Xu, J.: A new prognostics method for state of health estimation of lithium-ion batteries based on a mixture of Gaussian process models and particle filter. Microelectron. Reliab. 55(7), 1035–1045 (2015) 7. Park, S., You, G.W., Oh, D.J.: Data-driven state-of-health estimation of EV batteries using fatigue features. In: IEEE International Conference on Consumer Electronics. IEEE (2016) 8. You, G.W., Park, S., Oh, D.: Real-time state-of-health estimation for electric vehicle batteries: a data-driven approach. Appl. Energy 176, 92–103 (2016) 9. Hametner, C., Jakubek, S., Prochazka, W.: Data-driven design of a cascaded observer for battery state of health estimation. In: IEEE International Conference on Sustainable Energy Technologies. IEEE (2017) 10. Bhangu, B.S., Bentley, P., Stone, D.A., et al.: Nonlinear observers for predicting state-ofcharge and state-of-health of lead-acid batteries for hybrid-electric vehicles. IEEE Trans. Veh. Technol. 54(3), 783–794 (2005) 11. Chen, Z., Mi, C.C., Fu, Y., et al.: Online battery state of health estimation based on genetic algorithm for electric and hybrid vehicle applications. J. Power Sources 240(31), 184–192 (2013) 12. Li, Z.: Research and application of support vector regression machine. Dalian University of Technology (2006) 13. Patil, M.A., Tagade, P., Hariharan, K.S., et al.: A novel multistage support vector machine based approach for Li ion battery remaining useful life estimation. Appl. Energy 159, 285– 297 (2015)

High-Utility Itemset Mining in Big Dataset Jimmy Ming-Tai Wu1 , Min Wei1 , Jerry Chun-Wei Lin2(B) , and Chien-Ming Chen1 1 Shandong University of Science and Technology, Qindao, China [email protected], [email protected], [email protected] 2 Western Norway University of Applied Sciences, Bergen, Norway [email protected]

Abstract. High-utility mining (HUIM) is an extended concept from frequent itemset mining (FIM). It emphasizes the more important factors, such as profits or the weight of an itemset in commercial applications. In this paper, we assume a dataset is too big to be loaded in the memory, then propose a MapReduce framework to handle this kind of situation, and try to reduce the times of scanning dataset as possible and maximize parallelization of the process. Keywords: HUIM

1

· Data mining · Big data framework · MapReduce

Introduction

High-utility itemset, which is an extended concept from high-frequency itemset, is proposed to dope out a solution to reveal more valuable information from transaction databases [2–4,6,7,9,11–14]. Fournier-Viger et al. proposed the state-of-the-art EFIM method with effective pruning strategy for mining HUIs [15]. EFIM has excellent ability to reveal all of the HUIs in a short time. Lin et al. [12] considered the average-utility measure and presented an algorithm for mining high average-utility patterns with multiple thresholds consideration. Evolutionary algorithms [8,10,11] were also applied to resolve the “exponential problem” for mining HUIM. There is a principal problem with the above method. The previous EFIM is not specifically designed for a big-data database. For example, some of the previous algorithms modify the original dataset in order to reduce the size of the dataset and they should load the total data into the memory to perform the designed operators generally. In this paper, we introduce a novel MapReduce framework for HUIM, it modifies from the previous EFIM method, gets the benefit from the effective pruning strategy. The proposed framework tries to reduce the times of scanning for the original dataset as possible and don’t do any modify operator for the original dataset. c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 567–570, 2020. https://doi.org/10.1007/978-981-15-3308-2_62

568

2

J. M.-T. Wu et al.

Related Work

High-utility itemset mining (HUIM) was proposed to reveal the more important information in a dataset. It considers both the quantities and unit profits of itemsets to find the set of high-utility itemsets (HUIs). Transaction-weighted utilization (TWU) model was designed to reduce the candidate itemsets for reveal HUIs, due to the downward closure (DC) property [4]. The DC of TWU is if an itemset of which transaction- weighted utility is larger than the threshold, it would be called for high transaction-weighted utilization itemset (HTWUIs), and all of its supersets are definitely not HTWUIs and HUIs. Thus, a method applying TWU mode can ignore the itemsets which are the superset of the known HTWUIs. Li et al. then proposed isolated items discarding strategy (IIDS) to further reduce the number of candidates patterns by using TWU model [5]. MapReduce, which is proposed by Google, is a programming framework to handle a big dataset [1]. It is a parallel, distribution algorithm on a cluster and contains two major procedures, Map and Reduce. The input data for MapReduce framework is formatted as a key-value list. The node which performs Map procedure in a cluster is assigned a part of the input list. Map procedure performs a Map operator which is defined by the programmer to transfer the input keyvalue list to another key-value list. A shuffle operator is defined to distribute records in the key-value list which generated by Map procedure to different nodes which perform Reduce procedure in the cluster. The shuffle operator tries to balance the loading for each node and the records with the same key would be delivered to the same node. Finally, Reduce procedure performs a summary operation for the input records which have the same key and output the results. Generally, the format of the output data is also a key-value list. Map and Reduce procedures are independent operators, there is no interaction between nodes in a cluster during Map or Reduce is performed. Therefore, MapReduce framework can achieve distribution and reliability by parceling out the operators to nodes in a cluster. Users can dynamically add or remove computation units in the system due to the different loading. The system also has the ability to reassign the fail job to another node from some shutdown nodes. Overall, MapReduce provides a reliable, dynamic and parallel programming framework to deal with a big data environment.

3

Proposed MapReduce Framework

In this section, a MapReduce framework of EFIM is shown in Fig. 1. The definitions and algorithm of EFIM was introduced in [15]. The detail of the proposed framework is described in the following subsections. 3.1

Reveal the Set of 1-HTWUIs

In this section, the proposed algorithm follows the original EFIM algorithm and first reveal 1-HTWUIs (High transaction weighted utility itemsets). After EFIM and the proposed algorithm reveal 1-HTWUIs, they will sort the 1-HTWUIs and build the searching graph to general the candidate itemsets and reveal HUIs.

High-Utility Itemset Mining in Big Dataset

Transaction DataBase

MapReduce 1 Reveal 1-HTWUIs

Generate Task

MapReduce 2 Reveal HUIs

569

Ouput

HUIs

Fig. 1. The proposed MapReduce framework.

3.2

Generate the Task Files

In the proposed MapReduce framework, For each time of seeking dataset, the proposed framework will reveal all of the HUIs which contain the same number of items. So, a task file needs to be generated to show the candidate itemsets for the MapReduce stage. In other words, at the k-th time of performing MapReduce process, all of the k-HUIs will be found in this round. The format of the task file is a list of the key-value structure, the key is an itemset which is estimated at the previous round, and the value is a list of items which can be extended from this itemset in the searching graph, therefore, the keys and the values can be combined with the new candidate itemsets for this iteration. Thus, the first task file just contains one record of which key is NULL and value is the list of 1-HTWUIs. 3.3

Execute MapReduce Framework

In this section, the proposed framework loads the candidate HUIs from the task file and transaction information from the transaction database. Then, the proposed frameworks will be applied to calculate the utility, sub-tree utility and local utility for all candidate itemsets, and further reveal the HUIs in this dataset. The process will calculate all the above value by parallel computation in the MapReduce environment. If the utility of a candidate itemset is larger than the pre-defined threshold, the candidate itemset will be output to the set of HUIs. And the value of sub-tree utility and local utility will be used to generate the task file (the candidate itemsets) for the next round of the proposed MapReduce process. Finally, if the process cannot generate any new candidate itemset of HUIs, the proposed algorithm will quit the process and output the set of HUIs.

4

Conclusion

In this paper, we proposed a MapReduce framework for the previous EFIM algorithm. EFIM is a state-of-the-art technic to reveal HUIs in a transaction database. However, EFIM did not consider the situation with a big transaction

570

J. M.-T. Wu et al.

database. The proposed framework can resolve this issue and obtain all of HUIs effectively. We will discuss this proposed MapReduce framework and do the experiments to show the performance in our future works.

References 1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008) 2. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: HUOPM: highutility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2020) 3. Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015) 4. Lin, C.W., Hong, T.P., Lu, W.H.: An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38(6), 7419–7424 (2011) 5. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: International Conference on Information and Knowledge Management, pp. 55–64 (2012) 6. Lan, G.C., Hong, T.P., Tseng, V.S.: An efficient projection-based indexing approach for mining high utility itemsets. Knowl. Inf. Syst. 38(1), 85–107 (2014) 7. Lin, J.C.W., Gan, W., Hong, T.P., Zhang, B.: An incremental high-utility mining algorithm with transaction insertion. Sci. World J. 2015, Article ID 161564 (2015) 8. Lin, J.C.W., Yang, L., Fournier-Viger, P., Wu, M.T., Hong, T.P., Wang, L.S.L., Zhan, J.: Mining high-utility itemsets based on particle swarm optimization. Eng. Appl. Artif. Intell. 55, 320–330 (2016) 9. Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Chao, H.C.: FDHUP: fast algorithm for mining discriminative high utility patterns. Knowl. Inf. Syst. 51(3), 873–909 (2017) 10. Lin, J.C.W., Yang, L., Fournier-Viger, P., Hong, T.P., Voznak, M.: A binary PSO approach to mine high-utility itemsets. Soft. Comput. 21(17), 5103–5121 (2017) 11. Lin, J.C.W., Yang, L., Fournier-Viger, P., Hong, Y.P.: Mining of skyline patterns by considering both frequent and utility constraints. Eng. Appl. Artif. Intell. 77, 229–238 (2019) 12. Lin, J.C.W., Li, T., Fournier-Viger, P., Zhang, J., Guo, X.: Mining of high averageutility patterns with item-level thresholds. J. Internet Technol. 20(1), 187–194 (2019) 13. Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016) 14. Wu, J.M.T., Zhan, J., Lin, J.C.W.: An ACO-based approach to mine high-utility itemsets. Knowl.-Based Syst. 116, 102–113 (2017) 15. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a highly efficient algorithm for high-utility itemset mining. In: Advances in Artificial Intelligence and Soft Computing, pp. 530–546. Springer (2015)

Study on Health Protection Behavior Based on the Big Data of High-Tech Factory Production Line Kuo-Chi Chang1, Kai-Chun Chu2(&), Yuh-Chung Lin1, Trong-The Nguyen1, Tien-Wen Sung1, Yu-Wen Zhou1, and Jeng-Shyang Pan1 1

2

Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, China Department of Business Administration Group of Strategic Management, National Central University, Taoyuan, Taiwan [email protected]

Abstract. The concept of occupational health becomes more and more popular in 2019. This study choose workers on the PCB production lines of A Factory and B Factory in Taoyuan are taken as the target population in 2014 first generation of health promotion management database system, and a total of 340 active workers aged between 20 and 65, and with more than a year of work experience, are selected as the research subjects to facilitate the effectiveness of this study. However, in the electronic heath database test project of this study, when the various departments use the health promotion management database system as a management decision-making action, the time is greatly shortened, which is also because the employees of various departments change from the behaviors and cognition, attitude, and self-efficacy, which is also very important positive results. Keywords: Health protection behavior Attitude Self-efficacy Big data

Occupational hazard Cognition

1 Introduction In recent years, the number of workers suffering from occupational injuries and diseases has been gradually rising. According to the statistics of the Ministry of Labor, among the 8.18 million workers in Taiwan, includes 346 died from occupational injuries and disease in 2014. The 150,000 cases of occupational injury insurance payouts related to working environments, which resulted in a total of NT$7 billion in insurance reimbursements. According to the statistics of the National Institute for Occupational Safety and Health (NIOSH), the rate of occupational injuries has seen a significant increase in the past decade, among which occupational injuries related to manufacturing industries in Taiwan constitute a non-negligible proportion [1–3]. As the concept of occupational health becomes more and more popular, people begin to pay greater attention to occupational environments. As well as on the current situations of the five major occupational hazards are chemical, physical, biological, © Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 571–578, 2020. https://doi.org/10.1007/978-981-15-3308-2_63

572

K.-C. Chang et al.

human factors, and socio-psychological hazards, which are gradually changing from hazards to possible occupational injuries and disease. Most relevant studies at home and abroad regarding workers’ exposure to workplaces with occupational hazards were conducted from two perspectives includes (1) the demographical factors, which are influential on workers’ acceptance of implemented health protective behaviors; (2) single health protective behavior factors, which focus on their acceptance, and are manifested as their cognition, attitude, and self-efficacy toward these behaviors [4, 5]. Among these two kinds of studies, literature on the latter, namely those concerning the single health protective behaviors for certain industries (engineering hazard control, administrative management of hazard exposure time, wearing protective equipment, safety and health education training, health examination), constitute the dominant part. This means studies on how different health protective behaviors will influence workers’ cognition, attitude, and self-efficacy toward occupational hazards, as well as occupational health management grading after they are included in the studies. However, manufacturing industries must overcome the urgent problem of a lack of effective health protective behaviors that are superior to those stipulated in current rules and regulations [6]. The above literature review statistically confirms that, the implementation of health protective behaviors is significantly correlated to workers’ cognition, attitude, and selfefficacy for occupational hazards. The more health protective behaviors a worker adopts, the better their cognition, attitude, and self-efficacy for occupational hazards, which could contribute to their health and reduce the rate of occupational injuries and diseases. However, as most previous studies were conducted to investigate only one or two health protective behaviors, this study conducts further investigation and analysis of the PCB manufacturing industry to include four additional occupational hazards, which aims to understand how these behaviors could influence workers in terms of several demographic variables, such as their cognition, attitude, and self-efficacy for occupational hazards and strategic performance grading. It is intended that the results of this study could serve as references for future employers to adjust their health protective behaviors to effectively meet the needs of workers when laborers, costs, and resources are limited, thus, preventing workers from experiencing occupational injuries and diseases, and working together to create a safe and healthy workplace for workers [7–9].

2 Methodology This study takes the workers of PCB manufacturing factories as the questionnaire respondents from 2014 to 2018. The workers on the PCB production lines of A Factory and B Factory in Taoyuan are taken as the target population in 2014 first generation of health promotion management database system, and a total of 340 active workers aged between 20 and 65 (excluding temporary workers, dispatched agency workers, shortterm working students, and those who have never engaged in the production process), and with more than a year of work experience, are selected as the research subjects to facilitate the effectiveness of this study. This study distributed 340 questionnaires and collected 340 questionnaires, for a collection rate of 100% in the health promotion

Study on Health Protection Behavior Based on the Big Data

573

management database system. After removing 4 questionnaires by database system automatically that were not completely filled, 336 valid questionnaires remained, for a valid collection rate of 96.5%, these questionnaires are all built into the health promotion management database system automatically. The variables relevant to this study consist of three parts in the health promotion management database system, which are defined, as follows: target population “workers of PCB manufacturing factories”, with a total number of 336, (including workers engaged in dusty work - chemical hazards, noisy work - physical hazards, storage and handling work areas - human factors hazards, and those suffering from hypertension, high cholesterol, high blood sugar, and cardiovascular disease - psychosocial hazards); the independent variable is “health protective behaviors”; the dependent variables are “cognition, attitude, and self-efficacy for occupational hazards”; the control variables are the demographic variables, including gender, age, educational background, marital status, total seniority, seniority in the workplace, workplace, title, work shifts, hazard exposure, medical experience, examination time, examination results, safety support, and workplace environment. According to the aforementioned health management requirements and scope, the health promotion management database system of this study is based on the construction data table, input form design, operation form design, display report design, query function establishment, module programming, database completion and other architectures achieved. (As shown in Fig. 1).

Fig. 1. Research procedure and flow

This study uses the Waterfall Model to develop a health promotion management database, which is implemented using the Systems Development Life Cycle (SDLC) (as shown in Fig. 2). Its design procedures are requirements definition, system analysis, system design, system implementation, system testing, system construction, and finally system operation and maintenance. The output of each stage of this design pattern will

574

K.-C. Chang et al.

become the input source of the next stage, so the previous stage can be completed before the next stage is completed and each design stage must have clear goals [10–12].

Fig. 2. The health promotion management database system establishment and testing process

The input interface of the health promotion management database system established in this study provides convenient and fast operation functions for labor. The health promotion management database system login screen established in this study, the basic labor information filling page established in this study, the musculoskeletal symptom assessment scale established in this study (as shown in Fig. 3).

Fig. 3. The login screen of musculoskeletal symptom assessment

Study on Health Protection Behavior Based on the Big Data

575

3 Statistical Analysis and Results in First Generation This section aims to investigate the relationship between health protective behaviors and cognition, attitude, and self-efficacy for occupational hazards based on first generation of the health promotion management database system. Two-tailed testing of Pearson product-moment correlation analysis is applied to understand the distribution and correlation between the various constructs of the independent and dependent variables, and this correlation is represented by a correlation coefficient. The more this coefficient is approximate to 1, the higher the correlation. Table 1 elaborates on the analysis results. The various construct of the independent variable (health protective behaviors), as well as the various constructs of the dependent variables (cognition, attitude, and self-efficacy for occupational hazards), and are positively correlated.

Table 1. Correlation analysis of health protective behaviors and cognition, attitude, and selfefficacy for occupational hazards (first generation N = 336) Variable name 1 A Factory 1. Engineering hazard control 1.00 2. Hazard exposure time control .60** 3. Wearing protective equipment .52** 4. Safety and health education .50** training 5. Special health examination .64** 6. Cognition .03 7. Attitude .59** 8. Self-efficacy .67** B Factory 1. Engineering hazard control 1.00 2. Hazard exposure time control .54** 3. Wearing protective equipment .44** 4. Safety and health education .34** training 5. Special health examination .35** 6. Cognition .16* 7. Attitude .18* 8. Self-efficacy .39** Note. Coefficients *p < 0.05; **p < 0.01,

2

3

4

5

6

7

8

1.00 .60** 1.00 .46** .76** 1.00 .57** .08 .61** .55**

.69** .36** .69** .49**

.69** .26** .68** .59**

1.00 .06 1.00 .69** 1.99*** 1.00 .73** .03 .69** 1.00

1.00 .53** 1.00 .34** .46** 1.00 .30** .26** .24** .15* .20** .08 .32** .17* .06 .43** .35** .22** two-tailed test

1.00 .27** 1.00 .03 .02 .27** .18*

1.00 .23** 1.00

In order to understand the demographic features of the respondents, as well as the factors important to the correlation analysis of health protective behaviors, cognition, attitude, and self-efficacy for occupational hazards, it is vital to know whether there exists and collinearity between the independent variable and dependent variables before the actual regression analysis begins. Thus, this study applies collinearity testing

576

K.-C. Chang et al.

between the variables to test for variance inflation factors and tolerance. When tolerance is between 0–1 and VIF is greater than 10, the variables have a collinear relationship; as seen in Table 2, there is no collinearity between the variables in this case. Table 2. Collinearity diagnostics table of the independent variable and dependent variables of regression analysis (first generation N = 336) Independent variable

Tolerance A Factory Gender 0.89 Age 0.33 Educational background 0.67 Marital status 0.71 Seniority 0.25 Total seniority 0.26 Working unit 0.49 Title 0.40 Work shifts 0.81 Hazard exposure 0.61 Medical experience 0.44 Examination time 0.48 Examination results 0.53 Safety support 0.68 Workplace environment 0.74 Engineering hazard control 0.43 Hazard exposure time control 0.45 Wearing protective equipment 0.20 Safety and health education training 0.27

Inflation factor Tolerance B Factory 1.13 0.81 3.05 0.29 1.50 0.47 1.41 0.57 4.06 0.31 3.81 0.24 2.04 0.44 2.49 0.28 1.23 0.78 1.65 0.49 2.26 0.38 2.09 0.47 1.88 0.60 1.48 0.70 1.35 0.57 2.35 0.56 2.24 0.53 4.91 0.56 3.73 0.65

Inflation factor 1.23 3.43 2.13 1.77 3.24 4.16 2.30 3.58 1.29 2.03 2.61 2.12 1.68 1.44 1.75 1.80 1.91 1.80 1.55

However, in the electronic heath database test project, as can be seen from the results (Table 3), when the various departments use the health promotion management database system as a management decision-making action, the time is greatly shortened, which is also because the employees of various departments change from the behaviors and cognition, attitude, and self-efficacy, which is also very important positive results.

Study on Health Protection Behavior Based on the Big Data

577

Table 3. Collinearity diagnostics table of the independent variable and dependent variables of regression analysis (first generation N = 336) Electronic health Management database test project staff

Manufacturing department engineer

Spend Engineering QC department - department time engineer

0.39

0.37

0.45

0.49

0.42

0.48

0.41

0.38

0.37

0.43

0.46

0.49

0.47

0.38

0.45

0.42

0.43

0.42

0.47

0.44

0.47

0.47

0.48

0.44

0.46

0.35

0.36

0.37

0.34

0.36

Business R&D department department engineer

Chemical hazard 0.42 0.41 health information Physical hazard 0.49 0.46 health information Biological hazard 0.45 0.49 health information Human risk health 0.46 0.44 information Maternal workplace 0.49 0.45 protection health information Annual health 0.38 0.39 checkup information Total time of electronic health database test (unit:

minute)

2.56

4 Conclusions The concept of occupational health becomes more and more popular in 2019. According to the statistics of the Ministry of Labor, among the 8.18 million workers in Taiwan, includes 346 died from occupational injuries and disease in 2014. The 150,000 cases of occupational injury insurance payouts related to working environments, which resulted in a total of NT$7 billion in insurance reimbursements. This study takes the workers of PCB manufacturing factories as the questionnaire respondents from 2014 to 2018. The workers on the PCB production lines of A Factory and B Factory in Taoyuan are taken as the target population in 2014 first generation of health promotion management database system, and a total of 340 active workers aged between 20 and 65 (excluding temporary workers, dispatched agency workers, short-term working students, and those who have never engaged in the production process), and with more than a year of work experience, are selected as the research subjects to facilitate the effectiveness of this study. However, in the electronic heath database test project of this study, when the various departments use the health promotion management database system as a management decision-making action, the time is greatly shortened, which is also because the employees of various departments change from the behaviors and cognition, attitude, and self-efficacy, which is also very important positive results.

References 1. Andersen, L.L., Burdorf, A., Fallentin, N., Persson, R., Jakobsen, M.D., Mortensen, O.S.: Holtermann, patient transfers and assistive devices: prospective cohort study on the risk for occupational back injury among healthcare workers. Scand. J. Work Environ. Health 40(1), 74–81 (2014)

578

K.-C. Chang et al.

2. Chen, C.Y., Chang, K.C., Wang, G.B.: Study of high-tech process furnace using inherently safer design strategies (I) temperature distribution model and process effect. J. Loss Prev. Process Ind. 26, 1198–1211 (2013) 3. Lu, C.C., Chang, K.C., Chen, C.Y.: Study of high-tech process furnace using inherently safer design strategies (IV). The advanced thin film manufacturing process design and adjustment. J. Loss Prev. Process Ind. 40, 378–395 (2016) 4. Chu, K.C.: Establishment of the health promotion management database and its effect evaluation - with PCB manufacturing factories as the example. Ind. Saf. Health 316(6), 68– 80 (2015) 5. Lu, C.C., Chang, K.C., Chen, C.Y.: Study of high-tech process furnace using inherently safer design strategies (III) advanced thin film process and reduction of power consumption control. J. Loss Prev. Process Ind. 43, 280–291 (2016) 6. Fujishiro, K., Geer, G.C., de Castro, A.B.: Associations of workplace aggression with workrelated well-being among nurses in the philippines. Am. J. Public Health 101(5), 861–867 (2011) 7. Jay, O., Kenny, G.P.: Heat exposure in the Canadian workplace. Am. J. Ind. Med. 53(8), 842–853 (2010) 8. Chu, K.C., Horng, D.J., Chang, K.C.: Numerical optimization of the energy consumption for wireless sensor networks based on an improved ant colony algorithm. J. IEEE Access 7, 105562–105571 (2019) 9. Chen, C.Y., Chang, K.C., Lu, C.C., Wang, G.B.: Study of high-tech process furnace using inherently safer design strategies (II) deposited film thickness model. J. Loss Prev. Process Ind. 26, 225–235 (2013) 10. Labriola, M., Lund, T., Christensen, K.B., Albertsen, K., Bültmann, U., Jensen, J.N., Villadsen, E.: Does self-efficacy predict return-to-work after sickness absence? A prospective study among 930 employees with sickness absence for three weeks or more. Work 29(3), 233–238 (2007) 11. Njagi, A.N., Oloo, A.M., Kithinji, J., Kithinji, J.M.: Knowledge, attitude, and practices of health-care waste management and associated health risks in the two teaching and referral hospitals in Kenya. J. Community Health 37(6), 1172–1177 (2012) 12. Santos, M.B.G., Carvalho, L.S., Carvalho, R.S., Cunha, J.C.M., Araujo, I.F.: Risk of accidents at work in cachaca manufacturing process. In: Sho2015: International Symposium on Occupational Safety and Hygiene, pp. 335–337 (2015)

Social Network and Stock Analytics

A New Convolution Neural Network Model for Stock Price Prediction Jimmy Ming-Tai Wu1(B) , Zhongcui Li1 , Jerry Chun-Wei Lin2 , and Matin Pirouz3 1 Department of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China [email protected], [email protected] 2 Western Norway University of Applied Sciences, Bergen, Norway [email protected] 3 California State Engineering, Fresno, USA [email protected]

Abstract. The stock market is a highly nonlinear dynamic system, not only stock prices have a certain tendency, but also it is influenced by many factors such as political, economic and psychological factors. With the flourishing development of deep learning technique, a well-designed neural network can accomplish feature learning tasks more effectively. For the task of feature extraction and price movement prediction of financial time series, a novel convolutional neural network framework is proposed to enhance the accuracy of prediction. The proposed method is named stock sequence array convolutional neural network (SSACNN). It constructed a sequence array for the historical data and applies this array to be an input image for the proposed CNN framework. There are five Taiwanese stocks as a testing benchmark in the experimental results. SSACNN compared with previous algorithms, the performance of movement prediction is improved obviously.

Keywords: Stock history

1

· Convolutional neural network

Introduction

There are many kinds of financial time series forecasting [8] in financial markets. Especially stock price forecasting, various methods and data sources are used for stock market prediction. For example, genetic algorithm that is tool and technique for extracting features from original financial data which to do making predictions based on a set of variables [1,2]. In recent years, with the rapid development of deep learning field, CNN is especially up-and-coming in deep learning research at present. A previous CNN work uses stock candlestick charts to be the input image and feeds into the input layer directly [5]. In this paper, a new framework for CNN will be proposed to simulate the principle of image input, integrate the existing data into the form c Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 581–585, 2020. https://doi.org/10.1007/978-981-15-3308-2_64

582

J. M.-T. Wu et al.

of an image. The proposed method inputs the data piece by piece as images then trains the network weights by these images in order to extract useful features and identifies extreme value market by extracting features to generate appropriate signals. The history data of Taiwanean stock market is used to be input data and transferred by a pre-process to be input vectors for the proposed network.

2

Related Work

Financial time series refers to the arrangement of the values of financial random variables in order of time within a certain period. Considering a stock price sequence, an improved method was proposed by Chen et al. to provide more operable stock portfolio decisions for investors, moreover, a sequence-based GSP (group stock portfolio) was derived to provide investment advice [2]. They also proposed a domain-driven stock portfolio optimization algorithm which can meet the needs of investors in mining operable stock portfolios by using evolutionary algorithm [1]. Technical analysis is the most basic and direct methods in stock forecasting. In technical analysis, people think that some historical patterns are correlated to future behaviors [9], therefore, a lot of technical indicators are defined to describe these patterns in order to be applied in investment expert systems. With the development of deep learning, it is gradually applied to financial time series [6]. In deep learning, LSTM (Long Short–Term Memory) has a unique memory function and time sequence concept. Therefore, financial time series tend is suitable to be applied in LSTM. Thus Pang et al. The proposed an LSTM method to predict the immature stock market and obtain useful information from the stock time series [4,7]. A previous CNN work, which is applied to the traditional CNN framework, used stock candlestick charts to be the input images. However, the input images contain too much noise, useless information and are prone to overfitting [5]. Thus, the proposed method, which is named stock sequence array convolutional neural network (SSACNN), was proposed to resolve these issues and improve the performance. SSACNN focus on the meaning infomation and uses this information to generate a data array to be the input images in CNN framework. The experimental results showed the proposed model has obvious improvement with the previous CNN algorithm. The detailed algorithm for the proposed SSACNN is shown in the following section.

3

Stock Prediction Model Based on Convolutional Neural Network

The detail definitions and process for the proposed SSACNN will be proposed in this section. First, there are five indexes for stock used in SSACNN to generate the input image. Second, a normalization function will be introduced to modify the input data. It will cause the proposed method to focus on the trend of the prices, not on the actual value. Finally, a flow chart will be used to explain the process for SSACNN.

A New Convolution Neural Network Model for Stock Price Prediction

3.1

583

Stock Indexes and the Input Image

In the traditional technical analysis for the stock price, there are five indexes usually be the most important features to analyze the trend of the price. They are respectively open (the open price for a stock in the certain date), high (the highest price for a stock in the certain date), low (the lowest price for a stock in the certain date), close (the close price for a stock in the certain date) and volume (the total trading volume for a stock in the certain date). In this work, an input image will be generated by (collect) the indexes information for stock in 30 days. The x-axis for input images means dates of continuous periods. The y-axis for input images indicates the five indexes of stocks in these dates. An example is shown in Fig. 1.

y 244.5

248.5

250.5

245.5

248

240

...

228

227.5

246.5

252

251

246.5

248

241

...

235.5

234

243

248.5

243

241

242

232.5

...

227

226.5

246.5

250

245

246

242

234.5

...

234.5

231

6198

5082

2853

4459

5522

9433

...

5491

4338

4

5

6

...

29

30

1

2

3

X

Fig. 1. Example of the input image.

In the experiments, 30 days is the predefined width of a sliding window in the sequences of stock indexes. Each window can generate an input image and the pre-process can obtain the next image by shift one date from the current window. Finally, the proposed method can obtain a sequence of input image. It can be denoted as y1 , y2 , /dots, ym . Two neighboring images mean that their sliding windows are different by one day. Each image will be labeled by the price change percentage of the next date. The labeling function is described in Eq. 2. ⎧ ⎨ +1 lt > 0.01 0 Others Zt = (1) ⎩ −1 lt < −0.01 Zt is denoted the label of the sample yt , lt is the change percentage of price in the next date for the current stock. When lt is greater than 0.01, it will be labeled as +1 (price increasing), if lt is less than −0.01, it will be labeled as −1 (price decreasing), otherwise, it will be labeled as 0. Note that, each value will be normalized. The normalization function will be defined in the following section.

584

3.2

J. M.-T. Wu et al.

Normalization Function

In order to train a deep learning network for a general situation, SSACNN will normalize all of the input values. The range of the price for the training data might very different from the range for the testing data. Therefore, to normalize the input values can obtain the concept of price’s trend. It can avoid the network is just available for a certain range of prices. The proposed normalization function is given in Eq. 1 Xt − mean (2) X˙ t = max − min where Xt is the indexes vector for time t (open, high, low, close), X˙ t is the indexes vector after the normalize process. mean, max and min are the average value vector, maximal value vector and minimal value vector for the indexes vector in a certain period. In the experiments, the length of the period is set as 120 days. In the future work, the influence of the length of the period will be discussed. 3.3

The Flow Chart of SSACNN

Convolutional neural network (CNN) is a famous deep learning neural network framework proposed by LeCun et al. [3]. It includes several convolution layers, pooling layers, and full connection layers. It has been proved to have the ability to identify images. In this paper, the proposed method transfers a period of the stock indexes value to a sequence of images by the above method. These images will be the input images for the CNN framework. Finally, the probability of each output is analyzed with the sigmoid function and set a label for the input image. Figure 2 is a flow chart of the proposed algorithm.

input 30 x the number of variables

output(three nodes)

conv pool dropout norm

conv pool dropout norm

conv pool dropout norm

full connect

full connect

full connect

Fig. 2. The proposed SSACNN framework.

4

Experimental Results

In this section, in order to evaluate the effectiveness and performance of the proposed SSACNN, the previous CNN model [5] and SVM were used to compare with SSACNN. There are five stocks used in the experimental results. Table 1 showed the accuracy of SSACNN, the previous CNN model and SVM. Obviously, the proposed SSACNN has better performance than the previous two algorithms.

A New Convolution Neural Network Model for Stock Price Prediction

585

Table 1. The accuracy of three different algorithms in the five different stocks. si SSACNN CNN

SVM

s1 0.703364 0.5048

0.467882

s2 0.710791 0.61522 0.414352 s3 0.71625

0.54617 0.45978

s4 0.612931 0.53782 0.519087 s5 0.721689 0.5913

5

0.382812

Conclusion

In this paper, an algorithm SSACNN based on the convolutional neural network was proposed. The proposed method is obviously has better performance than the previous method. The SSACNN does not turn the data into pictures, the data is directly integrated into a matrix to avoid too much dispersion of the data and reduce the useless information. In total, the effectiveness of the stock price prediction is improved effectively in this framework.

References 1. Chen, C.H., Hsieh, C.Y.: Actionable stock portfolio mining by using genetic algorithms. J. Inf. Sci. Eng. 32(6), 1657–1678 (2016) 2. Chen, C.H., Yu, C.H.: A series-based group stock portfolio optimization approach using the grouping genetic algorithm with symbolic aggregate approximations. Knowl.-Based Syst. 125, 146–163 (2017) 3. Gardner, M.W., Dorling, S.: Artificial neural networks (the multilayer perceptron)— a review of applications in the atmospheric sciences. Atmos. Environ. 32(14–15), 2627–2636 (1998) 4. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013) 5. Hoseinzade, E., Haratizadeh, S.: CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst. Appl. 129, 273–285 (2019) 6. Long, W., Lu, Z., Cui, L.: Deep learning-based feature engineering for stock price movement prediction. Knowl.-Based Syst. 164, 163–173 (2019) 7. Pang, X., Zhou, Y., Wang, P., Lin, W., Chang, V.: An innovative neural network approach for stock market prediction. J. Supercomput., 1–21 (2018) 8. Rani, S., Sikka, G.: Recent techniques of clustering of time series data: a survey. Int. J. Comput. Appl. 52(15), 1–9 (2012) 9. Taylor, M.P., Allen, H.: The use of technical analysis in the foreign exchange market. J. Int. Money Finance 11(3), 304–314 (1992)

Author Index

A An, Kai, 237 Aye, Nilar, 30 B Bu, Guannan, 69 C Cai, Longzheng, 228 Cai, Qiqin, 143, 181 Cang, Yan, 210 Cao, Pengcheng, 200 Chai, Qing-Wei, 61 Chang, Cheng-Kuo, 310, 429 Chang, Kuo-Chi, 191, 310, 333, 571 Chao, Han-Chieh, 11 Chen, Bijun, 134, 151 Chen, Chan, 210 Chen, Chien-Ming, 455, 462, 468, 474, 567 Chen, Chun-Hao, 87 Chen, Chunyu, 543 Chen, Hanlin, 3 Chen, Hao-Min, 429 Chen, Siyu, 78 Chen, Yeh-Cheng, 455, 468 Chen, Yuqing, 295 Chen, Zhengshan, 366 Chen, Zhihui, 3, 181, 399 Chiu, Ming-Sung, 245 Chiu, Yi-Jui, 103, 111, 117, 125 Chu, Kai-Chun, 191, 571 Chu, Shu-Chuan, 50, 61, 462, 468 Chung, Yi-Nung, 245

Cui, Binge, 281 Cui, Dongli, 69 D Dao, Thi-Kien, 50 Deng, Hui-Qiong, 171, 302 Deng, Huiqiong, 78 Deng, Liqiang, 343 Dong, Weisong, 558 F Fan, Ya, 343 Fang, Weidong, 558 G Gankhuyag, Munkhjargal, 87 Gu, Shimin, 20 Guo, Baolong, 253, 355, 380 Guo, Feng, 143 Guo, Wei-Wen, 437 H Hao, Jingyu, 500 He, Hengxiang, 543 He, Xingli, 221 He, Zhigang, 366 Hong, Shuo, 262 Hong, Tzung-pei, 87 Hsu, Chao-Hsing, 245 Hsu, Chih-Yu, 94, 409 Hu, Jie, 380 Hu, Renyuan, 69 Hu, Rong, 3, 181

© Springer Nature Singapore Pte Ltd. 2020 J.-S. Pan et al. (Eds.): ICGEC 2019, AISC 1107, pp. 587–589, 2020. https://doi.org/10.1007/978-981-15-3308-2

588 Hu, Zhiyuan, 181 Hua, Jielin, 78 Huang, Hao-Da, 117 Huang, Li-Li, 437 Huang, Qingdan, 295 Huang, Zhe, 355, 380 J Jia, Kebin, 416 Jiang, Lei, 69 Joe-Yu, 94, 409 K Kao, Fan-Yi, 11 L Lee, Yu-Qi, 455 Lee, Zhiyuan, 462, 468 Leung, Ka-Cheong, 491 Li, Chao-Gang, 171, 302 Li, Cheng, 253, 355, 380 Li, Jianxing, 366 Li, Jing, 270 Li, Peiqiang, 200 Li, Peng, 321 Li, Qi-Chao, 103, 117 Li, Qi-chao, 125 Li, Qin-Bin, 171 Li, Shaoli, 262 Li, Xiao-Yun, 111 Li, Xinhui, 281 Li, Yifan, 558 Li, Yong, 143 Li, Zhongcui, 581 Liang, Bohan, 500 Liang, Tsung-Ta, 11 Liang, Xiao-Cong, 455 Liao, Lyuchao, 134, 151, 181, 399 Lim, Shuyun, 228 Lin, Chien-Chih, 245 Lin, Huiwei, 491 Lin, Jerry Chun-Wei, 567, 581 Lin, Jinyang, 221 Lin, Li, 321 Lin, Qin-Bin, 302 Lin, Xing-Ying, 171, 302 Lin, Yuh-Chung, 191, 333, 571 Liu, Jianhua, 69 Liu, Jierui, 181 Liu, Jiurui, 151 Liu, Lisang, 366 Liu, Weike, 534 Liu, Wen-jun, 125 Liu, Xiao, 321

Author Index Liu, Xiaomin, 270 Lu, Yan, 281 Lu, Zheming, 237, 447 Luo, Kan, 366 Luo, Sijie, 3, 143 Luo, Xuexue, 237, 447 Lv, Huiyuan, 295 M Ma, Huibin, 270 Ma, Ying, 366 Meng, Yao, 41 N Naing, Thinn Thu, 30 Ngo, Truong-Giang, 50 Nguyen, Trinh-Dong, 50 Nguyen, Trong-The, 50, 191, 571 Nwet, Khin Thandar, 551 P Pan, Jeng-Shyang, 50, 61, 94, 191, 310, 333, 409, 429, 437, 462, 468, 571 Pei, Liqiang, 295 Peng, Peng-Fei, 111 Peng, Yu, 262 Pirouz, Matin, 581 Platoš, Jan, 513 Q Qiao, Yulong, 210, 543 R Rao, Rui, 295 S Shan, Jie, 310, 429 Song, Jianqiao, 389 Su, Jingyong, 523 Sun, Peidong, 200 Sun, Yujia, 513 Sung, Tien-Wen, 191, 571 T Tang, Linlin, 41, 483, 523 Tang, Longmei, 228 Tie, Zhuoyu, 474 Tong, Xupeng, 523 Tran, Huu-Trung, 50 Tseng, Fan-Hsun, 11 W Wang, Eric Ke, 474 Wang, Feng, 389

Author Index Wang, Geng, 355, 380 Wang, Hancong, 321 Wang, Jhen-Yang, 245 Wang, King-Hang, 462, 468 Wang, Sheng, 558 Wang, Tao, 455 Wang, Xuan, 228 Wang, Zhiyong, 534 Wang, Zhuozheng, 416 Wei, Bizhong, 343 Wei, Min, 567 Wei, Zhefei, 355 Wen, Jinjuan, 3, 181 Weng, Guo-Wei, 117, 125 Wu, Jimmy Ming-Tai, 87, 567, 581 Wu, Jing, 281 Wu, Mu-En, 87, 462 Wu, Peng-Peng, 171, 302 Wu, Shangxi, 500 Wu, Tsu-Yang, 455, 462, 468 Wu, Xinke, 151 Wu, Yue, 447

X Xie, Bo-Lin, 333 Xu, Chaoda, 558 Xu, Chenrui, 416 Xu, GuanWei, 161 Xu, Kaiyuan, 500 Xu, Ming, 399 Xu, Weihui, 3, 143

589 Y Yang, Haiyan, 3 Yang, Lei, 462, 468 Yang, Yulin, 253 Ye, Miao, 343 Ye, Yunming, 491 Yeh, Jyh-Haw, 474 Yin, Tong, 262 Yu, Faxin, 237, 447 Yu, Jinxiang, 262 Yu, Thein, 551 Yuan, Ye, 416 Z Zhang, Bowen, 491 Zhang, Dongyang, 69 Zhang, Xiaoqing, 534 Zhang, Xingfu, 543 Zhang, Yong, 543 Zhang, Yonghui, 281 Zhao, Huaqi, 270 Zhao, Liang, 483 Zheng, Rongjin, 78, 161 Zheng, Shiguang, 94 Zheng, Wei-Min, 61 Zheng, Wenbin, 321 Zheng, Yongguo, 534 Zheng, Yuxin, 134, 151, 399 Zhi, Yunpeng, 253 Zhou, Yu-Wen, 191, 571 Zhuang, Weida, 221 Zou, Fumin, 134, 143, 151, 181, 399, 558