3D Imaging―Multidimensional Signal Processing and Deep Learning: Images, Augmented Reality and Information Technologies, Volume 1 9819912296, 9789819912292

This book presents high-quality research in the field of 3D imaging technology. The fourth edition of International Conf

267 125 11MB

English Pages 296 [297] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

3D Imaging—Multidimensional Signal Processing and Deep Learning : Images, Augmented Reality and Information Technologies, Volume 1 [1] 9789819912308, 9789819912292

This book presents high-quality research in the field of 3D imaging technology. The fourth edition of International Conf

210 102 41MB Read more

3D imaging technologies -- multidimensional signal processing and deep learning : methods, algorithms and applications. Volume 2 9789811631801, 9811631808

656 112 10MB Read more

3D Imaging—Multidimensional Signal Processing and Deep Learning: Multidimensional Signals, Video Processing and Applications, Volume 2 9789819911455, 9789819911448, 9819911451

This book presents high-quality research in the field of 3D imaging technology. The fourth edition of International Conf

180 56 51MB Read more

3D Imaging―Multidimensional Signal Processing and Deep Learning: Multidimensional Signals, Video Processing and Applications, Volume 2 9819911443, 9789819911448

This book presents high-quality research in the field of 3D imaging technology. The fourth edition of International Conf

333 79 8MB Read more

Deep Learning in Visual Computing and Signal Processing 1774638703, 9781774638705

An enlightening amalgamation of deep learning concepts with visual computing and signal processing applications, this ne

523 192 16MB Read more

Deep Learning-Powered Technologies: Autonomous Driving, Artificial Intelligence of Things (AIoT), Augmented Reality, 5G Communications and Beyond 3031357361, 9783031357367

This book covers various, leading-edge deep learning technologies. The author discusses new applications of deep learnin

376 34 8MB Read more

Deep Learning-Powered Technologies. Autonomous Driving, Artificial Intelligence of Things (AIoT), Augmented Reality, 5G Communications and Beyond 9783031357367, 9783031357374

263 62 8MB Read more

Library in Signal Processing: Signal Processing Theory and Machine Learning [1 ed.] 0123965020, 9780123965028

This first volume, edited and authored by world leading experts, gives a review of the principles, methods and technique

318 144 34MB Read more

Innovative Technologies and Signal Processing in Perinatal Medicine: Volume 1 [1st ed.] 9783030544027, 9783030544034

Pregnancy is a critical time for the health of the mother and the fetus, with important potential risks for both. Tools

473 103 12MB Read more

Diagnostic Biomedical Signal and Image Processing Applications With Deep Learning Methods: With Deep Learning Methods 0323961290, 9780323961295

Diagnostic Biomedical Signal and Image Processing Applications with Deep Learning Methods presents comprehensive researc

587 127 10MB Read more

3D Imaging―Multidimensional Signal Processing and Deep Learning: Images, Augmented Reality and Information Technologies, Volume 1
9819912296, 9789819912292

Author / Uploaded
Srikanta Patnaik
Roumen Kountchev
Yonghang Tai
Roumiana Kountcheva

Categories
Technique

Table of contents :
Preface
Contents
About the Editors
1 A Review of Temporal Network Analysis and Applications
1.1 Introduction
1.2 Modeling and Representations of Temporal Networks
1.3 Structural Properties and Statistical Characteristics
1.3.1 Temporal Motifs and Community Structures
1.3.2 Paths and Distances
1.3.3 Centrality Metrics
1.4 Application Analysis
1.4.1 Important Node Analysis
1.4.2 Link Prediction
1.4.3 Community Segmentation
1.4.4 Influence Maximization
1.4.5 Analysis Tools
1.5 The Future Outlook
References
2 A Knowledge Representation Method for Hierarchical Diagnosis Decision-Making Under Uncertain Conditions
2.1 Introduction
2.2 Hierarchical Diagnosis Decision-Making Knowledge Classification
2.3 Hierarchical Diagnosis Decision-Making Knowledge Ontology Modeling
2.3.1 Hierarchical Diagnosis Decision-Making Knowledge Representation Model
2.3.2 Hierarchical Diagnosis Decision-Making Knowledge Representation Model
2.4 The Semantic Level of Hierarchical Diagnosis Decision-Making and Its Knowledge Relevance
2.4.1 Device Domain Ontology Semantics and Their Relationships
2.4.2 Process Domain Ontology Semantics and Their Relationships
2.4.3 Diagnostic Domain Ontology Semantics and Their Relationships
2.4.4 Maintenance Decision Domain Ontology Semantics and Their Relationships
2.5 Conclusions
References
3 Research and Implementation of Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network
3.1 Introduction
3.2 Related Work
3.3 Research on the Method of Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network
3.3.1 Characteristics of Electric Distribution Network Data
3.3.2 Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network
3.4 Implementation of the Method of Hybrid Storage for Multi-source Heterogeneous Data of Electric Distribution Network
3.4.1 Design of Hybrid Storage Framework for Power Distribution Data
3.4.2 Implementation of Hybrid Storage Management System for Power Distribution Data
3.5 Application Results
References
4 Research on Measuring Method of Ball Indentation Based on MATLAB Image Edge Detection
4.1 Introduction
4.2 Indentation Measurement
4.3 Edge Detection
4.4 Technical Scheme
4.4.1 Measurement Steps and Methods
4.4.2 Simulation Result
4.5 Concluding Remarks
References
5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep Semantic Segmentation
5.1 Introduction
5.2 Process and Method
5.2.1 Acquisition and Production of Datasets
5.2.2 Construction of Deep Segmentation Model
5.2.3 Training Results of Deep Segmentation Model
5.2.4 Evaluation of Prediction Results of Deep Semantic Segmentation Model
5.3 Multi-scale Geometric Feature Characterization and Extraction of Mineral Zone
5.3.1 Multi-scale Feature Representation of Minera Zone
5.3.2 Multi-scale Feature Extraction of Mineral Zone
5.3.3 Result of the Extraction of Mineral Zone
5.4 Conclusion
References
6 Research on Mass Image Data Storage Method for Data Center
6.1 Introduction
6.2 Key Technology
6.3 System Design
6.4 Function Realization
6.5 Conclusion
References
7 Research on Virtual and Real Spatial Data Interconnection Mapping Technology for Digital Twin
7.1 Introduction
7.1.1 Research Background
7.1.2 Purpose and Significance
7.1.3 Research Level at Home and Abroad
7.2 Key Technology Research
7.2.1 Data Interconnection, Fusion Information Model Modeling, and Driving Technology Based on Power Grid Reality Twins
7.2.2 Multi-service Continuous Mapping Mechanism and Real-Time Data Panorama Mapping Method in Power Grid “Virtual-Real” Space
7.2.3 Efficient Transmission and Update Integration Technology of Twin Body Data in the Power Grid Environment
7.3 Conclusion
References
8 Prediction of Breast Cancer Via Deep Learning
8.1 Introduction
8.1.1 Introduction to Deep Learning
8.2 Approach
8.3 Method
8.3.1 Experimental Tool
8.3.2 Data Set
8.3.3 Experimental Indicators
8.3.4 Structural and Parametric Design
8.3.5 Experimental Results and Analysis
8.4 Conclusion
References
9 Scheme Design of Network Neighbor Discovery Algorithm Based on the Combination of Directional Antenna and Multi-channel Parallelism
9.1 Introduction
9.2 Neighbor Discovery Protocol Based on Directional Antenna and Multi-channel
9.2.1 Efficient Unidirectional Neighbor Discovery Strategy
9.2.2 Multi-channel Parallel Neighbor Discovery Strategy
9.2.3 A Fast Neighbor Discovery Strategy Combining Directional Antennas and Multi-channel Parallelism
9.3 Design of Combining Directional Antenna and Multi-channel Parallelism
9.3.1 Design of Simulation and Experimental Verification System
9.3.2 Design of a Fast Neighbor Discovery Strategy Combining Two Mechanisms
9.4 Summary and Prospect of the Scheme Design
9.4.1 Feasibility Analysis of Current Scheme
9.4.2 Summary of Current Scheme
9.4.3 Prospect of the Future Works
References
10 Ship Target Detection in Remote Sensing Image Based on Improved RetinaNet
10.1 Introduction
10.2 RetinaNet
10.2.1 Feature Extraction Network
10.2.2 Feature Fusion Structure
10.2.3 Loss Function and Classification Regression Sub-network
10.3 Network Model
10.3.1 Improved Loss Function
10.4 Experimental Dataset
10.4.1 Evaluation Parameters
10.4.2 Comparison of Result
10.5 Conclusion
References
11 Research on the Application of 3D Modeling Technology in Mechanical Structure Teaching
11.1 Introduction
11.2 The Present Situation and Characteristics of 3D Modeling Technology
11.2.1 The Situation
11.2.2 Characteristic
11.3 Current Situation of Mechanical Structure Teaching
11.3.1 The Internal Structure of Equipment is Difficult to Understand
11.3.2 There Are Hidden Dangers in the Actual Installation Operation
11.3.3 The Operation Demonstration Takes a Long Time
11.3.4 Action Demonstration of Equipment is Difficult
11.3.5 High Cost of Teaching
11.4 Using “3D Modeling Technology” to Improve the Quality of Mechanical Structure Teaching
11.4.1 Establishing the Basic Unit of the Model is the Basic Work to Improve the Teaching Quality
11.4.2 Using 3D Model of Equipment to Solve Difficult Problems in Real-Life Teaching
11.4.3 The 3D Model is Used for Simulation Analysis to Achieve Scientific and Accurate Inspection and Maintenance of Equipment
11.5 Conclusion
References
12 2D Numerical Model Used to Investigate the Influence of Vegetation on Geomorphological Evolution of Mudflats
12.1 Introduction
12.2 Research Methods
12.2.1 Model Setup
12.3 Results and Discussion
12.3.1 Evolution of Bare Mudflat
12.3.2 Evolution of Vegetated Mudflat
12.3.3 Influence of Vegetation on the Evolution of Muddy Tidal Flat Landform
12.4 Conclusions
References
13 Design and Experimental Verification of High Functional Density Cubesat System
13.1 Introduction
13.2 Design Requirements for High Functional Density Cubesat
13.2.1 Standardization
13.2.2 Integration
13.3 The System Design of Cubesat with High Functional Density
13.3.1 Standardized Assembly Stack
13.3.2 Energy Integration Design
13.3.3 Autonomous Operation Management
13.3.4 Diversified Task Mode
13.4 On-Orbit Verification
13.5 Expectation
References
14 Space Design of Exhibition Hall Based on Virtual Reality
14.1 Introduction
14.2 Research Method
14.2.1 Design Principles of VR Display
14.2.2 Generation of Panorama of Space Scene in Exhibition Hall
14.3 Result Analysis
14.4 Conclusion
References
15 Identification of Expressway Traffic States Based on the Enhanced FCM Algorithm
15.1 Description of Expressway Traffic States
15.2 State Identification Based on Enhanced FCM Algorithm
15.2.1 FCM Clustering Analysis and Implementation
15.2.2 SAGA-FCM Algorithm
15.3 Case Analysis
15.4 Summary
References
16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey
16.1 Introduction
16.2 Algorithms for Deformation Simulation of Soft Tissue
16.2.1 Algorithms of Graphics Based Deformation
16.2.2 Algorithms Based on Physical Characteristics
16.3 Results
16.4 Conclusion
References
17 Soft Tissue Cutting Based on Position Dynamics
17.1 Introduction
17.2 Based on Position Dynamics Algorithm and Tetrahedral Mesh Generation
17.2.1 Tetrahedral Mesh Generation
17.2.2 Based on the Principle of Position Dynamics Algorithm
17.3 Interactive Cutting Algorithm
17.3.1 Tetrahedral Cutting
17.3.2 Particle Constraint
17.3.3 Section Treatment
17.4 Simulation Experiment
17.5 Conclusion
References
18 Research on Application Technology of Multi-source Fusion of 3D Digital Resources in Power Grid
18.1 Introduction
18.2 Multimedia Heterogeneous Model Spatial Transformation, Pose Registration, and Comprehensive Reconstruction Technology
18.2.1 Analysis of Demand for 3D Digital Model Media at Different Stages of Infrastructure, Operation, and Maintenance
18.2.2 Discretized Sampling of Research Equipment GIM Model and Attitude Registration Method with Scanned Point Cloud
18.2.3 Research on Comprehensive Reconstruction of Digital Model Through Multi-source Medium Model, Appearance, and Internal Combination
18.3 Multi-source 3D Digital Resource Fusion Analysis Technology with Variable Space and Time
18.4 Conclusions
References
19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum Analysis
19.1 Introduction
19.2 Materials and Methods
19.2.1 Materials
19.2.2 Experimental Methods
19.2.3 Raman Spectrum Data Processing
19.2.4 Extraction of Key Features Raman Spectrum Region
19.2.5 Machine Learning Model Establishment
19.3 Results and Discussion
19.3.1 Identification of Tobacco Origin
19.3.2 Identification of Tobacco Grade
19.3.3 Identification Model Optimization
19.4 Conclusion
References
20 Research on Spam Detection with a Hybrid Machine Learning Model
20.1 Introduction
20.2 Literature Review
20.3 Proposed Model
20.3.1 Framework of Proposed Model
20.3.2 Stage One Process
20.3.3 Stage Two Process
20.4 Experiment Result
20.4.1 Dataset
20.4.2 Experiment Result
20.5 Conclusion
References
21 Research on Weld Defect Object Detection Based on Multi-channel Fusion Convolutional Neural Network
21.1 Introduction
21.2 Multi-channel Fusion Convolutional Neural Network
21.2.1 ResNet Network Structure Analysis
21.2.2 Multi-channel Fusion ResNet
21.2.3 Faster-RCNN Object Detection Model
21.3 Experiment and Analysis
21.3.1 Weld Defect Dataset
21.3.2 Evaluation Index
21.3.3 Experimental Running Environment
21.3.4 Results
21.4 Conclusion
References
22 Design and Research of Highway Tunnel Construction Positioning System Based on UWB
22.1 Introduction
22.2 Brief Introduction of UWB Positioning Technology
22.3 Highway Tunnel Construction Positioning System Based on UWB
22.3.1 Perception Layer
22.3.2 The Transport Layer
22.3.3 Calculating Layer
22.3.4 The Application Layer
22.4 Positioning System Test
22.4.1 Test Environment Construction
22.4.2 Performance Test
22.4.3 Test Results
22.5 Conclusion
References
23 Research on Cross Domain Data Analysis and Data Mining Technology of Power Grid Digital Benefits
23.1 Introduction
23.2 Power Grid Cross Domain System Requirements
23.3 Research on the Relationship Between Cross Domain Data and Digital System
23.4 Cross Domain Data Mining Process
23.4.1 Clustering Method
23.4.2 Cluster Similarity Measurement
23.4.3 Feature Mining of Historical Data
23.5 Fitting Analysis Technology
23.6 Conclusion
References
24 Graph Convolutional Neural Networks for Drug Target Affinity Prediction in U-Shaped and Skip-Connection Architectures
24.1 Introduction
24.2 Datasets and Methods
24.2.1 Datasets
24.2.2 Drug Molecule Representation
24.2.3 Protein Representation
24.2.4 Model Architecture
24.3 Experiment and Result
24.3.1 Evaluation
24.3.2 Baseline models
24.3.3 Setting of the Hyperparameters
24.3.4 Variants
24.3.5 Results and Analysis
24.4 Conclusion
References
25 SOC Estimation of Lithium Titanate Battery Based on Variable Temperature Equivalent Model
25.1 Introduction
25.2 Equivalent Model of Lithium Battery
25.2.1 Equivalent Model
25.2.2 OCV-SOC Nonlinear Relationship
25.2.3 Parameter Identification
25.3 Cubature Kalman Filter Algorithm
25.4 Testing and Discussion
25.4.1 Test Under Different Current Conditions
25.4.2 Test Under Different Temperature Conditions
25.4.3 Algorithm Comparison Test
25.4.4 Initial Value Robustness Test
25.5 Conclusion
References
Appendix
Author Index

Citation preview

Smart Innovation, Systems and Technologies 349

Srikanta Patnaik Roumen Kountchev Yonghang Tai Roumiana Kountcheva Editors

3D Imaging— Multidimensional Signal Processing and Deep Learning Images, Augmented Reality and Information Technologies, Volume 1

Smart Innovation, Systems and Technologies Volume 349

Series Editors Robert J. Howlett, KES International Research, Shoreham-by-Sea, UK Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP. All books published in the series are submitted for consideration in Web of Science.

Srikanta Patnaik · Roumen Kountchev · Yonghang Tai · Roumiana Kountcheva Editors

3D Imaging— Multidimensional Signal Processing and Deep Learning Images, Augmented Reality and Information Technologies, Volume 1

Editors Srikanta Patnaik Interscience Institute of Management and Technology Bhubaneswar, Odisha, India Yonghang Tai Department of Physics Yunnan Normal University Kunming, China

Roumen Kountchev Technical University of Sofia Sofia, Bulgaria Roumiana Kountcheva TK Engineering Sofia, Bulgaria

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-99-1229-2 ISBN 978-981-99-1230-8 (eBook) https://doi.org/10.1007/978-981-99-1230-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

This is Volume 1 of the Proceedings of the Fourth Conference on 3D Imaging Technologies—Multidimensional Signal Processing and Deep Learning (3D ITMSP&DL). 3D Imaging Technologies attracted recently significant attention both in research and industry and the topics cover many related aspects of Multidimensional Signal Processing, Deep Learning and Big data. The 3D IT-MSP&DL’22 provided a wide forum for researchers and academia as well as practitioners from industry to meet and exchange their ideas and recent research development works on all aspects of multidimensional signals analysis and processing, their applications and other related areas. A large number of conference topics attracted researchers working in various scientific and application areas. The papers, accepted for publishing in the proceedings, are arranged in two volumes. The selection of papers in Volume 1 covers research works presenting new methods and the related achievements in such areas as Temporal network analysis; Hierarchical diagnosis decision-making under uncertain conditions; Hybrid storage method for multi-source heterogeneous data; Ball indentation measuring method; Multi-feature extraction of mineral zoning of tabling; Mass image data storage; Virtual and real spatial data interconnection mapping technology; Breast cancer prediction; Network neighbor discovery; Ship target detection in remote sensing; Application of 3D modeling in mechanical structure teaching; Influence of vegetation on geomorphological evolution; Verification of high-functional density cubesat system; Exhibition hall space design; Identification of expressway traffic; Soft tissue deformation and cutting simulation; Multi-source fusion of 3D digital resources; Identification of herbaceous biomass; Intelligent spam detection; Weld defect detection; Highway tunnel construction positioning; Cross domain data analysis and mining; Drug target affinity prediction; Estimation of lithium-titanate battery, etc. In their investigations, authors used various approaches based on Multi-channel fusion CNN, Hybrid machine learning model, Virtual Reality, Improved RetinaNet, Deep Learning and many other up-to-date theoretical scientific tools.

v

vi

Preface

The aim of the book is to present the latest achievements of the authors to a wide range of readers: IT specialists, researchers, physicians, Ph.D. students and other specialists in the area. The book editors express their special thanks to Prof. Lakhmi Jain (Honorary chair), Prof. Dr. Srikanta Patnaik, Prof. Dr. Junsheng Shi and Prof. Dr. D.Sc. Roumen Kountchev (General chairs), Prof. Yingkai Liu (Organizing chair), Prof. Dr. Yonghang Tai, Dr. Shoulin Yin and Prof. Dr. Hang Li (Program chairs), and Dr. S. R. Roumiana Kountcheva (International advisory chair). The editors express their warmest thanks to the excellent Springer team which made this book possible. Bhubaneswar, India Sofia, Bulgaria Kunming, China Sofia, Bulgaria January 2023

Srikanta Patnaik Roumen Kountchev Yonghang Tai Roumiana Kountcheva

Contents

1

A Review of Temporal Network Analysis and Applications . . . . . . . . Jintao Yu, Bing Xiao, and Yuzhu Cui

2

A Knowledge Representation Method for Hierarchical Diagnosis Decision-Making Under Uncertain Conditions . . . . . . . . . . Yawei Ge, Xiqian Hou, Zhuxuan Meng, and Yue Lu

11

Research and Implementation of Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junfeng Qiao, Aihua Zhou, Lin Peng, Lipeng Zhu, Sen Pan, and Pei Yang

31

3

4

5

Research on Measuring Method of Ball Indentation Based on MATLAB Image Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan Zhou, Xin Wang, Ya Liu, Xiaoyuan Wang, Zheng Li, and Yanying Wang Multi-feature Extraction of Mineral Zone of Tabling Through Deep Semantic Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huizhong Liu and Keshun You

6

Research on Mass Image Data Storage Method for Data Center . . . Sen Pan, Jing Jiang, Hongbin Qiu, Junfeng Qiao, and Menghan Xu

7

Research on Virtual and Real Spatial Data Interconnection Mapping Technology for Digital Twin . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhimin He, Lin Peng, Hai Yu, and He Wang

8

Prediction of Breast Cancer Via Deep Learning . . . . . . . . . . . . . . . . . . Yihe Huang

1

41

51 69

77 87

vii

viii

9

Contents

Scheme Design of Network Neighbor Discovery Algorithm Based on the Combination of Directional Antenna and Multi-channel Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Na Zhao, Fei Gao, and Kaijie Pu

99

10 Ship Target Detection in Remote Sensing Image Based on Improved RetinaNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Yandong Sun and Tongliang Fan 11 Research on the Application of 3D Modeling Technology in Mechanical Structure Teaching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Lixiang Qin, Jing Yang, Liang Tang, Quan Gan, and Hongkai Wang 12 2D Numerical Model Used to Investigate the Influence of Vegetation on Geomorphological Evolution of Mudflats . . . . . . . . 131 Jiaojiao Ji 13 Design and Experimental Verification of High Functional Density Cubesat System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Yuying Yao, Weida Fu, Xin Guo, Sihan Shi, and Jing Yan 14 Space Design of Exhibition Hall Based on Virtual Reality . . . . . . . . . 157 Yixuan Wang 15 Identification of Expressway Traffic States Based on the Enhanced FCM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Zhuocheng Yang, Liang Hao, Yuchen Liu, and Lei Cai 16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Xiaoyu Cai and Hongfei Yu 17 Soft Tissue Cutting Based on Position Dynamics . . . . . . . . . . . . . . . . . 193 Zijun Wang and Hongfei Yu 18 Research on Application Technology of Multi-source Fusion of 3D Digital Resources in Power Grid . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Hai Yu, Zhimin He, Lin Peng, Jian Shen, and Kun Qian 19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Qiaoling Li, Zhongli Ye, Hui Liang, Zhiqiang Yu, Zhou Fang, Guohua Cai, Quanxing Zheng, Li Yan, Hongxiang Zhong, Zhe Xiong, Jun Xu, and Zechun Liu 20 Research on Spam Detection with a Hybrid Machine Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Yifu Gao, Jiuguang Song, Jia Gao, Na Suo, An Ren, Juan Wang, and Kun Zhang

Contents

ix

21 Research on Weld Defect Object Detection Based on Multi-channel Fusion Convolutional Neural Network . . . . . . . . . . 237 Hanlin Geng, Zhaohui Li, and Yuanyuan Zhou 22 Design and Research of Highway Tunnel Construction Positioning System Based on UWB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Hengbo Zhang, Xinke Wang, Yuchen Liu, and Lei Cai 23 Research on Cross Domain Data Analysis and Data Mining Technology of Power Grid Digital Benefits . . . . . . . . . . . . . . . . . . . . . . . 261 Gang Wang, Aidi Dong, Changhui Lv, Bo Zhao, and Jianhong Pan 24 Graph Convolutional Neural Networks for Drug Target Affinity Prediction in U-Shaped and Skip-Connection Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Jiale Chen, Xuelian Dong, and Zhongyuan Yang 25 SOC Estimation of Lithium Titanate Battery Based on Variable Temperature Equivalent Model . . . . . . . . . . . . . . . . . . . . . 285 Chao Song, Jianhua Luo, Xi Chen, and Zhizhao Peng Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

About the Editors

Srikanta Patnaik has supervised more than 30 Ph.D. Theses and 100 Master theses in the area of Computational Intelligence, Machine Learning, Soft Computing Applications and Re-Engineering. Dr. Patnaik has published more than 100 research papers in international journals and conference proceedings. He is the author of 3 text books and edited more than 100 books and few invited book chapters, published by leading international publishers like IEEE, Elsevier, Springer-Verlag, Kluwer Academic, IOS Press and SPIE. Dr. Srikanta Patnaik is the Editors–in–Chief of International Journal of Information and Communication Technology and International Journal of Computational Vision and Robotics published from Inderscience Publishing House, England and, Editor of Journal of Information and Communication Convergence Engineering and Associate Editor of Journal of Intelligent and Fuzzy Systems (JIFS). He is also Editors–in–Chief of Book Series on “Modeling and Optimization in Science and Technology” published from Springer, Germany. Prof. Patnaik is awarded with MHRD Fellowship by the Government of India, for the year 1996. He is nominated for MARQUIS Who’s Who for the year 2004 and nominated as International Educator of the Year 2005 by International Biographical Centre, Great Britain. He has been awarded with the certificate of merit from The Institute of Engineers (India) for the year 2004-05. He is also Fellow of IETE, Life Member of ISTE, and CSI. Dr. Patnaik has visited various countries such as Japan, China, Hong Kong, Singapore, Indonesia, Iran, Malaysia, Philippines, South Korea, United Arab Emirates, Morocco, Algeria, Thailand and Vietnam for delivering Key note addresses at various conferences and symposiums. Roumen Kountchev Ph.D., D.Sc. is a professor at the Faculty of Telecommunications, Department of Radio Communications and Video Technologies at the Technical University of Sofia, Bulgaria. His scientific areas of interest are digital signal and image processing, image compression, multimedia watermarking, video communications, pattern recognition, and neural networks. Professor Kountchev has more than 450 papers published in magazines and conference proceedings; 20 books; 48 book chapters; 20 patents. At present, he is a member of Euro Mediterranean Academy of Arts and Sciences and the president of the Bulgarian Association for xi

xii

About the Editors

Pattern Recognition (member of IAPR). He is an editor-in-chief of Intern. Journal of Image Processing and Vision Science; he is an editorial board member of: Intern. Journal of Reasoning-based Intelligent Systems; Intern. Journal Broad Research in Artificial Intelligence and Neuroscience; KES Focus Group on Intelligent Decision Technologies; Egyptian Computer Science Journal; Intern. Journal of Bio-Medical Informatics and e-Health, and Intern. Journal Intelligent Decision Technologies; and he is a member of Institute of Data Science and Artificial Intelligence and Intern. Engineering and Technology Institute. He has been a plenary speaker at more than 30 international scientific conferences and symposia and edited several books published in Springer SIST series. Yonghang Tai studied in Yunnan Normal University and got his bachelor’s degree. He received his Ph.D. on Computer Science from Deakin University, Melbourne, Australia. He has hosted four fund projects including Deakin University Postgraduate Research Full Scholarship, Yunnan Education Commission, Yunnan Natural Science Foundation, and Yunnan Education Commission. He has published more than 30 papers, 5 of which have been indexed by SCI. He is the co-editor of International Journal of Telemedicine and Clinical Practices and Machine Learning and Data Analytics. His research interests include VR/AR/MR in surgical simulation, physicbased rendering, and medical image processing. Roumiana Kountcheva got her M.Sc. and Ph.D. at the Technical University of Sofia, Bulgaria, and in 1992, she got the title Senior Researcher. At present, she is the vicepresident of TK Engineering, Sofia. She had postgraduate trainings in Fujitsu and Fanuc, Japan. Her main scientific interests are in image processing, image compression, digital watermarking, pattern recognition, image tensor representation, neural networks, CNC, and programmable controllers. She has more than 200 publications and 5 patents. R. Kountcheva was the plenary speaker at 23 international scientific conferences and scientific events. She edited several books published in Springer SIST series and is a member of international organizations: Bulgarian Association for Pattern Recognition, International Research Institute for Economics and Management (IRIEM), the Institute of Data Science and Artificial Intelligence (IDSAI) and is an honorary member of the Honorable Editorial Board of the nonprofit peer-reviewed open-access IJBST Journal Group.

Chapter 1

A Review of Temporal Network Analysis and Applications Jintao Yu, Bing Xiao, and Yuzhu Cui

Abstract With the continuous development of network science, a single and static network structure has become more and more difficult to portray various complex systems, while the temporal network is becoming an effective tool to solve the above problem. At present, the research on the temporal network is still at the stage of primary, and there are still many worthy areas to be further explored. Inspired by this, in our paper, we review the modeling and representation of temporal networks, the structural characteristics and statistical properties of networks, and the application analysis; analyze the weaknesses of the current research; and look forward to the future development aspects. Keywords Temporal network modeling · Application analysis · Structural feature

1.1 Introduction With the fast advancement of complex systems and network science research, our current research hotspots have shifted from single-layer networks to multilayer coupled networks and from static networks to dynamic networks. Dynamic networks add a temporal dimension compared to static networks, and the connected edges with the temporal dimension will appear and disappear intermittently with the elapse of time. Networks that have these features are denoted as temporal networks [1–3]. The occurrence of many network events is characterized by discontinuity and multiple occurrences, and static networks cannot portray these characteristics of network events well. However, the temporal network adds new features such as the sequential ordering of different edges, the survival duration of edges, and the contact frequency of individuals, which can reflect the properties that static networks do not have [4]. J. Yu (B) · B. Xiao Department of Intelligence, Air Force Early Warning Academy, Wuhan 430019, Hubei, China e-mail: [email protected] Y. Cui Zhejiang Lab, Hangzhou 311122, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_1

1

2

J. Yu et al.

In terms of data sources, human interactions such as emails, call records, RFID, and WiFi provide sufficient data for studying temporal networks [5, 6]. The problem of data needed for temporal network analysis has been effectively improved in the last decade. As a result, it is important to study the dynamic properties in complex systems through high-resolution time series sampling. However, it is difficult to represent the temporal network as a simple graph with no loss of characteristics or change in the information of nodes, on which the analysis and application of the temporal network are very different from the static network. To have a better understanding of the temporal network, it is necessary to deeply sort out the theoretical and application basis of the temporal network. In this paper, we are going to review the research results about networks with time information in terms of modeling and representation methods, structural properties and statistical characteristics, and application analysis. Based on this, we will look into the future development direction.

1.2 Modeling and Representations of Temporal Networks From now on, there are various approaches to modeling networks with temporal properties. At the beginning of the study on network science, the temporal properties of networks are ignored when abstracting them into graphs. That is to say, the network is abstracted into a static graph. Although this approach can simplify the study of networks, the ignorance of temporal properties will lead to an overestimation of the effective connections among nodes in the network while an underestimation of the shortest paths among nodes. In the following studies, weights are used to express the total number of connections (or contacts) occurring on each edge, and this approach reflects the importance of the number of connections or (or contacts) occurring among nodes, but still does not take the temporal properties of the connections (or contacts) into account. To solve this problem, the time-varying graphs framework was tried to study temporal networks [7]. Rosvall and Bergstrom et al. divided the observation window of a temporal network into time segments and abstracted the network into static graphs within each time segment [8]. This approach is rather crude because there may be multiple events on certain edges within each time segment. Therefore, Kim and Anderson et al. proposed an improved approach in which they specified that at most one event occurs on each edge within each time slice when slicing the network [9]. As for each time slice of the small network, it is still abstracted into a static network graph. Some scholars cut the temporal networks into small networks and abstract them into sequential graphs in each small network [10–12]. In fact, it is a challenge how to slice the network, and it is also a problem whether to use static or temporal sequence diagrams in each small segment for the study. In addition, not all networks are suitable for slicing. In [1], it is mentioned that some researchers have used line graphs to model the temporal network. Temporal networks are divided into static networks of time slices, which are then combined with line graphs for modeling and analysis. Temporal networks can also be studied by representing the

1 A Review of Temporal Network Analysis and Applications

3

time between events in a linear model and then converting the model into the topology of the network. In addition, others have studied temporal networks by reaching graphs [1], while Pan et al. modeled the whole temporal network directly as a whole, which better maintains the temporal characteristics [13]. Literature [14] used a multilayer network model to describe the temporal network, which assumes that the contiguous relation exists only at a certain moment and will pass with time, so the intra-layer contiguous relation is ignored and the inter-layer contiguous relation is used to reflect the interaction among individuals. In [15], a model of human spatial preference movement and stochastic interaction with bursts that reflects the waiting time of interaction is proposed, which integrates the movement of individuals over time into the study of human time-dependent interaction networks. From the perspective of mathematical calculation, the adjacency matrix of a temporal network, like that of a static network, could be expressed as a binary tensor table [16], which has neat and compact formulas but requires a large amount of memory for storage. Based on this, Huang proposed the representation of the super-adjacency matrix. The superadjacency matrix integrates the intra-layer links of the temporal network as well as the inter-layer associations to carve the link relations of the temporal network [17]. In summary, the common visual representations of the temporal network are shown in Fig. 1.1, where A, B, C, and D represent the network node, respectively; ω is the connection weight of the multiplex network; and t is the split point of the time slice.

1.3 Structural Properties and Statistical Characteristics 1.3.1 Temporal Motifs and Community Structures A large number of studies in recent years had focused on the mesoscale properties of temporal, including the temporal motifs [18] and community structures [19] of the network. In terms of the temporal motifs, the literature [20] introduced the concept of motif flow to quantify the motif as a way to distinguish the difference between this concept in static and temporal networks. What’s more, the literature [21] analyzed the gender-related and age-related temporal motifs in a large amount of cell phone call data; the literature [22] proposed a hybrid Markov chain approach to detect motif structures in stochastic temporal networks. Community structure is a fundamental topological property of the complex network that affects the information propagation in the network, and the delineation of community structure in temporal networks has been relatively little considered. For example, the literature [23] evaluated the robustness of existing community detection algorithms by optimizing the modularity function of the temporal network in a multilayer representation framework; the literature [24] proposed the concept of association dynamics, which denotes the intensity of life of a community structure within a temporal layer and can be used to describe the temporal evolution of a community structure.

4

J. Yu et al.

Fig. 1.1 Some typical representations of the temporal network. a With integrated time graph. b With time slicing graph. c With multiplex network. d With time path graph

Nevertheless, the temporal motifs and the detection of community structures for temporal networks at the mesoscale level are still in the exploration stage. And many problems still plague researchers, such as fast detection algorithms for motifs and community structures, as well as the classification of structural features.

1.3.2 Paths and Distances The distance between two different nodes, i.e., the minimal number of connections in the routes linking them, is the most fundamental metric that explains the static network of their relations. There are several approaches to generalize the distance in

1 A Review of Temporal Network Analysis and Applications

5

temporal networks, and it is not always clear which one to use. A classical method is to think about nodes vi and v j at time t. Then the delay is the difference between t and the latest time if there exists a path with respect to time from node vi to v j . Pan and Saramäki [13] investigated several approaches, such as assuming a condition of periodic boundary or splitting the concerns of whether a node is reachable and how long it will take to reach it if it is reachable. The first idea was presented by Holme [25], who defined the average minimum time when there is a time-relating path from node vi to v j as the reachability time. Another generalization of the temporal network distance is to take the amount of connections in a temporal path into consideration.

1.3.3 Centrality Metrics The study of node importance ranking based on centrality metrics has made some progress on static networks, but the definition and ranking of node centrality measures in temporal networks need to be revisited and improved due to the introduction of the temporal dimension. After extensive research in recent years, a set of temporal structure measures have been developed [1, 2, 19], such as temporal path, reachability, connectivity, average waiting time, network efficiency, minimum spanning tree, proximity centrality, betweenness centrality, and edge persistence patterns. Temporal networks have their unique characteristics and the study of their statistical properties is divided into two main categories. One is to study the network as a whole within the whole observation window. The other is to slice the network first, then study the statistical properties within each small network separately, and then generalize the statistical properties of the whole network based on the properties of each slice. For example, in the literature [10–12], the network is sliced and various statistical properties such as shortest temporal path, closeness centrality, betweenness centrality, and clustering coefficients of the network in the sliced networks are proposed. Here are some centrality metrics used commonly in research. Temporal closeness: N −1 , Ct (t) = τt (i, j )

(1.1)

j=i

where τt (i, j ) is the temporal distance between node vi and v j . Temporal node degree: t2

D[t1 ,t2 ] (i ) =

Dt (i )

t=t1

(|V | − 1)(t2 − t1 )

,

(1.2)

6

J. Yu et al.

where Dt (i ) is the degree in the integrated temporal network from time t1 to t2 and |V | is the number of nodes. Temporal betweenness: B[t1 ,t2 ] (i ) =

t1 ≤t 0

σ[t1 ,t2 ] ( j, k, i ) , σ[t1 ,t2 ] ( j, k)

(1.3)

where σ[t1 ,t2 ] ( j, k) is the shortest path set between node v j and vk from time t1 to t2 , and σ[t1 ,t2 ] ( j, k, i ) is the subset of σ[t1 ,t2 ] ( j, k) whose paths go through the node vi .

1.4 Application Analysis 1.4.1 Important Node Analysis Node importance metrics in temporal networks are important for applications such as fault detection. What’s more, since the temporal network is a more accurate abstraction of the real system, the problem of node importance metrics on temporal networks has also received much attention [12, 26]. And researches on this can be broadly divided into two aspects, one is the extended research on metrics such as closeness centrality, degree centrality, and betweenness centrality in temporal networks. For example, Kim et al. defined degree centrality, betweenness centrality, and closeness centrality for the temporal network by studying the effective paths of temporal networks in 2012 [9]. Another aspect is the extended study for node importance metrics based on eigenvector centrality in temporal networks. For example, Taylor et al. introduced the construction method of multilayer networks into temporal networks, thus extending the node importance metrics based on feature vectors in temporal networks [27].

1.4.2 Link Prediction Existing link prediction algorithms perform link prediction in static networks, which neglects the evolutionary characteristics of the network structure over time. And there are relatively few studies on the link relationship problem in temporal networks. These include considering the change of a characteristic property of the network structure as a whole over time, for example, in 2007, Leskovec et al. considered the evolutionary trend of network density and network diameter in a temporal network over time [28]. There are also a few works from the view of the local network structure. For example, in 2009, a hybrid prediction method that mixes univariate

1 A Review of Temporal Network Analysis and Applications

7

time series analysis methods with network structure-based link prediction models is reported by Huang and Lin [29]. Dunlavy et al. proposed a link prediction approach in temporal networks from the perspective of matrix and tensor decomposition [30]. Soares and Prudêncio presented a time series analysis prediction method according to the similarity measure with the network structure [31].

1.4.3 Community Segmentation Most of the current community discovery algorithms are still used to solve the community segmentation problem in static networks. Traditional community discovery algorithms are to identify hidden substructures in the network, while dynamic community partitioning also needs to track the evolution of these local topologies [32–34]. In 2007, dynamic changes involving communities were proposed by Palla et al., which proposed six types of possible changes in association structures: Birth, Growth, Contraction, Death, Merge, and Split [35]. Sometimes the state of Continue is also considered, as well as in 2014 Cazabet and Amblard added an eighth possibility: Resurgence [36]. Recently, Rossetti and Cazabet summarized the existing dynamic community partitioning algorithms and classified them into three categories [37]: initial optimal community partitioning [38], temporal trade-off community partitioning [39–41], and inter-temporal community discovery [19, 42].

1.4.4 Influence Maximization The problem of influence maximization based on the static graphs is to find k people as seed nodes from the social network and then make the information influence as many other people as possible in the network through k seed nodes under a specific propagation model (e.g., IC propagation model). The first study of the influence maximization problem on dynamic graphs was conducted in the literature [43] and [44]. After that, the large-scale temporal graph influence maximization algorithm is proposed [45]. In this algorithm, an improved ICT (Independent Cascade model on Temporal graph) propagation model is presented so that the information can be propagated on the temporal graph through the ICT propagation model. Then, the PageRank algorithm is improved to calculate the propagation probability between nodes. Finally, the problem of maximizing the influence of the temporal graph is implemented in two steps based on this.

8

J. Yu et al.

1.4.5 Analysis Tools An analysis tool for the temporal network is introduced in this paper, which is based on the package named “teneto” running in python 3.x, although it is still under continuous improvement. The software package provides tools for evaluating temporal network data, temporal network measurements, temporal network construction, time-varying/dynamic connectivity derivation, and visualization functions [46]. Thompson et al. described the core principles of temporal network theory and the measurement methods of temporal networks in detail. Then these measurements are applied to a stationary state fMRI dataset to demonstrate the utility of the method, and all of the above descriptions and measurements are carried out based on teneto tools [47].

1.5 The Future Outlook In summary, the research work on the temporal network can be considered still at an early stage of development, so there still will be many areas that need to be improved. 1. On the representation of the temporal network. There is still a long way to go to establish a general and effective model to characterize the time information so that the temporal network can be easy to calculate mathematically and easy to understand in terms of representation. The storage volume for the calculation can also be reduced significantly. As for the division of the observation window width for the temporal network, it is worthwhile to study how to select the optimal window size adaptively and make the description method more flexible. 2. Modeling the evolution of community structure in the temporal network. Since the structure of a temporal network will be constantly changing, the structure of the community to which the nodes belong may also be different or change during the evolution process. The community structure of the temporal network may also change dynamically, with continuous born, dying, expanding, contracting, merging, and splitting of the community. How to model or match the dynamic community evolution process so that the dynamic community structure in the temporal network can be portrayed and effectively divided is a problem to be further investigated. 3. Application research needs to be further expanded. The application prospect of temporal networks can be extended by building more comprehensive databases, developing more effective analysis tools, and conducting deeper feature learning. Acknowledgements The authors would like to convey their appreciation for the financial support given by the National Natural Science Foundation and the Project with the National University of Defense and Technology. The authors also agree there has been no conflict of interest in the process of this work.

1 A Review of Temporal Network Analysis and Applications

9

References 1. Holme, P., Saramaki, J.: Temporal networks. Phys. Rep. 519(3), 97–125 (2012) 2. Blonder, B., Wey, T.W., Dornhaus, A., et al.: Temporal dynamics and network analysis. Methods Ecol. Evol. 3(6), 958–972 (2012) 3. Barrat, A., Cattuto, C., Colizza, V., et al.: Empirical temporal networks of face-to-face human interactions. Eur. Phys. J. Spec. Top. 222(6), 1295–1309 (2013) 4. Scholtes, I., Wider, N., Pfitzner, R., et al.: Causality-driven slow-down vs. speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. 5, 5024 (2014) 5. Barabási, A.: The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005) 6. Stehlé, J., Voirin, N., Barrat, A., et al.: High-resolution measurements of face-to-face contact patterns in a primary school. PLoS ONE 6(8), e23176 (2011) 7. Casteigts, A., Flocchini, P., Quattrociocchi, et al.: Time-varying graphs and dynamic networks. Ad-Hoc, Mob., Wirel. Netw. 346–359 (2011) 8. Rosvall, M., Bergstrom, C.: Mapping change in large networks. PLoS ONE 5(1), e8694 (2010) 9. Hyoungshick, K., Ross, A.: Temporal node centrality in complex networks. Phys. Rev. E 85(2), 26107 (2012) 10. Tang, J. K.: Temporal network metrics and their application to real world networks. Ph.D. Thesis, University of Cambridge (2012) 11. Tang, J. K., Musolesi, M., Mascolo, C., et al.: Temporal distance metrics for social network analysis. In: The 2nd ACM workshop on Online Social Networks. Barcelona, Spain. pp: 31–36 (2009) 12. Tang, J., Musolesi, M., Mascolo, C., et al.: Analyzing information flows and key mediators through temporal centrality metrics. In: The 3rd Workshop on Social Network Systems. Paris, France. pp: 1–6 (2010) 13. Pan, R.K., Saramäki, J.: Path lengths, correlations, and centrality in temporal networks. Phys. Rev. E 84(1), 16105 (2011) 14. Valdano, E., Ferreri, L., Poletto, C., et al.: Analytical computation of the epidemic threshold on temporal networks. Phys. Rev. X 5(2), 021005 (2015) 15. Zhang, Y.Q., Li, X., Liang, D.: Characterizing bursts of aggregate pairs with individual Poissonian activity and preferential mobility. IEEE Commun. Lett. 19(7), 1225–1228 (2015) 16. Gauvin, L., Panisson, A., Barrat, A., Cattuto, C.: Revealing latent factors of temporal networks for mesoscale intervention in epidemic spread. ArXiv preprint arXiv:1501.02758 (2015) 17. Huang, Q. J.: Research on structure modeling and evolution analysis in temporal network. Ph.D. Thesis, National University of Defense and Technology (2019) 18. Holme, P., Saramäki, J.: Temporal networks. Phys. Rep. 519(3), 97–125 (2012) 19. Mucha, P.J., Richardson, T., Macon, K., et al.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010) 20. Rocha, L.E., Blondel, V.D.: Flow motifs reveal limitations of the static framework to represent human interactions. Phys. Rev. E 87(4), 042814 (2013) 21. Kovanen, L., Kaskia, K., Kertésza, J., et al.: Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences. Proc. Natl. Acad. Sci. U.S.A. 110(45), 18070–18075 (2013) 22. Liu, K., Cheung, W. K., Liu, J.: Detecting stochastic temporal network motifs for human communication patterns analysis. In: The International Conference on Advances in Social Networks Analysis and Mining, Niagara, Ontario. pp. 533–540 (2013) 23. Bassett, D.S., Porter, M.A., Wymbs, N.F., et al.: Robust detection of dynamic community structure in networks. Chaos 23, 013142 (2013) 24. Fu, C., Li, M., Zou, D. Q., et al.: Community vitality in dynamic temporal networks. Int. J. Distrib. Sens. Netw. 281565 (2013) 25. Holme, P.: Network reachability of real-world contact sequences. Phys. Rev. E 71(4), 046119 (2005)

10

J. Yu et al.

26. Praprotnik, S., Batagelj, V.: Spectral centrality measures in temporal networks. Ars Mathematica Contemporanea 11(1), 11–33 (2015) 27. Taylor, D., Myers, S.A., Clauset, A., et al.: Eigenvector-based centrality measures for temporal networks. Multiscale Model. Simul. 15(1), 537–574 (2017) 28. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 1–41 (2007) 29. Huang, Z., Lin, D.K.J.: The time-series link prediction problem with applications in communication surveillance. INFORMS J. Comput. 21(2), 286–303 (2009) 30. Dunlavy, D.M., Kolda, T.G., Acar, E.: Temporal link prediction using matrix and tensor factorizations. ACM Trans. Knowl. Discov. Data 5(2), 1–27 (2011) 31. Soares, P. R., Prudêncio, R.:Time series based link prediction. In: The International Joint Conference on Neural Networks, Brisbane, QLD, Australia. pp: 1–7 (2012) 32. Gauvin, L., Panisson, A., Cattuto, C.: Detecting the community structure and activity patterns of temporal networks: A non-negative tensor factorization approach. PLoS ONE 9(1), e86028 (2014) 33. Peixoto, T.P., Rosvall, M.: Modeling sequences and temporal networks with dynamic community structures. Nat. Commun. 8(582), 1–12 (2017) 34. Matias, C., Miele, V.: Statistical clustering of temporal networks through a dynamic stochastic block model. J. R. Stat. Soc. Ser. B-Stat. Methodol. 79(4), 1119–1141 (2017) 35. Palla, G., Barabási, A., Vicsek, T.: Quantifying social group evolution. Nature 446(7136), 664–667 (2007) 36. Cazabe, R., Amblard, F.: Dynamic community detection. Springer (2014) 37. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: A survey. ArXiv:1707.03186v3 (2020). 38. Aynaud, T., Fleury, E., Guillaume, J.L., et al.: Communities in evolving networks: definitions, detection, and analysis techniques. Dyn. Complex Netw. 2, 159–200 (2013) 39. Aynaud T., Guillaume, J. L.: Static community detection algorithms for evolving networks. In: International Symposium on modeling and optimization in mobile, ad-hoc and wireless networks, pp. 513–519 (2010) 40. Guo, C.H., Wang, J.J., Zhang, Z.: Evolutionary community structure discovery in dynamic weighted networks. Physica A 413, 565–576 (2014) 41. Liu, F.C., Choi, D., Lu Xie, L., Roeder, K.: Global spectral clustering in dynamic networks. Proc. Natl. Acad. Sci. U.S.A. 115(5), 927–932 (2018) 42. Viard, T., Latapy, M., Magnien, C.: Computing maximal cliques in link streams. Theor. Comput. Sci., 245–252 (2016) 43. Kim, D., Hyun, D., Oh, J., et al.: Influence maximization based on reachability sketches in dynamic graphs. Inf. Sci. 394–395, 217–231 (2017) 44. Wang, Y.H., Fan, Q., Li, Y.C., et al.: Real-time influence maximization on dynamic social streams, pp. 805–816. Proceedings of the VLDB Endow. Munich, Germany (2017) 45. Wu, A.B., Yuan, Y., Qiao, B.Y., et al.: The influence maximization problem based on large-scale temporal graph. Chin. J. Comput. 42(12), 2647–2664 (2019) 46. Thompson, W. H., Granitz, Harlalka, V. et al.: Wiheto/teneto: 0.5.0 (2020). https://github.com/ wiheto/teneto/tree/0.5.0 47. Thompson, W.H., Brantefors, P., Fransson, P.: From static to temporal network theory: Applications to functional brain connectivity. Network Neuroscience 1(2), 69–99 (2017)

Chapter 2

A Knowledge Representation Method for Hierarchical Diagnosis Decision-Making Under Uncertain Conditions Yawei Ge, Xiqian Hou, Zhuxuan Meng, and Yue Lu Abstract In order to effectively solve the problems existing in hierarchical diagnosis decision-making knowledge representation under uncertain conditions, the paper lists several key characteristics of complex systems diagnosis decision-making knowledge, combines with hierarchical diagnosis decision-making knowledge classification, makes ontology modeling of related diagnosis decision-making knowledge based on ontology theory, and constructs the semantic hierarchical analysis and its relationship between global ontology and each core domain ontology. In this way, the paper realizes semantic hierarchical analysis and knowledge association. Through the case study of a certain modular aircraft support system, the knowledge representation of diagnosis decision-making under uncertain conditions is realized. Keywords Uncertain conditions · Hierarchical diagnosis decision-making knowledge · Representation · Ontology

2.1 Introduction Knowledge is a prerequisite for any intelligent behavior. It is precisely the information obtained from the outside world that needs to be expressed in some form in order to enable human beings to continuously break through their cognitive level in complex and uncertain environments and produce correct behavior. At present, in many fault diagnosis decision applications, knowledge representation methods are inconsistent, and domain knowledge lacks system guidelines and mature standard methods in the decision-making of fault diagnosis and maintenance for complex systems. With the rapid development of sensor technology, there are differences and diversity of Y. Ge (B) · Z. Meng · Y. Lu Strategic Assessments and Consultation Institute, Academy of Military Sciences, Beijing 100091, China e-mail: [email protected] X. Hou Air Force Command College, Beijing 100097, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_2

11

12

Y. Ge et al.

knowledge information obtained between different types of weapons and equipment. Monitoring data information obtained from different levels of the same complex system also shows fuzziness, multi-source, and coupling. It is important to study the knowledge representation and inference methods of hierarchical diagnosis decision under uncertain conditions. Li [1] proposed an object-oriented knowledge representation method for diagnosing Bayesian networks for complex devices, but NP-hard problems exist in both exact and approximate inference for diagnosing Bayesian networks, such as the exponential quantity relationship between the number of parameters required for conditional probability table (CPT) and the number of parent nodes and each node state in the Bayesian network, which means that a lot of statistics are needed to construct CPT, which is a complex system. Especially the knowledge expression and inference of multi-level and multi-connected systems bring great difficulties. Di [2] proposed an ontology-based representation model and mechanism for rotating machine fault knowledge. The ontology was used to represent rotating machine fault knowledge and define its semantic relationship, so as to build a classification system and metadata model for diagnostic knowledge. However, it needs to further improve the core ontology of fault knowledge in combination with other methods and semantic environments in dealing with uncertainties and fuzzy information. The uncertain or incomplete knowledge involved in product conceptual design is analyzed in Xiang 3]. The application of the Bayes method, neural network, evolutionary calculation, and evidence theory in uncertain knowledge representation and redundancy handling are compared. The uncertain knowledge fusion, expression, and inference based on a rough set are proposed, but they lack semantic understanding and sharing in hierarchical knowledge representation of complex systems. Knowledge expression is a sub-area of artificial intelligence, which is related to the way computers recognize, design, and implement information, and is used to derive implicit knowledge, to communicate with people in natural language, to plan future behavior, and to solve problems in areas that usually require human expertise. Among them, the knowledge implied in the deduced information itself is a form of inference. So the field of knowledge expression is often referred to as knowledge expression and inference [4]. The knowledge representation mechanism and knowledge inference model are important components of hierarchical diagnosis. Diagnosis decision-making knowledge representation is a representation learning oriented to the fault entities and relationships in the fault knowledge base. By mapping the fault entities and relationships, the semantic information of the fault entities and relationships can be represented, thus the effective and fast calculation of the fault entities, relationships, and their complex semantic associations can be completed. At present, much work has been done on the research and application of diagnosis decision-making knowledge representation, but there are still some key issues to be solved in order to achieve hierarchical diagnosis decision-making knowledge expression under uncertain conditions:

2 A Knowledge Representation Method for Hierarchical Diagnosis …

13

(1) Diagnosis decision-making knowledge of complex systems has the characteristics of multi-origin, diversity, and inconsistency. The original fault diagnosis decision-making methods depend on knowledge engineering and expert systems. Because of the complexity and multi-origin of system fault knowledge, it is difficult to express fault knowledge. A single-expression method of diagnosis decision-making knowledge can effectively acquire fault knowledge and complete system maintenance, but it is easy to omit the expression of complex fault knowledge. Two or more fusion expression methods of diagnosis decision-making knowledge can effectively reduce the omission of knowledge expression, but also increase the difficulty of knowledge acquisition. At the same time, system heterogeneity, multi-origin and inconsistency of diagnosis decision-making knowledge, and insufficient semantic support at the knowledge base level will result in knowledge isolated islands, unable to share and interoperate complex system knowledge, thus unable to effectively explore hidden knowledge and achieve knowledge inference, which seriously affects the expansion of diagnosis decision-making knowledge. Therefore, it is necessary to construct a more standard expression, more universal domain, more sufficient semantics, and more complete knowledge representation method for complex system diagnosis decision-making knowledge. (2) In view of the large amount of knowledge information and the difficulty in obtaining information in complex systems, traditional knowledge expression modeling methods at the semantic level can achieve the mapping between knowledge, the expression of grammatical and semantic relationships within knowledge, the extensibility of knowledge expression structure, and the dependence of knowledge expression on description language, etc. However, when making fault diagnosis decisions for complex systems, the process of traditional knowledge representation modeling is highly dependent on experts, but it is also difficult to ensure the completeness and adequacy of knowledge expression in complex systems. Therefore, a hierarchical knowledge representation modeling method for diagnosis decision-making of complex systems is needed to achieve the effective expression and management of multi-source knowledge in the process of diagnosis decision-making. (3) The causal relationship implied in diagnosis decision-making knowledge complex systems is path-dependent, not easy to distinguish, practical, individual, and implicit. It is difficult to express it in the form of text and charts. It is also difficult to express it by CPT using traditional Bayes network-based methods [5]. Therefore, it is necessary to explore and study new methods of causal knowledge expression and reasoning. Implicit knowledge causality is expressed in diagnosis decision-making processes for complex systems. (4) Diagnosis decision-making knowledge of complex systems has the characteristics of various forms. If it is expressed simply by words, voice, experience, etc., it will not only affect the memory and thinking ability of operators, but also affect the efficiency and accuracy of fault diagnosis in a certain program. Think a lot about Edward De Bono who believes that there is a way to avoid the stiffness caused by human language by using as many “graphics” as possible in your

14

Y. Ge et al.

mind and less “words” [6]. Therefore, research is needed to make the knowledge more intuitive and easier for operators to understand and apply diagnosis decision-making expression methods. Therefore, in view of the problems in knowledge representation of hierarchical diagnosis decision-making models in complex systems, the paper mainly studies the methods of knowledge representation for diagnosis decision-making making under uncertain conditions combined with a semantic environment.

2.2 Hierarchical Diagnosis Decision-Making Knowledge Classification Diagnosis decision-making knowledge is essentially the performance of data and information in the field of diagnosis decision-making. It can be used to guide the execution and management of diagnostic tasks, solve diagnostic problems and complete the intelligent decision-making of diagnosis by effectively connecting the status data, function information, scheduling information, process information, and other data information of the diagnostic object. The information structure of diagnosis decision-making knowledge can be constructed as shown in Fig. 2.1. Hierarchical diagnosis decision-making knowledge can be divided into the following four types from the perspective of function and relationship: Descriptive knowledge: Knowledge derived from descriptive information, which belongs to “know what”, is used to describe the concepts, structure, and functions of diagnostic objects and diagnostic states at all levels in hierarchical diagnosis decision-making, as well as the definition and nature of hierarchical diagnosis decision modes; Diagnostic knowledge: Is the core of diagnosis decision-making knowledge, which belongs to the knowledge of “knowing why”. It includes fault modes, feature parameters, process assumptions, causal relationships between conclusions in the hierarchical diagnosis decision-making process, and empirical solutions of domain experts

Fig. 2.1 Information structure diagram of diagnosis decision-making knowledge

2 A Knowledge Representation Method for Hierarchical Diagnosis …

15

to specific fault diagnosis decision-making problems. It is the main basis for hierarchical diagnosis decision-making problems to obtain and formulate knowledge strategies. Process knowledge: Information reflecting the general process of hierarchical diagnosis decision-making, which belongs to the knowledge of “know how to do”. It is used for assistant reference knowledge such as numerical calculation algorithm, feature extraction method, diagnostic strategy formulation, process inference, parameter monitoring, etc. in hierarchical diagnosis decision-making. Controlled knowledge: The knowledge about the solution of control strategy for the whole hierarchical diagnosis decision process belongs to the knowledge of “who knows”. It is used to describe the control problem of the Diagnosis Decision-making process, to realize the selection and activation of knowledge for different levels of Diagnosis Decision-making tasks, to complete the control of the diagnostic process, to formulate diagnostic inference strategy and conflict resolution strategy, etc. Descriptive knowledge and diagnostic knowledge constitute the main body of hierarchical diagnosis decision-making knowledge, which is the basis of process knowledge and control knowledge. At the same time, hierarchical diagnosis decisionmaking knowledge jointly completes the expression and inference of the diagnosis decision-making process. Then, through self-organizing knowledge management, diagnosis decision-making knowledge in the related fields of complex systems is used to effectively solve hierarchical diagnosis decision-making problems.

2.3 Hierarchical Diagnosis Decision-Making Knowledge Ontology Modeling 2.3.1 Hierarchical Diagnosis Decision-Making Knowledge Representation Model Hierarchical diagnosis decision-making is essentially a decision-making process that combines all kinds of diagnosis decision-making information and resources about complex objects to obtain their fault status and maintenance mode. The knowledge expression process of hierarchical diagnosis decision-making refers to the process of obtaining the information related to the diagnostic object and processing it appropriately to extract the required knowledge. Under the guidance of the diagnostic goal, the knowledge of hierarchical diagnosis decision-making is expressed in a more standardized form, more sufficient semantics, and more intuitive way. Its essence is the information, data, and uncertainty related to the diagnostic object. The selforganizing process of methods, tools, etc. Hierarchical Diagnosis Decision-making knowledge representation model is built as shown in Fig. 2.2. It can be seen from the diagram that in the hierarchical diagnosis decision-making process, there are many kinds of knowledge involved. How to make full use of all

16

Y. Ge et al.

Fig. 2.2 Hierarchical diagnosis decision-making knowledge representation model

kinds of diagnosis decision-making knowledge involved and expressing them in an effective way will directly affect the accuracy of diagnosis and the efficiency of decision-making.

2.3.2 Hierarchical Diagnosis Decision-Making Knowledge Representation Model The concept of Ontology originated from Western Philosophy and belonged to a branch of metaphysics. It was used to describe the abstract nature of objective things. It was introduced into the field of artificial intelligence and knowledge engineering in the 1970s to describe the common understanding of human knowledge in the field. Among the different representations of ontology definitions given by many scholars, the ontology definition given by Gruber in 1993, “ontology is an explicit formal specification of conceptual model”, is widely accepted and accepted [7, 7]. The definition of ontology proposed by Fensel in [9, 9] is “a clear and shared formal description of important conceptual models in a specific domain”, which contains the following four meanings: (1) Conceptualization: Ontology is the abstraction of objective things in the real world and a conceptual representation of domain knowledge. The ontology model corresponds to its conceptual framework, interoperability among subjects, and basic theories in specific fields; (2) Explicitness: The domain knowledge concepts referenced in the ontology are clearly described, and the connections and constraints between the concepts are also clearly defined; (3) Sharing: The domain knowledge concepts described in the ontology model are generally recognized in the domain and are a recognized set of concepts in the relevant specific domain;

2 A Knowledge Representation Method for Hierarchical Diagnosis …

17

(4) Formal: Ontology models are formally coded for knowledge concepts recognized in specific domains and can be understood by machines. The ontology representation model has a good conceptual hierarchy and supports the logical inference of knowledge, which enables the fault diagnosis decisionmaking process to understand the precise definition and formal description, and coding of ontology and its relationship with each other. Traditional ontology model building methods, such as the skeleton method, Bernera method, Meth ontology method, and so on, although have some guidance for the construction of ontology models, traditional methods rely too much on expert experience. Here, combined with the hierarchical diagnosis decision process, the Work Breakdown Structure (WBS) is introduced into the meta-ontology model. The Diagnosis Decision-making activities are used as a guide, the knowledge units are extracted by the WBS, the concepts of knowledge are identified by the Diagnosis Decision-making activities, and the ontology expression model of hierarchical diagnosis decision knowledge is established based on the concepts and the concepts acquired by the logical relationship between the concepts and the Diagnosis Decision-making activities. From the bottom level, we can explicitly and clearly describe the relationship between concepts and the form of organization in the diagnosis decision-making process. The ontology knowledge representation model is defined as follows: Definition 1 (Ontology7): Ontology can be expressed as a six-tuple in form, which is a set of concepts. Concepts can generally be represented as a set of semantic objects including things, functions, behaviors, policies, processes, etc. Formally, it can be represented as a four-tuple. In the formal definition of a concept, it represents the unique identity of the concept. Generally, URI (Universal Resource Identifier) is used to represent the vocabulary contained in a concept, the basic attributes of a concept, and the collection of instances of a concept. In essence, it is a collection of relationships of concepts, such as SameAs, PartOf, Contains, Associates, and so on. It is the axioms in concepts that represent the logical connections and constraints that exist in concepts and their relationships. Instances, the specific entities corresponding to a concept, are a basic element of the concept. It is a hierarchy between concepts (Hierarchy of Concepts) and a hierarchy between conceptual relationships (Hierarchy of Relations). Combining with the definition of ontology knowledge representation, hierarchical diagnosis decision-making knowledge is considered as a structured description of ontology knowledge. The concept of the hierarchical diagnosis decisionmaking process is the smallest element in knowledge representation. Its characteristics can be viewed as attributes of ontology concepts. Constraints in the diagnosis decision-making process are considered as conceptual axioms in knowledge representation, and diagnosis decision-making activities are examples of concepts. The ontology concept hierarchy is used for the vertical connection between knowledge expressions, while the relationship hierarchy between ontology concepts is used for the horizontal connection between knowledge expressions. Hierarchical diagnosis decision-making ontology knowledge modeling mainly involves knowledge creation-mapping-acquisition.

18

Y. Ge et al.

(1) Knowledge creation: In the process of diagnosis and maintenance of complex weapons and equipment, various equipment design documents will be referenced and different types of diagnostic and maintenance data will be generated at different stages. At the same time, the diagnostic and maintenance process will record the implementation process, the information on the repaired parts, the key actions of the processing process, etc. Its form of expression is the design documents in the standard format, the decision-making report of diagnosis and maintenance given according to the actual situation, etc. Real-time system status data generated by the monitoring system, fault diagnosis data analysis report obtained by the diagnostic system, etc. Based on this, a hierarchical core ontology for diagnosis decision-making making is constructed, and concepts are extracted from the original data for data generalization to form the basic ontology. The steps are as follows: (1) Define the purpose, scope, function, and use of hierarchical diagnosis decision-making knowledge ontology, and determine the scope of knowledge by combining the knowledge provided from the experience of experts in the field with the actual knowledge needs of the diagnosis decisionmaking process so that the ontology can meet the needs of diagnosis decision-making in the smallest possible range; (2) Collect and sort out the original data of different types at different stages in the process of diagnosis decision-making according to the domain and scope of hierarchical diagnosis decision-making knowledge ontology; (3) To analyze and extract the ontology of documents, reports, data, etc. in the data. (2) Knowledge mapping: Analyzing the existing hierarchical diagnosis decisionmaking process and related data, combining with work decomposition structure, the hierarchical diagnosis decision-making task is structurally decomposed, and the concepts and attributes of knowledge are extracted from it, and the knowledge is organized according to the logical relationship between the concepts to form a complete hierarchical diagnosis decision-making data model, the specific steps are: (1) Implement work structure decomposition for hierarchical diagnosis decision-making process to clarify the tasks of the diagnosis decisionmaking process; (2) Each diagnosis decision-making task is structurally decomposed, and the activities are subdivided into specific subtasks; (3) The concepts, attributes, and relationships of knowledge extracted from diagnosis decision-making activities and their subtasks are specially identified, and the knowledge is organized by analyzing the logical relationship between the extracted concepts to form a complete Diagnosis Decision-making knowledge data model;

2 A Knowledge Representation Method for Hierarchical Diagnosis …

19

(4) Analyze and abstract the extracted knowledge, establish the initial ontology model, and define the important knowledge classes, class attributes, and class relationships in the model to describe the concepts of knowledge in the ontology model; (5) further refine the ontology and verify its completeness; (6) Check whether there are problems such as loss of knowledge classes or contradictions between classes in the ontology model. Such problems will affect the clarity of knowledge ontology expression and will result in the imprecision and imprecision of knowledge logical inference. If such a problem occurs, return to step (3) to modify the initial ontology. (3) Knowledge acquisition: By semanticalizing the multi-source information in the hierarchical diagnosis decision-making process, a shared ontology is formed, which collects the scattered information sources. On this basis, these semantically-based information sources are processed in a unified way. The ontology and semantics work together to achieve information sharing and transmission, and to acquire new knowledge from them, the specific steps are: (1) On the basis of the complete initial ontology, build a shared ontology that collects multi-source information and realizes the semantics of information; (2) Build a shared knowledge ontology model to share and reuse with other ontologies. Hierarchical fault diagnosis and maintenance decision-making for complex systems involves the entire coincidence cycle of the system. A large number of diagnosis decision-making knowledge elements come from the status monitoring data of the system, a large amount of information hidden in the production and working environment of the system, and the empirical thinking inspired by field experts and system users. Its knowledge structure is heterogeneous and multi-source, which has great uncertainty. When using ontology-based hierarchical diagnosis decisionmaking knowledge modeling, Knowledge management needs to be carried out in an all-round, multi-perspective, and phased way to achieve the expression and inference of diagnosis decision-making knowledge. Combined with the main links of ontology modeling for diagnosis decision-making making, the basic ontology-based modeling process is as follows. The hierarchical diagnosis decision-making knowledge ontology modeling process consists of the following four processes: (1) Analyzing and filtering task information related to the hierarchical diagnosis decision-making process, and establishing a formal knowledge ontology conceptual model; (2) Construct an extensible semantic level and abstract and define the concepts and their relationships in the process of diagnosis decision-making making; (3) Map the core and local ontologies in the process of diagnosis decision-making by combining the semantic level of diagnosis decision-making;

20

Y. Ge et al.

(4) Based on the effective acquisition, use, and storage of diagnosis decision-making semantic knowledge, the ontology conceptual model of diagnosis decisionmaking knowledge is updated continuously to achieve the fusion of diagnosis decision-making knowledge.

2.4 The Semantic Level of Hierarchical Diagnosis Decision-Making and Its Knowledge Relevance According to the process of Ontology-based knowledge modeling for diagnosis decision-making making, it is necessary to construct an easy-to-expand semantic level for diagnosis decision-making and to construct a core ontology for diagnosis decision-making entities and their behaviors in combination with the core concepts in the application domain. Combining hierarchical diagnosis decision-making of complex system, the top ontology domain of complex system is defined according to their object, behavior, entity relationship, hierarchy, and related methods. For specific cases, the semantic hierarchy contained in the core ontology can be extended to meet the requirements of diagnosis decision-making which are constantly required in the continuous hierarchical and continuous stages. Therefore, the core ontology of diagnosis decision-making includes device domain ontology (EquipDomain), process domain ontology (ProcDomain), diagnostic domain ontology (DiagDomain) and maintenance decision domain ontology (MainteDomain). The conceptual abstraction of device entity, diagnosis, and decision process and their relationship are defined together. The core ontology diagram is shown in Fig. 2.3. At the same time, the relationship between the parts of the core ontology is established with the idea of “problem-condition-process-method”. Its diagram is shown below (Fig.2.4).

Process domain ontology

state feedback

hierarchy

structure

state

Device domain ontology

o bject

Diagnosis domain ontology

feedback

support diagnosis solution

basis

object

Decision-making domain ontology

Fig. 2.3 Hierarchical diagnosis decision-making knowledge model core ontology diagram

2 A Knowledge Representation Method for Hierarchical Diagnosis … Device domain ontology The location of the failure and the system hierarchy to which it belongs

21

Diagnosis domain ontology Fault types, modes, characteristics, and diagnostic methods

Fault

Result

Fault status and current system status characteristics Process domain ontology

Decisions and actions taken to avoid failures Decision-making domain ontology

Fig. 2.4 Constructs a diagram of the relationships between the parts of the core ontology

Combining with the reality of each part of the core ontology, the core ontology is decomposed, inherited, and extended, the hierarchy of each core domain ontology is constructed, and the global ontology and each domain ontology are constructed, compiled, expanded, and modified using the computer coding language (OWL) recommended by the World Wide Web Consortium (W3C) to complete the knowledge representation of diagnosis decision-making for the whole complex system. In this paper, a graphical ontology editor is used to assist the modeling and development of hierarchical diagnosis decision-making knowledge ontology. By synthesizing the relationships between the parts of the core ontology, the basic concepts of the global ontology can be defined, including hardware devices (EquipComponent), failures, failure causes (FaultCause), diagnosis decision-making processes, diagnostic methods, maintenance decisions, and so on, which can be transformed into classes in the global ontology. In view of the fact that ontology building software does not support some functions for Chinese concept ontology building, the global ontology and the knowledge concepts of each core domain ontology are abbreviated in English. At the same time, according to the relationship between the core domain ontologies, object attributes among different types in the global ontology are defined. The object properties of a global ontology-related class have the following specific meanings, as shown in Table 2.1. By connecting classes with object attributes, you can get a complete global ontology model whose ontology structure is shown in Fig. 2.5.

22

Y. Ge et al.

Table 2.1 Object attribute definitions for classes in the global ontology Object properties

Definition domains

Value domains

Interpretation

Happened at

Failure

Equip component

Occurs in (parts)

Has process

Failure

Diagnosis decision process

(process) of

Because of

Failure

Fault cause

Is due to

The diagnosis method is

Failure

Diagnosis method

Failure diagnosis method is

The maintenance decision is

Failure

Maintenance decision

Failure maintenance decision is

Fig. 2.5 Hierarchical diagnosis decision-making global ontology

2.4.1 Device Domain Ontology Semantics and Their Relationships For complex systems with complex internal structures and staggered module units, it takes a huge amount of work to construct the device ontology directly. To reduce the difficulty of building device domain ontology for complex systems, the hierarchical structure model is used to decompose the physical structure of complex systems, which can be divided into four levels: system, module, subassembly, and part. Taking a modular aircraft support system as an example, its system layer contains the whole system of the support system, which is an uncomplicated whole. Module layer, that is, each component module of the system layer, including the aeronautical power module, air source module, air conditioning module, etc. Similarly, according to the principle of hierarchy, modules can be subdivided into a more detailed hierarchy, that is, the component layer and the part layer. The domain conceptual knowledge of the diagnosis decision-making knowledge model does not exist in isolation. There are specific complex relationships among the components of each hierarchy, which are described as the object attributes of the class. At the same time, the components themselves have their own characteristics, that is, the data type attributes of the class. Conceptualize the related components and their relationships in the hierarchy of the safeguard system, such as CentralContrMod, DieselGenerMod, AviaPowerMod, AirSupplMod, AirCondMod,

2 A Knowledge Representation Method for Hierarchical Diagnosis …

23

Fig. 2.6 Domain ontology for diagnosis decision-making devices

HydrauMod, which use object attributes to connect classes and add corresponding data type attributes to classes. A complete device domain ontology model can be obtained, and its ontology structure is shown in Fig. 2.6.

2.4.2 Process Domain Ontology Semantics and Their Relationships The purpose of building diagnosis decision-making domain ontology is to further refine the diagnosis decision-making process. At the same time, it obtains the current status of device parts, operation links, operation status, and task steps of the diagnosis decision-making process and abstracts the diagnosis decision-making process into a semantic ontology. The process domain ontology mainly includes diagnosis decision-making process, hardware devices, current status of devices, etc. The diagnosis decision-making process refers to the hierarchical process to complete the decision-making of fault diagnosis and maintenance. Hardware equipment refers to the equipment components involved in the diagnosis decision-making process. The current state of the device is a description of the state of the system and modules in the diagnosis decision-making process. Similar to the device domain ontology construction method, process domain ontology conceptual semantics are applied for hierarchical diagnosis decisionmaking to obtain the class and object, and data type properties of process domain ontology. The specific meanings of the domain ontology-related classes and their

24

Y. Ge et al.

Table 2.2 Interrelated classes in process domain ontology and their data type attribute meanings Class

Subclass

Data type properties

Process (diagnosis decision-making process)

Data collect proc

Object name (object name), object value (object parameter)

Info proces proc

Feature name, feature value

Stat monit proc

State name (status name), state value (status value)

Diag proc

Diag strategies, diag results

Deci suppo proc

Mainte actions, expertise

Repres proc

Data display, info display, decis display

System

System name

Module

Module name

System cond

Cond value

Module cond

Cond value

Proc step cond

Proc step number

Fault cond

Cond value

Equip component Condition

Table 2.3 Object attribute definitions of classes in a process domain ontology Object properties

Definition domains

Value domains

Interpretation

Happened at

Process

Equip component

Occurs in (a part)

Is part of

Equip component

Equip component

Is part of

Has part

Equip component

Equip component

Is the part

Has condition

Equip component/process

Condition

(Part/process)

object and data type attributes for diagnosis decision-making are shown in Tables 2.2 and 2.3. By connecting classes with object attributes and adding corresponding data type attributes to classes, a complete process domain ontology model can be obtained, and its ontology structure is shown in Fig. 2.7.

2.4.3 Diagnostic Domain Ontology Semantics and Their Relationships The main concepts involved in the diagnosis decision-making domain ontology include hardware devices, fault types, failure modes, fault causes, and fault diagnosis methods. By ontological semantics of the concepts involved and their numerical quantification, the class and object, and data type attributes of the diagnosis domain ontology can be obtained, as shown in Tables 2.4 and 2.5.

2 A Knowledge Representation Method for Hierarchical Diagnosis …

25

Fig. 2.7 Diagnosis decision-making process domain ontology

By connecting classes with object attributes and adding corresponding data type attributes to classes, a complete diagnostic domain ontology model can be obtained, and its ontology structure is shown in Fig. 2.8.

2.4.4 Maintenance Decision Domain Ontology Semantics and Their Relationships The purpose of building maintenance decision domain ontology is to provide support for current maintenance decision by combining the results of fault diagnosis with the experience of previous diagnosis and maintenance, and abstract maintenance experience into semantic maintenance cases to form knowledge ontology. Similar to diagnostic domain ontology, maintenance decision domain ontology mainly includes hardware equipment, maintenance decision problem, maintenance plan, feedback, and so on. Among them, hardware equipment refers to the parts of equipment that need to be repaired. Maintenance decision-making problems refer to the description of failure status in hardware devices, including failure phenomena, conditions for failure occurrence, failure causes, feature attributes, etc. Maintenance scheme refers to the solution for maintenance of faulty hardware equipment, mainly including requirements, manpower, tools, parts, etc. Feedback refers to the evaluation of the system to restore

26

Y. Ge et al.

Table 2.4 Diagnostic domain ontology-related classes and their data type attribute meanings Class

Subclass

Sub-subclasss

Data type properties

Fault type

Component fault

Module fault

Fault name

Sub assem fault Part fault Function fault Manmade fault

Fault name

–

Fault name

–

System name

Module

Module classes within the same device domain ontology

Module name

Subassembly

Subassembly classes within the Subassembly name same device domain ontology

Part

Part classes within the same device domain ontology

Equip component System

Fault mode

–

Part name

Fault name

–

Fault name

Fault feature

–

Feature parameter

Fault cause

ModLevlFauCause

Fault cause content

SubassemLevlFauCause PartLevlFauCause Fault location

–

Fault effect

–

Location name Effect level

Diagnosis method Mode-based method –

Method name

Data-based method

–

Method name

Hybrid method

–

Method name

Table 2.5 Object attribute definitions for classes in diagnostic domain ontologies Object properties

Definition domains

Value domains

Interpretation

Happened at

Fault type

Equip component

Occurs in (parts)

Is part of

Equip component

Equip component

Is part of

Has part

Equip component

Equip component

Is the part

The mode is

Fault type

Fault mode

The mode of

The effect is

Fault mode

Fault effect

Effect of

To diagnosis

Diagnosis method

Fault mode

Method fault mode for (a failure)

Because of

Fault mode

Fault mode

Mode of (part/process)

its original normal state after the implementation of the maintenance plan, including evaluation, suggestions, and supplementary instructions. By ontological semantics of the concepts involved in the ontology of the maintenance decision domain and their numerical quantification, the class, object, and data

2 A Knowledge Representation Method for Hierarchical Diagnosis …

27

Fig. 2.8 Diagnosis decision-making diagnostic domain ontology

type attributes of the ontology of the maintenance decision domain can be obtained, as shown in Tables 2.6 and 2.7. By connecting classes with object attributes and adding corresponding data type attributes to classes, a complete ontology model of the maintenance decision domain can be obtained, and its ontology structure is shown in Fig. 2.9.

2.5 Conclusions The paper focuses on the knowledge representation of hierarchical diagnosis decision-making in complex systems under uncertain conditions. Firstly, through problem analysis, several key problems that need to be solved in hierarchical diagnosis decision-making knowledge representation of complex systems are extracted. Secondly, combined with the classification, the hierarchical knowledge ontology modeling is realized. Finally, by constructing the semantics and relationships between global ontology and each core domain ontology, the semantic hierarchical analysis and knowledge association of hierarchical diagnosis decision-making are realized.

28

Y. Ge et al.

Table 2.6 Property meanings of related classes and their data types in maintenance decision domain ontology Class

Subclass

Sub-subclasss

Data type properties

Maintenance problem

Problem phenomenon

–

Phenon name

Problem precondition

–

Condition name

Problem feature

–

Feature parameter

Problem cause

ModLevlFauCause

Fault cause content

SubassemLevlFauCause PartLevlFauCause Equip component

Maintenance solution

System

–

System name

Module

Module classes within the same device domain ontology

Module name

Subassembly

Subassembly classes within the same device domain ontology

Subassembly name

Part

Part classes within the same device Part name domain ontology

Maintenance requirement

Tools

Types, name

Manpower Parts

Feedback

Maintenance case

Maintenance time

–

Time

Evaluations

–

Advices

–

Contents, mainten Plan ID, date

Memos

–

–

–

Case name, Case date

Table 2.7 Object attribute definitions of classes in maintenance decision domain ontology Object properties

Definition domains

Value domains

Interpretation

Happened at

Mainten problem

Equip component

Appears (part)

Is part of

Equip component

Equip component

Is part of

Has part

Equip component

Equip component

Is the part

Refer to

Maintenance problem

Maintenance case

Reference (case)

The problem is

Maintenance case

Maintenance problem

Reason of

The feedback is

Maintenance case

Feedback

Feedback implementation

The solution is

Maintenance case

Maintenance solution

Solution

2 A Knowledge Representation Method for Hierarchical Diagnosis …

29

Fig. 2.9 Maintenance decision domain ontology

References 1. Li, J.C.: Decision Making Method and Application Research on Fault Diagnosis and Maintenance of Bayesian Network, University of Defense Science and Technology (2002) (in Chinese) 2. Di, X.L.: Ontology-based Knowledge Modeling for Fault Diagnosis of Rotating Machines, Hunan University of Science and Technology (2011) (in Chinese) 3. Xiang, D.: Research on Multi-domain Knowledge Expression, Acquisition and Application in Product Design, Central China University of Science and Technology (2012) (in Chinese) 4. Carlos, R.G.: Advances in knowledge representation, InTech Press (2012) 5. Zhang, Q.: DUCG: a new method of expression and reasoning of dynamic uncertain causal knowledge: discrete, static, evidence determination and directed acyclic graph. J. Comput. Sci. 4(33), (2010) (in Chinese) 6. Edward, D.B.: Edward De Bono’s Thinking Course, BBC Consumer Publishing London-UK (1994). 7. Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993) 8. Gruber, T.R.: Towards principles for the design of ontologies used for knowledge sharing. Int. J. Hum Comput Stud. 43(5–6), 907–928 (1995) 9. Fensel, D.: The semantic web and its languages. IEEE Comput. Soc. 15(6), (2000) 10. Fensel, D.: Ontologies: silver bullet for knowledge management and electronic commerce. Spring Verlag (2004)

Chapter 3

Research and Implementation of Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network Junfeng Qiao, Aihua Zhou, Lin Peng, Lipeng Zhu, Sen Pan, and Pei Yang Abstract The traditional relational database is difficult to efficiently store the power grid topology data with complex network relationships, which seriously restricts the application expansion of power grid topology analysis and calculation. Graph database is a new data management and analysis and calculation technology based on graph theory. The research mainly focuses on the following four aspects: first, the basic requirements of the power grid diagram database, including: the data requirements of the power grid diagram database, the functional requirements of the power grid diagram database, and the performance requirements of the power grid diagram database; the second is the research on the core technology of the grid diagram database, including: the research on the data description method of the grid diagram database, the research on the data storage technology of the grid diagram database, the research on the data modeling technology of the grid diagram database, and the research on the fast retrieval technology for the grid diagram data; the third is the development and realization of the main functions of the power grid diagram database, including: the design of the power grid diagram database system framework, the research on the development and selection of the power grid diagram database technology, and the development and realization of the basic functions of the power grid diagram database; the fourth is grid diagram database testing and application verification, including: grid diagram database testing and grid diagram database application verification. Keywords Electric distribution network · Data hybrid storage · Multi-source isomerism · Multi-source heterogeneous data · Distributed new energy

J. Qiao (B) · A. Zhou · L. Peng · L. Zhu · S. Pan · P. Yang State Grid Key Laboratory of Information & Network Security of State Grid Smart Grid Research Institute Co., Ltd, Nanjing 210003, Jiangsu, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_3

31

32

J. Qiao et al.

3.1 Introduction The electric power grid is mainly composed of electrical equipment for power generation, transmission, transformation, distribution, and consumption [1]. All kinds of equipment are distributed in a wide geographical space. The pole tower, transformer, switch blade, and other equipment and facilities are distributed in a typical point shape in the geographical space [2]. Although substations, switching posts, ring network cabinets, power plants, etc., are similar to urban administrative regions in the real world, showing the characteristics of areal distribution, in the analysis and application of the power industry. They are generally abstracted as a point in the logical sense, so they can also be treated as a special case of point distribution [3]. And lines (including trunk lines, branch lines, and cables), like streets and rivers in geographical space, belong to the current distribution elements in geographical space. They reflect a connection between discrete points. Specifically, in the power system, they reflect the logical association of power equipment and facilities such as towers, substations, power plants, and switches [4]. Therefore, the electrical equipment and facilities of the power grid have the characteristics of the typical geographical distribution of points and lines in space. The topology information (connection relationship between equipment) required in the analysis and application can be described as “incomplete topology”, instead of “complete topology” containing “face” topology information [5]. This paper mainly adopts the method of building network topology relationship, that is, the network equipment and topology relationship data set are calculated and stored to store the connected topology relationship between the objects in the line data set, and finally generate the network data set or modify and update the existing data set. The construction of topological relationship between power entities is the basis of topological analysis and decision-making. It often involves the operation between different types of data sets and the mutual relationship between objects in topological space, including five basic topological relationships: adjacent, included, intersected, partially covered, and separated. Because the most important part of the power grid structure is the interconnection between lines, this paper mainly deals with the topological relationship of objects in the network line data set during the formation of the power grid topology, to ensure the consistency of the data set topology, so as to facilitate various query and analysis operations in the future.

3.2 Related Work With the geometric growth of the amount of power grid equipment, the management of power grid equipment has been extended to low voltage and rural network users [6]. The amount of physical equipment will reach 1 billion, and the amount of logical equipment and historical status data is expected to reach 10 billion [7]. It has far exceeded the design capacity of the original 1 billion level equipment [8]. As a result,

3 Research and Implementation of Hybrid Storage Method …

33

the storage demand for massive topology data is almost impossible for traditional databases. More professional graph databases are needed to support the storage of large-scale device topology data in the future power grid. It is also necessary to investigate the scale of data that needs to be stored and managed for grid diagram data, understand the actual data management and use status, and develop the design capacity of the grid diagram database [9]. The data in the power grid diagram database comes from relational database, time sequence database, big data platform, text, drawings, etc. Study the access specifications, interface models, and storage methods of various types of data, and effectively manage the source, name, record, etc. of data sources [10]. It is also necessary to investigate the data source and data quality of the current grid diagram data, understand the collection, generation, storage, and transmission mechanism of the actual data source, and formulate the data format requirements and data interface specifications of the grid diagram database.

3.3 Research on the Method of Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network 3.3.1 Characteristics of Electric Distribution Network Data The processing characteristics of grid topology relations are mainly due to the fact that the object of topology processing is the line data set, so some processing should be carried out in the circuit diagram before building the grid topology. Switch, transformer and junction box play the role of the button in the electrical connection relationship between various objects, so they must be treated on the circuit diagram first. The endpoints of a poly-line or line segment are called nodes, and they can be specifically divided into suspended nodes, false nodes, and true nodes. Suspension node: when the endpoint of a poly-line or line segment does not coincide with the end point of any other poly-line or line segment, this node is called a suspension node. False node: when the endpoint of a poly-line or line segment coincides with the endpoint of another poly-line or line segment, and the two endpoints coincide at the same time, such a point is called a false node. True node: when the endpoint of a poly-line or line segment coincides with the endpoint of at least two other poly-lines or line segments, and there is only one point at the intersection, such a point is a true node. The significant difference between the suspended node, false node, and true node is that there is only one arc (poly-line or line segment) at the suspended node, there are only two arcs at the false node, and there are at least three arcs at the true node. Vertex, such an intermediate turning point becomes a vertex. The endpoint of the poly-line is not a vertex, but a node. Overhanging line: A poly-line or line segment with suspension nodes becomes a suspension line. In order to generate corresponding true nodes where there are switches, transformers, or junction boxes

34

J. Qiao et al.

Fig. 3.1 Characteristic extraction of electric distribution network data

when establishing the topology, the line should be broken at the corresponding location to establish corresponding false nodes. For physically crossed lines, they should also be broken into two lines at the intersection to obtain false nodes, and the crossed lines cannot be processed to conform to reality. As is shown in Fig. 3.1, first of all, the typical map database application of the Internet industry is investigated, and the general description method of map data is studied; starting from the feasibility and basic theory of the graph database, the typical characteristics and data characteristics of the power grid diagram are deeply analyzed, the power grid diagram is abstracted, and the logic model, topology structure and metadata model of the power grid diagram are extracted.

3.3.2 Hybrid Storage Method for Multi-source Heterogeneous Data of Electric Distribution Network The original data of large-scale power allocation is kept on the distributed storage system, and the index established must be distributed. If the original data of a graph is large, its index file will also be large. In addition, the delay of a distributed environment and data update also increases the difficulty of index maintenance. Therefore, the indexing of graph data is a very difficult problem both in storage and maintenance.

3 Research and Implementation of Hybrid Storage Method …

35

Fig. 3.2 Hybrid storage and mapping mode of electric distribution data

From the perspective of use purpose and actual effect, indexes can be divided into two categories: one is to establish indexes in the cloud computing environment to support ordinary queries, which is helpful to improve data search efficiency, mainly used in distributed graph databases; the other is the index established to speed up the calculation processing, which is mainly used in the calculation processing applications of graphs, such as shortest path calculation, Page-rank calculation, clustering analysis, etc. In graph data storage, entities can be represented as “points”, and the relationship between two entities is represented by edges. As is shown in Fig. 3.2, in order to achieve efficient parallel storage and processing, it is very important to reduce the coupling between sub-data blocks of distributed processing as much as possible. Effective data block segmentation is an important means to achieve decoupling. First, a logically complete large data block should be divided into several parts and placed on each working node of the distributed storage system. The subsequent processing of data blocks is to start a computing task for each sub-data block that has been stored in a distributed way and performs the same processing operation. When all sub-data blocks are processed, the entire large data block is processed once.

3.4 Implementation of the Method of Hybrid Storage for Multi-source Heterogeneous Data of Electric Distribution Network The electric data is abstracted into multiple groups of data, that is, RDF (Resource Description Framework) data, which is composed of many RDF statements. Each RDF statement is a triplet where the subject is the described resource, and the object represents the value of the subject’s attribute, which can be another resource or text. From the perspective of the mixed data model,

36

J. Qiao et al.

Fig. 3.3 Mixed storage model mapping mode

the subject and object correspond to the vertex in the graph, the attribute corresponds to a directed edge in the data set, an RDF statement corresponds to two vertices in the graph and a directed edge from the subject vertex to the object vertex. The mapping relationship is shown in the following figure. As is shown in Fig. 3.3, by comparing and integrating RDF tag framework and XML language representation, a data hybrid storage model mapping method oriented to grid asset management and power equipment association analysis is adopted to solve the representation and storage problems of massive power data, which can correctly express the power grid data. The expressed data system is simple, clear, and easy to understand. The expression method is easy to be organized, managed, retrieved, and stored by computer and can finally facilitate the expansion of the expressed data.

3.4.1 Design of Hybrid Storage Framework for Power Distribution Data In the research of electric data resource description, the data will be classified according to the characteristics and application occasions of power grid data, which can be divided into low density maps with fewer attributes and fewer connections between objects in the data. Data objects have many attributes, but they are independent of each other, that is, the graph does not contain a medium density graph of rings. Almost all data objects are on one or more rings, and there are many object attributes in the high-density graph. RDF resource description framework and XML language are used to formalize the association feature information of large-scale graph. Then we study the graph data segmentation and indexing technology. We plan to use breadth first (BFS) and KL/FM algorithm to perform heuristic segmentation on the graph, use Hash + feature tree to index the graph, and use a key value database in the Hadoop framework to store the graph data.

3 Research and Implementation of Hybrid Storage Method …

37

Fig. 3.4 Fusion of electric distributed data

As is shown in Fig. 3.4, the hybrid storage method of electric power data is firstly to develop the description methods of structured, semi-structured, and unstructured electric power data resources including RDF tag and key value, and to study the implementation method of realizing the storage of this electric power data model in a distributed environment. On this basis, the division algorithm, index algorithm, and storage mode of power data are studied, and then the algorithm for recovering original data through data block reconstruction and data block merging is studied. The load balance of data nodes is considered to complete efficient query and analysis of large-scale power multi-type data. Although there are many electrical devices in the power grid, their types are limited. Common devices in the grid logic data include buses, lines (including cables), transformers, switch connectors, etc. The buses and lines can be abstracted as data edges, while switch connectors, transformers, etc. can be abstracted as data vertices.

3.4.2 Implementation of Hybrid Storage Management System for Power Distribution Data In this paper, an electric grid data block division method based on region division is proposed. Based on the association analysis algorithm, the distributed multi-source heterogeneous data of the grid is divided into mixed partitions. Taking full account of the equalization requirements of grid equipment in a specific region, a multiscale quad-tree grid of storage space is quickly constructed in an adaptive way. First, all power equipment is numbered in a global order, and then all equipment identifications are replaced, In this way, the equipment identification in the power

38

J. Qiao et al.

data can be replaced by a globally ordered continuous number, and then all the equipment numbers can be divided according to the hybrid division algorithm of the store data. On this basis, two data block division and integration methods are comprehensively used to achieve efficient division of power grid data and database construction. On this basis, based on the grid data resource division technology, large-scale grid data resources are stored on multiple storage nodes in the form of partitions, and full consideration is given to load balancing and storage redundancy to achieve distributed hybrid efficient storage based on grid data. As is shown in Fig. 3.5, since the reconstruction of sub-data blocks is a process of parallel processing, the complexity of the algorithm directly depends on the time cost of the slowest map and reduce node, which is positively related to the size of the data received by the node. The data received by the map node is equal, so there is no need to balance. For newly added data, it is necessary to distribute it evenly among multiple storage nodes through a load balancing strategy. This paper studies the use of an improved FCFS (First Come First Served) strategy to put all the storage tasks of newly added data into a unique operation control queue. According to the importance of storage tasks, priority parameters are configured for each operation. When the system selects the storage operation to be executed, Scan the operation queue, and select the operation to be executed according to the submission time sequence of the operation and the priority of the operation. In this way, the response time to the operation with high importance can be greatly improved, and new storage tasks cannot be allocated to the machine with saturated data storage capacity, thus realizing the load balancing of each data storage node. As described above, the storage processing of large-scale mixed data in the database involves many aspects of the storage mode. The current research focuses on the following aspects: (1) Large-scale data segmentation. The segmentation of large data blocks in the graph database needs to improve the connectivity within the subgraph, reduce the connectivity between sub-data blocks, maintain the balance of data size and graph topology between sub-data blocks, and have less time complexity. A good data block segmentation algorithm is the basis for reducing the strong coupling

Fig. 3.5 Electric distribution network data fusion

3 Research and Implementation of Hybrid Storage Method …

39

of data parallel computing in the cloud computing environment. However, the current data block segmentation technology is difficult to achieve good performance in connectivity, balance, and time complexity. (2) Large-scale data index structure. Although large-scale data management can improve efficiency by relying on distributed parallel processing mechanism, the addition of indexes will undoubtedly improve the management efficiency. At present, the indexing mechanism of distributed data has formed products, but it is still in the process of continuous exploration and research. For the index of data processing, only part of the data processing has carried out the index research in the cloud computing environment, such as shortest path computing, and quite a few of the data processing methods have not considered the index mechanism. (3) Query processing and disk storage. Large-scale data query processing in the database is in the development stage, and there is still much room for performance improvement. Moreover, complex applications such as sub-data block mining and data pattern matching query have not been well solved. In addition, large-scale graph processing systems based on BSP models, such as Preggers, Ham, and Garish, are currently based on memory [7], which limits the scale of data processing, integrates hard disk storage into BSP models, and optimizes disk IO.

3.5 Application Results This paper mainly studies the hybrid data storage technology based on the power grid. With the rapid development of the power system, the data generated in the power system will grow at an unprecedented rate. The generation of massive data will also pose a great challenge to the storage mechanism of the entire power system. Therefore, how to effectively and reasonably store the data in the power system is a key point of this project. In this project, a database model based on hybrid data management is built for the distributed multi-source heterogeneous data of the power grid. The database storage system stores data with the device as the unique identifier. By taking advantage of the physical “pointing” characteristics between nodes, it can provide “indexing free” association operations for adjacent nodes. It has the characteristics of real-time and dynamic status updates, which effectively improves the efficiency of data storage and processing. At the same time, The data segmentation and integration technology is used for multi-source heterogeneous data. The whole system model is divided into various sub-data modules, which are broken into parts. Through data load balancing, large-scale power data is stored in multiple storage nodes in a distributed manner, realizing distributed storage, and effectively solving the problem of mixed storage of massive multivariate data. The traditional data storage method uses different databases to store different types of data which creates a lot of redundancy and wastes a lot of storage space. Based on the research results of this paper, the research results greatly improve the

40

J. Qiao et al.

data storage efficiency of the original traditional data storage methods, and reduce the data redundancy by more than 40%. The reduction of data redundancy can provide good data support for future data analysis and analysis applications, and provide an important database for the further development of power business. In the next, the research on hybrid storage of electric power data will also explore the optimal efficient storage scheme of the grid map database to support the storage requirements of grid nodes at 100 billion levels and edges at 10 billion levels. There are various data connection logic relationships between components in the power grid, such as the connection relationship between lines and switches, the connection relationship between transformers and lines. These topological relationships and various electrical components together constitute a complex power grid, which brings greater challenges to future data storage. Acknowledgements This work was supported by State Grid Corporation of China’s Science and Technology Project (5400-202258431A-2-0-ZN) which is “Research on deep data fusion and resource sharing technology of new distribution network”.

References 1. Tu, Z.J., Mao, Y.C., Wu, M.B., Chen, Y.: Adaptive parameter tuning of power data storage system based on reinforcement learning. Power Syst. Autom. 46(04), 112–122 (2022). (in Chinese) 2. Zeng, F., Yang, X., Su, W., Xiao, X.L., Yi, W.F.: Power data storage and sharing method based on blockchain and data lake. Jiangsu Electr. Eng. 041(003), 48–54 (2022). (in Chinese) 3. Lin, S.H., Li, J.X., Zeng, Z.X.: Research on privacy of grid cloud storage data. Commun. World 27(9), 2 (2022). (in Chinese) 4. Fang, Y., Zhang, L., Sheng, J. Q., Zhang, Y. C., Wang, H. Y.: Design of automatic encryption system for power data storage based on microservice architecture. Autom. Instrum. (001), 189-192+196 (2022) (in Chinese) 5. Wan, C., Wei, L. H., Yang, Q. Y., Yang, C. Y., Su, H. Q.: Research on metadata integrated data storage strategy of power grid industry. Microcomput. Appl. 37(1), 26-28+32 (2021) (in Chinese) 6. Pourbabak, H., Chen, T., Zhang, B., Su, W.: Control and energy management system in microgrids. arXiv preprint arXiv:1705.10196 (2017) 7. Dymora, P., Paszkiewicz, A.: Performance analysis of selected programming languages in the context of supporting decision-making processes for industry 4.0. Appl. Sci. 10(23), 8521 (2020) 8. Wang, L.Y., Fu, H., Yang, Y., Liu, J.: Blockchain based power marketing data storage mechanism. J. Chongqing Univ. 44(08), 156–164 (2021). (in Chinese) 9. Huang, L.S., Zhang, F., Song, K., et al.: Power timing data storage system based on HBase. Sci. Technol. Innov. 000(011), 101–102 (2021). (in Chinese) 10. Zeng, C.Y., Chen, G., Pang, X.J., Wu, J.Y., Hu, H.S.: Application of data storage technology in security audit platform of power monitoring system. Digit. Technol. Appl. 008, 143–145 (2022). (in Chinese)

Chapter 4

Research on Measuring Method of Ball Indentation Based on MATLAB Image Edge Detection Yuan Zhou, Xin Wang, Ya Liu, Xiaoyuan Wang, Zheng Li, and Yanying Wang Abstract Ball pressure test is an important test to evaluate the heat resistance of non-metallic materials. At present, most of the tests are judged by manual measurement, with low accuracy. In this paper, the digital image measurement technology of MATLAB is used to transform and enhance the collected indentation image. The Gaussian Laplace operator is used to detect the edge of the image, find the maximum value of the coordinate difference between each element point on the image edge and other element points, traverse all the element points on the image edge, and put the obtained maximum value into the maximum value array; continue to find the maximum value in the maximum value array to obtain the diameter of the ball pressure indentation image, thus improving the measurement accuracy and realizing the intelligent measurement of the diameter of the ball pressure detection image. Keywords Ball pressure · MATLAB · Image measurement · Indentation · Edge detection

4.1 Introduction In many safety standards for electrical and electronic products, the ball pressure test is an important method to assess the heat resistance of non-metallic materials. It is important to determine whether the materials are qualified according to the indentation diameter. Therefore, it is important to study the ball pressure indentation measurement method and explore how to accurately measure the indentation diameter [1]. MATLAB can provide powerful data processing and graphic display functions. Its high flexible programmability makes it advantages in data processing that other software does not have. It is an efficient development platform for numerical calculation and graphic display [2]. Y. Zhou (B) · X. Wang · Y. Liu · X. Wang · Z. Li · Y. Wang Shandong Institute of Inspection On Product Quality, Jinan 250102, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_4

41

42

Y. Zhou et al.

In this paper, a method and system for measuring spherical indentation based on MATLAB image edge detection are proposed. The acquired indentation image is transformed and enhanced by using MATLAB digital image measurement technology. The image edge is detected by Gauss Laplace operator to find the maximum value of the coordinate difference between each element point on the image edge and other element points. Respectively, putting the obtained maximum value into the maximum value array; continue to find the maximum value in the maximum value array to obtain the diameter of the ball pressure indentation image, thus improving the measurement accuracy and realizing the intelligent measurement of the diameter of the ball pressure detection image.

4.2 Indentation Measurement Whether the ball pressure test is qualified or not is related to the indentation diameter. The indentation diameter is the direct criterion for judging whether the material is qualified or not. GB/T 5169.21–2017 has put forward clear requirements, and it is qualified if its diameter does not exceed 2 mm [3]. Theoretically, the indentation diameter is the maximum distance between two tangent points. The tangent point is that the spherical surface at the end of the ball pressing device is geometrically tangent to the indentation concave surface. The determination of the indentation boundary determines the indentation diameter measurement [4]. The following measurement methods are produced in actual operation: (1) From the perspective of certification, the most severe method is adopted: the maximum indentation is taken as the final result, and the actual indentation must be within this maximum value. (2) Apparent boundary method: Cover a thin layer of white paper on the indentation surface, gently trace the boundary on the paper with a pencil, or paint the indentation with colored ink or color the indentation with a fluorescent agent, then measure the maximum value and minimum value with a reading microscope, and then take the intermediate value as the measurement result. This method depends on the personal experience of the operator to a great extent in the actual operation. It varies with different operators, and there are artificial uncertainties in the measurement results. (3) Cutting method: The indentation is cut transversely, and the indentation section is measured to calculate the indentation diameter. This method has high requirements for cutting means and cutting tools. If the cutting is improper, the indentation shape will be damaged and the test results will be affected. (4) Image method: The indentation image is acquired by the camera, and the image is used as a means or carrier for detection and transmission, in order to extract useful signals from the image. With the development of computer technology, it is possible to realize accurate measurement by automatically measuring indentation size by computer.

4 Research on Measuring Method of Ball Indentation Based on MATLAB …

43

(5) Equipment such as hardness tester that directly and accurately images and reads indentation diameter data in real time. However, such equipment is generally expensive, cumbersome, and less applied. (6) Reflection method: The parallel light irradiated on its surface is focused on the focal point by the concave mirror. The place where the ball indentation and the pressure ball contact can be regarded as an ideal concave mirror. Under the irradiation of strong enough light, the indentation part will show a very bright state due to the condensing characteristics, which is the actual indentation to be measured. At present, most of the measurement methods used are under an optical microscope or with a calibrated magnifying glass. Because different measuring personnel have different habits of operating equipment or different measuring methods, it is inevitable to cause large errors. At the same time, the measuring time is too long and a large amount of human resources are required. In the reading of measurement data, if the indentation value is near the critical value, higher requirements are put forward for the tester [5]. In the actual operation of the test, the indentation boundary is usually fuzzy, and the tester cannot accurately determine the starting point of the indentation, which is easy to cause the deviation of the indentation reading value. Therefore, the accuracy of the existing measurement methods is low, and there are human errors and subjectivity in the measurement by different personnel.

4.3 Edge Detection The so-called edge refers to the result that the gray level of the surrounding pixels has a step change and the gray level value is discontinuous, and the edge detection is the method of image segmentation based on the amplitude discontinuity [6]. Generally, in an image, the edge of the image is difficult to be observed by the naked eye. The objective of edge detection is to extract the feature points of various parameter changes in the image, where the boundary represents the discontinuity of the local position of the image. Edge detection is conducive to selecting the regional feature information and more accurate image segmentation [7]. Image edge detection is to find the boundary line between different objects in an image through a certain algorithm, which is mainly divided into filtering, enhancement, detection, and positioning. Filtering mainly reduces the influence of noise on edge detection through filters; enhancement is mainly to enhance the pixels with obvious changes; detection is to detect an edge of an image by setting a threshold; positioning is to roughly determine the position of the edge. In actual edge detection, very accurate edges of an image are not obtained. In order to get a clear indentation boundary, the existing edge detection methods usually use edge detection operators. The edge detection operator includes an edge detection operator based on the first derivative and an edge detection operator based

44

Y. Zhou et al.

on the second derivative. The edge detection operators based on the first derivative include Roberts operator, Sobel operator, Prewitt operator, etc. and the edge detection operators based on the second derivative include Laplacian operator, LOG operator, and Canny operator [8–14]. Most of the commonly used image processing methods for ball pressure detection use the above-mentioned edge detection operator to find the edge of the image, but the currently known methods only find the edge of the image, and the measurement of the image diameter still relies on manual work, so the measurement accuracy cannot be ensured, and the intelligent effect cannot be achieved.

4.4 Technical Scheme 4.4.1 Measurement Steps and Methods The measurement method of ball pressure indentation based on MATLAB image edge detection proposed in this paper includes the following steps: (1) Obtain the image of ball pressure indentation; (2) Preprocess the image; (3) Gaussian Laplace operator is used to detect the edge of the preprocessed image to obtain the image edge; (4) Find the maximum value of the difference between the coordinates of each element point on the image edge and the coordinates of other element points, traverse all element points on the image edge, and put the obtained maximum value into the maximum value array; (5) Based on the obtained maximum value array, the diameter of the ball indentation image is obtained. The measurement process is shown in Fig. 4.1. The specific implementation process adopts the following methods: (1) Acquire an image: Fix the distance between the camera and the ball pressure indentation image and use the camera to photograph the image to obtain the ball pressure indentation image. Due to the different focal lengths of the cameras, the image proportions displayed by the indentation images taken by the same camera are different, which affects the design of the algorithm. In order to overcome this effect, the method of fixing the types of cameras is adopted. Only one camera is used to capture the images, and the distance of the captured images is fixed. (2) Image preprocessing: First, the imread function is used to read the image information, and then the rgb2gray function and im2bw function are used to grayscale and binarize the image. The purpose is to convert the color image into a digital matrix, and then the point coordinates of the image matrix are numerically calculated.

4 Research on Measuring Method of Ball Indentation Based on MATLAB …

45

Fig. 4.1 Flow chart of measurement method

(3) Edge detection: The Gaussian function is used to low-pass filter the binary gray-scale image, and then the Laplace Gaussian operator is used to integrate the binary gray-scale image to find the boundary of the image. Laplace operator is a high pass filter, which is the sum of the second-order partial derivatives of the image gray function in two vertical directions. In the case of discrete digital image, the second-order difference of image gray level is directly used to replace the second-order partial derivative in the case of a continuous image. Laplace operator is very sensitive to noise, and pseudo edge response often occurs when extracting edges. In order to overcome the shortcomings of Laplace operator, low-pass filtering should be applied to digital image to suppress noise. Gaussian function is a good normalized low-pass filter, which can be used for low-pass filtering of digital images to reduce the impact of noise. On this basis, Laplacian operator is used to extract edges, which is called Gaussian Laplacian operator, also known as LOG (Laplacian of Gaussian) operator. LOG operator is a second-order derivative operator that operates on twodimensional functions and will produce a steep zero crossing at the edge. It is a linear and time invariant operator, and the origin of its transfer function in the frequency domain space is zero, so the Laplace filtered image has zero average gray level. LOG operator firstly uses a Gauss low-pass filter to filter the steep edges in the image, and finally uses zero gray value to binarize to generate closed and connected contours, eliminating all internal points, and its detection accuracy is significantly improved. (4) Record the coordinates of the left and right points. After Gaussian Laplace operator processing, find the maximum value of the difference between the coordinates of each element point on the image edge and the coordinates of other element points, traverse all element points on the image edge, and put the obtained maximum value into the maximum value array, specifically.

46

Y. Zhou et al.

Take any point on the edge of the image as the starting point, make a difference between the coordinates of the starting point and the coordinates of the remaining points, take the absolute value of the difference, compare the absolute values, find the absolute value with the largest value, and put the absolute value with the largest value into the maximum value array; traverse all the element points on the edge of the image, and put the absolute value of the maximum value of each element point into the maximum value array. For example, there are 10 element points on the edge of the image obtained after the Gaussian Laplace operator processing. First, one of the 10 points is randomly taken as the starting point, which is denoted as d1 , and the remaining 9 points are denoted as d2 –d10 . The coordinate values of d1 and d2 –d10 are, respectively, compared and the difference is taken as the absolute value to obtain the absolute values C1 –C10 of 10 differences. The absolute values C1 –C10 are compared, and the largest value is put into the maximum value array, for example, C3 . Therefore, for d1 , the maximum difference absolute value C3 is placed in the maximum value array. Then, traverse all points on the edge of the image, i.e., d2 –d10 , and obtain the absolute value of the maximum difference of d2 , the absolute value of the maximum difference of d3 , the absolute value of the maximum difference of d4 … The absolute value of the maximum difference of d10 according to the above method and put the obtained absolute value of the maximum difference of d2 –d10 into the maximum value array. In this way, a maximum value array is obtained. (5) Calculate the diameter. Based on the obtained maximum value array, find the maximum value in the maximum value array, which is the diameter of the ball indentation image.

4.4.2 Simulation Result The original image in Fig. 4.2 below is simulated by MATLAB. After graying and binarization, it is shown in Fig. 4.3. After edge extraction by the Gauss Laplace operator, it is shown in Fig. 4.4. The horizontal line in Fig. 4.2 is drawn manually when the measurement is carried out manually by visual method, and the measured diameter is 2152 mm. The ball pressure indentation measurement method based on MATLAB image edge detection proposed in this paper is adopted. After gray-scale and binary processing (see Fig. 4.3), and then the edge is extracted by Gauss Laplace operator (see Fig. 4.4), the diameter corresponding to Fig. 4.2 is 2151.4 mm, which realizes intelligent detection and improves measurement accuracy. The example shows that this method is feasible and effective.

4 Research on Measuring Method of Ball Indentation Based on MATLAB …

Fig. 4.2 Original image Fig. 4.3 Diagram after pretreatment

Fig. 4.4 Diagram after Gaussian Laplace operator processing

47

48

Y. Zhou et al.

4.5 Concluding Remarks The ball pressure test seems simple, but it is not. In order to improve the measurement efficiency and improve the measurement accuracy and automation, it is of great practical significance to study the intelligent indentation measurement technology to improve the accuracy and efficiency of measurement. The ball pressure indentation measurement method based on MATLAB image edge detection proposed in this paper has the following beneficial effects: (1) After the image is grayed and binarized to obtain the binarized gray-scale image, the Gaussian function is first used to low-pass filter the binarized gray-scale image, and then the Laplace operator is used to extract the edge, which can improve the accuracy of edge extraction. (2) An innovative method is proposed to find the maximum value of the difference between the coordinates of each element point on the image edge and the coordinates of other element points, traverse all the element points on the image edge, and put the obtained maximum value into the maximum value array; continue to find the maximum value in the maximum value array to obtain the diameter of the ball indentation image. In the current prior art, when the tester reads the indentation value of the image with a fuzzy indentation boundary, it adopts the method of manual visual inspection, which cannot accurately determine the indentation starting point, resulting in the deviation and low accuracy of the indentation reading value. Through an example, it is verified that the intelligent method proposed in this paper has greatly improved the measurement, and at the same time, the human error and subjectivity caused by manual measurement and reading are eliminated. The measurement results are more objective and accurate and have the value of popularization and application. Acknowledgements This paper is supported by the scientific research reserve project of Shandong Quality Inspection Institute “Research on indentation detection method of ball pressure test based on MATLAB” (2019ZJY007).

References 1. Yang, J., Zhao, Y.K.: Discussion on the new method of indentation size measurement in ball pressure test. Electron. Test. 19, 45–46 (2013). (in Chinese) 2. Liu, Y.L., Zuo, Y., Tang, C.R.: Design of indentation diameter measurement software based on mixed programming of MATLAB and VB. Meas. Technol. 2009(2), 39–41 (2009). (in Chinese) 3. GB/T 5169.21–2017. Fire hazard test for electrical and electronic products Part 21: abnormal hot bulb pressure test method. China Standards Press (2018) (in Chinese) 4. Xie, Y.T., Chen, M., Xia, Z.Y.: Analysis of test technology of ball pressure test. Low Volt. Apparatus 15, 55–58 (2013). (in Chinese) 5. He, D.S.: Discussion on improvement of indentation measurement technology based on ball pressure comparison test. Metrol. Test. Technol. 2, 3–6 (2014). (in Chinese)

4 Research on Measuring Method of Ball Indentation Based on MATLAB …

49

6. Shen, J., Du, Y.R., Gao, H.J.: Research on image edge detection technology. Inf. Technol. 12, 32–34 (2005). (In Chinese) 7. Yang, C.X., Li, M., Li, Y.J.: Analysis of edge detection algorithm based on MATLAB. China Educ. Technol. Equip. 10, 39–44 (2020). (in Chinese) 8. Hong, T.P.: Research on edge detection algorithm of color stone image based on MATLAB. Inf. Commun. 3, 63–64 (2020). (in Chinese) 9. Wang, Y., Liao, J.B., Ji, L.J., et al.: Image measurement of indentation diameter based on image segmentation technology. Autom. Meas. Control. 2, 72–73 (2007). (in Chinese) 10. He, Y.: Image edge detection and algorithm analysis based on Python. Comput. Prod. Circ. (6), 147 (2018) (in Chinese) 11. Wang, H.: Application of Python language in gray image edge detection. Electron. Compon. Inf. Technol. 7, 25–28 (2019). (in Chinese) 12. Mei, Y.Y.: Research on image edge detection algorithm based on Python. Sci. Technol. Innov. 31, 123–124 (2017). (in Chinese) 13. Zhang, Y., Rockett, P.I.: The Bayesian operating point of the canny edge detector. IEEE Trans Image Process. 15(11), 3409–3416 (2006) 14. Basu, M.: Gaussian-based edge-detection methods-a survey. IEEE Trans. Syst., Man, Cybern., Part C (Appl. Rev.) 32(3), 252–260 (2002)

Chapter 5

Multi-feature Extraction of Mineral Zone of Tabling Through Deep Semantic Segmentation Huizhong Liu and Keshun You

Abstract Various segmentation algorithms for mineral zone images can only extract the boundary of the concentrate zone or the separation point of the mineral zone. To obtain fuller and more productive feature information from the mineral zone of Tabling’s separation, deep semantic segmentation models with DeepLab, U-net, and Xception are constructed. The image datasets of the industrial Tabling separation are collected and marked, and the corresponding mineral zone image dataset is constructed, the training and test sets are distributed in a certain proportion and imported into the deep semantic models for training. The training results of these models are compared, and the segmentation of the mineral zone images is evaluated. DeepLab-xception and DeepLab v3+ have the highest accuracy 0.9943 and mean intersection over the union value of 0.989. Finally, the DeepLab v3+ is adopted as the model for the image feature segmentation of Tabling’s mineral zone. Through the corresponding image processing and feature extraction operators, the effective multi-scale features of Tabling’s mineral zone can be well extracted. Keywords Automation and intelligence of tabling · Tabling’s mineral zone · Deep semantic segmentation model · Effective multi-scale features

H. Liu (B) · K. You Jiangxi University of Science and Technology, Ganzhou 341000, China e-mail: [email protected] K. You e-mail: [email protected] H. Liu Research Center of Jiangxi Mining and Metallurgical Mechanical and Electrical Engineering Technology, Ganzhou 341000, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_5

51

52

H. Liu and K. You

5.1 Introduction In the beneficiation of tungsten ore, the Tabling is a critical piece of gravity beneficiation equipment that cannot be replaced. At now, the number of using the beneficiation Tabling is huge, reaching hundreds of thousand sets in accordance with incomplete statistics [1, 2]. When mineral pulps is given to the surface of Tabling, the slurry particles are loosening and delamination by the action of lateral flushing water and the differential frictional force of the surface. Particles with different gravities are distributed at diverse positions. Unique Tabling’s mineral zone is created as a result of the color contrast between them [3, 4]. In the previous, the operations of Tabling were completed by experienced Tabling’s technicians. By observing Tabling’s mineral zone, the technicians were enough able to determine the suitability of Tabling’s state and the interception point of the concentrate. To control concentrate grade, the adjustments were made to the point of the intercepting and the water flow rate. Given that people’s experience and technology are not same, the mistakes of the interception of the concentrate frequently happened, causing changes of the process indictors. Eventually leading to that there was no basis that it was with the optimum separation state. Nowadays, the MV technologies have been used instead of manually identifying the mineral zone of Tabling to realize automatic concentrate interception, which has quite a few results. For example, intelligently optimized color image segmentation algorithms by krill and improved firefly algorithms are proposed to obtain the optimized threshold [5–7]. The MV technologies are adopted by Beijing General Research Institute of Mining and Metallurgy (BGRIMM) for diagnosis and evaluation of the mineral zone and automatic access of Tabling [8–10]. The object detection technology can only acquire the coordinate information of the mineral separation point of Tabling [11], but obtaining the geometric characteristics of the entire concentrate zone and learning the status of Tabling’s separation process intelligently are impossible. Given that the MV technology from BGRIMM only obtained the position of the object point, the known mineral zone features are on one side, and the entire information of mineral zone features cannot be mastered. At now, the deep learning has been increasingly applied to the identification of minerals, and this has already acquired considerable results [12–14]. With the increasingly growing of DCNN, more and more advanced MV algorithms have been used in the industry, and they are providing a better solution for mineral processing intelligence [15, 16]. In view of the low intelligent level of the current Tabling and the defects of the existing mineral zone image recognition technology, deep semantic segmentation algorithms was used to segment the mining zone. Deep segmentation models with DeepLab v1, U-net, DeepLab-xception, and DeepLab v3+ deep are constructed. Through observation and calculation of the segmentation effects of the four models, we determine that DeepLab v3+ has the best segmentation effect and highest Miou. In spite of getting the coordinate position information of the concentrate zone, the DeepLab v3+ algorithm can also obtain the length of lines and area of the mineral zone, they are rich and meaningful features. The more comprehensive the geometric

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

53

features of the extracted mineral zone are, the stronger the machine learning system’s ability to learn the separation state will be. The mineral zone feature is used as the input, the separated concentrate grade and recovery rate are used as the output. Then, a credible machine learning model can be trained. This study lays a solid foundation for real-time detection and smart control of the mineral separation of Tabling.

5.2 Process and Method In tungsten ore processing plant, multistage videos of the mineral zone in the Tabling process are collected. A representative frame of approximately one min is used as the practical dataset.

5.2.1 Acquisition and Production of Datasets In terms of data continuity and representativeness, using a video file as the experimental data is preferable to using a lot of individual images. The frames from the videos should be extracted as images, and one image datum is acquired every 10 frames, resulting in approximately 2000 images. In this study, 1500 images are used as the training set, 450 images are the verification set, and the 50 images are test sets. Deep segmentation models must spend a lot of effort preprocessing image data, such as the creation of training set mask labels and image data enhancement, to adapt the input of each model. To reduce the memory usage while the detector is operating, the image was resized to 700*395. For some models, the input image should be flipped horizontally, flipped vertically, rotated, mirrored, and cut. The ground truth image’s mask label must be created prior to such processes, as shown in Fig. 5.1. Fig. 5.1 Training set data label making

54

H. Liu and K. You

5.2.2 Construction of Deep Segmentation Model Each pixel in an image must be classified for semantic segmentation, for example, using deep convolutional neural networks or pixel-level decision trees [17]. Deep learning, a popular image segmentation algorithm, is currently the best classification method in machine learning. Deep segmentation models range from fully connected convolutional networks in 2014 [18] to SegNet [19], U-net [20], and Dilated Convolutions [21], to DeepLab (v1 & v2) [22, 23], and PSPNet in 2016 [24] and to and DeepLab v3 in 2017 [25]. The scores in the VOC2012 dataset have been steadily rising. The goal of DeepLab is to assign semantic labels such as “people,” “dogs,” and “cats”, to each pixel from the input image [26]. DeepLab v1 combines atrous convolution and DCNN to make semantic segmentation, as illustrated in Figs. 5.2 and 5.3, as well as Table 5.1. Based on DeepLab v1’s optimization, DeepLab v2 effectively segments objects using arcane spatial pyramid pooling (ASPP). DeepLab v3 uses a multi-scale perforated convolution cascade or a parallel one to capture multi-scale background pixels and optimizes the ASPP’s segmentation by image features. In DeepLab v3+, a simple and efficient decoder module is added to improve the segmentation results of DeepLab v3. From Fig. 5.4, the advanced deep segmentation model with DeepLab v3+ continues to improve the architecture of DeepLab v3. To integrate multi-scale features, the Encoder-Decoder is introduced, which arbitrarily controls the resolution of the encoder to balance accuracy and time consumption through hole convolution [27]. There are numerous new convolutional neural network design techniques, as well as numerous variations of U-net. Nevertheless, such methods still keep the main thoughts of U-net and adding new modules. As shown in Fig. 5.4, U-net with the U-shaped architecture can make it need less images of training, and the accuracy of segmentation still can be kept. It uses a fully convolutional neural network. The left is a network for feature extraction and the right for feature fusion. Finally, two convolution operations are performed to generate a feature map. A DeepLab network architecture with Xception-41 was constructed. As shown in Fig. 5.5, the Xception architecture consists of entry, middle, and exit flows. The Xception architecture is an improvement of the Inception-V3 network, adding depthwise separable convolutions and residual connection modules [28]. We define the DeepLab network with Xception-41 as DeepLab-xception. Table 5.1 describes the development of the DeepLab network architecture. The DeepLab v3+ with Xception-65 is different from DeepLab-xception.

5.2.3 Training Results of Deep Segmentation Model MATLAB2020b and Python3.8.8 programming environment are used to construct the deep segmentation models, set the training parameters, and sort out the relevant training data. Table 5.2 illustrates the hardware and software employed in the experiment. From the prepared dataset, 1500 images of 395*700*3 are imported into the

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

55

Fig. 5.2 DeepLab(v1&v2) architecture

Fig. 5.3 DeepLab v3+ architecture

DeepLab v3+ model for training, and the entire training process is completed in MATLAB2020b. The primary learning rate is set to 0.0001, the maximum epochs is 20, and the minimum batch size is 8. The overfitting is avoided by employing L 2 regularization.

56

H. Liu and K. You

Table 5.1 Development of DeepLab architecture Backbone Module

DeepLab v1

DeepLab v2

DeepLab v3

DeepLab v3+

VGG16

RestNet

RestNet

Xception-65

Atrous Convo

Atrous Convo

Atrous Convo

Atrous Convo

CRF

CRF

–

–

–

ASPP

ASPP+

ASPP+

–

–

–

En-Decoder

Fig. 5.4 U-net architecture

L = E in + λ

∑

w2j

(5.1)

j

where L is the loss of training, E in is the training error that does not contain a regularization issue, λ is the regularization parameter, and wj is the training weight. As shown in Fig. 5.6, the image is divided into object piexls (that is, a mineral zone area, ‘StopSign’) and background pixels (marked as ‘Back’). Analyzing the target and background pixel values can yield a training weight, which needs to be used in the model output layer to speed up the convergence of perfect training. Then, DeepLab v1, U-net, and DeepLab-xception models are constructed in Torch1.9.0 of Python3.8.8. All data sets labeled by LabelMe are imported into these models for training and supervised learning.

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

57

Fig. 5.5 Xception-41 architecture

Table 5.2 Software and hardware

Laboratory equipment

GT62VR 7RE

Device memory

256gSSD,1 T

Video memory

GTX1070, 8 g

Running memory

16 g

Programming environment

Matlab2020b, python3.8.8

Operating environment

DeepLearn Tool, torch1.9.0

CUDA

11.4

The training of the model is carried on single GPU, and the training accuracy and loss are counted, as shown in Table 5.3. In PyTorch, the accuracy and loss fluctuations of the DeepLab v1, U-net, and DeepLab-xception model training process are shown in Figs. 5.7 and 5.9. In MATLAB, the training process of the DeepLab v3+ model is visualized, as shown in Fig. 5.8. No overfitting state occurs in all training, the accuracy of the training process is over 0.9, and the loss value is lower than 0.2. In Fig. 5.10, the prepared test set is integrated into the trained model to make predictions. Except for the poor results of DeepLab v1, the prediction quality of the U-net, DeepLabxception, and DeepLab v3+ models is ideal. Table 5.3 presents that DeepLab v3+ model has the lowest loss and the highest accuracy. Specifically, the loss value and accuracy of the model are 0.0133 and 0.9943, respectively.

58

H. Liu and K. You

Fig. 5.6 Ratio of background pixels and target pixels

Table 5.3 Training results DeepLab V1

U-net

DeepLab-xception

DeepLab V3+

Loss

0.2999

0.0563

0.02567

0.0133

Acc

0.9856

0.9921

0.9935

0.9943

5.2.4 Evaluation of Prediction Results of Deep Semantic Segmentation Model In several fields, semantic segmentation is presently employed extensively. The most commonly-used evaluation index of semantic segmentation is Miou. To understand and calculate the evaluation index, a mixed food matrix should be established. Among them, TP (true positive, referred to as the positive example), FP (false positive, referred to as the positive example), FN (false negative, referred to as the negative example), TN (true negative, referred to as true negative example). In this work, Miou calculates the difference between the segmentation of ground truth and the inferred segmentation. Their ratio is redefined as the number of TP classified by the total number of FN, TP, FN, and TN. 1 ∑ k k + 1 i=0 ∑ k

m I OU =

j

pi j +

pii k ∑

(5.2) p ji − pii

j

In which Pii = TP + TN, Pij = FP + FN, and confusion matrix can be described as

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

Fig. 5.7 Training accuracy of DeepLab V1, U-net, DeepLab-xception

Fig. 5.8 Training process of DeepLab v3+ in MATLAB

59

60

H. Liu and K. You

Fig. 5.9 Training loss of DeepLab V1, U-net, DeepLab-xception

m I OU =

TP T P + FP + FN

(5.3)

Through calculation, we obtain the Miou values of the four models, as shown in Table 5.4. The fluctuation curve of iou is illustrated in Fig. 5.10. The IOU values are relatively high and over 0.9, which indicates that the models are considerably appropriate for the segmentation task of predicting the mineral zone of Tabling. In particular, DeepLab v3+ has the highest evaluation value among the four models. Its Miou value is as high as 0.989, which substantially exceeds its performance of 0.857 Miou in the data set of VOC2012. On the basis of the development of semantic segmentation models, from the initial Encoder-Decoder framework to the later DeepLab, PSPNet framework, and RefineNet framework, each framework has its own uniqueness. Nonetheless, DeepLab v3+ integrates DeepLab, PSPNet, and Encoder-Decoder, obtains the best results, and is the master of ideas (Fig. 5.11).

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

61

Fig. 5.10 Result of semantic segmentation model prediction

Table 5.4 Evaluation of prediction results of deep segmentation model Model

DeepLab v1

U-net

DeepLab-xception

DeepLab v3+

Miou

0.972

0.919

0.987

0.989

5.3 Multi-scale Geometric Feature Characterization and Extraction of Mineral Zone 5.3.1 Multi-scale Feature Representation of Minera Zone In order to visualize the variation of mineral zone separations, it is necessary to adopt some representation of mineral zone characteristics, which also has the use of better construction of separation index detection methods based on mathematical and physical models [29, 30]. These research directions prove that the mathematical model of Tabling’s separation state can be constructed by the machine learning modeling, which is a crucial method for achieving automatic control and parameter optimization of the Tabling separation process [31]. The rich and effective multiscale image features of mineral zone are a strong support for the construction of related models.

62

H. Liu and K. You

Fig. 5.11 IOU value of semantic segmentation model prediction

We are able to forecast multi-scale picture attributes using the deep segmentation model that was previously trained with DeepLab v3+ . The multi-scale geometric characteristics of the points, lines, and surfaces of mineral zone can also be calculated. However, these multi-scale geometric features should be characterized first. As shown in Fig. 5.12, the object area segmented is the mineral zone of the image. It can be described by five feature values: r, θ , l 1 , l 2 and l 3 . In the figure, A1 is set as the mineral zone area, A2 is the background area, and r is the ratio of A1 to A2 . θ is the clamping angle from the left and right borderlines, l2 is the length from the left borderline of mineral zone, and l3 is the length from the right borderline of mineral zone. l1 is the distance from the intersection of the left borderline and the bottom borderline of mineral zone and the left borderline of Tabling.

5.3.2 Multi-scale Feature Extraction of Mineral Zone The multi-scale characteristics of the mineral zone can roughly be described by five data values. To extract the multi-scale geometric characteristics of mineral zone, we should understand how to solve these five values. On the whole, the image processing algorithm can only extract the pixel coordinates of a certain point in the image, which

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

63

Fig. 5.12 Geometric features of the mineral zone image

has a certain gap with the actual coordinate position. Thus, we use the method of calibration of the camera for resolving the difference between the image’s pixel distance calculated by the algorithm and the actual distance. Here, we need to define a pixel ratio coefficient a, a = d/l, in which d is the actual distance and l is the pixel distance. For example, before camera debugging, d is the true length of the left borderline of Tabling. The pixel length of the left borderline of Tabling in image is denoted as l, and their ratio is calculated, which is the corresponding pixel ratio coefficient indicated as a. However, after the camera is calibrated, the position and angle of the camera cannot be changed, otherwise, the pixel ratio calculated via calibration will be inconsistent. As shown in Fig. 5.13, to extract multi-scale geometric characteristics of mineral zone, the bottom edge is used as the X axis, and the left edge as the Y axis to construct a two-dimensional geometric coordinate base on the surface of Tabling. We define the cross of the left and the bottom borderline as the coordinate origin o(x1,y1) and extract mineral zone as an irregular quadrilateral. The four vertices of the quadrilateral are A(x 2 ,y2 ), B(x 3 ,y3 ), C(x 4 ,y4 ) and D(x 5 ,y5 ); the lengths of the four sides are l2 , l 3 , l 4 and l 5 . The calculation formulas are shown in Eqs. (5.4)–(5.9). /

(x3 − x4 )2 + (y3 − y4 )2

(5.4)

l1 =

/ (x1 − x2 )2 + (y1 − y2 )2

(5.5)

l2 =

/ (x5 − x2 )2 + (y5 − y2 )2

(5.6)

l3 =

l4 = l5 =

/

(x3 − x2 )2 + (y3 − y2 )2

/

(x5 − x4 )2 + (y5 − y4 )2

θ = arctan (y4 − y3 )/(x4 − x3 ) − arctan (y5 − y2 )/(x5 − x2 )

(5.7) (5.8) (5.9)

64

H. Liu and K. You

Fig. 5.13 Establishment of two-dimensional coordinates under the mineral zone image

Equations (5.4)–(5.8) can only extract the points and lines of the mineral zone. To obtain the surface scale features of the concentrate zone, it is necessary to use the coordinates of the four points of any quadrilateral to solve the area of the quadrilateral. To solve the problem, first divide the quadrilateral into two triangles SΔADC and SΔABC. Let the areas of SΔADC and SΔABC be S1 and S2 , respectively. It is known that the two sides of SΔADC are l2 and l5 respectively. According to the extracted B and D coordinate point information, the third side length of l6 can be calculated as l6 =

/

(x4 − x2 )2 + (y4 − y2 )2

(5.10)

Therefore, it can be known that the three sides of SΔABC are l3 , l 4, and l 6 respectively. According to Heron’s formula [32], first find the half perimeter of the triangle: P1 =

(l2 + l5 + l6 ) 2

(5.11)

P2 =

(l3 + l4 + l6 ) 2

(5.12)

Then find the area of the two triangles and the total area (S) of the mineral zone area, respectively: S1 = S2 =

√ √

P1 (P1 − l2 )(P1 − l5 )(P1 − l6 )

(5.13)

P2 (P2 − l3 )(P2 − l4 )(P2 − l6 )

(5.14)

S = S1 + S2

(5.15)

where l 1 is the distance between separation point of the concentrate zone and the origin coordinates. If the coordinates of points B and C are known, l3 can be obtained. If the coordinates of points A and D are known, l2 can be obtained. D point coordinates can be obtained, known A, B coordinates can be obtained l4 , known C, D coordinates can be obtained l5 .

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

65

Fig. 5.14 The whole process of image acquisition-recognition-feature extraction (shown is the process of extracting the feature of the ore belt image of No. 1)

5.3.3 Result of the Extraction of Mineral Zone Figure 5.14 shows the general idea of feature data acquisition feature data acquisition of mineral zone image. First, after the operating parameters of the operating parameters are adjusted, until the processing state is stable, then the industrial high-definition camera captures the image dataset of the frame and makes the corresponding number, and the computer and the camera are connected by a network cable. The numbered image is detected by the DeepLab v3+ segmentation algorithm designed, and the designed feature extraction algorithm is used to express the multi-scale geometric characteristics of mineral zone. the extracted original geometric features include the distance l1 between the separation point of mineral zone and the left borderline of the Tabling, the four borderlines l2 , l 3 , l 4 , and l 5 of the mineral belting of the tabling, the pixel area A of the shaking table surface and the area A1 of mineral zone area, as well as their ratio r and the included angle Ø of the left and right borderlines. The extraction results of mineral zone features of some samples are shown in Table 5.5. For zonal images numbered 1, 2, and 3, according to the principle of the mineral zone feature extraction algorithm of the shaking table, we learned that the pixel area A of the shaking table surface is about 48,969, but the actual area of the shaking table is 75,000 cm2 . The available area-to-pixel ratio coefficient is α = 1.53 and the length-to-pixel ratio coefficient is ε = 1.237. According to the image processing software, we can completely obtain the extraction results of rich and effective mineral zone image features of the Tabling. From the observation of the data of the three samples, the area A1 of the mineral zone fluctuates at 11,245 cm2 , the ratio r of the area of the mineral zone of the surface is about 0.15, and the fluctuation of clamping angle between the left and right borderlines is in 23°.

66

H. Liu and K. You

Table 5.5 Mineral zone image feature extraction results (in pixels, the actual length value needs to be multiplied by the pixel ratio coefficient) number r

Ø(°)

l1

l2

l3

l4

l5

l6

A1

1

0.150 23.7542 39.3147 302.8206 210.7189 105.2108 11.4042 157.5389 7357.6

2

0.150 23.5689 38.6581 300.1568 205.3659 104.3246 11.3658 156.3256 7355.3

3

0.148 22.9865 39.5632 209.3654 206.5486

0.0133 11.2653 156.3456 7258.6

5.4 Conclusion (1) Given the research of the application of DL algorithms for pattern recognition of mineral zone is less, this is an important factor leading to the low level of intelligence in the Tabling equipment. Deep semantic segmentation algorithm for recognition and feature extraction in mineral strip images was proposed by us for the first time and achieved considerable results. (2) The first industry-derived deep learning dataset for mineral zone image segmentation was produced to learn the deep segmentation model and get qualified accuracy, and various existing Deep segmentation models were compared, and the best performing DeepLab v3+ algorithm was finally chosen as the framework for learning by evaluations and comparisons of various effects. (3) Finally, the multi-scale geometric features of the mineral zone were effectively extracted using a multi-scale feature extraction algorithm, laying a crucial basis for the development of the nonlinear mapping relationship between the mineral zone and the equipment properties. Acknowledgements The authors are grateful for the experimental platform and technical assistance provided by some institutions and corporations, we extend their sincere gratitude to Jiangxi province K&R development project and fund project from Talent Project for various help and understanding throughout his work. Fund Project Innovative talent project: Jiangxi province “double thousand (JXSQ2018101046); Jiangxi Province K&R Development Project (20212BBE53026).

plan”

References 1. Liu, H. Z.: Application progress and prospects of gravity separation equipment in metal ore beneficiation in my country. Non-Ferr. Metals (mineral processing), (Supplement 1), 18–23 (2011) 2. Abaka-Wood, G.B., Quast, K., Zanin, M., Addai-Mensah, J., Skinner, W.: A study of the feasibility of upgrading rare earth elements minerals from iron-oxide-silicate rich tailings using Knelson concentrator and Wilfley shaking table. Powder Technol. 344, 897–913 (2019) 3. Keshun, Y., Huizhong, L.: Intelligent deployment solution for tabling adapting deep learning, in IEEE Access, 11, pp. 22201–22208 (2023). https://doi.org/10.1109/ACCESS.2023.3234075

5 Multi-feature Extraction of Mineral Zone of Tabling Through Deep …

67

4. Zhao, Y.L., Zhang, Y.M., Bao, S.X., et al.: Loose-layered model in the process of vanadium extraction and pre-concentration and separation from stone coal. Trans. Nonferrous Metals Soc. China 24(02), 528–535 (2014) 5. You, Keshun, and Huizhong Liu. "Research on optimization of control parameters of gravity shaking table." Scientific Reports 13.1 (2023): 1133. 6. He, L.F., Huang, S.W.: Modified firefly algorithm based multilevel thresholding for color image segmentation. Neurocomputing 240, 152–174 (2017) 7. You, K., Qiu, G., Gu, Y.: Rolling bearing fault diagnosis using hybrid neural network with principal component analysis[J]. Sensors 22(22), 8906 (2022) 8. Liu, L.M., Li, Q., Wu, T., et al.: The design and application of the automatic ore access device of the shaker. Gold 39(10), 48–51 (2018) 9. Yang, W.W., He, Q.L., Lan, X.X., et al.: Development and application of intelligent inspection robot for mineral processing shaker. Non-Ferr. Metals (Mineral Processing Part) 05, 102–106 (2020) 10. Wu, T., Yang, W. W., Guo, J. H., et al.: An intelligent control method for beneficiation shaking table. Beijing: CN108519781A, 2018-09-11 (2018) 11. Zarie, M., Jahedsaravani, A., Massinaei, M.: Flotation froth image classification using convolutional neural networks. Miner. Eng. 155, 106443 (2020) 12. Wang, L.G., Chen, S.J., Jia, M.T., et al.: Deep learning-based image recognition and beneficiation method of wolframite. Chin. J. Nonferrous Metals 30(05), 1192–1201 (2020) 13. Keshun Y.: Study on model construction and control parameter optimization of ore dressing shaking bed sorting process [D]. Jiangxi University of Technology (2022). https://doi.org/10. 27176/d.cnki.gnfyc.2022.000557 14. Liu, Y., Zhang, Z., Liu, X., et al.: Efficient image segmentation based on deep learning for mineral image classification. Adv. Powder Technol. 32(10), 3885–3903 (2021) 15. Liu, Y., Zhang, Z., Liu, X., Wang, L., Xia, X.: Ore image classification based on small deep learning model: Evaluation and optimization of model depth, model structure and data size. Miner. Eng. 172, 107020 (2021) 16. Wang, X., Zhou, J., Wang, Q., Liu, D., Lian, J.: An unsupervised method for extracting semantic features of flotation froth images. Miner. Eng. 176, 107344 (2022) 17. Luo, Y. W., Zheng, L., Guan, T., et al.: Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation. ArXiv preprint., arXiv:1809.09478 (2019) 18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3431–3440 (2015). 19. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481– 2495 (2017) 20. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer, Cham. (2015) 21. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions (2015). arXiv preprint arXiv:1511.07122 22. Chen, L. C., Papandreou, G., Kokkinos, I., et al.: Semanticimage segmentation with deep convolutional nets and fully connected crfs (2014). arXiv preprint arXiv:1412.7062 23. Chen, L.C., Papandreou, G., Kokkinos, I., et al.: Deeplab: Semantic image segmentation with deep convolutionalnets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017) 24. Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2881–2890 (2017) 25. Peng, C., Zhang, X., Yu, G., et al.: Large kernel matters--improve semantic segmentation by global convolutional network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.4353–4361 (2017)

68

H. Liu and K. You

26. Chen, L. C., Papandreou, G., Schroff, F., et al.: Rethinking atrous convolution for semantic image segmentation (2017). arXiv preprint arXiv:1706.05587 27. Filippo, M.P., Gomes, O.D.F.M., da Costa, G.A.O.P., Mota, G.L.A.: Deep Semantic Segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Miner. Eng. 170, 107007 (2021) 28. Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pp. 801–818 (2018) 29. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1251–1258 (2017) 30. Liu, H., You, K.: Research on image multi-feature extraction of ore belt and real-time monitoring of the tabling by semantic segmentation of DeepLab V3+. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds.) Advances in Artificial Intelligence and Security. ICAIS 2022. Communications in Computer and Information Science, vol 1586. Springer, Cham (2022). https://doi. org/10.1007/978-3-031-06767-9_3 31. Sun, B., et al.: An integrated multi-mode model of froth flotation cell based on fusion of flotation kinetics and froth image features. Miner. Eng. 172, 107169 (2021) 32. Li, S.N., Hua, J.G., Li, J.M., et al.: Optimal non-line-of-sight suppression localization algorithm using Helen’s formula. J. Sens. Technol. 31(2), 5 (2018)

Chapter 6

Research on Mass Image Data Storage Method for Data Center Sen Pan, Jing Jiang, Hongbin Qiu, Junfeng Qiao, and Menghan Xu

Abstract With the advancement of technology and the development of the electric power business, power enterprises have generated a large amount of image data, which contains rich potential information, and it is urgent to store massive-scale image data for further mining and analysis. Starting from the storage requirements of power grid image data, this paper expounds on the drawbacks of direct storage of small files according to the characteristics of power grid image files. Then, the architecture design and function design of the grid image small file storage are carried out, and the function implementation is carried out. Finally, the grid image small file storage method based on the SequenceFile file format is briefly summarized. Keywords Grid image data · Small files · Mass storage · SequenceFile

6.1 Introduction With the continuous development of video capture technology and network transmission technology, especially the increasing scale of power production, power companies have generated a large amount of power grid image data, which contains rich potential information. By mining valuable data, it can provide assistance to companies decision-making. The image data of power enterprises are mainly pictures in various formats, including bmp, jpg, png, raw, tiff, gif, etc. The size of the pictures is also different, and there are many small files. Under a massive scale, a big data platform is usually used for small file storage. Since a single power grid image data file is generally about several megabytes or ten megabytes, which is smaller than the S. Pan (B) · J. Jiang · H. Qiu · J. Qiao State Grid Smart Grid Research Institute Co. Ltd. Nanjing, Jiangsu 210003, China e-mail: [email protected] State Grid Key Laboratory of Information and Network Security, Nanjing 210003, China M. Xu State Grid Jiangsu Electric Power Co. Ltd. Information and Telecommunication Branch, Nanjing, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_6

69

70

S. Pan et al.

size of a single block file on a Hadoop Distributed File System (HDFS) (the default value is 64 MB) [1], if all the small files are directly stored on HDFS, it will cause a lot of pressure on the memory of the namenode. Because the inode information of each file or directory will be stored in the memory of the namenode, in the scenario of large-scale small file image storage, it will seriously restrict the access efficiency and speed of the big data platform [2], which is not conducive to subsequent mining, analysis, and utilization. Therefore, it is urgent to study a method to improve the image storage mode mainly based on small files. This paper proposes an improved storage method for massive power grid image data based on Sequence File. The idea of the method is to integrate a large number of small files in a certain order, store and record the metadata information of the small files, and directly extract the small files through the metadata information when reading. The method solves the problem of the efficiency of small file storage and reading and provides support for the storage of massive image data in the data center.

6.2 Key Technology Distributed file system: In the face of massive data storage, the traditional array and centralized storage mode have been difficult to meet the storage requirements of massive data, and there are certain problems in scalability, reliability, and high availability. By introducing a distributed file system [3], based on its core features such as distributed architecture, data block unit storage, dynamic resource expansion, data disaster tolerance, and applicability of cheap servers, it supports online secure storage of PB-level data and solves the problem of mass data storage [4]. HDFS uses multi-copy storage to provide high fault tolerance and is naturally installed and deployed on low-cost hardware [5]. It provides high-throughput data read and write capabilities through a distributed architecture. It is very suitable for the storage of massive power and big data, especially is the storage of semi-structured and grid image data. Hadoop supports heterogeneous storage functions [6]. The function of heterogeneous storage is to use different types of storage media (including HDD hard disks, SSDs, memory, etc.) on the server to provide more storage strategies which make HDFS storage more flexible and efficient to deal with various application scenarios. Stored in SequenceFile format. A SequenceFile file is a flat file designed by Hadoop to store key-value pairs in the binary form [7]. Sequence file consists of a series of binary key/value, set the key as the file name, and set the value as the file content, then combine a large number of small files into a large file. SequenceFile files have two advantages: (1) SequenceFile files are splittable, so MapReduce can divide them into chunks and operate on each chunk independently; (2) they support compression at the same time, including three compression methods: NONE (no Compression), RECORD (compresses only the value in each record), BLOCK (compresses all records in a block together).

6 Research on Mass Image Data Storage Method for Data Center

71

Columnar storage database. Non-relational columnar databases have a flexible data model [8], very high read and write performance, are easy to scale, and have low development and maintenance costs. The new columnar storage system can support the in-place update of data, avoid additional data processing and data movement. It solves the problem that the frequent updating of huge data volume and high-speed query retrieval cannot be taken into account simultaneously in the existing distributed storage components [9]. The new columnar storage system has high performance for both data scanning (scan) and random access (random access), which simplifies users’ complex hybrid architecture; has high CPU efficiency, maximizes the performance of advanced processors; has high input-output performance and makes full use of Advanced Persistent Storage Media. Kudu [10] is a storage system for storing structured data tables. Any number of tables can be defined in a Kudu cluster, and each table needs to have a predefined schema. The number of columns in each table is determined, each column requires a name and type, and one or more columns can be defined as the primary key in each table. The primary key strengthens the constraint of data uniqueness, and at the same time, like the index, it can also speed up the update and deletion of data.

6.3 System Design The core of the method is the combined compression and metadata management of small files [11]. The combined compression of small files solves the performance problems caused by the storage of many small files, and the metadata management solves the problem of rapid positioning and extraction after the combined compression of small files. Metadata management for grid image data. The metadata information of the power grid image data file includes attribute information such as image source, image name, shooting time, shooting location, image format, image description, image size, uploader name, upload time, storage path, image label, etc., as shown in Table 6.1. Based on the metadata information, a metadata information class metaDataUnit is constructed, and all fields are defined in the metaDataUnit to facilitate unified calls in the metadata interface. Grid image data storage and extraction. The data storage part is based on big data technology and uses HDFS to store power grid image data. The column database Kudu is used to store metadata information and operation information of power grid image data, etc. The data storage part provides Rest API, which supports browsing based on Data read, write, transfer, and modify operations. In the data interface part, the functions of uploading, downloading, updating, and deleting data based on Rest Api are developed to realize the unified management of power grid image data, as shown in Fig. 6.1. SequenceFile is used to store small files, and multiple small files are integrated into one large file and stored in the distributed file system. The specific process

72

S. Pan et al.

Table 6.1 Metadata information table for grid image data files Property item

Field name

Data type Remarks

Image number

img_id

String

Unique, non-null Non-null

Image name

img_name

String

Image source

img_source

String

Photo time

img_phototime

Date

YYYY-MM-DD HH:MM:SS Coordinates or address

Photo locations

img_location

String

Image format

img_filetype

String

Image description

img_desc

String

Image size

img_size

Int

Image tag

img_label

String

Uploader name

img_uploaduser String

The unit is bytes

Upload time

img_uploadtime Date

YYYY-MM-DD HH:MM:SS

Storage path

img_savepath

When the storage method is SequenceFile, the value is the path of the SequenceFile

String

Store in SequenceFile is_seqfile

boolean

SequenceFile offset 1 SeqFile_start

Int

Offset start value

SequenceFile offset 2 SeqFile_end

Int

Offset end value

Fig. 6.1 Power grid image data storage and extraction

is to first upload the grid image data files to the high-speed buffer of HDFS, then start the small file merging task periodically, traverse the specified HDFS directory, and merge all the uploaded grid image data files into the SequenceFile. Finally, the original small file is deleted, and the metadata information is updated at the same time.

6.4 Function Realization The overall design of the storage method for massive power grid image data is shown in Fig. 6.2, including data storage platform, metadata management, image data upload and merge storage, and image data download parts.

6 Research on Mass Image Data Storage Method for Data Center

73

Fig. 6.2 Overall design of storage method for massive power grid image data

(1) Data storage platform construction The data storage platform includes the storage of SequenceFile and the storage of metadata. The storage of SequenceFile adopts HDFS, and the storage of metadata adopts a new columnar database Kudu. HDFS and Kudu are deployed in the same cluster, as shown in Fig. 6.2. The HttpFS service is built on top of HDFS to provide REST HTTP interface services and support all HDFS file system operations to implement SequenceFile storage management. Impala service is built on top of Kudu to provide SQL operation entry, so that metadata operation statements can be transmitted to Impala through JDBC, and Impala interprets the relevant SQL and submits it to Kudu for execution to complete the addition, update, and query of metadata. Build high cache on HDFS. Using the heterogeneous storage function of HDFS, set a buffer directory to store the grid image data that has not been merged after uploading, and set the storage policy of the buffer directory to ALL_SSD, so that all copies in the directory are stored in SSD, which greatly improves the access speed and efficiency of buffer data. (2) Metadata Management Interface The metadata management function includes the management of addition, update and query of metadata. The main process is to synchronously call the metadata management interface during the storage process of power grid image data, so as to realize the complete storage of power grid image data. Using the new columnar database Kudu for big data, and combining it with the high-speed SQL query engine Impala, build a metadata storage database, design interfaces for adding, updating, and querying metadata, and transmit relevant SQL commands to Impala in the form of JDBC. Impala interprets the relevant SQL and submits it to Kudu for execution to implement operations such as adding and updating metadata, which is as follows: metaDataInsert(metaDataUnit unit), metaDataUpdate(metaDataUnit unit, img_id), metaDataSelect(img_id).

74

S. Pan et al.

(3) Small file merge The files mergence and storage. The file name of the grid image data is used as the key, the binary content (BytesWritable) of the image data file is used as the value, which would be stored in the SequenceFile, and the image file is synchronously recorded at the starting position of the SequenceFile. (4) SequenceFile data file upload Data upload is achieved through the HttpFS service. It is achieved by connecting to the Rest API interface of the HttpFS service through HttpClient and then calling the Put method. The main steps are as follows: 1. Create an HttpClient object: CloseableHttpClient client = HttpClients.createDefault(); 2. Create an instance of the request method and specify the request URL. Here it needs to create an HttpPut object to send a PUT request: HttpPut httpPut = new HttpPut(url). 3. Set the HTTP request header and the sent request parameters. Set the HTTP request header through the setHeader() method, write the request parameters directly behind the url through the “?” symbol, and call the setEntity(InputStreamEntity entity) method to set the file stream. 4. Call execute(HttpPut put) of the HttpClient object to send the request, and this method returns an HttpResponse. 5. Call HttpResponse’s getAllHeaders(), getHeaders(String name), and other methods to obtain the server’s response headers; call HttpResponse’s getEntity() method to obtain the HttpEntity object, which wraps the server’s response content. 6. Close the connection and release resources. When the power grid image data file is uploaded, the metadata information is collected synchronously. After the file upload is completed, the metadata interface metaDataInsert is called to save the metadata of the power grid image data file to the database. (5) Image data extraction and download Based on the grid image data name and number, the storage information is retrieved from the metadata information table. The information includes the storage path of the SequenceFile file where the image data is located and the start and end positions in the SequenceFile file. Read the SequenceFile file from hdfs according to the storage path and start and end position, and then extract the image data at the specified location from it and output it as an image file.

6 Research on Mass Image Data Storage Method for Data Center

75

6.5 Conclusion The method provides a flexible and efficient storage method for power grid image data through distributed access and storage of massive power grid image data and metadata management and storage of power grid image data files. The massive power grid image data provides a unified storage method and management method, realizes the data integration of the “loosely coupled” structure with good scalability, and ensures that the massive power grid image data can be opened and shared safely, conveniently, quickly and smoothly, and supports further data analysis and mining. In terms of technical implementation, based on Hadoop big data system, HDFS, and Map Reduce are used as effective methods to deal with massive picture data, and Kudu is used to build metadata storage module, which meets the requirements of efficient storage and management of massive picture data. This method has the advantages of good scalability, easy maintainability, high security and reliability, low development cost, and easy implementation when dealing with massive small image data. Acknowledgements This work is supported by the Science and Technology Project of State Grid Corporation of China (Research on Intelligent Data Management and Feature Engineering Technology for Middle Platform, No. 5700-202058480A-0-0-00).

References 1. Ahad, M.A., Biswas, R.: Dynamic merging based small file storage (DM-SFS) architecture for efficiently storing small size files in hadoop. Proc. Comput. Sci. 132, 1626–1635 (2018) 2. Masadeh, M.B., Azmi, M.S., Ahmad, S.S.S.: Available techniques in hadoop small file issue. Int. J. Electr. Comput. Eng. (IJECE) 10(2), 2097–2101 (2020) 3. Du, Z.N., Zhu, C.J.: A survey of distributed file system. Softw. Eng. Appl. 6(2), 21–27 (2017). (In Chinese) 4. Merceedi, K.J., Sabry, N.A.: A comprehensive survey for hadoop distributed file system. Asian J. Res. Comput. Sci. 11(2), 46–57 (2021) 5. Liu, J., Leng, F.L., Li, S.Q., Bao, Y.B.: A distributed file system based on HDFS. J. Northeast. Univ. (Nat. Sci.) 40(06), 795–800 (2019). (In Chinese) 6. Chen, L., Wu, X.H.: Design of distributed cluster big data dynamic storage system based on hadoop. J. China Acad. Electron. Inf. Technol. 14(06), 593–598 (2019). (In Chinese) 7. https://cwiki.apache.org/confluence/display/HADOOP2/sequencefile 8. Pan, S., Zhu, L.P., Qiao, J.F.: An open sharing pattern design of massive power big data. In: 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 5–9 (2019). https://doi.org/10.1109/ICCCBDA.2019.8725750 9. Hassan, M.U., Yaqoob, I., Zulfiqar, S., Hameed, I.A.: A comprehensive study of HBase storage architecture—a systematic literature review. Symmetry 13(1), 109 (2021) 10. https://kudu.apache.org/docs/index.html 11. Liu, C.L.: Research on Massive Picture Storage Technology. University of Electronic Science and Technology of China (2020). (In Chinese)

Chapter 7

Research on Virtual and Real Spatial Data Interconnection Mapping Technology for Digital Twin Zhimin He, Lin Peng, Hai Yu, and He Wang

Abstract In recent years, distributed photovoltaics, decentralized wind power, new loads, electric vehicles, etc. have been connected to the power grid in a wide range, and the distribution network has become more active, with strong fluctuations, and large peak-to-valley differences. The management and coordination of distributed resources interaction put forward higher requirements. This paper conducts research on the data interconnection and mapping technology of virtual and real space for digital twins. The first step is to study the modeling and driving technology of data interconnection, fusion mechanism model, and information model based on the real twin of the power grid. Coordinate data and monitoring data of main and auxiliary equipment and the data interconnection mechanism of the power grid real scene twin, build a dual-mode driving method that integrates the mechanism model and data model; the second step is to study the power grid “virtual-real” space multi-service continuous mapping mechanism and real-time data. The panoramic mapping method realizes the continuous and real-time mapping of point data and surface data to the digital twin and realizes multi-block, multi-level, and multi-type mapping according to the steps of feature extraction, feature matching, model parameter estimation, transformation, and interpolation. The digital twin data is accurately matched; the third step is to study the efficient transmission and update integration technology of twin data in the power grid environment, research methods to improve the real-time and concurrency of data transmission, and research multi-source heterogeneous data dynamic update and integration technology. The fourth step is to study the two-way intelligent cooperation and integration technology of station equipment, realize realtime data interaction and state interaction between the virtual digital model and the real physical model, and realize the interconnection and mapping between the station panoramic model and the virtual and real data. Z. He (B) · L. Peng · H. Yu · H. Wang State Grid Smart Grid Research Institute Co.,LTD, Nanjing 210003, Jiangsu, China e-mail: [email protected] Z. He State Grid Laboratory of Power cyber-Security Protection and Monitoring Technology, Nanjing,Jiangsu 210003, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_7

77

78

Z. He et al.

Keywords Digital twin · Fake and real data · Data interconnect mapping

7.1 Introduction 7.1.1 Research Background On January 29, 2022, the National Development and Reform Commission and the National Energy Administration issued the “14th Five-Year Plan for Modern Energy System”, proposing to accelerate the promotion of green and low-carbon transformation of energy, adhere to ecological priority and green development, and expand the clean energy industry. The renewable energy substitution action promotes the construction of a new power system and promotes the gradual increase in the proportion of new energy [1]. Driven by the strategy of clean replacement and electric energy replacement, distributed photovoltaics, decentralized wind power, new loads, and electric vehicles have been connected to the grid in a wide range. At present, the power grid is still unable to efficiently store and transport electric energy on a large scale [2]. It is necessary to build a complex power transmission network to transmit the electric energy generated on the power generation side to the power consumption side to serve the actual production and life. However, as the scale of the power transmission network continues to expand, there are more and more types and functions of equipment in the system, and the complexity of the system shows an explosive growth. In addition, the existing clean energy system including wind power generation, photovoltaic power generation, energy storage power stations, and other elements is significantly affected by natural factors such as weather and light, which will aggravate the fluctuation and uncertainty of the power grid system. The above two factors lead to the increasing difficulty of system management and control, and the traditional large-scale power grid management and control methods face more and more challenges. Effective management and control of complex power grid systems to achieve long-term safe and stable operation of large power grids has become a new problem that needs to be solved urgently [3]. Digital twins construct virtual digital images of physical systems in the real world, which run through the entire life cycle of the physical system and evolve dynamically with the physical system. Digital twin technology is deeply embedded in digital business applications and compared with traditional power system modeling analysis, it further enhances perception, cognition, intelligence, and control [4]. At the same time, the power grid topology simplifies the complex connection relationship between devices and is a universal expression for analyzing the operation status and connection relationship of the power grid. However, the traditional topology representation of the power grid cannot present the information such as physical equipment space and environment corresponding to 2D entity symbols, which makes it difficult to further carry out 3D space analysis and deduction for equipment and facilities on the basis of 2D complex network. At present, the existing digital twins in substations and new energy stations tend to be static three-dimensional models. On the one hand, they

7 Research on Virtual and Real Spatial Data Interconnection Mapping …

79

only present the three-dimensional display of physical entities and data superposition, lacking the composite embedded mapping and linkage of power semantics such as electrical topology of equipment, while the generic digital twin modeling is difficult to grow synchronously with the dynamically evolving power system, resulting in the lack of authenticity of digital twins. On the other hand, most of the twins lack the numerical analysis and prediction of their internal characteristics and evolution laws, and there is no good reasoning tool to support the future state analysis of various grid facility twins [5].

7.1.2 Purpose and Significance In general, the power grid equipment information platform has insufficient virtual and real mapping interaction capabilities, and it is difficult to support business innovation and intelligent interaction such as power grid production and operation, safety supervision, and management. The digital twin uses a physical model, sensor update, operation history, and other information to complete the physical mapping of power grid equipment in the virtual space. It has the characteristics of “full life cycle, real-time/quasi-real-time, and two-way”, and can be widely used in equipment environment monitoring, power transmission, transformation, and transportation. For business scenarios such as inspection, on-site safety monitoring, and virtual power plants, to realize the transformation and upgrading of traditional power grid services with digital technology, and to improve the production efficiency of all factors, it is necessary to realize the efficient mapping of physical-digital space through the construction of twin models and promote the diagnosis based on the digital twin of the power grid, forecasting, decision-making application development [6].

7.1.3 Research Level at Home and Abroad In recent years, foreign research on digital twin technology has achieved rapid development both at the theoretical level and at the application level. General Electric Company (GE) and the University of Cincinnati apply digitization from design to maintenance to optimize product production but have yet to implement a unified modeling technology for digital twins. ANSYS Corporation of the United States proposed the ANSYS Twin Builder technical solution to create digital twins and quickly connect to the industrial Internet of Things to improve product performance, reduce the risk of unexpected downtime, and optimize next-generation products. Compared with the rapid development momentum abroad, domestic research on digital twin technology is still in its infancy. Li et al. [7] proposed a digital twin design framework to describe complex products and explored key technologies in the development process. Tao et al. [8] proposed the concept of digital twin fivedimensional model and prospected the application prospect of this model in 10

80

Z. He et al.

different fields. Zhuang et al. [9] analyzed the similarities and differences between big data and digital twin technology from multiple perspectives and how to promote the realization of intelligent manufacturing. The Industrial Internet Industry Alliance [10] summarizes the key technologies of digital twins in cyber-physical systems and describes the implementation of digital twin technology in the product life cycle.

7.2 Key Technology Research The general technical route of the subject is shown in Fig. 7.1, which is divided into three steps to carry out the research in turn. The first step is to study the modeling and driving technology of data interconnection, fusion mechanism model, and information model based on the real twin of the power grid. The data interconnection mechanism of 3D real video data, image data, spatial coordinate data, monitoring data of main and auxiliary equipment, and the real twin of the power grid is studied, and the dual-mode driving method of fusion mechanism model and data model is built. The second step is to study the “virtual-real” space multi-service continuous mapping mechanism and real-time data panoramic mapping method to realize the continuous mapping and real-time mapping of point data and planar data to digital twins. According to the steps of feature extraction, feature matching, model parameter estimation, transformation, and interpolation, the accurate matching of multi-block, multi-level, and multi-type digital twin data is realized. The third step is to study the efficient twin data transmission and update integration technology in the power grid environment. Research methods to improve the real-time and concurrent data transmission, and study the dynamic update and integration technology of multi-source heterogeneous data.

7.2.1 Data Interconnection, Fusion Information Model Modeling, and Driving Technology Based on Power Grid Reality Twins Facing the needs of multi-service immersive interaction with digital twins as the carrier, study the data interconnection mechanism of 3D real video data, image data, spatial coordinate data, and monitoring data of main and auxiliary equipment and the real twin of the power grid. Bit management, inspection fault handling, and abnormal status tracking can improve the interactive performance of the real-world 3D digital twin and enhance human-computer interaction capabilities. In order to meet the three-dimensional scene inspection and operation visualization of the power operation environment, realize the panoramic three-dimensional interactive operation of the power transmission and distribution scene, and visualize the data of the entire power business scene. The comprehensive use of memory

7 Research on Virtual and Real Spatial Data Interconnection Mapping …

81

Fig. 7.1 Total technical route

file mapping, visibility discrimination, and multi-level LOD technology reduces the number of point cloud renderings, which can achieve fast and efficient point cloud rendering on ordinary PCs, and support real-time monitoring of 3D scenes in power transmission and distribution scenarios. Based on the multi-dimensional spatiotemporal data fusion analysis technology of substation equipment, the research uses neural network and D-S reasoning method to realize the fusion analysis of equipment holographic state data, inspection data, and real 3D model. Using the data calculation method considering the weighted fusion method and the ratio method, the mathematical model of the three-level fusion calculation of the pixel-feature-decision of multi-dimensional data is established, and the

82

Z. He et al.

characteristic values effectively corresponding to the physical equipment and the digital twin model can be effectively fused and analyzed. Effective fusion analysis of sampled data, use principal component transformation to remove redundancy of feature data, separate information to eliminate correlation, improve the independence of fusion data, and correct the correctness of data to achieve effective fusion analysis of value data. The hat transformation method realizes information compression, simplifies the fusion analysis process of big data and multimedia, heterogeneous and unstructured data, and realizes effective fusion analysis in the dynamic evolution of multi-temporal data, uses multi-protocol standardization methods to realize multimedia, heterogeneous Integrate unstructured data, establish a unified mathematical fusion model, and realize the rapid identification of characteristic data; use neural network and D-S reasoning method to establish a high-precision data preprocessing model to improve the accuracy and reliability of fusion calculation.

7.2.2 Multi-service Continuous Mapping Mechanism and Real-Time Data Panorama Mapping Method in Power Grid “Virtual-Real” Space In the mapping from physical space to digital space at the geometric behavior level, it is necessary to map data to digital twins, and through data visualization methods, multi-source heterogeneous data can be effectively transmitted. In data visualization, there are mainly the problems of multi-source heterogeneous data integration, realtime data transmission and concurrency of the monitoring system, and multi-source heterogeneous data visualization mapping in the processing process. First, for the device monitoring single-dimensional state data, the point-like data of the device state is marked into the anchor point of the digital twin through the association between the device and the model entity. Secondly, for the fixed-point cameras deployed in the actual scene, the 2D-3D registration algorithm is used to accurately calibrate the spatial coordinates of the inspection points. The inspection points match the multi-angle cameras and establish a logical association with the anchor points in the digital twin. The real-time continuous images collected by the camera are projected into the digital twin through coordinate transformation. Then, for the mobile inspection terminal in the actual scene, on the basis of sensor data collection, through self-positioning and perspective coordinate transformation, the actual sensor 2D observation data is transformed and mapped into the 3D digital twin, and the 3D engine such as WebGL is used to map the actual sensor to the 3D digital twin. Real-time data is projected and rendered, thereby providing the visual monitoring function of panoramic spatiotemporal data of the scene. Based on the multi-dimensional digital twin fusion and registration technology of substation equipment, according to the steps of feature extraction, feature matching, model parameter estimation, transformation, and interpolation, the accurate matching of multi-block, multi-level, and multi-type digital twin data is realized, and the image

7 Research on Virtual and Real Spatial Data Interconnection Mapping …

83

data is realized. Taking image registration as an example, the partial volume interpolation method is used to linearly distribute the gray value of the digital twin to improve the smoothness of the distribution of the objective function and realize image registration. It can solve the interface technical problems of various interactive information in the registration application, increase the accuracy of registration, and realize the matching of multi-dimensional and multi-temporal data, the image registration method based on transformation domain information is adopted, and the statistical feature matching is used. The image registration method based on transform domain information is adopted, and the statistical feature matching and subspace search method are used for multi-resolution matching to achieve the optimal registration parameter set. The image registration algorithm based on similarity measure function was used to register the feature image and information, and the corresponding relationship between the registration elements was established. The geometric transformation model and its reference values between the reference image and the floating image were calculated to achieve the image registration with feature accuracy and non-feature height. The region growing method, image cutting method, and second-order difference curve detection algorithm are used to determine the transformation parameters, and the accurate feature matching between the device entity and the 3D digital twin is realized.

7.2.3 Efficient Transmission and Update Integration Technology of Twin Body Data in the Power Grid Environment The real-time and concurrency of multi-source heterogeneous data transmission is an important indicator of the virtual monitoring system. The real-time performance is the basis for the stable operation of the monitoring system, and the concurrency reflects the load capacity of the monitoring system. On the one hand, the real-time multi-source heterogeneous data in the monitoring system needs to be used for realtime visual monitoring, and on the other hand, it needs to be persisted to facilitate data traceability. If the integrated data is directly sent to the visual mapping module for persistence, a relatively large heap space is required in the computer for storing constants, which has a great impact on the operation of the virtual monitoring system. If the integrated data is persisted first and then sent to the visual mapping module, it will have a certain impact on real-time performance. In addition, if there are high concurrent requests for data interaction, the correctness of the data will be affected. This problem can be well solved by realizing the asynchronous transmission mode of data with the help of caching technology and message queue technology: visual mapping requires high real-time performance, and real-time integrated data is entered into the cache for visual mapping; data interaction and data persistence are very important to data. The correctness of the request is higher, and the operation request

84

Z. He et al.

is added to the message queue, and then the operation request is sent out of the message queue to perform the requested operation. To realize multi-source heterogeneous data integration, data is the core of operation and maintenance monitoring. On the one hand, there are many data sources, such as business systems, sensors, etc. On the other hand, the data structure is different, including structured data and unstructured data. Due to the multi-source and heterogeneity of the data, the access to the data in the monitoring system is relatively complicated, and the system development is relatively difficult. In order to improve the performance of data access and the efficiency of system development, it is necessary to integrate multi-source heterogeneous data. Through data integration, data that are related to each other, from different sources, and of different structures are integrated, thereby improving data consistency and data utilization. The difficulties of multi-source heterogeneous data integration in the CNC machining process are in two aspects: (1) Heterogeneity: The data model is heterogeneous, and the types, semantics, and expression forms of data are different. The first thing to be solved in data integration is data encapsulation and unified expression. (2) Distributed: Each data source is relatively independent, and there are certain problems in the performance and development difficulty of data collection and transmission. It is necessary to design efficient data collection and transmission methods. Therefore, it is necessary to design algorithms for the unified expression of multi-source heterogeneous data and data integration, and data collection needs to be designed at the monitoring system level.

7.3 Conclusion This paper proposes a spatial information mapping, embedding, and interconnection solution for the power grid digital twin, which provides panoramic multi-dimensional information resources for power grid production collaboration, and solves the problems of difficult entity semantic labeling, incomplete information spatiotemporal interconnection, and insufficient depth of information fusion. Acknowledgements This work is supported by National Key R&D Program of China (Research and Pilot Plant of Digital Twin Technology for Clean Energy System in Green LowCarbon Community, No. 2022YFE0105200).

References 1. National Development and Reform Commission and National Energy Administration “14th Five-Year Plan for Modern Energy System” (2022). https://www.ndrc.gov.cn/xxgk/zcfb/ghwb/ 202203/t20220322_1320016.html?state=123&code=&state=123. (In Chinese) 2. Huang, Z.: Distribution network planning practice based on power supply reliability. Distrib. Utiliz. 34(3), 42–46 (2017)

7 Research on Virtual and Real Spatial Data Interconnection Mapping …

85

3. Li, X.S., Wang, X.: Metagrid: system and architecture of a new generation of smart grid based on parallel grid. Chin. J. Intell. Sci. Technol. 3(4), 387–398 (2021). (In Chinese) 4. Li, H., Wang, H.Q., et al.: Concept, system structure and operating mode of industrial digital twins system. Comput. Integr. Manuf. Syst. 27(12), 3373–3390 (2021). (In Chinese) 5. Dong, L., Song, H., et al.: Application of twin system technology in smart substation. Telecom. Power Technol. 38(4), 146–148 (2021). (In Chinese) 6. Liu, H.L.: Digital twin technology for smart energy system and its application. Electron. Comp. Inf. Technol. 4(12), 168–169 (2022). (In Chinese) 7. Li, H., Tao, F., Wang, H.Q., et al.: Integrated development framework and key technologies of complex product design and manufacturing based on digital twin. Comput. Integr. Manuf. Syst. 25(6), 1320–1336 (2019). (In Chinese) 8. Tao, F., Liu, H.R., Zhang, M., et al.: Digital twin five-dimensional model and application in ten fields. Comput. Integr. Manuf. Syst. 25(1), 1–18 (2019). (In Chinese) 9. Zhuang, C.B., Liu, J.H., Xiong, H., et al.: The connotation, architecture and development trend of product digital twin. Comput. Integr. Manuf. Syst. 23(4), 753–768 (2017). (In Chinese) 10. Industrial Internet Industry Alliance, Industrial Digital Twin White Paper (2021). (In Chinese)

Chapter 8

Prediction of Breast Cancer Via Deep Learning Yihe Huang

Abstract With the continuous progress of machine learning in image processing, artificial neural networks have more and more applications in medical image processing. Aiming at the method of selecting a BP neural network to realize the diagnosis of breast cancer pathological images, the method of deep learning is selected for prediction, which improves the prediction accuracy and obtains a more effective diagnosis effect. Keywords Deep learning · Breast cancer · Pathology image · Cancer prediction

8.1 Introduction In recent years, the industrial reform of artificial intelligence (AI) has made remarkable progress all over the world. In the field of AI in the medical field, departments such as IBM Watson and Health are widely using medical AI services for diagnosis support [1]. Meanwhile, the application of big data analysis using AI in the field of medical information is also advancing, one of which is to predict the probability of cancer or whether it is cancer. In 2020, there are almost 10 million deaths from cancer worldwide. It is notable that, compared to data from the 2018 Global Cancer Statistical Report, in 2020, women’s breast cancer overtook lung cancer to become the cancer with the highest cancer incidence in the world, with a total of 685 000 deaths [2]. The diagnosis of breast cancer mainly relies on two methods: pathological image diagnosis and medical image diagnosis. Among them, pathological image diagnosis is one of the methods for diagnosing gastric cancer, which is the most reliable. Compared with other methods such as medical image diagnosis, pathological image diagnosis is more accurate and more suitable as a basis for the diagnosis. With the rapid development in computing in recent years, deep learning is making inroads into image processing Y. Huang (B) College of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_8

87

88

Y. Huang

[3], and medical images, as a sort of structured data, are easily handled by deep learning approaches. Deep learning has been widely and thoroughly studied [4–6] and its application in the areas of breast cancer, prostate cancer, and brain disease, and its diagnostic accuracy has reached over 95%. This method does not require the operators to have Expertise in cancer tissue identification, but rather replaces manual extraction with automatic extraction of pathological tissue [7]. Using BP neural network, through the MLPClassifier model of sklearn, after establishing the model by dividing the training set and the test set, perform visual analysis, using the model for the prediction can achieve a prediction accuracy of about 94%. At the same time, it can reduce the workload of doctors, so it has high practical value. In previous studies by scholars, Dong et al. proposed a feature fusion-based convolutional neural network breast cancer image classification method, which utilizes the feature of automatic feature extraction by convolutional neural networks to first fuse the features extracted from the two structures and then classify the fused features using a classifier, and the method achieved 89% classification AUC on the breast cancer image dataset BCDR-F03 [8]. Yuan used a BP neural network to analyze and classify the breast cancer pathology data with an accuracy of 95.1% using its excellent feedforward mechanism and error feedback mechanism [9]. Zhang, et al. chose to fuse GoogLeNet and AlexNet models, which have performed well in image classification, for the diagnosis of gastric cancer pathology images, and achieved a sensitivity of 97.6% and specificity of 99% [10]. In this paper, the deep learning method is applied to the identification of breast pathological sections, and the breast cancer data set of the Breakhis database is used to further improve the prediction accuracy.

8.1.1 Introduction to Deep Learning Deep Learning and Convolutional Neural Networks Deep learning is a branch of machine learning. An artificial neural network is used as an architecture for data representation learning. Since the concept of deep learning was introduced in 2006, in the following decade, deep learning has continued to break through many fields such as computer vision [11]. The idea of deep learning is to build a neural network containing multiple hidden layers, to represent the attributes and features of the data by combining lower features to form more abstract features. A distributed feature representation of data was discovered. Deep learning simulates the operation principle of human brain neurons using deeper networks and nonlinear correlation mapping structures, compared to shallow networks of conventional machine learning. It is possible to solve the problem of the disappearance of the gradient in the conventional machine learning method and the problem which is easy to fall into the local optimum solution [12]. Figure 8.1 shows a simple deep learning network model.

8 Prediction of Breast Cancer Via Deep Learning

89

Fig. 8.1 A simple network model

Neural networks are mainly composed of three parts: neurons, layers, and networks. Each layer has a number of neurons, and each layer is connected to form a neural network. The difference between deep learning and classical machine learning is that the deep learning has a deeper structure of the network, as shown in Fig. 8.1. Conversely, if the network layer gets deeper, the error will nearly disappear if it is propagated back to the previous layer. Convolutional neural network (CNN) is a feedforward neural network dating back to 1962. Two biologists Hubel and Wiesel proposed the concept of the receptive field in cats’ visual cortex through the study of cats’ vision and found that some cells were very sensitive to visual input [13]. Fukushima Kunihiko Fukuyama proposed a neural cognitive device model based on Hubel and Wiesel’s local receptive field theory in 1980. This is the first CNN network model that people implemented [14]. The advantages of convolutional neural networks are in local connections and weights sharing. In the conventional artificial neuron network, one neuron is connected to all neurons in the adjacent layers, and the computational complexity increases exponentially when the network scale continues increasing. The convolution neural network absorbed the concept of the local sensitivity field. In the convolution layer, the neurons effectively control the parameter scale and the computational complexity only by establishing a connection with neurons in the receptive field of the preceding layer through convolution nuclei. The convolution neural network structure includes a convolution layer, a pool layer, and a full connection layer, wherein the convolutional layer and the pool layer appear alternately, and the entire connection layer is typically placed at the end of the network. Convolution neural networks can efficiently extract features from the data [5]. Figure 8.2 is a schematic diagram of convolution and integration operations on pathological slices. Based on the trait of the image, we can abstract the local features of the image by performing convolution operations on all the different sensory fields using a shared convolution kernel. For a multi-channel feature map, The calculation of a single target feature point after convolution is shown in Formula (8.1). z = f ωTd1 xd1 + ωTd2 xd2 + · · · + wdTn xdn + b

(8.1)

90

Y. Huang

Fig. 8.2 Schematic diagram of convolution operation on pathological slices

where ω denotes the convolutional kernel weights in the target feature map, ωT denotes the T-th feature map of the previous layer, x denotes the input feature map, dn denotes the position of each pixel point and b denotes the bias value. When building a convolutional neural network, it is often used after the convolutional layer to reduce the feature dimension of the convolutional layer output by pooling, which effectively reduces the network parameters and prevents overfitting, thus improving the training efficiency. Pooling methods include average pooling, combinatorial pooling, maximum pooling, random pooling, etc. Compared with fully connected networks, convolutional neural networks have a strong robustness, especially when the image data are large [15]. Keras Framework Keras is a deep learning framework featuring ease of use, abstraction, compatibility, and flexibility. Keras also includes many kinds of modules, including optimization method selection module, objective function module, activation function module, parameter initialization module, layer module, preprocessing module, and model selection module. Among them, the optimization method selection module, the objective function module, and the activation function module integrate many of the latest and better optimization methods, objective functions, and activation functions in recent years. Using these functions can greatly speed up the modeling speed and is critical to the neural network. The parameters are better improved, which is also a very big advantage of this framework.

8 Prediction of Breast Cancer Via Deep Learning

91

8.2 Approach In this paper, we will use a deep learning approach for the identification of breast pathological sections and further improve the prediction accuracy using the breast cancer dataset from the BreakHis database.

8.3 Method 8.3.1 Experimental Tool First, the pathological images are preprocessed by stacking, cropping, etc., to obtain a data set that can be used to train a neural network, and then send the treated data to the model to train. Using Anaconda 3, PyCharm as the basic development environment, the development training tool uses the free software machine learning library scikit-learn for the Python programming language, and the matplotlib library is used for 2D graphics drawing. The deep learning framework uses the sequential model provided by Kreas. It provides methods to define complete computational graphs. By adding layers to an existing model or computation graph, we can build complex neural networks, including CNN and RNN, and more. By stacking many layers, a deep neural network is constructed. This brings great convenience to the application of deep learning.

8.3.2 Data Set In this experiment, The Breast Cancer Histopathological Image Classification (BreakHis) was used, BreakHis database contains images of microscopic biopsies of benign and malignant breast tumors. Institutional review board approval was obtained and all patients provided written informed consent. All data is anonymous. So far, it has 2,480 benign samples with 5,429 malignant samples [16–18]. A sample data set is shown (Fig. 8.3).

8.3.3 Experimental Indicators In the study of dichotomous classification problems, experiments are often judged using precision, recall, and F1 metrics, which are based on true positive cases (TP), true negative cases (TN), false positive cases (FP), and false negative cases (FN),

92

Y. Huang

Fig. 8.3 Malignant tumors of the breast as observed at different magnification rates: a 40×, b 100×, c 200×, and d 400×

where precision (Formula 8.2) reflects the number of correctly predicted samples out of all predicted samples, and recall (Formula 8.3) is for the actual samples and reflects how many samples are accurately predicted. P= R=

TP TP+FP TP TP+FN

(8.2) (8.3)

The F1 score is a weighted average of the accuracy and recall of the classification model, with a maximum value of 1 and a minimum value of 0. The formula is shown in Formula (8.4). From the formula, we can see that the higher the value of F1, the better the model prediction. F1 = 2 ∗

P∗R P+R

(8.4)

8 Prediction of Breast Cancer Via Deep Learning

93

8.3.4 Structural and Parametric Design Load and Tag Images Images are classified into benign and malignant, and all images are loaded into the corresponding folders. Then use the NumPy library to create the array and import the array. Classification Dataset The data set is divided into two groups, with 80% of the training set and 20% of the test set. Some sample benign and malignant images are shown (Fig. 8.4). Model Building The Deep Learning Toolbox model for the DenseNet201 network is DenseNet-201 is a convolutional neural network that is 201 layers deep [19]. DenseNet201 model is a pre-trained model, using pre-trained model can reduce the number of model parameters and simplify the construction of the model. Use DenseNet201 as the pre-training weights. And set the learning rate to 0.0001.

Fig. 8.4 Some of the benign and malignant images

94

Y. Huang

Fig. 8.5 a Standard neural net; b after applying dropout

On this basis, the GlobalAveragePooling layer and the Dropout layer are used to reduce overfitting. The purpose of GlobalAveragePooling layer is to replace the traditional fully connected layers in CNN and reduce the amount of parameters. GAP enforces the correspondence between feature maps and categories. For the convolutional structure, this conversion is more natural, and the GAP layer has no parameters for Optimized to avoid overfitting. Dropout can also significantly reduce overfitting by ignoring half the number of feature detectors (making half of the hidden layer nodes 0) (Fig. 8.5). Use batch normalization (BatchNormalization) and a fully connected layer with 2 neurons with softmax as the activation function for benign and malignant of 2 output classes. Using BN reduces the gradient disappearance and speeds up the convergence process. It can also prevent overfitting to some extent. Finally use the Adam optimizer, using binary cross-entropy (Formula 8.5) as the loss function. Loss = − N1

N

yi · log(p(yi )) + (1 − y) · log(1 − p(yi ))

(8.5)

i=1

where yi is the binary label 0 or 1 and p (yi ) is the probability that the output belongs to the yi label. Figure 8.6 shows the output shapes and parameters in each layer.

8.3.5 Experimental Results and Analysis Model Training According to the above principle, the Pycharm tool was used to program and the ReduceLROnPlateau function was set to ensure that when the metric stops improving, the learning rate (Fig. 8.7) is reduced. The ModelCheckpoint function was used to save the best-trained model in training. Output confusion matrix (Fig. 8.8) during training. The using of confusion matrix can overcome the limitations imposed by using classification accuracy alone.

8 Prediction of Breast Cancer Via Deep Learning

95

Fig. 8.6 Each layer of learning model

Fig. 8.7 Learning rate and loss

Result The experimental results use the AUC value as the criterion and output the ROC curve (Fig. 8.9). After 20 iterations, the final result can be got (Table 8.1). It can be seen that the highest accuracy rate can reach 98.3%.

96

Y. Huang

Fig. 8.8 Confusion matrix

Fig. 8.9 ROC curve

Table 8.1 Result Accuracy (%)

Precision

Recall

F1 score

ROC-AUC

98.3

0.65

0.95

0.77

0.692

8.4 Conclusion Through deep learning, in this paper, a neural network is built to learn and classify breast pathology image data, such that in order to achieve breast cancer prediction. The softmax activation function in the Keras deep learning framework and the adaptive gradient descent method Adam are combined with the neural network, so that the

8 Prediction of Breast Cancer Via Deep Learning

97

network has better performance. Simultaneously, ModelCheckpoint and ReduceLROnPlateau within the framework are used to improve the speed of model training and to optimize the training procedure. The verification results show that the model can be used to predict breast cancer and achieve high prediction accuracy, which has good guidance and significance for practical medical applications.

References 1. IBM Watson Health. Health care provider solutions for the modern health system. https://www. ibm.com/watson-health/solutions/healthcare-provider. Accessed 21 July 2021 2. Zhang, et al.: Global and Chinese breast cancer incidence and death trends. Electron. J. Compreh. Cancer Treat. 7(02), 14–20 (2021) 3. LeCun, Y., Bengio, Y., Hinton, G.: Deeplearning. Nature 521(7553), 436–444 (2015) 4. Chen, H., Dou, Q., Wang, X., Qin, J., Heng, P.A.: Mitosis detection in breast cancer histology images via deep cascaded networks. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) 5. Kwak, J.T., Hewitt, S.M.: Multiview boosting digital pathology analysis of prostate cancer. Comput. Meth. Prog. Biomed. 142, 91–99 (2017) 6. Sarraf, S., Tofighi, G.: Deep learning-based pipeline to recognize Alzheimer’s disease using fMRI data. In: 2016 Future Technologies Conference (FTC), pp. 816–820. IEEE (2016) 7. Zhang, Q.L., Zhao, D., Chi, X.B.: Review of medical imaging diagnosis based on deep learning. Comput. Sci. 44(B11), 1–7 (2017) 8. Dong, Y.F., et al.: Feature fusion based convolutional neural network for breast cancer image classification. J. Hebei Univ. Technol. 47(06), 70–74 (2018) 9. Yuan, X.H.: Analysis and prediction of breast cancer diagnosis based on BP neural network. Dalian University of Technology, MA Thesis (2019) 10. Zhang, Z.C., et al.: A deep learning-based approach to classify pathological images of gastric cancer. Comput. Sci. 45(S2), 263–268 (2018) 11. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 12. Chen, X.Y.: The application of machine learning algorithms in data mining applications. Mod. Electron. Technol. 38(20), 11–14 (2015) 13. Kandel, E.R.: An introduction to the work of David Hubel and Torsten Wiesel. J. Physiol. 587(Pt 12), 2733 (2009) 14. Fukushima, K.: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980) 15. Qiu, W.G., Xie, J., Shen, Y., Xu, J., Liang, J.: Endoscopic image recognition method of gastric cancer based on deep learning model. Expert. Syst. 39(3), e12758 (2021) 16. Spanhol, F.A., Oliveira, L.S., Petitjean, C., Heutte, L.: A dataset for breast cancer histopathological image classification. IEEE Trans. Biomed. Eng. 63(7), 1455–1462 (2015) 17. Spanhol, F.A., Oliveira, L.S., Petitjean, C., Heutte, L.: Breast cancer histopathological image classification using convolutional neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 2560–2567. IEEE (2016) 18. Spanhol, F.A., Oliveira, L.S., Cavalin, P.R., Petitjean, C., Heutte, L.: Deep features for breast cancer histopathological image classification. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1868–1873. IEEE (2017) 19. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

Chapter 9

Scheme Design of Network Neighbor Discovery Algorithm Based on the Combination of Directional Antenna and Multi-channel Parallelism Na Zhao, Fei Gao, and Kaijie Pu Abstract The existing deign has the dynamic blind zone with high energy efficiency, which is lower the quality of connection in the place like airport. The asynchronous, heterogeneous, and multi-channel fusion mobile Ad-Hoc network with multi-channel and multi-interface is an inevitable choice to further improve the communication bandwidth and communication quality. The key problem is to study and design a fast neighbor discovery strategy that integrates “directional antenna and multichannel parallel” in the case of limited energy. This paper proposed a neighbor discovery algorithm combining directional antenna and multi-channel parallelism. In more details, our framework uses multi-channel and multi-interface asynchronous, heterogeneous, and multi-channel fusion mobile Ad Hoc network, and this optimized algorithm enables fast neighbor discovery in energy-constrained situations. Keywords Ad-hoc network · Network neighbor discovery algorithm · Directional antenna · Multi-channel parallelism

9.1 Introduction With the rapid development of airport WIFI applications, single channel connection is hard to meet the requirements of fast and quality connection. As we can see in Fig. 9.1, considering the irregularity of nodes in the practical application of airports, N. Zhao (B) · F. Gao Beijing Polytechnic, Beijing, China e-mail: [email protected] F. Gao e-mail: [email protected] N. Zhao Assumption University, Bangkok, Thailand K. Pu China Power Complete Equipment Co. Ltd., Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_9

99

100

N. Zhao et al.

Fig. 9.1 When the WIFI user moves from one zone to the other zone, there may be a blind zone

the single channel neighbor discovery algorithm face the problems like blind zone and delay of the communication, low communication quality [1]. Multi-channel can further reduce neighbor discovery delay and improve communication quality. The good design of multi-channel can solve those problems and reduce the energy consumption at the same time. Furthermore, the opportunity of neighbor discovery can be greatly increased, so as to improve the energy efficiency.

9.2 Neighbor Discovery Protocol Based on Directional Antenna and Multi-channel 9.2.1 Efficient Unidirectional Neighbor Discovery Strategy The directional antenna has a high transceiver gain in a specific direction, but a small gain in other directions, so it has the advantages of long transmission distance, strong anti-interception ability, and high channel utilization rate [2]. However, the introduction of directional antennas makes it more difficult for neighbor discovery, and neighbors must meet the following 3 conditions for mutual discovery [3, 4]: (1) Both sides of communication must exchange information at the same time; (2) The beams of both sides of communication must be pointing at each other at the same time; (3) The distance between the two nodes should be less than the maximum transmission distance in this scenario. The time required for the RF module to transmit from one direction to the other is far less than its conversion time from sleep to wake up. Therefore, in the research of the directional antenna, two wake-up time slots with different lengths can be considered to further greatly improve the performance. Although this idea is similar to multichannel parallelism, it is necessary to design an appropriate algorithm according to the characteristics of directional antennas. On the other side, in a unidirectional

9 Scheme Design of Network Neighbor Discovery Algorithm Based …

101

Fig. 9.2 Communication success case

Fig. 9.3 Collision case

connection, one node A can receive a signal from another node B, but node B may not be able to receive a signal from node A. Wireless connections are inherently unreliable and asymmetric, which naturally leads to unidirectional neighbor discovery. Effective use of wireless network one-way connection characteristics for data communication will help to improve network connectivity, data transmission capacity, and reliability. In fact, the use of directional antennas increases the radiation irregularity of nodes, which will inevitably lead to the increase of unidirectional connections. Assuming that 2 nodes with directional antennas can communicate with each other, one node needs to be in the receiving state and the other node in the transmitting state, and their transceiver beams cover each other. Therefore, 2 nodes must point to their beams at the same time, and the transmit and receive states are reversed. If 2 or more beams reach the receiving beam of a node, a message collision occurs, after which the node does not receive the packet [5]. In sub-slot 1, node A sends and node B receives. In sub-time slot 2, the 2 nodes transmit and receive states instead, and the case of successful communication is shown in Fig. 9.2. A collision occurs when a node simultaneously receives discovery packets from 2 or more of its neighbors in a given beam, as shown in Fig. 9.3. In this case, the node cannot parse the information and therefore considers the packet lost. Each node can distinguish between an empty time slot and a collision time slot by a simple energy detector.

9.2.2 Multi-channel Parallel Neighbor Discovery Strategy Currently, some achievements have been made in the research of multi-channel neighbor discovery algorithm. For example, the algorithm based on lottery collection theory and flooding mode, EasiND algorithm [6], and McDisc algorithm [7]. In the case of multi-channel, nodes can send and receive data in parallel with different channels, and adjacent links can transmit at the same time without conflicting [8]. EasiND [6] is also a packet-based multi-channel neighbor discovery algorithm, but its packet algorithm is complex to implement as it has a large overhead and limited

102

N. Zhao et al.

performance improvement. McDisc [7] is a multi-channel neighbor discovery algorithm based on middleware. It uses the middleware to obtain clock synchronization. Then, it allows each node to set its own working channel according to the global clock. It can ensure any two physical neighbor nodes must work in the same channel when there are overlapping wake-up slots. Thus, this can realize neighbor discovery and keep the discovery delay consistent with the single channel mode. Although the existing neighbor discovery algorithms have achieved good energy efficiency, there is still some problems like it does not consider the packet collision problem [9]. Furthermore, the discovery delay increases exponentially after a conflict occurs. Recent studies on single channel neighbor discovery are based on time slots. The time is divided into multiple time slots, and each time slot chooses to wake up or sleep according to a certain strategy. The existing single channel algorithm has adopted the research idea of unequal length of wake-up time slot and sleep time slot, which improves the performance. However, there is no wake-up time slots of different lengths. The main reasons for the assumption of using the same length wake-up time slot are that the size of the wake-up time slot is generally determined by the hardware, especially by the conversion time from sleep to wake-up of the RF module. Some existing multi-channel serial neighbor discovery algorithms are proposed to solve the problems of conflict and communication quality, but the introduction of serial multi-channel increases the difficulty of discovery. The RF module needs the circuit to complete a series of actions from sleep to wake up, especially it takes a certain time for the carrier frequency to rise to high frequency (such as 2.4 GHz). However, for multi-channel, it takes less time for RF module to switch from one channel to another than it does from sleep to wake up. Thus, we can consider to use a wake-up strategy in which the wake-up time slot is not of equal length in multichannel. This will help to open up a new idea that multi-channel parallel is different from single channel neighbor discovery research, so as to further greatly improve the energy efficiency of neighbor discovery. The problems to be solved after the introduction of multi-channel are the following: whether the increased discovery delay is due to the increased difficulty of discovery; does it can offset the reduced discovery delay due to the reduced conflict; how to balance the two is a breakthrough in the quantitative theory of the strategy.

9.2.3 A Fast Neighbor Discovery Strategy Combining Directional Antennas and Multi-channel Parallelism The use of directional antennas and multi-channel parallel operation can both improve the energy efficiency of neighbor discovery. Thus, we can combine the two strategies to achieve better performance. The purpose of this paper is to design a fast neighbor discovery strategy based on the combination of a directional antenna and multi-channel parallel, which can be achieved the optimal energy efficiency of fast neighbor discovery. In the three-dimensional scenario of “time-channel-main lobe

9 Scheme Design of Network Neighbor Discovery Algorithm Based …

103

cumulative coverage width”, the problem description and formalization method of “directional antenna and multi-channel parallel” are combined, and finally, the strategy and algorithm of optimal energy efficiency are obtained.

9.3 Design of Combining Directional Antenna and Multi-channel Parallelism 9.3.1 Design of Simulation and Experimental Verification System The network topology of the project is expressed as: G(t) = (V , E(t)), where V represents the set of terminal nodes like handheld, on-board mobile terminal nodes, fixed terminal nodes, etc. E(t) indicates a directed edge e(i, j) set, which change over time. The system has the most basic time unit δ. In time step δ, each node is guaranteed to send single packet of data. The length of sleep time slot and wake-up time slot is a multiplication of the basic time unit, which may be 1, 2, etc. This specific value depends on different theoretical studies and specific algorithm strategies. Each node continuously switches between waking and sleeping states according to the work schedule set by the neighbor discovery algorithm. Generally, the node can receive messages only when it wakes up. However, when it is necessary, the node can actively wake up and send messages at any time. We assume Gi = {(t 1 i , d 1 i ), (t 2 i , d 2 i ), …, (t m i , d m i )} represents the work schedule of node i, where t m i and d m i respectively represent the start time and duration of the jth wake-up slot of node i. Then, the duty cycle of node i is Di = mj=1 d ij /T where T represents the life cycle of the node. In the simulation process, firstly, each node has the simulation ability of oneway connection and multi-channel parallel operation. Then, the algorithms of integrating one-way connection and multi-channel parallel operation are realized. The simulation design of the dynamic blind zone is realized. Furthermore, the software programming of the designed mechanism and strategy is realized. Finally, the simulation experiment and performance analysis are carried out. The main performance indicators of simulation analysis include: average neighbor discovery delay between nodes inside and outside the blind zone; average neighbor discovery delay between nodes in the blind zone; average neighbor discovery delay of the network; average discovery probability of the network; average number of neighbors discovered by nodes per unit time, energy efficiency, etc. The basic settings of the simulation are as follows: the network size is 1000M × 1000M; the variation range of node radiation radius is 10–100 m; the radiation irregularity of nodes is set to 0–60%; The duty cycle of the node ranges from 0.5% to 6%; the moving speed of nodes ranges from

104

N. Zhao et al.

0 m/s to 20 m/s; the variation range of node degree is set to 1.5–30; the generation rate of dynamic blind zone is set to 1/s–31/s; the horizontal radiation main lobe width of the directional antenna is set to 45–180 degrees; the number of parallel channels for node communication is set to 1–16 (Note setting to 1 actually turns into single channel mode); the node’s mobility model is set to waypoint and hotspot. Secondly, the modules or network nodes required for the experimental verification system, such as directional antenna transformation, multi-channel parallel control module transformation, and node production. We verify the actual performance of various mechanisms and strategies including: energy-efficient multi-channel parallel neighbor discovery mechanism; unidirectional neighbor discovery mechanism based on directional antenna; fast neighbor discovery strategy integrating directional antenna and multi-channel parallel.

9.3.2 Design of a Fast Neighbor Discovery Strategy Combining Two Mechanisms In the specific scheme model mentioned above, firstly, a rectangular matrix with length t and width n is constructed. The wake-up timeslot of a node is divided into L timeslot (listening timeslot, size τ ) and B timeslot (δ has multiple time unit sizes where and δ Dmax

(12.6)

where GB is the biomass production (g/m2 ); Dmin and Dmax are the minimum and maximum depths (m), respectively; the coefficients a, b, and m are selected to satisfy the value of GB is 0 at Dmin and Dmax , and equals to the maximum biomass GBmax at the parabola maximum. In the model, the value of Dmin is set as 0, and Dmax is computed as a function of tidal range. The second vegetation growth pattern was based on an ecological frame (hereinafter referred to as the vegetation biomass ecological distribution), which includes the processes of clone, photosynthesis, and withering of vegetation. This vegetation growth pattern used a simplified canopy-based vegetation photosynthetic growth model [13] to calculate the net primary productivity of vegetation. The biomass and its

134

J. Ji

spatio-temporal distributions have taken the vegetation adaptability to environment into account [14]. In this study, a typical salt marsh species, Spartina alterniflora was considered. Following Mudd et al. [11], some parameters describing vegetation properties, including the stem density ns , stem diameter d s (m), and average stem height hs (m) are carefully determined. The parameters are expressed as follows:

12.2.1.3

ds = 0.0006 × G B 0.3

(12.7)

n s = 25 × G B 0.3032

(12.8)

h s = 0.609 × G B 0.1876

(12.9)

Coupled Module of Vegetation Dynamics and Morphodynamics

The interactions between ecology and morphology are the foundations of the ecogeomorphological models [15]. Previous studies have demonstrated that vegetation could significantly affect hydrodynamics, sediment transport, and further result in morphological changes. It should be noted that the vegetation in the model is assumed to be rigid cylindrical plants [16]. 1. Influence of Vegetation on Hydrodynamics Vegetation greatly alters the bed roughness, and thus the flow field: / n v = (1 + k) (1 − cv

1

)n 2

Cd ar 2cmin(h v , h)h 3 + gπ d

(12.10)

where nv is the bed roughness when vegetation exists; C d is the drag force coefficient; cv is the vegetation density; ar is the shape coefficient; hv is the height of vegetation under water (m); d is the diameter of vegetation (m). 2. Influence of Vegetation on Sediment Transport Firstly, the presence of vegetation may enhance the critical shear stress for erosion. Therefore, the erosion flux could decrease ] [ τ −1 (12.11) Q e = Me min 0; τcr,e GB (12.12) τcr,veg = τcr,e 1 + K veg G Bmax

12 2D Numerical Model Used to Investigate the Influence of Vegetation …

135

where M e is the erosion coefficient (kg/m2 /s); K veg is a dimensionless calibration parameter. Secondly, vegetation could not only trap suspended sediment directly but also serve as organic matter accumulation. The total sedimentation rate considering the effect of vegetation is as follows: Q d = Q ds + Q dt + Q db

(12.13)

Q dt = C · U · ε · n s · ds · min[h s ; h]

(12.14)

Q db = Q db0 ·

GB G Bmax

(12.15)

where Qds is the natural sedimentation rate due to only settling; Qdt is the sedimentation rate caused by vegetation trapping; ε is the capture efficiency; Qdb is the sedimentation rate of organic matter; and Qdb0 is a typical deposition rate, which is based on on-site measurement experience.

12.3 Results and Discussion 12.3.1 Evolution of Bare Mudflat 12.3.1.1

Hydrodynamics and Sediment Dynamics During the Early Stage of Evolution

Hydrodynamic process (e.g., water level, current velocity, and water depth) and sediment dynamics (i.e., suspended sediment concentration, hereinafter referred to as SSC) at the observation stations (A, B, C, and D) during the early stage of evolution are presented in Fig. 12.1.

Fig. 12.1 Hydrodynamics and sediment dynamics at observation stations during the early stage of evolution: a water level, b current velocity, c suspended sediment concentration, and d water depth

136

J. Ji

Observation stations A and B were located in the subtidal zone and completely submerged during tidal cycles. Additionally, the water level of observation stations A and B exhibited a shape of sinusoidal. In contrast, observation stations C and D were in the intertidal zone and could be exposed for some time during the phase of the low tidal level. Meanwhile, the corresponding water depth gradually decreased from the observation stations in the subtidal zone. Current velocity and SSC both showed tidal periodic variations at the observation stations.

12.3.1.2

Medium- and Long-Term Geomorphological Evolution

The medium- and long-term geomorphological evolution of the mudflat without vegetation is shown in Fig. 12.2. A tidal channel system came into formation and development in the intertidal zone after the simulation time of 5 years. The tidal channels tended to evolve landward and bifurcate to tidal creeks. Though a tidal channel system incised into the mudflat, sediment still gradually deposited near the tidal channels, and the elevation of the intertidal zone continued to increase, which is shown in Fig. 12.2. The evolution of the mudflat profile indicated that the intertidal zone of the mudflat had a tendency of deposition. However, a slight erosion due to the development of tidal channels near the low water level was detected. During the early stage of geomorphological evolution, the intertidal zone accreted and propagated toward the sea at a fast rate. In general, the elevations of the upper and lower intertidal zone were increased. It is suggested that the rate of deposition slowed down during the late stage of evolution. As a result, the profile of the mudflat presented a convex shape and the convex point was located near the high water level. After the 20-year of simulation, the average slope of the mudflat profile was about 0.087%, which was less than 0.1% of the initial slope.

Fig. 12.2 Geomorphological evolution of the mudflat: a–d without vegetation; e–h the vegetation biomass parabolic distribution; i–l the vegetation biomass ecological distribution

12 2D Numerical Model Used to Investigate the Influence of Vegetation …

137

12.3.2 Evolution of Vegetated Mudflat 12.3.2.1

Hydrodynamics, Sediment Dynamics, and Vegetation Dynamics During the Early Stage of Evolution

Two vegetation growth patterns (i.e., parabolic distribution and ecological distribution) were used to simulate the geomorphological evolution of the vegetated mudflat. Hydrodynamics, sediment dynamics, and vegetation dynamics at the four observation stations on the CS profile were analyzed and compared. The hydrodynamic and sedimentological conditions were almost same when the two different vegetation growth patterns were taken into account. Moreover, the influence of vegetation on hydrodynamics and sediment dynamics during the early stage of evolution could be ignored. Water level, current velocity, suspended sediment concentration, and bed shear stress at the different observation stations on the CS profile under the scenarios of parabolic distribution and ecological distribution. (i–l) Vegetation biomass on the CS profile at the different observation stations during three years.

12.3.2.2

Medium- and Long-Term Geomorphological Evolution

Compared with the geomorphological evolution of the bare mudflat, the existence of vegetation could significantly affect the evolution of the mudflat. On the one hand, the amount of depositional sediment and the elevation of the upper intertidal zone both increased. On the other hand, the presence of vegetation promoted the development of tidal channels. Different vegetation growth patterns showed different effects on the evolution of the mudflat. It seemed that the deposition platform was relatively narrow, but the overall elevation of the vegetation area was higher when the vegetation biomass is ecologically distributed. In addition, the number of tidal channels was larger; the width of tidal channels was wider. After the simulation time of 20 years, the mudflat profile shape was convex. When considering vegetation factors, the overall deposition thickness of the mudflat increased, and the convex point moved seawards compared with that of the mudflat without vegetation. Compared with the 1D model, the 2D model has a smaller influence on the average profile shape of the mudflat. The main impacts are concentrated on the increase of sedimentation in the middle and upper parts of the mudflat and the intensified erosion of the tidal channel development area near the low water level. Especially when the vegetation biomass is ecologically distributed, the increase in the amount of sedimentation is more obvious, and the steep slope of the vegetation marginal area is larger. The results also show that on the profile with no tidal channels in the upper part of the intertidal zone, the vegetation area can capture more sediment, resulting in more siltation, and the elevation of the sedimentation platform on the upper part of the mudflat is higher. But near the low water level on the CS profile, when there is no vegetation distribution and the vegetation biomass is parabolic

138

J. Ji

distributed, the corresponding scour platform is wider. When the biomass is ecologically distributed, the elevation near the average sea level changes suddenly. Because the formation and development of tidal channels are affected by many factors, and their development locations are random, the above results appear.

12.3.3 Influence of Vegetation on the Evolution of Muddy Tidal Flat Landform 12.3.3.1

Influence of Vegetation on Water Flow

The existence of tidal flat vegetation can slow down water flow, attenuate waves, and promote sedimentation in the vegetation zone. In order to study the effect of vegetation on mudflat flow, this paper compares and analyzes the maximum flood speed in several typical regions on the observation profile at different evolution periods with or without vegetation on the mudflat. The selected areas are the bare flat area (point C), the edge point of the vegetation, and the vegetation area (points D and F) on the CS profile. The specific results are shown in Table 12.1. The shape and biomass of the mudflat profile at the corresponding time are shown in Fig. 12.3. The maximum flood speed in the table represents the rapid flow velocity during a tidal cycle at the beginning of the evolution, and the slow flow effect of the vegetation can be judged by the value of this value. It can be found from Table 12.1 that regardless of whether vegetation is distributed or not, the maximum flood speed of rising tide in the CS profile generally shows a decreasing trend from sea to land. Therefore, vegetation has different effects on reducing the maximum flood speed at different locations on the mudflat. In bare flat area, vegetation has little effect on the maximum flood speed. When the biomass has a parabolic distribution, the maximum flood speed at the edge of the vegetation is reduced by 6.05%, the maximum flood speed at point D in the vegetation area is Table 12.1 Maximum flow velocity during high tide in different areas and at different times (m/s) Bare flat

Edge point of vegetation

At some point in the early stages of evolution

At some point in the later stages of evolution

Parabolic Ecological Vegetation Upper Vegetation Upper distribution distribution area D intertidal area D intertidal zone F zone F No vegetation

0.270 0.248

0.245

0.247

0.232

0.089

0.113

0.270 0.233 Parabolic distribution

0.225

0.222

0.207

0.085

0.098

Ecological 0.270 0.290 distribution

0.321

0.247

0.179

0.045

0.102

12 2D Numerical Model Used to Investigate the Influence of Vegetation …

139

Fig. 12.3 a CS profile elevation at the early stage of evolution, b biomass distribution at different stages of evolution

reduced by 10.12%, and 10.78% at point F in the upper intertidal zone. When the biomass is ecologically distributed, the maximum flood speed at point D does not decrease significantly, the maximum flood speed at point F in the upper part of the intertidal zone is reduced by 22.84%, and the speed at the edge of the vegetation increases instead. Combined with the analysis of mudflat profile in Fig. 12.3a, at the upper edge of the vegetation, a tidal channel appeared due to tidal erosion, which leads to an increase in the maximum flood speed at the edge of the vegetation when the biomass is ecologically distributed. Comparing the biomass distribution in Fig. 12.3 with the maximum flood speed in Table 12.1, we can see that in the upper intertidal zone, due to the larger biomass of vegetation during ecological distribution, the flow reduction effect of vegetation is more obvious. The opposite is true in the middle and lower parts of the intertidal zone (such as point D in the vegetation area and near the edge of the vegetation). It can be seen that different vegetation biomass distribution methods have different effects on water flow velocity. When performing tidal flat vegetation-dynamic-landform evolution, it is necessary to choose a suitable vegetation distribution method.

12.3.3.2

Short-Term Simulation Results of Vegetation-Dynamics-Landform

In the previous simulation, when comparing the difference between whether there is vegetation on the mudflat, the topographical conditions are not based on the same standard, and it is impossible to judge the influence of a single vegetation factor on the evolution of mudflats. Therefore, in order to further explore the influence of a single vegetation factor, during the operation of the two vegetation models, the simulation results of a certain fixed period of time are selected (the results of the model after 5 years of operation are selected in this section). On the basis of the topography of the result, the influence of vegetation is ignored, and only hydrodynamic and external sediment supply are considered. A 1-month short-term simulation is carried out to compare and analyze the impact of a single vegetation on the short-term evolution of the mudflat, as shown in Fig. 12.4. The simulation results show that due to the tidal action, the mudflat has a siltingup trend in the short term, but the shape of the mudflat profile does not change much. During the high tide period along the way, the maximum flood speed value

140

J. Ji

Fig. 12.4 Sediment deposition at different locations of the tidal flat after short-term simulation, maximum flow velocity during high tide, bed shear stress, and CS profile elevation

corresponds well to the shear stress of the bottom bed. The influence of the two vegetation biomass distribution methods on the evolution of mudflats is roughly the same. Because vegetation has a slow-flow effect, when vegetation exists, the flow velocity of the water in the vegetation area is relatively low, and the amount of siltation is also large. When the vegetation biomass is in an ecological distribution, the maximum flood speed value during the high tide in the vegetation area is reduced to a greater extent, and the amount of sedimentation on the beach surface is more than that of the parabolic distribution of vegetation.

12.4 Conclusions A 2D eco-geomorphological model was used to investigate the influence of vegetation on the morphodynamics of the mudflat. Three main conclusions were summarized: (1) Vegetation significantly affects the evolution of mudflats and can promote the development of tidal channels. When the vegetation biomass is ecologically distributed, the effect of vegetation on the development of tidal channels is more obvious.

12 2D Numerical Model Used to Investigate the Influence of Vegetation …

141

(2) The presence of vegetation can reduce the maximum flood speed at different locations on the mudflat, and when the vegetation biomass is large, the corresponding flow reduction will be deeper. (3) When vegetation exists on the mudflat, the amount of sedimentation in the intertidal zone is large, the maximum flood speed is small, and the shear stress of the bottom bed is large. This phenomenon is more prominent when the biomass of vegetation is ecologically distributed. It is worth emphasizing that the size of vegetation biomass is a key factor affecting the evolution of tidal flat landforms. Therefore, it is particularly important to select an appropriate vegetation biomass distribution method during the simulation process of the tidal flat ecological landform evolution model.

References 1. Amos, C.L.: Siliciclastic tidal flats. In: Geomorphology and Sedimentology of Estuaries, vol. 53, pp. 273–306 (1995) 2. Friedrichs, C.: Tidal flat morphodynamics: a synthesis. In: Treatise on Estuarine and Coastal Science, vol. 3, pp. 137–170 (2011) 3. Reineck, H.E., Singh, I.B.: Depositional Sedimentary Environments (1980) 4. Coco, G., Zhou, Z., Maanen, B.V., Olabarrieta, M., Tinoco, R., Townend, I.: Morphodynamics of tidal networks: advances and challenges. Mar. Geol. 346, 1–16 (2013) 5. Fan, D., Guo, Y., Ping, W., Shi, J.Z.: Cross-shore variations in morphodynamic processes of an open-coast mudflat in the Changjiang Delta, China: with an emphasis on storm impacts. Cont. Shelf Res. 26(4), 517–538 (2006) 6. Wright, L.D., Thom, B.G.: Coastal depositional landforms: a morphodynamic approach. Prog. Phys. Geogr. 1(3), 412–459 (1977) 7. Dean, R.G., Dalrymple, R.A.: Coastal Processes with Engineering Applications. Cambridge University Press (2004) 8. Dieckmann, R., Osterthun, M., Partenscky, H.W.: Influence of water-level elevation and tidal range on the sedimentation in a German tidal flat area. Prog. Oceanogr. 18(1–4), 151–166 (1987) 9. Lee, S.C., Mehta, A.J.: Problems in characterizing dynamics of mud shore profiles. J. Hydraul. Eng. 123(4), 351–361 (1997) 10. Roberts, W., Hir, P.L., Whitehouse, R.: Investigation using simple mathematical models of the effect of tidal currents and waves on the profile shape of intertidal mudflats. Cont. Shelf Res. 20(10), 1079–1097 (2000) 11. Mudd, S. M., Fagherazzi, S., Morris, J. T., Furbish, D. J.: Flow, sedimentation, and biomass production on a vegetated salt marsh in South Carolina: Toward a predictive model of marsh morphologic and ecologic evolution, The Ecogeomorphology of Tidal Marshes, 165–188 (2004) 12. Zhu, M., Lan, J., Zhang, X., Sui, G., Yang, X.: Porous carbon derived from Ailanthus altissima with unique honeycomb-like microstructure for high-performance supercapacitors. New J. Chem. 41(11), 4281–4285 (2017) 13. Gao, L., Zhang, Q., Zhu, M., Zhang, X., Sui, G., Yang, X.: Polyhedral oligomeric silsesquioxane modified carbon nanotube hybrid material with a bump structure via polydopamine transition layer. Mater. Lett. 183, 207–210 (2016) 14. Ren, L.J., Li, X.Z., Yang, S.L., et al.: Effects of vegetation changes on salt marshes in Chongming East Beach on the function of siltation and wave reduction in tidal flat wetlands. Chin. J. Ecol. 34(12), 9 (2014)

142

J. Ji

15. Zhang, J., Cui, S.Y.: Effects of straw and vegetation cover on soil salinity and fertility properties in tidal flat reclamation areas. China Soil Fertil. 000(003), 128–135 (2018) 16. Zhang, X.L., Gu, D.Q., Feng, A.P., et al.: A comparative study on the characteristics and evolution of wetland vegetation in the Yellow River Delta and the southern bank of Laizhou Bay. Bull. Soil Water Conserv. 26(3), 6 (2006)

Chapter 13

Design and Experimental Verification of High Functional Density Cubesat System Yuying Yao, Weida Fu, Xin Guo, Sihan Shi, and Jing Yan

Abstract A Cubesat is regarded as a concentrated instrument intuitively and it includes all necessary elements of a satellite. High-cost performance is the key point and advantage, which is different from traditional satellites. In this paper, the development trends and task dimensions of Cubesat are analyzed first. Then it details the standardization and integration design requirements of high functional density Cubesat as the technologies of electronics and advanced devices are promoted rapidly. Technical approaches are proposed for system design of high functional density Cubesat in two dimensions. Furthermore, a specific design example for 6U Cubesat is shown in detail, which includes standardized stack combination, power system integration, autonomous operation management, and diversified task modes. All these methods are applied to achieve more functions in a Cubesat. The 6U Cubesat is well-verified on orbit for 10 months and large amounts of data are transmitted from space to earth. The key developing technologies of Cubesat are prospected at last. Keywords High functional density · Cubesat · Standardization · Integration

13.1 Introduction CubeSat is a low-cost micro/nanosatellite that adopts the international general standards. It adopts the universal, modular, and standardized design. It has the characteristics of short development cycle, flexible launch and deployment, and low cost of launching. According to the statistical analysis of global satellite launch data in 2021 [1], the number of Cubesat which is less than 50 kg continues to grow, approaching 400, accounting for 20% of the total in 2021. Throughout the development of Cubesat in recent years, its application fields have gradually expanded, and the design indexes of representative Cubesats are shown in Table 13.1 (the volume of Cubesat is internationally common in “U”, and the volume of 1U is 10 × 10 × 10 cm). Doves are a typical application of Cubesat network and have higher image resolutions than other Y. Yao (B) · W. Fu · X. Guo · S. Shi · J. Yan DFH Satellite Co. Ltd, Beijing Haidian District 100094, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_13

143

144

Y. Yao et al.

cubesats [2]. Aoxiang Star was designed by university students and is an effective way for teaching international programs and cooperations [3]. Capstone extends the field of Cubesat from the earth to the moon and opens out a new application for deep-space exploration [4]. It is obvious that the application of Cubesat has gradually expanded from low earth orbit to deep space orbit, and space missions have developed from technical experiments to space services. The complexity of single satellite missions has gradually increased, and it has the mission capability of single satellite work, multi-satellite collaboration, and constellation networking [5, 6]. It can be seen that improving the functional density on the basis of standardization and low cost is the key technology for the development of Cubesat.

13.2 Design Requirements for High Functional Density Cubesat In the face of new application demand and resource constraints of Cubesat, improving the functional density of Cubesat focuses on top-level design and application level, and carries out integrated multi-function reuse. Its system design requirements are different from those of traditional satellites, mainly as follows.

13.2.1 Standardization Standardization is the basis of Cubesat design. The standardized design content not only covers the standardization of physical interfaces such as component size, electrical interface, and communication interface but also includes the standardization of software such as communication protocol and data format. Standardize components through unified interfaces to meet various application requirements. Standardized design reduces the cost and risk of hardware redesign and improves production efficiency. At the same time, standardization makes the adaptation of stand-alone products more flexible and better compatibility, which is easy to achieve product upgrading and improves the flexibility of design. Furthermore, standardized interfaces help to realize the on-orbit reconstruction of satellite software and advance the unmanned watching of the earth, improve the satellite’s on-orbit application efficiency and operation reliability, and reduce the satellite’s maintenance cost.

13.2.2 Integration The integration requirements of Cubesat include hardware integration and software integration.

2015

2016

Xiwang-2

Aoxiang-Star

12U/10 kg

1U/8U 1.5 kg/10 kg

Size

3U/5 kg

Time

2013–2022, multiple batches launch

Name

Doves

Payloads

Atmospheric density detector, very long baseline interferometric beacon, dual mode four frequency GNSS receiver

High-resolution camera (panchromatic, color, near-infrared), optical telescope

Atmospheric Polarization detection, Earth sensor observation

Atmospheric detection, networking application

Earth observation, debris tracking

Task type

Table 13.1 Representative design indexes of Cubesats Main indexes

Satellite ground TT&C:UHF/VHF, Power output: solar array 30 W, battery 58 Wh

Detection range of atmospheric density data Atmospheric pressure: 1. 0 × 10–6 –5. 0 × 10−3 Pa, GPS pseudorange single-point positioning accuracy: ≤10 m

Resolution: 2.7–3.7 m, width: 7.3–25 km, varies as different batches of satellites

Orbit

Launch

Carrying launch

(1) From ISS platform (2) Carrying launch

350 km/41° Carrying low Earth orbit launch

524 km/97.47° SSO

(1) 420 km/52° ISS (2) 475 km/98° SSO

Life

3 months

1 year

1–3 years

Country

(continued)

China

China

USA

13 Design and Experimental Verification of High Functional Density … 145

Time

2022

Name

Capstone

Table 13.1 (continued)

12U/25 kg

Size Earth moon orbit communication navigation test

Task type Optical camera, on-SOC atomic clock

Payloads Space/Earth TT&C; X-band: Space/Moon TT&C: S-band; Power output: solar array 114 W, battery 182 Wh

Main indexes Near linear halo orbit (NRHO)

Orbit Small rocket + upper satellite platform

Launch 22 months

Life USA

Country

146 Y. Yao et al.

13 Design and Experimental Verification of High Functional Density …

147

The hardware integration is based on the design idea of an all-in-one system and cableless, breaking the traditional composition mode which is designed from instrument or module to subsystem, and then to whole satellite, and combining different functional modules according to the optimization principle. According to the actual task requirements, Cubesat design can be implemented simultaneously by means of satellite platform and payload integration, mechanical and electrical integration, information processing integration, or multidimensional integration. Software integration is based on hardware integration, and the whole satellite data is fused to achieve satellite energy management, thermal control management, attitude control, and safety mode management, as well as unified scheduling and implementation of satellite-in-orbit tasks. As a whole, the satellite carries out unified mechatronic and electrothermal design and resource planning tradeoffs from the large system level, which can effectively improve the functional density, reduce the cost of raw materials through design optimization, and shorten the development cycle [7].

13.3 The System Design of Cubesat with High Functional Density The system design dimensions of the Cubesat with high functional density are shown in Fig. 13.1. The horizontal aspect of Cubesat system design is aimed at improving satellite capability and reducing satellite envelope. Based on the standard interfaces (communication interface, measurement and control system, data protocol, etc.), the software design expands its functions from the aspects of autonomous operation management, diversified task modes and applications, and the hardware is integrated into a whole satellite or a single computer through standardization and combination design. High function density system design was carried out for a popular science education 6U Cubesat. The system layout of the satellite is shown in Fig. 13.2. Most of the electronic stand-alone circuit boards with independent functions on the satellite are integrated into the assembly stack to achieve maximum integration design.

13.3.1 Standardized Assembly Stack The Cubesat stack circuit board is based on the industrial PC104 standard and electrically connected through the stack connector. The stack connector is configured with power supply and distribution, communication bus, telemetry, and remote-control interfaces. Through reasonable architecture and interface design, the interconnection cables between boards are reduced, and the design of less cables and no cables for electrical equipment is realized, thus reducing the weight of the satellite and the cost of the whole satellite. The assembly stack accounts for about 2U, including

148

Y. Yao et al.

more

Function

Volume

Standardized interfaces

Diversified task modes

Autonomous operation management

Standar- Departments TT&C dataDized integration transmission Assembly integration stack

Openning Application interfaces

PCB-board Centralized solar array power distrubution

less

Fig. 13.1 System design dimensions of high functional density of Cubesat

Fig. 13.2 General layout of 6U Cubesat

power supply, computer, control unit, measurement and control, data transmission, test payload, and other functional modules. The assembly stack is composed of a frame, a stack of circuit boards, and copper pillars. The envelope size, component height, circuit board combination sequence, and stack connector definition of each circuit board are uniformly designed. The assembly stack is designed with the integration of mechanics and thermal. The system thermal design is based on the single-board thermal design, and the heat exchange is carried out from the optimal path to the cooling surface of the satellite. The heat

13 Design and Experimental Verification of High Functional Density …

149

dissipation area and heat conduction path of the single board are uniformly designed according to the requirements of the whole satellite. The heat dissipation area is located at the edge of the circuit board. The front and back sides of the heat dissipation area, the copper column connection position, and the stack connector edge are prohibited areas for devices. Copper coating is required to enhance heat conduction (Figs. 13.3 and 13.4). Fig. 13.3 Assembly stack of Cubesat

Framework

Copper pillar PCB board

Fig. 13.4 PCB circuit board of assembly stack

Prohibited area

Heat dissipation area

Device area

Output connector

150

Y. Yao et al.

13.3.2 Energy Integration Design According to the space and missile system center satellite cost model of the US Air Force Space Command and the statistical analysis of a large number of historical data, the energy system cost accounts for 12–20% of the total satellite cost. The energy integration design is conducive to centralized regulation and transmission of electric power, reducing power loss and improving power density. The power regulation, distribution, and management integration design of Cubesat is completed by the onboard computer. The secondary power supply converters are mainly centralized distribution to reduce power conversion loss and improve energy utilization. Solar array accounts for the largest proportion of the energy system. It adopts the structural design of multi-component packaging or the integrated design of solar array based on PCB board to realize cable-free and lightweight. For Cubesat, PCB solar wings have advantages in quality, remanence distance, insulation, and cable reliability, and the process flow is simpler. The 6U satellite is designed with two deployable PCB solar wings of four circuit boards in total. The distribution area is about 0.2 m2 . At the end of life, output power is more than 37 W. The weight of both wings is about 1.3 kg (Fig. 13.5). For the Cubesat with body-mounted solar wings, a multi-component packaging structure design is adopted to integrate the satellite structure board and electrical interface between the boards and the solar wings. The inner side of the satellite structure board is packaged with an intelligent substrate for transmitting electrical signals between boards, and the outer side is a solar cell chip. Through intelligent floating connectors, fast interconnection between boards and electrical connection inside and outside the satellite can be realized. The electrical interface of intelligent substrate includes telemetry signals, remote-control commands, power supply, communication, etc. In addition to meeting the requirements of power supply system, it also takes into account the power supply and communication requirements of the whole satellite or all modules based on the same intelligent substrate. The design scheme of the integrated solar array is shown in Fig. 13.6. The integrated design can flexibly add and replace circuit modules or turn on and off different circuit modules Fig. 13.5 PCB solar array of 6U Cubesat

13 Design and Experimental Verification of High Functional Density …

151

Solar cell chip

Intelligent substrate 2

Intelligent substrate 1

Structure board 1

Intelligent floating connector

Intelligent substrate 3

Structure board 3

Structure board 2

Fig. 13.6 Integrated body-mounted solar array design scheme

to achieve the functional expansion and rapid reconstruction of the satellite. When each circuit module is connected to the satellite electrical system, the intelligent substrate will conduct rapid self-inspection.

13.3.3 Autonomous Operation Management The development and manufacture cost of Cubesat is low. On-orbit maintenance is mainly based on autonomous operation and unmanned watching management mode. The onboard computer system monitors the safety of energy, measurement and control, attitude, and other components according to satellite telemetry, and implements reasonable protection measures. The autonomous operation process of the satellite is shown in Fig. 13.7. Autonomous management on orbit

Autonomous management into orbit Satellite-rocket separation

Power safe mode

N

Is the power bus ok Y

Signal of satelliterocket separation is correct

Is the TT&C Link ok?

Is attitude ok?

TT&C safe mode

Fig. 13.7 Autonomous operation and management scheme

Attitude safe mode

Y Y

Y

Is the temperature ok N

N

Orbit entry program

N

Task mode

Thermal safe mode

152

Y. Yao et al.

(1) Satellite into orbit phase: After the separation of satellite and rocket, the onboard separation switch is turned on and the satellite is powered on independently. After judging the reliable separation of the satellite and the rocket, the onboard computer starts the pre-installed program of the orbit entry section, completes the solar wing deployment, antenna deployment, orbit entry attitude adjustment, and other actions, and starts the satellite autonomous management safety mode. (2) On-orbit operation phase: During long-term operation and mission, it is necessary to ensure that the status of all components of the satellite platform is within the normal range, corresponding to different safety mode thresholds. The execution conditions of task mode are the highest, followed by energy, attitude, measurement and control, and thermal control. When any security mode is not exited, the task mode cannot be executed. The safety mode threshold can be updated on orbit as required. (3) Satellite state recovery: The power on/off operation in the safety mode will restore the system parameters to the initial values. When exiting the safety mode, the system parameters, safety threshold, etc., will be restored to the state before entering the safety mode to ensure that the satellite operation timing is correct.

13.3.4 Diversified Task Mode The satellite task mode is designed according to the type and number of payloads. Aiming at the optical load loaded by 6U cubic satellite and its task characteristics, the imaging mode, data storage and transmission, data processing, and other diversified task modes are designed, as shown in Fig. 13.8. As the main task, the micro intelligent processing module has the functions of image processing, task planning, and onboard information management. Set automatic imaging as the default imaging mode, that is, “imaging as startup”, and the optical camera will be imaged with the default working parameters. According to the task requirements, imaging parameters can be set up to select the control imaging or timing imaging from ground station, and select the original image or thumbnail imaging according to the purpose of the image. The compression ratio of thumbnails can be set by command. The thumbnail image data is stored in the UV band communication module and transmitted down to the ground station through the UV telemetry channel. The original image data is stored and processed in the intelligent processing module. The original image download, cloud detection, and key target detection can be selected according to the ground command. The image data processing results are transmitted to the ground through telemetry parameters or data transmission modules.

13 Design and Experimental Verification of High Functional Density …

153

Task modes Data storage and transmission task

Image task

Integrated X-band TT&C datatransmission module

Key target detection

Original image data

Cloud detection

Data record

UV-band communication module

Data playback

Thumbnail data

Real-time transmission

Satellite computer

Thumbnail transmission

Image parameters

Thumbnail image

Fixed time image

Control image

Automatic image On/off command

Experimental task

Original image data

Intelligent processing module

Fig. 13.8 Diversified task modes design

13.4 On-Orbit Verification The 6U Cubesat is deployed by POD (Picosatellite Orbital Deployers), as shown in Fig. 13.9. The POD is a standard deployer for 6U Cubesat and provides a right force to push the satellite out when it receives the separating signal from the rocket. On December 26, 2021 (Beijing time), the satellite was successfully launched by a CZ-4C rocket as a carried load from Taiyuan Launch Centre. Through satellite telemetering analysis, after the separation of the satellite and the rocket, the satellite automatically runs the procedure of entering the orbit. The solar arrays are unfolded in about 20 s and the attitude control computer completes the rate damping to achieve the sun orientational to in about 8 min, and switches to the operation mode of sun cruising, as shown in Fig. 13.10. The solar arrays’ current is about 3.5 A which matches the design value when the sun radiates at about 90° direction. At the same time, the sun sensor the outputs the correct angles which indicates that the attitude of the satellite is working in normal operation. After the 6U Cubesat was checked out in a normal procedure, it carried out multi-mode verification such as automatic imaging, command imaging and thumbnail imaging, and conducted in-orbit imaging tests for different scenes such as cities, coasts, and mountains, as shown in Fig. 13.11. Up to now, the Cubesat has been in orbit for 10 months and successfully completed various test tasks.

154

Y. Yao et al.

Fig. 13.9 The 6U Cubesat in POD for launch

Fig. 13.10 Telemetry curve from omnidirectional acquisition rate damping to sun orientational (above: solar array current, below: sun sensor output angle)

13.5 Expectation The rapid development of Cubesat technology closely depends on its high-cost performances. With the full verification of the reliability of more and more commercial devices, there is still considerable room for tapping potential and optimizing its design and application modes.

13 Design and Experimental Verification of High Functional Density …

155

Fig. 13.11 Satellite images of 6U Cubesat (left: Dubai Palm island, right: Golden Gate Bridge in San Francisco)

(1) The design of onboard processing system based on software reconfiguration can improve the reliability of Cubesat and the adaptability to complex tasks, and meet the requirements of space task diversity. An example of intelligent processing module of 6U Cubesat is mentioned above. (2) Multi-satellite networking and task collaboration is an inevitable trend to improve the application efficiency of Cubesat. Standardized interfaces, highperformance constellation networking, and autonomous operation control are key technologies. (3) The standard launch and deployment interfaces are designed to flexibly adapt to the launch and deployment modes. Cubesat can not only improve its launch utilization by “one rocket with multiple satellites” but also be released into space by carrying space stations and adding small satellites to the upper platform of the rocket [8].

References 1. The market scale, market competition pattern and application prospect of China’s satellite Internet industry in 2021. https://baijiahao.baidu.com/s?id=1740921383432404630&wfr=spi der&for=pc. Last accessed 12 Aug 2022 (in Chinese) 2. Deep analysis of DOVE satellites system. http://www.360doc.com/content/18/0918/22/438 85509_787793213.shtml. Last accessed 19 Aug 2018 (in Chinese) 3. Liu, Y.Y., Zhou, J., Liu, G.H.: Development and prospect of “Aoxiang” series CubeSats. J. Astronaut. 40(10), 1115–1124 (2019) (in Chinese) 4. Liu, Y.Z., Liu, Y.D., Wang, S.: Mission analysis of the first Earth Moon space cube in the world. Space Int. (8), 11–15 (2022) (in Chinese) 5. Franco, P., Dm’io, M., Antonio, V., et al.: DustCube, a nanosatellite mission to binary asteroid 65803 Didymos as part of the ESA AIM mission. Adv. Space Res. (62), 3335–3356 (2018)

156

Y. Yao et al.

6. Patrick, B., Jakob, D., Esa, V., et al.: DISCUS-the deep interior scanning CubeSat mission to a rubble pile near-Earth asteroid. Adv. Space Res. (62), 3357–3368 (2018) 7. Zhang, K.K., Chang, L., Song, W.X.: A preliminary study on integrated design methodology for high functional density microsatellite. In: The 8th Cross-straits Conference on Aeronautics and Astronautics, Beijing, sponsored by Beijing University of Aeronautics and Astronautics (2012) (in Chinese) 8. Diao, Y., Guo, Z.X., Peng, Y.X.: Inspiration of Japan’s “Bird” project model for China’s Cubesat project. China Aerosp. (9), 54–57 (2021) (in Chinese)

Chapter 14

Space Design of Exhibition Hall Based on Virtual Reality Yixuan Wang

Abstract With the emergence of new art forms and technical means, more and more exhibition activities are presented in front of people with a novel artistic expression, relying on brand-new technical means, which has greatly changed people’s understanding of exhibition space. The author puts forward the idea of applying VR (Virtual Reality) display design method to the space design of exhibition halls and establishes a panoramic view generation model of exhibition halls. Because there is a high similarity between the optimal simple polygon formed by plane discrete corners and TSP (Traveling Salesman Problem), this paper will directly use the basic steps of GA (Genetic Algorithm) to solve TSP problem to generate the building ground contour. The research shows that by comparison, it can be seen that the time consumed by this algorithm is obviously less than that of the algorithm based on feature points. When the number of points is 16, the two algorithms get the same criterion function value. As the number of points increases, the optimization effect of GA becomes more and more obvious, while the error of the algorithm based on feature points becomes larger. It shows that the optimized GA has a good optimization effect. Keywords Virtual reality · Exhibition hall · Space design · 3D virtual

14.1 Introduction VR (Virtual Reality) technology is an important direction of simulation technology. It is a combination of simulation technology, computer graphics, man–machine interface technology, multimedia technology, sensor technology, network technology, and other technologies. It is a challenging interdisciplinary frontier discipline and research field. With the improvement of economic level, no matter from the angle of commercial economy or cultural communication, human beings are no longer satisfied with the traditional two-dimensional information display mode, but the demand for participatory and three-dimensional display mode is increasing [1, 2]. Relatively Y. Wang (B) Guangdong University of Science and Technology, Dongguan 523000, Guangdong, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_14

157

158

Y. Wang

speaking, graphic works and computer digital works are more convenient to display and save data. They can be sorted and summarized in a unified way and can be saved and displayed in digital or physical form. However, there are many difficulties in the preservation of solid model data forms such as animated character modeling, product model, three-dimensional composition, product packaging, etc. The ideal virtual environment should make it difficult for people to distinguish between true and false, and make people devote themselves to the 3D virtual environment simulated by computer [3–5], and browse in the virtual environment, so that all perceptions they see, hear, smell, taste, and move seem to be real. With the emergence of new art forms and technical means, more and more exhibition activities are presented in front of people with a novel artistic expression, relying on brand-new technical means, which has greatly changed people’s understanding of exhibition space [6]. Maf et al. put forward the design method of external space aiming at various spatial levels in the city, which provides many beneficial ideas for attracting people to carry out activities in public spaces and promoting social interaction. Whether there are vibrant people’s activities is the only criterion of public space [7]. Wang et al. made a comprehensive and meticulous research on the design of the facade, around the door, approach road, parking space, front yard of the main house, side yard and backyard, enclosure, lighting, and color of the external space of the building [8]. Due to the unbalanced level of digital development of museums and art galleries in China, the integration of digital resources will be very difficult. However, some digital art galleries are not carefully demonstrated when they are developed, which directly results in imperfect technology, poor interactive functions, unreasonable structure, and low quality, which bring various obstacles to resource integration [9]. The development of exhibition halls is limited by many factors such as weather, traffic, time, venue, cost, and safety [10]. Combining VR technology with exhibition hall technology is an important means to solve the obstacles of exhibition hall industry development. The author puts forward the idea of applying VR display design means to the space design of exhibition halls, aiming at using VR display means to alleviate the actual problems existing in the display and data preservation of college students’ artistic design works, hoping to provide a more advanced digital solution for the display of artistic design works and data preservation.

14.2 Research Method 14.2.1 Design Principles of VR Display Visual effect is also the most critical design content of VR display design, and it is also the focus of designers’ work. The artistry of VR display design often means that it has good visual effects and visual feelings. However, the reality here does not mean that virtual scenes and objects must be faithfully restored to the real ones, but

14 Space Design of Exhibition Hall Based on Virtual Reality

159

that virtual scenes and objects must have a real and real visual sense of existence, so as to give visitors a realistic visual feeling. This virtual scene and object may not exist in the objective world, but in the virtual environment, visitors must feel that it exists alive. Whether it is realistic or exaggerated artistic expression. The real and natural visual effect is mainly for the psychological feeling of the visitors. How to make the virtual world feel immersed and make the visitors feel “immersive” is the most important design principle. Traditional display methods or media will be aware of the limitations of time and space when transmitting information. For VR display design, such limitations are weakened by the characteristics of VR technology. VR display design is a process in which designers make and create digital virtual information, and then determine the clear display purpose, reasonably use the display means, rationally analyze the main and secondary contradictions of display design, grasp the key points, and ensure the accurate, reasonable and efficient transmission of display information. We should carefully consider the needs of people’s spiritual life, provide people with practical, emotional, and reasonable information feelings through VR technology, and adhere to the humanized design principle. On the other hand, VR display design makes the relationship between information and audience closer. It gives designers greater freedom in artistic creation and makes the design closer to caring for people. In the development of the virtual exhibition hall, the polygon modeling method with geometric modeling technology, which is accurate and relatively delicate in model data, is selected [11]. The 3D geometric scene model with accurate parameters can be made by referring to the scale and size data of the real scene, and the whole virtual scene is more real. The scenes of the virtual exhibition hall include seven functional areas: foyer, multi-function hall, watercolor exhibition area, oil painting exhibition area, design exhibition area, traditional Chinese painting exhibition area, and sculpture exhibition area. In the process of modeling, it is often necessary to seek a balance between the details of the model and the performance effect, that is, to fully optimize the model on the premise of ensuring the real scene effect. Choose different modeling techniques according to the priority of the model to be created. As shown in Fig. 14.1. Because the panoramic VR display design is based on the combination of traditional photography art and VR technology. Therefore, the photography technology Fig. 14.1 Virtual exhibition hall scene structure

160

Y. Wang

and the processing technology of the later images are the keys. It is needless to say that photography is a mature technology and art form. The virtual exhibition design based on 3D modeling has strong stereoscopic impression, spatial sense, and interactive ability, giving people a strong sense of reality. And because its virtual environment can be unrealistic, it gives designers more creative space and freedom. Therefore, it is the most promising direction of VR design. And the application conditions that need to truly show the original features of the environment. However, the shortcomings of photo-based VR technology are obvious, such as the large hardware investment in the early stage and the inability to realize real interaction. Therefore, strictly speaking, it is a quasi-3D virtual display form. VR exhibition design, as a developing exhibition design method, aims to create an interactive, rich, and realistic environmental scene or object through relatively low-cost and advanced technical means. Therefore, the artistry of VR exhibition design is also one of the goals and principles to be pursued. From the 3D geometric modeling design of VR to the visual effect design of virtual objects, as well as the rendering of the overall atmosphere, sound environment and friendly man–machine interaction design, art should be followed.

14.2.2 Generation of Panorama of Space Scene in Exhibition Hall A clear project orientation of the art gallery is the premise of functional block division. In terms of practical functions, the modern art museum has changed from collection and exhibition to a comprehensive place where various activities such as collection, research, exhibition, education, communication, and service collectively exist. The architecture of art museum has become more and more complex because of its expanding coverage, and the people it serves tend to be public, which indicates that the contemporary art museum has gradually evolved into a multifunctional cultural center. According to the high-tech characteristics of works of art, the space design of modern art galleries has strengthened the mobility of exhibition space, traffic space, and public space, which requires the comprehensive utilization of all parts of the space and the proportion of gray space. The diversity of display space forms and the interweaving of different streamlines also provide a variety of viewing angles for artworks. The rich changes of space make the audience’s emotions excited during the exhibition, creating a harmonious space. The 3D virtual exhibition hall is a digital exhibition hall built by computer graphics technology, and it is a 3D interactive experience mode. Based on the traditional exhibition hall, the 3D virtual technology is used to transplant the exhibition hall and its exhibits to the Internet for display, publicity, and education activities, breaking through the limitations of time and space in the traditional sense. From the interactive aspect, the online pavilion can fully mobilize the enthusiasm of users by using multimedia technology, such as words, sounds, pictures, videos, and 3D models, while

14 Space Design of Exhibition Hall Based on Virtual Reality

161

the physical pavilion only makes some simple interactions through tour guides or auxiliary devices. It can be asserted that the network 3D virtual exhibition hall will surely become the most valuable exhibition means in the future. Panoramic mosaic is a kind of image mosaic technology. By integrating multiple images, it is used to provide a larger field of view for the scene where the camera is located or generate a higher resolution image. In the image-based virtual scene generation technology, a virtual scene is composed of a set of discrete viewpoint spaces, where the viewpoint space refers to the scene observed by participants at a certain viewpoint. Therefore, panorama becomes one of the appropriate description forms of viewpoint space, and virtual scene can be regarded as a group of discrete panorama [12]. For a full-color image, it is directly represented by RGB components. Using the palette technology is equivalent to saving an image by the aforementioned saving method, and a copy of the same image is also saved in the palette. There is no concept of grayscale image in BMP format. But if the RGB of each pixel is exactly the same, that is, R=G=B=Y

(14.1)

where Y is called the gray value. The sampling frequency of luminance signal Y is 13.5 MHz, and the sampling frequency of chrominance signals U and V is 6.75 MHz. Each digital significant line has 720 brightness sampling points and 360 ´ 2 color difference signal sampling points, respectively. The sampling points of each component are uniformly quantized, and PCM coding with 8-bit accuracy is performed for each sample. Y = 0.299B + 0.587G + 0.114B

(14.2)

Texture mapping is to map the texture of a two-dimensional space to the surface of a 3D object through a certain mapping function. Essentially, it is a transformation from one coordinate system to another. Texture mapping, usually including forward mapping and reverse mapping. Forward mapping is a process from screen space to texture space by defining a two-dimensional texture function and mapping the twodimensional texture to the surface of a 3D object. Set a point p in the house surface and calculate its corresponding ground coordinate formula as follows:

X = X0 + M X Y = Y0 + MY

(14.3)

Usually, the coordinates of the lower left corner of the textured surface are set to (0,0) and the coordinates of the upper right corner are (1,1), where X , Y is the image point coordinate of p point relative to the lower left corner point, (X 0 , Y0 ) is the ground coordinate corresponding to the lower left corner point, and M is the corresponding scale coefficient in the orthographic texture.

162

Y. Wang

As a classic example of combinatorial optimization, TSP (Traveling Salesman Problem) is widely used in the fields of transportation, computer network, circuit board design, logistics, distribution, etc. In a word, any problem that can be abstracted as the problem of traversing all nodes once and finding the lowest cost only once can be used as the application of TSP problem. Because the optimal simple polygon formed by plane discrete corners has a high similarity with TSP problem, this paper will directly use the basic steps of GA (genetic algorithm) to solve TSP problem to generate the building ground contour. Randomization coding will lead to the generated polygon that may be selfintersecting, in order to avoid too many self-intersecting. It is necessary to punish the number of self-intersections. Let the penalty factor be K , and the number of intersections of nonadjacent edges of polygons be n I nter . Define the following criteria function: K ∗ n I nter 2 If the polygon appears self - intersecting Jp = (14.4) J (θi ) If the polygon is a simple polygon Because it is a minimum problem, this paper uses a reciprocal transformation to define the fitness function of GA: fit =

1 Jp

(14.5)

For bitmaps using palette, image data is the index value of the pixel in palette, and for true color images, image data is the actual RGB value. The flow chart of converting a 24-bit color bitmap image to grayscale image is shown in Fig. 14.2. Fig. 14.2 Process of converting true color bitmap into grayscale image

14 Space Design of Exhibition Hall Based on Virtual Reality

163

14.3 Result Analysis Because the photos obtained from the camera are perspective projections of the scene onto the film plane, and the panorama shows the cylindrical projection of the scene, the image must be preprocessed before image stitching, that is, the plane projection is converted into cylindrical projection, which is perspective projection centered on the viewpoint. In this paper, the pixel-filling method is adopted, that is, the cylindrical pixels are mapped back to the plane image. Determining the overlapping area between adjacent images is to find the most similar part between two images, that is, to find the translation vector between adjacent images. This is actually a process of image feature recognition. In the contrast area and search area of two adjacent images, the maximum gradient point of each column is found, and the ordinate of this point is recorded, so that two curves reflecting the distribution of the maximum gradient feature points of the two images can be obtained. Through a set of images, the two algorithms are compared and analyzed. The CPU time occupied by the two algorithms is shown in Table 14.1 and Fig. 14.3. By comparison, it can be seen that the algorithm in this paper consumes less time than the algorithm based on feature points. In this paper, several groups of discrete points with different numbers are selected, and the above algorithm and the algorithm based on feature points are adopted for each group of discrete points. The experimental results are shown in Fig. 14.4. According to the analysis of experimental results, when the number of points is very small, for example, when the number of points is 16, the two algorithms get the same criterion function value. With the increase in the number of points, the Table 14.1 Comparative analysis of algorithm consumption time

Image sequence

Feature point algorithm

Our

1

1.728

1.1489

2

1.6413

1.0729

3

1.6788

1.0641

4

1.6804

1.101

5

1.6577

1.0913

6

1.654

1.1615

7

1.7194

1.1102

8

1.6935

1.127

9

1.6322

1.1428

10

1.6793

1.0823

11

1.6151

1.1808

12

1.7392

1.1794

13

1.7055

1.1878

14

1.7372

1.0951

15

1.7211

1.0985

164

Y. Wang

Fig. 14.3 Trend of algorithm consumption time

Fig. 14.4 Comparison of criteria function results

optimization effect of optimized GA becomes more and more obvious, while the error of the algorithm based on feature points becomes larger, which shows that the optimized GA has got a good optimization effect.

14 Space Design of Exhibition Hall Based on Virtual Reality

165

14.4 Conclusion With the improvement of economic level, no matter from the angle of commercial economy or cultural communication, human beings are no longer satisfied with the traditional two-dimensional information display mode, but the demand for participatory and three-dimensional display mode is increasing. The development of exhibition halls is limited by many factors such as weather, traffic, time, venue, cost, and safety. Combining VR technology with exhibition hall technology is an important means to solve the obstacles of exhibition hall industry development. In this paper, the VR-based exhibition hall space design research is carried out, and the panorama generation model of exhibition hall space scene is established. The research shows that by comparison, it can be seen that the time consumed by this algorithm is obviously less than that of the algorithm based on feature points. When the number of points is 16, the two algorithms get the same criterion function value. As the number of points increases, the optimization effect of GA becomes more and more obvious, while the error of the algorithm based on feature points becomes larger.

References 1. Kochi, N., Isobe, S., Hayashi, A., Kodama, K., Tanabata, T.: Introduction of all-around 3D modeling methods for investigation of plants. Int. J. Autom. Technol. 15(3), 301–312 (2021) 2. Gao, J., Shi, J.D., Wang, J.Z.: Modeling and simulation for small-tracked mobile robots. J. Beijing Inst. Technol. (2), 211–217 (2016) 3. Liu, J.: Application of UAV photogrammetry and 3D modeling in mine geological environment monitoring. Acta Geol. Sin. (Engl. Ed.) 93(2), 437–438 (2019) 4. Chatzivasileiadi, A., Wardhana, N.M., Jabi, W., Aish, R., Lannon, S.: Characteristics of 3D solid modeling software libraries for non-manifold modeling. Comput.-Aided Des. Appl. 16(3), 496–518 (2019) 5. Sui, Z., Mu, J., Wang, T., Zhang, S.: Evaluation of energy saving of residential buildings in North China using back-propagation neural network and virtual reality modeling. J. Energy Eng. 148(3), 04022013 (2022) 6. Jin, C., Zou, F., Yang, X., Liu, K.: 3-D virtual design and microstructural modeling of asphalt mixture based on a digital aggregate library. Comput. Struct. 242, 106378 (2021) 7. Fuessinger, M.A., Schwarz, S., Neubauer, J., et al.: Virtual reconstruction of bilateral midfacial defects by using statistical shape modeling. J. Cranio-Maxillofac. Surg. 47(7), 1054–1059 (2019) 8. Wang, Z.C., Shi, L., Yang, Y.S.: Application of virtual reality technology in the space design of offshore platform living quarters. China Offshore Platf. 35(2), 6–11 (2020) 9. Huang, W.W.: Research on architectural space environment design based on virtual reality technology. Autom. Instrum. 4, 31–33 (2017) 10. Li, B.: Characteristics and application of scene design in modern space display. Bonding 40(8), 92–94 (2019) 11. Zhao, Z.Y., Lv, J., Pan, W.J., Hou, Y.K., Fu, Q.W.: Research on VR space cognition based on virtual representation behavior. J. Eng. Des. 27(3), 9 (2020) 12. Zhang, W.Z., Gao, J., Gao, Z.H., Yu, H., Liu, K., Huang, Y.: Cosmic scale experience design from the perspective of virtual reality. Exp. Technol. Manag. 036(001), 80–83,88 (2019)

Chapter 15

Identification of Expressway Traffic States Based on the Enhanced FCM Algorithm Zhuocheng Yang, Liang Hao, Yuchen Liu, and Lei Cai

Abstract In order to enhance the accuracy and effectiveness of traffic state identification, a fuzzy C-means algorithm based on simulated annealing genetic algorithm (SAGA-FCM) was proposed. First, according to the characteristics of expressway traffic, traffic states were divided into five states based on the Van Aerde model. Second, there are characteristics of fuzziness in expressway traffic. Flow, speed and density were taken as characteristic attributes of sample data. This paper proposed an enhanced fuzzy C-means clustering method SAGA-FCM for the identification of traffic states. It overcomes the problem that the traditional FCM algorithm is sensitive to the initial clustering center and easy to fall into the local optimum. Finally, the M25 motorway was used as an example to evaluate traffic conditions. The results were consistent with measured traffic conditions, which verified the effectiveness of the method. Keywords Traffic engineering · Traffic state identification · Clustering algorithm · Macroscopic fundamental diagram · Expressway

15.1 Description of Expressway Traffic States Traffic states are objective reflections of the overall operation of the traffic, which can be represented by a set of index variables that reflect the traffic in different aspects and different granularity. They have the characteristics of dynamic randomness, continuity and periodicity [1]. Traffic states identification can be divided into normal identification and abnormal identification. The former not only needs to identify traffic congestion, but also needs to identify other types of traffic flow. The latter focuses mainly on the identification of occasional traffic congestion. This paper mainly studies identification under normal conditions.

Z. Yang (B) · L. Hao · Y. Liu · L. Cai Beijing GOTEC ITS Technology Co., Ltd., Beijing 100088, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_15

167

168

Z. Yang et al.

Traffic congestion is one of the most influential, longest lasting and most frequent traffic problems. Research abroad usually adopt the method of study traffic anomaly events to identify traffic state. Scholars in China usually adopt road occupancy, queuing vehicles and traffic speed as a standard of traffic state identification. Based on the threshold of traffic parameters such as road capacity and road occupancy, comprehensive evaluate whether there is congested traffic flow. In automatic traffic state identification, traffic congestion is considered to exist only when several parameters, such as road occupancy, traffic speed and traffic flow, exceed the threshold in a certain period of time. The model of traditional traffic flow theory usually uses the theory of traditional physics, mathematics and other basic disciplines to describe traffic behavior. Although the model is reasonable and simple and has a clear physical meaning, it has many restrictions. In contrast, machine learning uses data-driven methods to dig deep into data information that is sensitive to abnormal features and their changes. It pays more attention to the application value of models and algorithm research. This model is mainly used for the study of complex traffic flow research. Among them, some machine learning methods can better detect abnormal traffic events, and better obtain the changes of the characteristics of traffic speed, road occupancy and whether there is congested traffic flow, which can be used for traffic state identification [2]. In this paper, the macroscopic fundamental diagram combined with FCM (Fuzzy C-means clustering method) is adopted to identify the traffic states of expressway. The macroscopic fundamental diagram is a method to describe the relationship among continuous traffic flow q, speed v and density ρ on a road section based on historical data, using statistical methods and the form of graph. It includes a speedflow model, a speed-density model and a flow-density model. Since there is a relation q = v·ρ between the three parameters of traffic flow, the three basic graphs can be derived and converted from each other [3]. In order to study the relationship between traffic flow parameters of expressway, a four-parameter single structure Van Aerde model was selected as the modeling basis, which involves several key characteristic parameters. v f —free flow speed (km/h), vc —critical speed (km/h), ρc —critical density (pcu/km/lane), ρ j —blocking density (pcu/km/lane), qc —road capacity (pcu/h/lane). c1 = v f (2vc − v f )/(ρ j vc2 )

(15.1)

c2 = v f (v f − vc )2 /(ρ j vc2 )

(15.2)

c3 =

1 − v f /(ρ j vc2 ) qc

q = v/(c1 +

c2 + c3 v) vf − v

The relationship curve of the Van Aerde model is shown in Fig. 15.1.

(15.3) (15.4)

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

169

Fig. 15.1 Van Aerde model relationship curve

In the study of road section traffic states, traffic states can be divided into multiple grades based on the macroscopic fundamental diagram. As shown in Fig. 15.2, the expressway traffic states can be divided into free flow state, crowded flow state and blocked flow state. Based on the above analysis, this paper further divides traffic states into unimpeded, basically unimpeded, mild congestion, moderate congestion and severe congestion. In Fig. 15.3, A, B, C and D are used to divide traffic states successively into different levels. Unimpeded: The traffic flow is free as a whole. The sample points fall in the low-density and low-flow interval. The vehicles are free from external interference in the road network. Through the analysis of historical data, the time period when the samples appeared was mainly in the early morning. Basically unimpeded: The sample points are also distributed in the low-density and low-flow interval. But the traffic flow state is worse than the free flow, and the speed distribution span is larger. At this time, the traffic of most sections is unimpeded

Fig. 15.2 The traffic states are divided into three levels

170

Z. Yang et al.

Fig. 15.3 The traffic states are divided into five levels

or close to the free flow state, and the traffic flow of a few sections is unstable. But it will not form congestion or queue for a long time, which has little impact on the overall performance of the road. Mild congestion: The average flow is close to road capacity, but the average density is lower than the critical density (the density under the capacity). At this time, the density of most sections is lower than or close to the critical density. And the density of a few sections is higher than the critical density. But the overall average density is lower than the critical density. If traffic demand continues to increase, the average flow will gradually decrease. Moderate congestion: The average density has exceeded the critical density, and the average flow begins to decline from the capacity. Also the traffic queuing situation on the road begins to deteriorate. At this time, the difference of traffic flow state of each section begins to increase, and the three states of traffic flow of the section will all appear. But the overall state is still controllable. If the traffic demand entering the expressway is reduced at this time, the steady state of traffic flow can be gradually restored or maintained. But if traffic demand continues to increase, the expressway will soon become clogged and even run in a state of paralysis. This level of traffic flow mainly occurs in the morning and evening peak hours. Severe congestion or obstruction: After the expressway enters the obstruction state, the data sample points are distributed in the high-density and low-flow interval. There has been large-scale congestion, most of the road sections should be in the blocked flow state, and the vehicles are running slowly. Although some sections are unblocked or slightly congested, they cannot drive the whole system to run. At this time, if the corresponding traffic control and dredging measures are not adopted in time to limit or prevent the traffic inflow, the traffic operation will tend to be paralyzed.

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

171

15.2 State Identification Based on Enhanced FCM Algorithm 15.2.1 FCM Clustering Analysis and Implementation The theory of FCM is rooted in fuzzy set theory and fuzzy logic. The basic idea of FCM is to make the data belonging to the same category more consistent, that is, the smaller the variance, the better. Similarly, the greater the variance of data belonging to different categories, the better. Take any value within the interval [0, 1] that each data point in the process of clustering and the size of the clustering center of subordination degree. Compared with hard C clustering analysis, fuzzy C-means clustering is to solve the problem on the verge of various samples of the membership, to the class and class boundaries between imprecise problems is the most applicable, which can more accurately reflect the actual situation. The disadvantage is that the convergence speed is relatively slow. In the unsupervised case, the closer a data point is to a clustering center, the greater the probability that the point belongs to the clustering center [4]. The steps of the FCM algorithm are as follows. Step 1: By giving the number of fuzzy subclasses c, the clustering center C (0) and m, the membership matrix U, the maximum iteration number Nmax and the iterative convergence threshold ε are preliminary set. Step 2: Calculate the new fuzzy membership matrix U (b) . Step 3: Calculate the new fuzzy clustering center C (b) according to the result of step 2. Step 4: Judging whether it satisfies ||U (b) − U (b−1) || ≤ ε. If the calculation result is stable, stop the calculation. And output the optimal clustering center C and sample membership matrix U. If neither is true, set b = b + 1 and return to step 2.

15.2.2 SAGA-FCM Algorithm Traditional FCM algorithm has many shortcomings, such as sensitivity to the initial clustering center, difficulty to determining the number of clusters and easy to fall into local optimum, which will directly affect the effectiveness and rationality of FCM algorithm for traffic state identification. The Simulated Annealing algorithm (SA) has strong local search ability, which can prevent the search process from falling into local optima. But its global search ability is insufficient. Although the local search ability of Genetic Algorithm (GA) is poor, it has a strong ability to grasp the whole search process. The SAGA hybrid algorithm combining SA and GA can complement the advantages of the single algorithm and improve the overall performance of the algorithm. SAGA-FCM algorithm combining SAGA and traditional FCM algorithm can overcome the defects of traditional FCM algorithm and improve the validity and rationality of identification results [5].

172

Z. Yang et al.

Fig. 15.4 The process of the SAGA-FCM algorithm

The process of the SAGA-FCM algorithm is shown in Fig. 15.4.

15.3 Case Analysis According to traffic flow theory, there is a nonlinear relationship between traffic flow parameters. Under the same traffic flow, the road may be in an unimpeded state, or it may be in a crowded state, showing different traffic speeds. Therefore, a single traffic flow parameter cannot be used to comprehensively evaluate traffic states. Currently, most of the fixed traffic detectors used in road traffic flow parameter detection systems can provide basic traffic data such as flow, speed and density. Therefore, the three parameters are selected as input variables for the identification of the traffic states [6–8].

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

173

This experiment uses MATLAB R2020b to write the FCM algorithm and the SAGA-FCM algorithm program. System clustering method, traditional FCM algorithm and SAGA-FCM algorithm are used to cluster traffic states. For the three clustering algorithms, consistent parameters were set. The relevant parameters were set as follows: population size was 100, maximum evolutionary number was 100, crossover probability was 0.7, mutation probability was 0.01, initial annealing temperature was 100, termination temperature was 1, cooling coefficient was 0.8, and ambiguity coefficient was 2 [9, 10]. The traffic states of the M25 motorway near Heathrow Airport was analyzed, including two-way 4 lanes, two-way 5 lanes and two-way 6 lanes (Fig. 15.5). From 1 August to 30 September 2019, seven observation stations conducted realtime monitoring of the M25 motorway traffic flow for 61 days. The raw data collected included key fields such as date, time, flow and speed. The data granularity was 15 min (Table 15.1). By analyzing experimental data from observation station A in August, it is known that the maximum free flow of this section is 1500 pcu/h/lane. When the maximum density is 20 pcu/km/lane, the corresponding maximum traffic flow is 2000 pcu/h/lane (Fig. 15.6). The traffic data collected continuously for 31 days can basically reflect the traffic conditions of this section. Different colors in Fig. 15.7 represent different time periods. There are 8 colors in total, that is, a day is divided into 8 time periods

Fig. 15.5 Location of the M25 motorway

9

3:14:00

2019/8/1

9

9

9

3:59:00

4:14:00

4:29:00

2019/8/1

2019/8/1

2019/8/1

9

9

9

3:29:00

3:44:00

2019/8/1

2019/8/1

9

2:44:00

2:59:00

2019/8/1

9

2:29:00

2019/8/1

2019/8/1

9

9

1:59:00

2:14:00

9

2019/8/1

1:44:00

2019/8/1

9

9

9

9

9

9

Day type ID

2019/8/1

1:14:00

1:29:00

2019/8/1

0:59:00

2019/8/1

2019/8/1

0:29:00

0:44:00

2019/8/1

0:14:00

2019/8/1

2019/8/1

Local time

Local date

313

304

242

224

200

167

191

173

199

164

186

199

176

180

185

215

216

222

Total carriageway flow

Table 15.1 A sample data of observation stations

156

163

133

117

88

63

96

59

93

76

89

94

77

84

90

111

120

127

Total flow vehicles less than 5.2 m

57

53

27

28

27

20

26

26

25

33

36

32

33

38

40

43

35

33

Total flow vehicles 5.21–6.6 m

23

24

27

27

27

24

15

33

32

18

22

25

23

26

19

12

19

23

Total flow vehicles 6.61–11.6 m

77

64

55

52

58

60

54

55

49

37

39

48

43

32

36

49

42

39

Total flow vehicles above 11.6 m

103.59

104.37

103.88

103.29

102.78

100.95

100.97

100.32

102.5

103.76

103.85

102.71

100.33

104.58

103.87

102.94

105.17

103.2

Speed value

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

15

Quality index

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

NTIS model version

174 Z. Yang et al.

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

175

Fig. 15.6 Van Aerde model relationship curve, August 2019 (observation station A)

every 3 h. Different shapes represent different traffic states. There are five shapes, that is, a day is divided into five traffic states. In August 2019, the traffic states are 3, 1, 4, 5 and 2 in decreasing order of quantity. That is, the states of the road sections from most to least are: mild congestion, unimpeded, moderate congestion, severe congestion and basically unimpeded (Fig. 15.8).

Fig. 15.7 Clustering results of the SAGA-FCM algorithm

Z. Yang et al.

Sample size

176

800 600 400 200 0

unimpeded basically unimpeded mild congestion

moderate congestion

severe congestion

Fig. 15.8 Traffic state statistics for August 2019

The traffic states of a consecutive week from August 5 to August 11, 2019 was identified as shown in Fig. 15.9. It can be seen from Fig. 15.9 that there is an obvious morning peak and evening peak on workdays, and different degrees of congestion after 6 o’clock. During the morning peak (6:00–9:00), there is serious congestion, mainly because there is a large flow of passengers going to Heathrow Airport by plane in the morning. And most of them choose to travel by car, causing traffic congestion near the M25 motorway. However, on non-workdays, there is no concentrated time to go to work and come home, traffic conditions are relatively stable, and serious congestion rarely occurs. In addition, traffic congestion in the morning is about 1–2 h later than on workdays, and the duration is shorter than on workdays. Through the actual investigation of the traffic data of this section, statistical analysis of the data and the investigation of citizens’ actual feelings about the road operation, the actual traffic states of this section is basically consistent with the identification result of SAGA-FCM algorithm. The change rate of distance between each sample and its clustering center can be used to compare the clustering effect of two algorithms. The change rate of the distance between sample i and its clustering center is denoted as: Ci =

Di, A − Di,A Di,B

(15.5)

Di, A represents the distance of sample i from its clustering center under clustering algorithm A [11]. The change rate of distance Ci reflects the difference of clustering effect of different clustering algorithms. The positive value indicates that the distance between each sample and its clustering center decreases under the B algorithm, and the B algorithm enhances the intra-class compactness. The new clustering center better represents the characteristics of all samples, that is, the B algorithm is superior to the A algorithm. The higher the value of Ci , the more obvious the improvement of the clustering effect of algorithm B compared with algorithm A. The change rate of distance between FCM algorithm and system clustering method as well as SAGAFCM algorithm and FCM algorithm were calculated respectively, and the results were shown in Fig. 15.10 respectively.

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

177

Fig. 15.9 Traffic states on workdays and non-workdays

Through the calculation of change rate of distance of the above algorithms, it can be concluded that among the algorithms used for traffic state identification, SAGAFCM algorithm and traditional FCM algorithm are significantly better than system clustering. The SAGA-FCM algorithm is better than the traditional FCM algorithm.

178

Z. Yang et al.

Fig. 15.10 Change rate of distance between FCM algorithm and system clustering method (left); change rate of distance between SAGA-FCM algorithm and FCM algorithm (right)

15.4 Summary In this paper, the SAGA-FCM algorithm is used for traffic state identification and the following conclusions are drawn: (1) The comprehensive effect of SAGA-FCM algorithm for traffic state identification is better than that of traditional FCM algorithm and system clustering method. (2) SAGA-FCM algorithm overcomes the shortcomings of traditional FCM algorithm which is sensitive to the initial clustering center and has insufficient global search ability. It can obtain a better global optimal solution and make the identification result more accurate. (3) SAGA-FCM algorithm generally improves the effectiveness of traffic state identification, so as to improve the rationality and accuracy of discrimination results. And reduce the randomness of artificial classification and the roughness of traditional algorithms.

References 1. Sun, X.L.: Research on evaluation and prediction of urban road traffic condition and its application. Beijing Jiaotong University (2013) (in Chinese) 2. Jia, R., Dai, S.H., Huang, N., Li, S.Y., Liu, Z.Y.: Literature review on traffic congestion identification methods. J. South China Univ. Technol. (Nat. Sci. Ed.) 49(04), 124–139 (2021) (in Chinese) 3. Zhu, L.: Research on theory and method of urban expressway traffic situation assessment. Beijing Jiaotong University (2013) (in Chinese) 4. Zhang, H.Z., Wang, J.: Improved FCM clustering algorithm based on initial cluster center selection. Comput. Sci. 36(06), 206–209 (2009) (in Chinese) 5. Zhou, K.L., Yang, S.L.: Power load characteristics classification based on improved fuzzy C-means algorithm. Prot. Control Power Syst. 40(22), 58–63 (2012) (in Chinese) 6. Shang, Y., Li, X.G., Jia, B., Yang, Z.Z., Liu, Z.: Freeway traffic state estimation method based on multisource data. J. Transp. Eng., Part A: Syst. 148(4), 04022005 (2022)

15 Identification of Expressway Traffic States Based on the Enhanced FCM …

179

7. Alexander, G., Noel, H., Michail, M., Anastasios, K.: An experimental urban case study with various data sources and a model for traffic estimation. Sensors 22(1), 144 (2021) 8. Dahiya, G., Asakura, Y.: Exploring the performance of streaming-data-driven traffic state estimation method using complete trajectory data. Int. J. Intell. Transp. Syst. Res. 19(3), 572–586 (2021) 9. Jiang, J.F., Chen, Q.S., Xue, J., Wang, H.B., Chen, Z.J.: A novel method about the representation and discrimination of traffic state. Sensors 20(18), 5039 (2020) 10. Risso, M.A., Bhouri, N., Rubiales, A.J., Lotito, P.A.: A constrained filtering algorithm for freeway traffic state estimation. Transp. A: Transp. Sci. 16(2), 316–336 (2020) 11. Babu, C.N., Sure, P., Bhuma, C.M.: Sparse Bayesian learning assisted approaches for road network traffic state estimation. IEEE Trans. Intell. Transp. Syst. 22(3), 1733–1741 (2020)

Chapter 16

Algorithms Applied in Soft Tissue Deformation Simulation: A Survey Xiaoyu Cai and Hongfei Yu

Abstract In modern medical science, Minimally Invasive Surgery (MIS) is one of the most direct and effective ways to treat the malignant lesions. The training of MIS is now not only limited to the actual surgical anatomy but also start to have a growing tendency to the VR technology which is depended on computer graphics and haptic rendering. As well as the surgery equipment simulation and graphics rendering, the Soft Tissue Deformation (STD) simulation is another critical technique in MIS simulation of VR. Therefore, we have collected lots of STD algorithms from 1986– 2022, especially, the haptic feedback algorithms for STD, which can be divided into two categories. The first one is algorithms based on image distortion and the another one is based on the physical properties simulating. After classification of the two categories, we make a comparison among the algorithms proposed above, such as efficiency, accuracy, complexity and rendering speed. Keywords Minimally invasive surgery simulation · Soft tissue deformation algorithm · Haptic feedback algorithm

16.1 Introduction There is a major issue for the medical field to offer the patients more convenient and safely to undergo surgery. Getting faster postoperative healing and reducing the pain after anesthesia faded also troubles the Practitioners. With the rapid development of synchronous video technology, computer technology and high-frequency electric knife, minimally invasive surgery (MIS) has become the backbone of surgery in recent 30 years. With the help of advanced computer-aided surgery system, surgery has become more and more safe, reliable and accurate, the wound becomes smaller, and even without epidermal trauma which is named NOTES (Natural Orifice Translumenal Endoscopic Surgery). X. Cai · H. Yu (B) Yunnan Key Lab of Opto-Electronic Information Technology, Yunnan Normal University, Kunming, P. R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_16

181

182

X. Cai and H. Yu

The key technology in virtual MIS is the simulation of soft tissue deformation (STD). Since the development of computer graphics technology, people have proposed many methods for soft tissue modeling, collision detection, and deformation. The most important technical point of STD lies in the real -time and accuracy of the deformation. How to make the STD exceeds the performance of the human sensory sense without sacrifice authenticity, including tactile and visual authenticity, becoming a major topic for the improvement of STD. In terms of haptic rendering, due to the characteristics of biological tissue materials are very complicated, including biological non -linear elasticity, biological viscosity, and stress in soft tissue [1–3], the evolution of the tactile rendering algorithm of STD is backward. In order to solve the problems proposed above, we have collected many papers in order to find a STD method with high real time performance and great accuracy.

16.2 Algorithms for Deformation Simulation of Soft Tissue 16.2.1 Algorithms of Graphics Based Deformation Image based deformation algorithm is mainly constructed from the perspective of pure geometric construction, with less consideration of the physical characteristics of the rendered object. Free Form Deformation Sederberg et al. [4] proposed the Free Form Deformation method in 1986, which is an important method for editing geometric shapes. This method is very important, and it is an enduring method in the Deformation of simple 3D modeling geometry. Steps of this method could be summarized as Table 16.1. Laplace Operator Method and Poisson Operator Method Zhou et al. [5] and Sorkine et al. [6, 7] proposed on ACM in 2004 and 2005 respectively an algorithm for large mesh deformation using Laplace operator. And Yu et al. [8] and Xu et al. [9] conceived another way in Poisson Operator. The processing of the method is divided as 3 steps as Table 16.2 shown.

16.2.2 Algorithms Based on Physical Characteristics The algorithm based on physical characteristics refers to approaching the force properties of soft tissues with a certain model which is based on the actual physical properties of soft tissues. So that the model could achieve a real haptic feedback effect.

Content

Parallelepiped space creation

Frame control point control

Steps

1

2

Table 16.1 Steps of free form deformation

s=

=

S×U·(X−X 0 ) ,u S×U·T

=

S×T ·(X−X 0 ) S×T ·U

(1)

where Xffd is a vector containing the Cartesian coordinates of the displaced point, and Pijk is a vector containing the Cartesian coordinates of the control point

Move the frame control point, use the local coordinates (s, t, u) of the frame vertices to control the world coordinates, and use Bernstein polynomials to calculate the world coordinates of the geometric model vertices ( ) [ ( ) [ ( ) ]] l m n ∑ ∑ ∑ l m n Xffd = (2) (l − s)l−i si (1 − t)m−j tj (1 − u)n−k uk Pijk i=0 i j=0 j k=0 k

Where U is horizontal axis, Y is the vertical axis, T is the vertical axis Note that for any point interior to the parallel piped that, 0 < s < 1, 0 < t < 1 and 0 < u < 1

T ×U(X−X 0 ) ,t T ×U·S

X = X 0 + s S + t T + uU

Create a parallelepiped space frame for the set model and establish a local coordinate system, and calculate the coordinates of geometric model vertices under the local coordinate system. Any point X has (s, t, u) coordinates in the system as shown in Eq. (1)

Methods

16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey 183

184

X. Cai and H. Yu

Table 16.2 Steps of Laplace operator method Steps

Content

Methods

1

Model assumption

Assume model as a triangular mesh: Let V = (v1 , v2 , . . . , vn ) be the mesh vertex positions and i be the index set of every vertices adjacent for vi . So that the Laplacian coordinate (LC) of a vertex vi is ( ) ∑ li = j∈i wi j V j − V i

2

Vertex positions updating

As a preparation we let V t an l t be the vertex positions at time t Using the current Laplace Coordinates (LCs) l t , the vertex position V t+1 can be computed. Since that the LCs and current handle positions become the constraints of (3). Therefore Eq. (3) could be deducted into Eq. (4) '

AV = b

(3)

A AV t+1 = AT bt

(4)

T

Where A is a diagonal matrix and bt from the current handle positions 3

The Laplacian coordinates updating

After step 1, LCs could be updated to match the current deformed surface. To maintain the original sizes of the feature, LCs’ magnitudes should be kept well. Then current LCs’ corresponding LC will be defined into Eq. (5). n it+1 in the equation is the unit curvature normal computed from the vertex position V t+1 lit+1 = ||li0 ||n it+1

(5)

Finite Element Model Wu et al. [10] proposed a model with the use of Finite Element in 1996. After that Bro-Nielsen et al. [11] improved the algorithm in 2005. In 2010 Gutiérrez et al. [12] extended FEM into XFEM. The core idea of Finite Element Model is using finite elements to analyze a certain physical phenomenon. It is to use discrete finite elements to simulate the continuous feature in the process of STD. The application of finite element model in STD simulation can be traced back to the algorithm proposed by Bro Nielsen and Gramkow in 1996 [10]. In 2005 Wu et al. [11] advanced the FEM. After that Ramos et al. [12] improve FEM into a new method named XFEM. The FEM algorithm is normally composed of two steps which is demonstrated in Table 16.3. Mass-Spring Model The mass-spring model is composed of massless spring grid connected particle sets like Fig. 16.1 shown [13, 14]. In this model, every particle is evenly distributed on the space grid. Each edge of each hexahedron in the model corresponds to a “structural” spring, which connects all particles in the model to form the basic shape of the model

16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey

185

Table 16.3 Finite element model algorithm processes steps Steps

Content

Methods

1

Soft tissue model division

A soft tissue model can be divided into multiple tetrahedral structures, and the displacement u = (u, v, w) of these tetrahedral structures can be replaced as (6), where I is a 3 × 3 Identity matrix, while N is a linear profile function defined as (7) [ ] u = N e ue = I Ni I N j I Nm I N p [uej uej uem uep ]T (6) Nr =

ar +br x+cr y+dr z ,r 6V

= i, j, m, p

(7)

while,

ai=1 =

bi=1 =

| | | | | | | | | x j yj z j | | 1 yj z j | | | | | (−1)i−1 || xm ym z m || (−1)i || 1 ym z m || | | | | | x p yp z p | | 1 yp z p | | | | | | xj 1 zj | | | i| ci=1 = (−1) | xm 1 z m || di=1 = | | | xp 1 zp | | | | | | x j yj 1 | | | (−1)i || xm ym 1 || | | | x p yp 1 | V is the volume of every element 2

Deformation energy calculation

Each element u above is used to represent the overall deformation force and the stress ε can be expressed as ε = BU , where B is a displacement differentiation matrix of 6 × 3n e and U is a 3n e × 1 vector which is to represented the displacement of every node. Therefore, it can be presumed that the pressure of linear elastic material σ relate to ε as a linear relationship via Young’s Modulus. Then the energy E is (8), while D is a symmetric elastic matrix ∫ ∫ T E = & 21 σ T εd V = 21 ε Dεd V = (8) ∫ 1 T 2 (BU ) D(BU )d V

and limits the radial deformation of the model. In addition, each diagonal of each hexahedron corresponds to a “shear” spring, which limits the shearing of the model. Generally, the model is consisted of two parts which are exhibited in Table 16.4.

186

X. Cai and H. Yu

Fig. 16.1 Mass-spring model

Position-Based-Dynamic Position-Based-Dynamic was firstly proposed by Müller et al. [15] in 2006 and has been improved in various of ways [16–20]. For example, the well-known digital graphic company Nvidia Corporation has developed a soft deformation algorithm based on PBD model for clothes, fluids and rigid bodies simulating [21]. It is a dynamic simulation algorithm in computer vision, mainly used for various scenes that need real-time simulation. A PBD’s working flow could be encapsulated into 17 lines as Table 16.5 shown. Cell Neural Network Model The proposed neural network is the cellular neural network (CNN), which is a localinterconnected array-computing structure. The idea was first proposed by Chua and Yang [22] in 1988 on IEEE Transactions on Circuits and Systems. In 2019 Jinao Zhang et al. [21] transfer the idea into STD for surgical simulation. The core idea of this method could be divided into two parts in Table 16.6.

16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey

187

Table 16.4 Parts of mass-spring model Parts

Content

Details

1

Model decomposition

Assuming that the number of particles in the mode is n, the state of the whole model at the moment can be determined by the position xi and velocity vi of each particle i = 1 . . . n. The resultant force f i on each particle obtains from the internal force provided by the spring which connects to the particle and the external force (gravity, friction, force generated by user manipulation, etc.) on the particle. Generally, linear elastic spring is used in the model. According to Hooke’s law, the elastic force exerted on particle i by the spring between particles j can be represented as (9). Where X i j = (X i − X j ) is the difference of position vector X i and X j , while ks is the stiffness coefficient and li j is the original length of the spring | | X f i j = ks (| X i j | − li j ) |X ii jj | (9)

2

Model adjustment

However, in reality, elastic objects do not only exhibit ideal elastic properties. Elastic objects lose energy during deformation. Therefore, a damping between particles should be added into the model during a relative motion like Eq. (10) shown (| | ) X f i j = ks | X i j | − li j | X i j | + kd (V j − V i ) (10) ij

16.3 Results Each STD simulation algorithm utilizes its own unique property. The general features and capabilities are summarized in Table 16.7. It can be analyzed from the above table that the algorithms based on PositionBased-Dynamics show a good convergence and great controllability with STD while Cell Neural Net Model perform a great potential in low cost of calculation power and high algorithm extensibility. Besides, Physical properties-based algorithms perform higher calculate speed and more stability than Graphics-based do.

16.4 Conclusion Virtual surgical robot needs a set of stable and real-time available STD simulation algorithm to provide haptic feedback whose performance are close to real surgery. The purpose of this paper is to collect a variety of simulation algorithms for STD, introduce the working flow during the surgery simulation processing and compare their advantages and limitations to find a more suitable simulation algorithm for

188

X. Cai and H. Yu

Table 16.5 Processing lines of position-based-dynamic Parts

Line

Content

Initialize the state variables

1

For all vertices i

2

Initialize X i = X 0i , V i = V 0i , wi = 1/m i

3

End for

4

Loop

5

For all vertices i do V i ← V i + ∆twi f ex t (X i )

Manipulate the velocities

6

dampVelocities(V 1 , . . . . . . , V N )

Estimates P i for new locations of the vertices are computed using an explicit Euler integration step

7

For all vertices i do P i ← X i + ∆t V i

The iterative solver manipulates position estimates the constraints

8

For all vertices i do generateCollisionConstraints (X i → P i )

9

Loop solverIterations times

10

projectConstraints (C1 , . . . . . . , C M+Mcoll , P 1 , . . . . . . , P N )

11

End loop

The positions of the vertices are moved to the optimized estimates and the velocities are updated accordingly

End loop

12

For all vertices i

13

V i ← ( P i − X i )/∆t

14

Xi ← Pi

15

End for

16

velocityUpdate (V 1 , . . . . . . , VN )

17

End loop

Table 16.6 Parts of cell neural network model Parts

Content

Details

1

Cell description Cell state

xi (t) = C x dvdt

∑ − R1x vxi (t) + C( j)∈Nr (i) ∑ C( j )∈Nr (i )

Cell output Conditions 2

(11)

A(i; j )v y j (t) + B(i; j)vu j (t) + Ii v yi (t) = f (vxi (t))

= 21 (|vxi (t) + K | − |vxi (t) − K |), K ≥ 1# |vxi (0)| ≤ K ; |vui | ≤ K

(12) (13)

Neural Since the load of mechanical is applied to the model, the cell current propagation of can be applied as a strain energy density for the direction of external potential energy force like Eq. (14) shown. Where I i is the vector form of cell current Ii , σ and ε are the stress and strain at the point, and F is the external force and ||F|| is the magnitude ∫ F I i = σ dε ||F|| (14)

Based on Physical characteristics

2006

Dual Laplacian editing for meshes [7]

Author

An improved scheme of an interactive finite element model for 3D soft-tissue cutting and deformation [10]

XFEM framework for cutting soft tissue including topological changes in a surgery simulation [12]

Research on An Improved Mass Spring Model and Collision detection Algorithm [13]

XFEM

MSM

2006

2010

2005

Hong Zhicha

Gutiérrez L F; Ramos F

Wen Wu; Pheng Ann Heng

Dong Xu

Differential Mesh Processing [9]

2006

Yu, Yizhou; Zhou, Kun;

Au, O.K.-C.; Tai, C.L.;

Zhou, Kun; Huang, Jin; Snyder, John; Liu, Xinguo; Bao, Hujun; et al.

Sederberg, Thomas W.; Parry, Scott R

Mesh editing with 2004 poisson-based gradient field manipulation [8]

FEM

Poisson Opera

2005

Large mesh deformation using the volumetric graph Laplacian [5]

Laplace Operator

Year 1986

Title

Free-form deformation of solid geometric models [4]

Methods

FFD

Basic Theory

Based on Graphics

Table 16.7 General features and capabilities of STD simulation algorithm Advantages

Limitations

● Input mesh has to be 2-manifold ● Connectivity is changed

● Graphic only ● The time complexity is O(n 3 )

● Sample model ● Readily understood

(continued)

● Lack of real-time ● Lack of stability

● Enough accuracy with low ● Not enough consideration computing power of self-collisions of discontinuous elements

● Great simulation accuracy ● High computing power needed

● Allows two mesh ● Only guarantee C 0 continuity between boundaries to have very constrained and free different shapes, sizes and vertices roughness

● Produce visually pleasing results ● Iterative framework automatically updates the non-constrained vertices

● Low learning cost

16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey 189

Based on Physical characteristics

Basic Theory

CNNM

Neural network modelling of soft tissue deformation for surgical simulation [21]

2019

Zhang J; Zhong Y; Gu C

Huang S; Zang H

Fast Simulation of 2020 Cloth-Rigid Body Based on PBD [18]

Kelager M, Niebe S, Erleben K

Fratarcangeli M, Pellacini F

2010

A Triangle Bending Constraint Model for Position-Based Dynamics [16]

Müller M, Heidelberger B, Hennix M

Author

A gpu-based 2013 implementation of position-based dynamics for interactive deformable bodies [17]

2007

Position based dynamics [15]

PBD

Year

Title

Methods

Table 16.7 (continued)

● Stable and fast ● Real-time available

● Good convergence ● Very robust ● Great controllability

Advantages

● Need Feasibility verification

● Constraints become arbitrarily stiff as the iteration count increases

Limitations

190 X. Cai and H. Yu

16 Algorithms Applied in Soft Tissue Deformation Simulation: A Survey

191

complex structures of soft tissues. It could be concluded that with the rapid development of computer processing speed, the physical based soft tissue deformation algorithm can gradually replace the traditional image based deformation algorithm. For example, PBD algorithm and the latest CNNM algorithm both reduce the difficulty of deformation to a certain extent and increase the speed of deformation calculation. Future research work will focus on the extension of these algorithms and their real application in certain surgery operation scene, such as cutting, joining, tearing, burning and sewing up. Besides, studying the coupling capability of algorithms on virtual reality system will be included into the scope of study.

References 1. Zhang, Q.K., et al.: A non-linear deformable model for simulation of real-time soft tissue deformation. J. Appl. Sci. 25(3) (2007) 2. Yan, L.X.: Research of the Key Techniques on Virtual Surgery. Zhejiang University, pp. 6–10 (2001). (In Chinese) 3. Tai, Y. H.: Research on Real-time Physics-based Deformation. Deakin University (2018) 4. Sederberg, T.W., Parry, S.R.: Free-form deformation of solid geometric models. ACM SIGGRAPH Comput. Graph. 20(4), 151–160 (1986) 5. Zhou, K., Huang, J., Snyder, J., Liu, X., Bao, H., Guo, B., Shum, H. Y.: Large mesh deformation using the volumetric graph laplacian. In: ACM SIGGRAPH 2005 Papers, pp. 496–503 (2005) 6. Sorkine, O., Cohen-Or, D., Lipman, Y., Alexa, M., Rössl, C., Seidel, H.P.: Laplacian surface editing. In: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing, pp. 175–184 (2004) 7. Au, O.K.C., Tai, C.L., Liu, L., Fu, H.: Dual Laplacian editing for meshes 12(3), 0–395 (2006) 8. Yu, Y., Zhou, K., Xu, D., Shi, X., Bao, H., Guo, B., Shum, H.Y.: Mesh editing with poisson-based gradient field manipulation. In: ACM SIGGRAPH 2004 Papers, pp. 644–651 (2004) 9. Dong, X.: Differential Mesh Processing. Zhejiang University Hangzhou (2006). (In Chinese) 10. Wu, W., Heng, P.A.: An improved scheme of an interactive finite element model for 3D softtissue cutting and deformation 21(8–10), 707–716 (2005) 11. Bro-Nielsen, M.: Fast Fluid Registration of Medical Images. Visualization in Biomedical Computing (1996) 12. Gutiérrez, L.F.: XFEM framework for cutting soft tissue including topological changes in a surgery simulation (2010) 13. Hong, Z.C.: Research on An Improved Mass Spring Model and Collision detection Algorithm. Nanchang University (2019). (In Chinese) 14. Wang, Y.Z.: Research on Improved Mass-Spring Models and Their Application in Virtual Surgery. National University of Defense Technology (2006). (In Chinese) 15. Müller, M., Heidelberger, B., Hennix, M., et al.: Position based dynamics. J. Vis. Commun. Image Represent. 18(2), 109–118 (2007) 16. Kelager, M., Niebe, S.: A triangle bending constraint model for position-based dynamics. Vriphys 10, 31–37 (2010) 17. Fratarcangeli, M., Pellacini, F.: A gpu-based implementation of position-based dynamics for interactive deformable bodies. J. Graph. Tools 17(3), 59–66 (2013) 18. Huang, S., Zang, H.: Fast simulation of cloth-rigid body based on PBD. IOP Conference Series: Materials Science and Engineering. IOP Publishing, 782(2), 022039 (2020) 19. Macklin, M., Müller, M., Chentanez, N.: XPBD: position-based simulation of compliant constrained dynamics. In: Proceedings of the 9th International Conference on Motion in Games, 49–54 (2016)

192

X. Cai and H. Yu

20. PhysX. NVIDIA. https://developer.nvidia.com/physx-sdk. Last Accessed 12 Nov 2022 21. Zhang, J., Zhong, Y., Gu, C.: Neural network modelling of soft tissue deformation for surgical simulation. Artif. Intell. Med. 97, 61–70 (2019) 22. Chua, L.O., Yang, L.: Cellular neural networks-theory. IEEE Trans. Circuits Syst. 35(10), 1257–1272 (1988)

Chapter 17

Soft Tissue Cutting Based on Position Dynamics Zijun Wang and Hongfei Yu

Abstract In order to fit the development wave of virtual medical surgery, a soft tissue cutting model based on position dynamics is proposed. The particle-constrained tetrahedron is used to reconstruct the soft tissue part of the human body, and the facility and real-time operation is met through a visual interactive platform and force feedback calculation. The user can control the surgical instrument to reach the organ lesion area through the manipulation force feedback stroke area, and can also adjust the optimal position and posture of the instrument. Through experimental simulation, it is fully demonstrated that the model has good stability under increasing time steps, and at the same time, the higher computational efficiency is highlighted by expressing the applied force in the form of constraints, which can meet the high standards of human–computer interaction. Keywords Position based dynamic · Particle constraint · Tetrahedron reconstruction · Force feedback calculation

17.1 Introduction With the rapid development of virtual medical technology, the soft tissue constructed by the simple model cannot meet the authenticity of the surgical process, especially the partial excision of the organs, and it is difficult to find a series of problems such as physiological characteristics, blood vessel rupture, and soft tissue details during surgery. This is a huge test for the convergence of modern medical technology and virtual technology. The emergence of minimally invasive interventional surgery (MIS) [1] has made the transition from traditional surgery to minimally invasive. MIS enters the lesion area through a tiny incision in the patient’s body, reducing the Z. Wang · H. Yu (B) Yunnan Key Laboratory of Optoelectronic Information, Technology, Kunming, Yunnan, China e-mail: [email protected] H. Yu Yunnan Normal University, Kunming 650092, Yunnan, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_17

193

194

Z. Wang and H. Yu

pain and postoperative prevalence during surgery, effectively shortening the patient’s treatment cycle, and improving the patient’s recovery rate. However, due to the limited operating range of MIS, surgeons are required to have many years of operating experience and react quickly in the limited area of the device’s tip camera, which requires a rigorous assessment of the surgeon’s operating skills. Therefore, in order to effectively save the time spent by surgeons to master soft tissue cutting training, and at the same time solve the difference between cadaver and living body texture and interaction in the process of surgical training. In this paper, an immersive surgical interactive platform based on position-based dynamics (PBD) is proposed, which can bring users visual, tactile and other sensory simulations. By building a virtual surgical scene, users can restore the surgical scene and quickly establish a three-dimensional sense.

17.2 Based on Position Dynamics Algorithm and Tetrahedral Mesh Generation The validity of the soft tissue model directly affects the authenticity of the surgery, including the simulation of the organs and the user’s stereoscopic sense of the surgery. At present, common soft tissue models include Finite Element Method (FEM), Mass Spring Method (MSM), Position-Based Dynamics (PBD), and other models. The position-based dynamic model describes the force exerted by the system through constraint [2], and the particle displacement and shape change of the model are completed by the particle position. It breaks through the traditional physical model based on force and no longer updates the position of objects with force layer, acceleration, and Newton’s second law [3]. The position dynamics-based model proposed by Müller et al. [4] in 2006 effectively solves the problem of transition adjustment in MSM. The PDB model has higher computational efficiency than the MSM model and also avoids the accuracy problem of non-convergent truth values in the FEM model. The position-based dynamic model has a good adaptability in soft tissue modeling, and its real time and stability are better than FEM and MSM models.

17.2.1 Tetrahedral Mesh Generation Soft tissue modeling is mainly divided into surface grid and volume grid. The object surface is formed by fitting vertices and polygons. The surface grid has a good visualization effect [5] and low calculation time, but it can not describe the internal structure and stress of organs, and is only applicable to simulating the surface layer and local scope of organs, so it can not simulate the whole organ soft tissue. While the volume mesh is dominated by the quadrilateral volume mesh, which avoids the unreality of the internal information loss of the surface mesh in the organization

17 Soft Tissue Cutting Based on Position Dynamics

195

Fig. 17.1 Tetrahedral mesh inside the body model

(a)Tetrahedral mesh rabbit model

(b)Tetrahedron mesh cylinder model

modeling, improves the internal support for the model, and at the same time includes the advantages of the surface mesh in the rendering, which improves the physical accuracy of the simulated soft tissue, as shown in Fig. 17.1.

17.2.2 Based on the Principle of Position Dynamics Algorithm Position-based dynamics generally controls the deformation and displacement of objects by presenting geometric constraints. It is not only possible to define constraint functions such as stretching, self-collision, and bending. In 2013, Macklin realized the friction constraint of rigid body and flexible body coupling relationship [6]. The model based on positional dynamics can be expressed by existing vertices and defined constraint functions. Suppose that the number of vertices in the tetrahedral mesh of the physical model is N, and there are corresponding M constraints on N vertices, constraint j ∈ [1, ..., m] including mass m i , position X i , and velocity Vi , vertex i ∈ [1, ..., n], The constraint relationship of point Cj can be expressed as follows: Cj (X1 , ....., Xn ) = 0 or Cj (X1 , ....., Xn ) ≥ 0

(17.1)

The constraint conditions corresponding to each vertex will have a stiffness coefficient to achieve the control effect under the constraint conditions. The PBD algorithm will give priority to calculating the initial predicted position of each vertex. Under the influence of the applied force, the vertex will shift to meet all constraints: Cj > 0 (j = 1, 2, ..., M)

(17.2)

Algorithm implementation: First, initialize the vertex position and vertex velocity of the tetrahedron mesh, and update the current position according to Vi (t) = Vi (t − ∆t) + ∆twi fext (where ∆t is the time interval, wi = 1/mi ). Second, the damping is increased according to the refresh position to reduce the displacement speed of the vertex due to the applied force. Then, the vertex position is predicted according (pi (t) = Xi (t − ∆t) + ∆tvi (t)) to the calculation. For details, refer to the flow chart of position-based dynamics algorithm in Fig. 17.2.

196

Z. Wang and H. Yu

Fig. 17.2 Flow chart of position-based dynamics algorithm

17.3 Interactive Cutting Algorithm In order to realize the reality of cutting in virtual medical surgery, two force feedback devices are used to simulate the scalpel and tweezers in surgery in real time. The organ model and the Touch device are automatically loaded and initialized in the simulation. Displays the initialized force feedback position and soft tissue model. The user can control the surgical instrument to reach the organ lesion area through the manipulation force feedback stroke area, and can also adjust the optimal position and posture of the instrument.

17.3.1 Tetrahedral Cutting In the process of virtual surgery, in addition to focusing on information such as soft tissue incision position, incision direction, and instrument cutting motion trajectory, attention should also be paid to the internal composition after cutting. When the surgical instrument collides with the tetrahedron, the movement mode will divide a single tetrahedron into multiple small tetrahedrons. The splitting and thinning of tetrahedron primitives directly affect the processing speed. If the processing speed is too slow, the cutting process will be stuck and not smooth, resulting in insufficient user interaction. As shown in Fig. 17.3, (a), yellow is the cutting path of the scalpel. When the tetrahedron intersects and collides with the cutting surface, it will first be stored in the collection, and the intersection point will be calculated and stored in the collection cutting surface and the distance from the intersection point to the vertex. When the distance is less than the threshold, mark the vertex cutting surface to be moved. If the distance is greater than or equal to the threshold, mark. Compare

17 Soft Tissue Cutting Based on Position Dynamics

197

(a) Intersection of surgical instrument and tetrahedron (b) Change of tetrahedron cutting surface

(c) Tetrahedral particle distribution before cutting (d) Distribution of tetrahedral particles after cutting

Fig. 17.3 Schematic diagram of tetrahedron cutting track and particle distribution

(c) and (d). When the tetrahedron is cut, the cutting surface will be generated at the beginning of the collision of the surgical instrument and the number of particles will increase.

17.3.2 Particle Constraint This paper adopts the particle real-time visual effect simulation technology of Nvidia Flex. The core of Flex is that everything is a particle system connected by constraints, so as to enhance the visual effect. The FlexAPl extension library contains a Solver for creating software, which is mainly used to convert the mesh primitives into single particles and group the differentiated particles into clusters. Because shape matching has the effect of resisting bending and torsion in the model, bending and stretching can be controlled to produce deformation by setting the stiffness of links and clusters separately. Flex forms a collection of the position, velocity, and mass information of each particle in the [x, y, z, 1/m] format. In the real-time simulation, each particle has its own solver to calculate the collisions between particles and between particles, particles, and meshes. After the collision, particles find the best rigid transformation through the least squares, so that the particles in the displacement can match back to the rigid state. The advantage of position based dynamic model is that it can limit the movement direction of particles to the direction of the constraint gradient, make the movement direction of particles perpendicular to the direction of the simulated rigid

198

Z. Wang and H. Yu

Fig. 17.4 Simulation of cutting tetrahedron generation

body mode, so as to meet the dynamic energy conservation, and prevent particles from gathering so that the virtual surgery can reach a high degree of reality.

17.3.3 Section Treatment Complete cutting and incomplete cutting are commonly used in cutting algorithms. Incomplete cutting algorithm separates the parts that have not been cut or processed from the cutting tetrahedron to reduce the amount of calculation. Because the incomplete segmentation method is suitable for one-sided partial cutting, there is no sense of wholeness. The complete cutting is based on the minimum subset method, which improves the calculation speed. The section structure is smooth and complete, and the new tetrahedron will be generated after cutting. The reconstruction of tetrahedron grid is conducive to understanding the internal structure of soft tissue with stable interaction. As shown in Fig. 17.4. If the intersection point of the cutting face is MON, copy the intersection point M' O' N' . Insert the triangle MON and M' O' N' to form a notch, and the split face ABD is the face DMO and the face ABM' O' . Similarly, the ACD of the face is split into the DMN of the face and the M' N' CA of the face. Split the face DBC into face DNO and face O' N' CB. Connect BM' , BN' , AN' , and insert surfaces ABN' , BN' M' . Finally, four new tetrahedrons DMNO, BM' N' O' , ABM' N' , ABCN' , and ABO' N' were reconstructed. The tetrahedral mesh is reconstructed on the cutting surface to form the surface mesh of the cutting surface. At the same time, the number of particles on the cutting surface shifts and particle constraints increase.

17.4 Simulation Experiment Three groups of experiments were conducted using Unity 2018 simulation platform, OpenGL, OpenHaptics, and two force feedback devices of UR5. Figure 17.5 shows the independently constructed and rendered lung model, which is dragged and cut.

17 Soft Tissue Cutting Based on Position Dynamics

(a) Complete lung compression

199

(b) Partial cutting of complete lung

Fig. 17.5 Simulate independent lung cutting

As shown in Fig. 17.6, the algorithm based on position dynamics can effectively simulate the deformation of soft tissues, and smoothly display the cutting process. The internal structure of the cutting surface is clear, and there is no jamming due to the large amount of particle motion calculation. High frame rate can be maintained for different models such as cylinder, independent lung tissue, and virtual abdominal cavity. the comparison of cutting performance of different models reflects that position-based dynamics has certain advantages in force feedback capability, cutting speed, complexity, robustness, and quality change.

17.5 Conclusion The most critical technical problem in virtual medical surgery is to balance the contradiction between the authenticity and real time simulation. Focusing on precision requires optimization and fine design in modeling and cutting, which increases the simulation calculation time. Therefore, we should improve the simulation speed as much as possible on the premise of ensuring the authenticity. In this paper, the physical model based on dynamics is adopted to avoid the problem that the traditional model usually uses the mass–spring model and cannot describe the volumekeeping constraint. The constraint function is used to control the particle directly, which avoids the unstable overshoot caused by the explicit integration and the excessive calculation of implicit integration. In the experimental simulation part, three conventional surgical simulation methods of force feedback equipment for soft tissue pinching, pressing, and cutting are displayed, realizing the tactile sense of surgical simulation and the authenticity of simulation. In the cutting part, the processing of the cutting surface model, the simulation of particle displacement and splitting, and the process of generating surface mesh after cutting the volume mesh are shown. The advantages of position-based dynamics in simulating real-time soft tissue cutting are fully verified, as shown in Fig. 17.7.

200

Z. Wang and H. Yu

(a) Simulation of virtual lung surgery scene; (b) Simulated virtual lung surgery clipping;

(c) Simulation of virtual lung surgery cutting trajectory;

(d) Simulate virtual lung surgery cutting surface

Fig. 17.6 Virtual lung cutting surgery

Fig. 17.7 Cutting performance of different models

17 Soft Tissue Cutting Based on Position Dynamics

201

References 1. Jia, S.Y., Pan, Z.K.: A preliminary study on soft tissue cutting in virtual surgery simulation, J. Shanghai Univ. 10(s), 177–183 (2004) 2. Westwood, J.D., Hoffman, H.M., Mogel, G.T., Phillips, R., Robb, R.A., Stredney, D., Wu, W., Sun, J., Heng, P.A.: A hybrid condensed finite element model for interactive 3D soft tissue cutting. Stud. Health Technol. Inf. 94 (2003) 3. Rizoiu, I.M., DeShazer, L.G., Eversole, L.R.: Soft tissue cutting with a pulsed 30-Hz Er, Cr: YSGG laser. In: Biomedical Optoelectronic Instrumentation, Vol. 2396, pp. 273–283. SPIE (1995) 4. Müller, M., Heidelberger, B., Teschner, M., Gross, M.: Meshless deformations based on shape matching. ACM Trans. Graph. (TOG) 24(3), 471–478 (2005) 5. Loukas, C., Nikiteas, N., Kanakis, M., Georgiou, E.: Deconstructing laparoscopic competence in a virtual reality simulation environment. Surgery 149(6), 750–760 (2011) 6. Pan, J.J., Chang, J., Yang, X., Liang, H., Zhang, J.J., Qureshi, T., Howell, R., Hickish, T.: Virtual reality training and assessment in laparoscopic rectum surgery. Int. J. Med. Robot. Comput. Assist. Surg. 11(2), 194–209 (2015)

Chapter 18

Research on Application Technology of Multi-source Fusion of 3D Digital Resources in Power Grid Hai Yu, Zhimin He, Lin Peng, Jian Shen, and Kun Qian

Abstract In view of the technical requirements of 3D model resource fusion in power grid infrastructure and operation and inspection, this paper studies the spatial transformation of multi-media heterogeneous models, pose registration, and comprehensive reconstruction algorithms and analyzes the multi-source 3D digital resource fusion analysis of variable time and space technology. A fusion registration analysis method of 3D digital resources oriented to multi-medium and changeable space– time is proposed, forming a management scheme for updating 3D digital models of infrastructure and operation inspection. Relevant achievements support the integration analysis and interaction of 3D digital resources and provide a technical basis for the fusion application of multi-source and polymorphic 3D models. Keywords 3D digital resource integration · Pose registration · Comprehensive reconstruction

18.1 Introduction In recent years, State Grid Corporation of China has carried out a series of pilot applications for grid digital twin technology, such as the establishment of an intelligent Internet of Things digital twin service platform, an intelligent operation and inspection service platform for transmission lines, a high-precision three-dimensional panoramic display of transmission lines, etc., to achieve the three-dimensional live H. Yu (B) · K. Qian School of Automation, Southeast University, Nanjing 210096, Jiangsu, China e-mail: [email protected] H. Yu · Z. He · L. Peng Power Grid Digitization Technology Research Institute, State Grid Smart Grid Research Institute, Nanjing 210003, Jiangsu, China J. Shen Electric Power Research Institute, State Grid Henan Electric Power Company, Zhengzhou 450052, Henan, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_18

203

204

H. Yu et al.

display and initial application of grid equipment. However, the research on multisource integration of 3D digital resources and shared service technology is not deep enough to support the comprehensive application of 3D digital models of power grids in the fields of infrastructure, operation, and inspection, and has the following shortcomings [1]. The three-dimensional model of power grid equipment has various formats, lacks standardized and unified access interfaces, and is difficult to interoperate. For example, in the design stage, there are currently three-dimensional model data formats of the State Grid Grid Information Model (GIM), and in the operation and maintenance stage, there are three-dimensional models such as 3D Max, LAS, and laser point cloud data formats. A large number of State Grid GIM 3D data resources formed in the design phase cannot be effectively used in the operation and maintenance phase, and it is difficult to give full play to its value, resulting in a huge waste of State Grid digital resources and repeated investment in construction. On the other hand, the GIM 3D model in the design stage has not been verified by the actual operation and maintenance, so it is difficult to ensure that it is consistent with the on-site entity. Most of the current 3D applications of power grid equipment are limited to the use of the “shell” of the 3D model to meet the needs of some panoramic visualization, but the application lacks intelligent interaction, modeling simulation, fusion diagnosis, and other functions, and does not have the “soul” of the digital twin, which does not really reflect the core connotation of digital twin technology cannot allow end users to experience the huge benefits brought by the application of 3D digital models of power grids. Therefore, it is necessary to study the 3D digital basic base, which has the functions of unified access and output of 3D models in different formats, model refinement, lightweight, model assembly and calibration processing, and model management, so as to realize the interoperability of 3D models in different formats of power grid equipment. This will enable the 3D data models such as GIM formed in the design stage to be effectively applied in the operation and maintenance stage, and give full play to the full cycle through use value of the huge 3D design resources of the State Grid.

18.2 Multimedia Heterogeneous Model Spatial Transformation, Pose Registration, and Comprehensive Reconstruction Technology Firstly, we analyzed the characteristics of typical GIM models, point cloud models, and other multimedia heterogeneous models and their problems in infrastructure construction, operation, and maintenance applications, based on the requirements for multi-source and polymorphic 3D digital models at different stages of infrastructure construction, operation, and maintenance; secondly, the discretized sampling of the GIM model of the device and its attitude registration method with the scanned point

18 Research on Application Technology of Multi-source Fusion of 3D …

205

cloud model are studied, and the spatial pose transformation relationship between the heterogeneous models is estimated to realize the refined completion of the GIM model. Finally, the attitude registration is performed. On the basis of this, a unified coordinate system is established to realize the comprehensive reconstruction of the digital model of the plant and station that integrates the multi-source medium model and the combination of appearance and interior [2, 3].

18.2.1 Analysis of Demand for 3D Digital Model Media at Different Stages of Infrastructure, Operation, and Maintenance The commonly used model in the infrastructure phase is GIM/PModel. In the infrastructure stage, the 3D model comes from the 3D parametric modeling and design software, and the demand for the 3D model is mainly to provide the model design basis for construction and acceptance. The commonly used models in the inspection stage are 3D MAX and LAS laser point cloud 3D models. In the inspection stage, the 3D model is mainly reconstructed from the scene scan, and the 3D model provides the digital base of the environment for the inspection requirements such as camera calibration, personnel positioning, equipment status, and digital twin mapping with absolute scale information. The GIM model is characterized by no absolute coordinate value, lack of texture, material, and lighting. The three-dimensional mesh model represented by this has only vertices and patches, and no coordinates of continuous points on the surface. In contrast, the point cloud model is a reflection of the physical scale of the real world, all surface points have absolute coordinate values, and the 3D shaded point cloud reconstructed by the joint scanning of lidar and vision also has color textures. If a large number of 3D GIM models formed in the infrastructure stage of State Grid Corporation can be handed over to be applied in production operation and maintenance, the cost of 3D modeling in the later stage of the equipment department will be greatly reduced [4, 5].

18.2.2 Discretized Sampling of Research Equipment GIM Model and Attitude Registration Method with Scanned Point Cloud The “elements” of a triangular mesh model are vertices, faces, and may also contain elements such as edges, depth map samples, and triangle strips. Before computing point cloud feature descriptors and registration, the triangular mesh model needs to be discretized first. The point cloud is directly extracted from the vertices of each facet in the Polygon or Stanford Triangle Format (PLY) model. This method is simple

206

H. Yu et al.

and feasible, but has the following disadvantages: (1) Only retaining the vertex data of the triangular facets will make the point cloud too sparse, which is not conducive to subsequent calculations; (2) The change of curvature is small, and the triangular patches in the flat part are large in area and small in number. Direct extraction of vertices will make this part of the point cloud sparse, while the triangular patches with a large change in curvature are small in area and large in number. Direct extraction of vertices will make this part of the point cloud too dense, resulting in a large difference in the final point cloud density distribution, which is not conducive to subsequent calculations. Another algorithm is the more classical random uniform triangulation sampling method. Although the probability of generating sampling points in the triangular patch is the same, the distance between points varies. Some points are very close to each other, forming a mixture. Some points are far away from each other, and the spatial distribution is more random. Such a problem will make the points unable to accurately describe the geometry of the surface, which will adversely affect the matching and registration [6–9]. To this end, this subject intends to use a two-stage grid sampling algorithm based on Poisson-Disk distribution to solve this problem, as shown in Fig. 18.1. In the first stage, the random uniform sampling algorithm is used to obtain enough initial sample points to meet the requirements of unbiased sampling and maximum sampling. The number of sample points generated in each triangular region needs to be adaptively controlled, and the algorithm sets it to be proportional to the area of the triangular patch. Let the side length of the triangular patch be ai , bi , ci , S is the triangle area, and the number of candidate points to be sampled is num i : √ (18.1) Strii = pi ( pi − ai )( pi − bi )( pi − ci ) pi =

ai + bi + ci 2

num i Strii = , i = 1, 2, · · · , t num total Stotal

(18.2) (18.3)

Then in the second stage, the threshold of the minimum radius is used to compare and filter, to meet the requirement that the distance between any two points in the sampling point group is greater than the set threshold radius. As shown in Fig. 18.2, for the triangular mesh model after discretization, the model needs to be transformed into the coordinate system of the original point cloud by rotation and translation, and the purpose of point cloud registration is to find the transformation relationship between the point cloud and the model, namely (Fig. 18.3): Pt = R · Ps + T

(18.4)

where Ps is the source point cloud, Pt is the target point cloud, R is the rotation matrix, and T is the translation matrix. Point cloud registration can be divided into

18 Research on Application Technology of Multi-source Fusion of 3D …

207

Fig. 18.1 Random uniform sampling (left) and Poisson-Disk sampling (right)

Fig. 18.2 3D model discretization sampling method

Fig. 18.3 Sample of 3D model discretization

two stages: coarse registration and fine registration. Since fine registration has certain requirements on the initial poses of the two point clouds, coarse registration is often used to reduce the misalignment between point clouds before fine registration, so as to provide a better initial pose for fine registration. The above-mentioned point cloud pose registration is realized by the coarse registration algorithm represented by

208

H. Yu et al.

SAC-IA and Super4PCS algorithm and the fine registration algorithm represented by ICP algorithm. The algorithm mainly includes the following three steps: (1) Obtain the nearest corresponding point set; (2) Minimize the error; and (3) Repeat these two steps until the registration error is less than the threshold value to achieve registration error correction [10–12]. Attitude registration obtains the spatial pose transformation relationship between the constructed models. On this basis, the discretized mesh device model is given point by point the color vector of the corresponding point in the real-scene scanning point cloud, so as to complement the color vector used for transportation inspection. The lack of textures and materials in GIM models in business.

18.2.3 Research on Comprehensive Reconstruction of Digital Model Through Multi-source Medium Model, Appearance, and Internal Combination Through the previous steps, the GIM model is registered with the point cloud data of the equipment, and the position and attitude of the model in the unified world coordinate system are determined, so as to realize the efficient single reconstruction of typical power equipment. The GIM model of each device is mapped to a unified point cloud scene coordinate system through rigid body transformation, and the CAD model is superimposed on the pose of the corresponding device in the scene scan point cloud. In addition, according to the pose registration results, the real-life scanning point cloud reflecting the external appearance of the device and the mesh model reflecting the internal mechanism of the device can also be aligned, spliced, and superimposed. So as to realize the comprehensive reconstruction of multi-source medium and multi-level digital models.

18.3 Multi-source 3D Digital Resource Fusion Analysis Technology with Variable Space and Time For the time-varying data sources such as on-site images, on-site sensor data, and business data that are common in the inspection business, data source interfaces are designed respectively, including business data timed sampling access, business data synchronous replication access, sensor data real-time collection and access, file data import, data handover access, etc. On the basis of coordinate transformation and open access interface for various heterogeneous 3D model digital resources, the publish/subscribe function is developed for these time-varying data sources to realize the access of distributed dynamic data. Facing the demand for fusion analysis of multi-source 3D digital resources in changing time and space, it is faced with multisource 3D digital model data generated by multi-time and space. While formulating

18 Research on Application Technology of Multi-source Fusion of 3D …

209

unified data format standards, 3D digital resources based on historical time points are saved, and the corresponding relationship between time and real-world 3D model data-level equipment holographic state data is established [13–16]. Aiming at the typical point/plane data of power grid business system, on-site sensors of plant and station, and monitoring image data, the perspective transformation relationship under different coordinate systems is analyzed, and the correlation mapping and projection superposition mechanism between time-varying data and equipment model is studied. The research is based on the multi-dimensional spatiotemporal data fusion analysis technology of substation equipment, using neural network and D-S reasoning method to realize the fusion analysis of equipment holographic state data, data and real 3D model; Based on the multidimensional spatiotemporal data fusion analysis technology of substation equipment, the fusion analysis of equipment holographic state data, data and real 3D model is realized by using neural network and D-S reasoning method; The weighted fusion method and ratio method data calculation method are used to establish the mathematical model of pixel feature decision three-level fusion calculation of multidimensional data, effectively fuse and analyze the feature values effectively corresponding to the physical equipment and the digital twin model, and achieve effective fusion analysis of big data and sampled data; The tasseHed captransform method is adopted to achieve information compression, simplify the fusion analysis process of big data and multimedia, heterogeneous and unstructured data, and achieve effective fusion analysis in the dynamic evolution process of multiple spatio-temporal data; Multi protocol standardization method is used to realize the fusion of multimedia, heterogeneous and unstructured data, establish a unified mathematical fusion model, and realize the rapid identification of feature data. Through point cloud segmentation and extraction of typical equipment, 3D shape descriptor extraction, and shape retrieval, the entity semantics of equipment are recognized, and the semantic information of 3D digital model is embedded to achieve multi-level and all-round description of the environment. The point cloud of the substation scene is collected by the three-dimensional laser scanner, although some devices in the scene will have more noise, but different devices are relatively independent in space, so the DBSCAN point cloud segmentation method based on density clustering is used to segment the point cloud of different device targets in the space. After the segmentation of a single object, the Principal Component Analysis (PCA) method is used to obtain the minimum bounding box of the object, and determine the pose of the object in space. The steps of the algorithm for determining the position and pose of point cloud target and the minimum bounding box based on PCA principal component analysis are: first calculate the covariance matrix of point cloud data, then calculate the eigenvector of the covariance matrix, and then calculate the center of gravity c of point cloud. The pose of point cloud in the world coordinate system is T = [v1, v2, v3, c]. Then transform the point cloud in the world coordinate system to the local coordinate system, and the calculation formula is P1 = T−1 ∗ P, where P is the point cloud under the world coordinate system before transformation, P1 refers to the point cloud under the local coordinate system after transformation.

210

H. Yu et al.

Fig. 18.4 Clustering segmentation of single equipment in point cloud

Finally, the axis-aligned bounding box (AABB) of the point cloud in the local coordinate system is calculated, and then the AABB bounding box is transformed to the world coordinate system, that is, the position and pose of the point cloud in the world coordinate system and the minimum bounding box are obtained. The effect of device clustering and segmentation is shown in the Fig. 18.4.

18.4 Conclusions This paper analyzes the characteristics of typical GIM models, point cloud models, and other multimedia heterogeneous models and their problems in the application of infrastructure and operation inspection. Aiming at the deficiency of GIM model in texture, material, and lighting, the discrete sampling of GIM model of equipment and its attitude registration method with scanning point cloud model are studied, and the spatial pose transformation relationship between heterogeneous models is estimated to achieve the refinement of GIM model. On the basis of attitude registration, a unified coordinate system is established to realize the comprehensive reconstruction of the digital model of the power station through multi-source medium model and the combination of external and internal. Study the mechanism of correlation mapping and projection superposition between equipment model and grid business system data, as well as the spatio-temporal data of on-site sensors of power plants and stations, and realize the multidimensional spatio-temporal big data analysis of equipment holographic state data and real three-dimensional model fusion. Through typical equipment point cloud segmentation and extraction, 3D shape descriptor extraction and shape retrieval, identify the entity semantics of equipment, and expand the semantic information of 3D digital model, including the topological structure of environment space, equipment semantics, place semantics, etc., to achieve multi-level and all-round description of the environment. Acknowledgements Thanks for the support of the State Grid Corporation of China Science and Technology Project “Research on digital twin management and interactive technology for efficient coordination of county energy internet”(5700-202224202A-1-1-ZN).

18 Research on Application Technology of Multi-source Fusion of 3D …

211

References 1. Wu, K.H., Wang, X.H., Bai, X.W.: A review on geographic information system in electric power. FCC2010 (2010) 2. Zhang, J.M., Xu, A.C., Li, H.X.: An automatic engineering configuretion system for substation automation based on SVG/XML/CM. Autom. Electr. Power Syst. 28(14), 54–57 (2004). (In Chinese) 3. Li, S.Q., Liu, Z.H., Qiu, F., etc.: Production line failure warning based on digital twin technology. Dev. Innov. Electromech. Prod. 34(1):70–73 (2021). (In Chinese) 4. Wang, J.X., Hu, W.: Preliminary analysis on the application of BIM technology in smart city “digital twin” construction projects. Smart Build. Smart City 1, 94–95 (2021) 5. Mandolla, C., Petruzzelli, A.M., Percoco, G., Urbinati, A.: Building a digital twin for additive manufacturing through the exploitation of blockchain: a case analysis of the aircraft industry. Comput. Ind. 109, 134–152 (2019) 6. Chen, X.X., Bai, Y., Wang, S.J., et al.: Application of digital twin technology in intelligent inspection of substations. China Strat. Emerg. Ind. (Theoretical Edition) 5, 1 (2019) 7. Tang, W.H., Chen, X.Y., Qian, T., et al.: Digital twin technology and its application for smart energy systems. China Eng. Sci. 22(4), 74–85 (2020) 8. Chen, W.: Research on Smart Substation Technology and Its Design Application. Shandong University, Jinan (2017) 9. Zhao, Z.J.: Research on the application of clock synchronization system technology in smart substations. Yunnan Electr. Power Technol. 41(6), 20–23 (2013) 10. Fan, Z.X., Chen, F., Zhou, W., et al.: Research on the application of key technologies in intelligent substation automation system. China High-tech Enterpr. 32, 20–21 (2012) 11. Li, M., Qin, L. J., Guo, Q., et al.: Power grid topology analysis based on CIM model. Electr. Power Sci. Eng. 27(9), 18–22 (2011) 12. Cao, N., Li, G., Wang, D.Q.: Discussion on key technologies and construction methods of smart substations. Power Syst. Protect. Contr. 39(5), 63–68 (2011) 13. Li, C., Mahadevan, S., Ling, Y., Choze, S., Wang, L.: Dynamic Bayesian network for aircraft wing health monitoring digital twin. AIAA J. 55(3), 930–941 (2017) 14. Zhang, P.C., Gao, X.: System structure of digital substation. Power Grid Technol. 24, 73–77 (2006). (In Chinese) 15. Li, M.C., Wang, Y.P., Li, X.W., et al.: Analysis of intelligent substation and technical characteristics. Power Syst. Protect. Contr. 38(18), 59–62, 79 (2010) 16. Yang, T., Zhao, L.Y., Wang, C.S.: A review of the application of artificial intelligence in power systems and integrated energy systems. Autom. Electr. Power Syst. 43(1), 2–14 (2019). (In Chinese)

Chapter 19

Rapid Identification of Herbaceous Biomass Based on Raman Spectrum Analysis Qiaoling Li, Zhongli Ye, Hui Liang, Zhiqiang Yu, Zhou Fang, Guohua Cai, Quanxing Zheng, Li Yan, Hongxiang Zhong, Zhe Xiong, Jun Xu, and Zechun Liu Abstract To realize the rapid online identification of tobacco quality style in a few seconds, this paper proposed a rapid identification method of tobacco quality style based on the Raman spectrum analysis. This method could quickly obtain the Raman spectrum of tobacco samples, then establish the mapping relationship database between the tobacco information and the Raman signal. The KNN (KNearest Neighbors) algorithm as the identification algorithm of tobacco information was used. The rapid and accurate identification of tobacco origin through the Raman signal in the range of 2700–3500 cm−1 was realized. The accuracy of origin identification can reach 95.7%. To identify the tobacco grade accurately, the competitive adaptive weighted sampling (CARS) was used to select the key Raman characteristic spectral coverage that determines the acteristics of tobacco grade. Combined with the KNN algorithm, the rapid and accurate identification of tobacco grades by analyzing the signals of key Raman characteristic spectral coverage was realized. By using the key Raman characteristic spectral coverage in the range of 800–800 cm−1 and 2700– 3500 cm−1 , the accuracy of tobacco grade identification can reach 87.0%. The results showed that by analyzing the key Raman characteristic band signals of unknown tobacco samples, combined with the identification algorithm proposed in this paper, the efficient identification of the quality style of unknown tobacco can be realized. Keywords Component · Formatting · Style · Styling

Q. Li · Z. Ye · H. Liang · Z. Yu · G. Cai · Q. Zheng · L. Yan · H. Zhong · Z. Liu (B) Technology Center, China Tobacco Fujian Industrial Co., Ltd., Xiamen, China e-mail: [email protected] Z. Fang · Z. Xiong · J. Xu School of Energy and Power Engineering, Huazhong University of Science and Technology, Wuhan, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_19

213

214

Q. Li et al.

19.1 Introduction The quality style of tobacco is closely related to the process of tobacco planting, initial drying, threshing, and redrying. The quality style characteristics of tobacco from different producing areas, grades, and types are significantly different. Traditional tobacco classification methods mainly rely on artificial smoking and personnel experience. After tobacco entering the factory, industrial technicians would evaluate the grade of tobacco again in the way of sensory rating, and detect the relevant physical and chemical characteristics of tobacco. The evaluation results are used as the basis of the using of tobacco raw materials in subsequent applications. With the increase in the requirements for the grade and quality of raw materials in the production process of tobacco products, it is of great significance to establish accurate and rapid classification methods of tobacco raw materials and realize the fine classification of tobacco raw materials. At present, a variety of quality style classification methods of tobacco raw materials have been developed. Sensory assessment of tobacco smoking evaluates the tobacco quality style by the sensory feelings of cigarette taster and assisted by the professional equipment. Xiang et al. [1] stated that the method of sensory assessment of tobacco can give priority to the subjective feelings of consumers, but it is highly subjective, and different cigarette tasters may have their own biases, and the subjective differences can decrease the accuracy of identification of the tobacco quality style. The chemical composition detection method is to detect the content of watersoluble sugar, total nitrogen, nicotine, chlorine, and other substances in tobacco based on the national standard. It’s reported that the contents of the measured chemicals are accurate, which can be used to determine the tobacco quality style [2–4]. However, the chemical composition detection method takes a long time and must be carried out in the laboratory, so it is difficult to achieve rapid detection and cannot be used in the production line. In recent years, near-infrared spectroscopy [5–7] can be used to detect and analyze the surface quality of biomass. Researchers [8–11] have developed a tobacco quality style classification method based on the near-infrared spectrum analysis, which can identify the tobacco leaf quality style by analyzing the differences in the near-infrared spectrum of different tobacco leaves. The near-infrared spectroscopy has the advantages of fast speed, good repeatability, high sensitivity, and less sample consumption, but it has high requirements for the moisture content of tobacco samples, which is difficult to meet the needs of rapid online detection during the production processes. Researchers also established a classification model based on sample image features by combining the morphological characteristics of tobacco and introducing machine learning and neural network algorithms. However, due to the similar morphological characteristics of tobacco and the interference of light on image identification, when the image quality was not high, the accuracy of identification results would be significantly decreased [12–16]. Davies et al. [17] developed the thermogravimetric analysis-Mass spectrometry method to identify the volatiles of tobacco samples. However, this method also had the shortcomings of long time-consuming and high professional requirements for operators. Therefore,

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum …

215

it can be seen that there is a lack of a method that can quickly and accurately identify the tobacco quality style online. To solve the above problems, this paper innovates a method to quickly identify the quality style of tobacco through the Raman spectrum analysis. The Raman spectrum detection technology can quickly, nondestructively, and conveniently obtain the chemical structure of carbon-based raw materials. By using the self-developed analytical method of characteristic parameters of tobacco Raman spectrum and the identification algorithm, the relationship between the characteristic parameters of tobacco Raman spectrum and the chemical characteristics that determine the quality style is established. The rapid and accurate identification of tobacco quality style is realized by using this method.

19.2 Materials and Methods 19.2.1 Materials 80 different tobacco samples were collected from Fujian Province (F) and Yunnan Province (Y). According to the sensory perception of professionals, due to the difference in chemical composition of tobacco itself, the difference in smoke release during smoking is caused. The 80 tobacco samples were classified into four grades, which were named grade 1, grade 2, grade 3, and grade 4, respectively. Table 19.1 shows the classification information of the tobacco samples. The symbol F-1 indicates the samples classified as grade 1, and the origin was Fujian Province. Table 19.1 Classification information of 80 tobacco samples

Grade

Symbola,b

Number

1

F-1

7

2

F-2

7

3

F-3

14

4

F-4

10

1

Y-1

10

2

Y-2

10

3

Y-3

10

4

Y-4

12

a F indicates that the tobacco sample was from Fujian Province, and

Y indicates that the tobacco sample was from Yunnan Province numbers mean the grades of the tobacco sample according to the sensory fellings

b The

216

Q. Li et al.

19.2.2 Experimental Methods The Raman spectra of 80 tobacco samples were detected on the Nicolet IS50R Fourier transform Raman spectrometer (Thermo Scientific). InGaAs detector was used as an excitation light source. The objective lens was a confocal microscope with 50 times magnification. The laser wavelength was 1064 nm, the spot size of the laser on the sample surface was about 50 µm, and the output power of the laser was 50 mW. The frequency of spectral acquisition was 300 times. The recorded spectral range was 100–3700 cm−1 . For each sample, to improve the accuracy of Raman testing, 8 different positions were randomly selected for testing, and the 8 measurement results were averaged (singular points with huge differences were removed), to reduce the random error caused by the experiment and the systematic error caused by the uneven tobacco sample itself.

19.2.3 Raman Spectrum Data Processing The Raman spectra of tobacco were processed by the method as follows. Firstly, the Raman spectrum curve [18] was smoothed via the Savitsky Golay (SG) convolution smoothing method. As a polynomial fitting method, SG convolution smoothing method was used to fit multiple cells through polynomial least square fitting, and calculated the weighted average sum of the center point in the window relative to its surrounding points, so as to eliminate random noise in Raman spectral data. The first-order Raman spectrum and second-order Raman spectrum were selected from the smoothed Raman spectral curve. Researchers [19, 20] have developed that the range of the first-order Raman spectrum was 800–1800 cm−1 , and the range of the second-order Raman spectrum was 2700–3500 cm−1 . In the Raman spectrum of tobacco, the first-order Raman spectrum and the second-order Raman spectrum were mainly used to reflect the chemical structure characteristics of tobacco. To correct the fluorescence interference in the sample detection process, the fluorescence signals of the first-order and second-order Raman spectra should be removed, as shown in Fig. 19.1. The Raman intensity value after removing the fluorescence signal can be expressed as Eq. (19.1): Ic' = Ic −

Ib − Ia Xc Xb − Xa

(19.1)

where Ia , Ib , and Ic are the Raman intensity values of points A, B, and C, respectively. Xa , Xb , and Xc are the Raman shift values of points A, B, and C, respectively, and Ic ' is the Raman intensity value of point C after fluorescence signal removal.

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum …

217

Fig. 19.1 Fluorescence signal removal method of the first and second order Raman spectrum

19.2.4 Extraction of Key Features Raman Spectrum Region To maximize the accuracy and reliability of Raman spectrum data used in the tobacco quality style prediction model, this paper proposed an identification method of key effective characteristic bands of the Raman spectrum based on the competitive adaptive weighted sampling (CARS) developed by Gou et al. [21]. Each time, the points with a larger absolute value weight of the regression coefficient in the Partial Least Squares regression (PLS) model are retained as a new subset through the CARS method, and the points with a smaller weight are removed, then the PLS model is established based on the new subset. After many calculations, the Raman band in the subset with the smallest root mean square error (RMS error) in the PLS model interactive verification is selected as the key effective feature band. Because CARS has certain randomness, this paper will run more than five calculations to select the best Raman band data used in the tobacco quality style prediction model. The specific process is shown in Fig. 19.2. CARS algorithm can remove irrelevant characteristic wavelengths that do not have significant differences in tobacco quality style and can select the key effective characteristic Raman band that reflect the tobacco quality style. Thus the selected key effective characteristic Raman band can be used to replace the full Raman spectrum as the data source to establish a model of the tobacco quality style identification. This Raman spectrum processing method can effectively improve the accuracy and calculation efficiency of the model identification results.

19.2.5 Machine Learning Model Establishment The KNN (k-nearest neighbors) algorithm is used as the identification algorithm of the tobacco quality grade and origin information, and a database containing the known tobacco sample grade and origin information is established. According to

218

Q. Li et al.

Fig. 19.2 Identification and selecting methods of key effective characteristic bands of Raman spectrum in this paper

the origin of the collected tobacco samples, the tobacco samples are divided into category F for samples from Fujian and category Y for samples from Yunnan. There are 38 and 42 samples of Y and F, respectively. Samples of F and Y are randomly selected. 3/4 of the samples of F and Y are used as the training set, and 1/4 of the samples are used as the test set to test the accuracy of the identification model. In class F samples, the number of samples with original grades of 1, 2, 3, and 4 is 7, 7, 14, and 10 in turn, and the number of samples in the training set and test set is 5, 5, 10, 7 and 2, 3, 4, 3 respectively; among Y-type samples, the number of samples with original grades of 1, 2, 3, and 4 is 10, 10, 10, and 12, and the number of samples in the training set and test set are 7, 7, 7, 9 and 3, 3, 3, 3 respectively, as shown in Table 19.2. Table 19.2 Sample quantity of the test set and training set Class

Number of samples Level 1

Training set Testing set

Level 2

Level 3

Level 4

F

5

5

10

7

Y

7

7

7

9

F

2

2

4

3

Y

3

3

3

3

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum …

219

19.3 Results and Discussion 19.3.1 Identification of Tobacco Origin Comparing the chemical structures corresponding to the first-order and second-order Raman spectrum of tobacco samples, it can be seen that the chemical structures reflected by the first-order and second-order Raman spectrum are different, so they contain different information about the tobacco quality style. Based on this, this paper used the first-order Raman data, the second-order Raman data, and both the first-order and second-order Raman data as data sources for the identification model to identify the tobacco quality style, and then determined the Raman spectrum data with the best identification accuracy by comparing the results. Table 19.3 shows the accuracy of origin identification results when using the first-order Raman data, the second-order Raman data, and their combinations as the data sources of the identification model. When using the KNN algorithm, different K values were selected for the identification calculation. According to the results in Table 19.3, when identifying the origin of tobacco samples, different Raman spectrum data sources were used, and the identification accuracies were different. Compared with the results of using the first-order data alone and using both the firstorder and second-order data at the same time, the model that used the second-order Raman data alone as the data source showed the highest identification accuracy. The reason that the identification accuracies of the tobacco origin were different when different Raman bands were used as data sources was that different Raman bands reflected different chemical properties of the tobacco sample. According to the prediction results, the differences in tobacco samples due to different growing origins are mainly reflected in the second-order Raman spectrum. Therefore, when identifying tobacco-producing areas, the second-order Raman band data should be selected as the data source for the identification model. The results in Table 19.3 showed that the accuracy of all the identification accuracies increased first and then decreased with the increase of the K value. For the identification model of the secondorder Raman spectrum as the data source, when K = 3, the accuracy was the highest, which could reach up to 95.7%. Table 19.3 Accuracy of the tobacco origin identification when using first-order Raman data, the second-order Raman data, and their combinations as identification data sources Raman spectrum source

K

First-order

82.6% 82.6% 91.3% 82.6% 82.6% 78.3% 82.6% 82.6% 82.6% 82.6%

1

2

3

4

5

6

7

8

9

10

Second-order 91.3% 91.3% 95.7% 87.0% 91.3% 91.3% 87.0% 91.3% 87.0% 87.0% First-order + 82.6% 82.6% 87.0% 82.6% 82.6% 82.6% 82.6% 87.0% 82.6% 78.3% Second-order

220

Q. Li et al.

19.3.2 Identification of Tobacco Grade To improve the identification accuracy of the tobacco grade, this paper completed the identification through two steps: the first step takes the samples of grade 1 and grade 2 as the first category, and the samples of grade 3 and grade 4 as the second category, and then uses KNN algorithm to identify these two categories; the second step is to further identify the samples of grade 1 and grade 2 and the samples of grade 3 and grade 4 respectively after the identification of the two categories. Through the two-step identification method, the database sample capacity in each identification process can be larger, and then the identification accuracy can be improved. Using the method in Sect. 19.3.1 to identify the tobacco grade, it could be found that the first-order Raman spectrum contained a large number of chemical structures and functional groups information that determines the tobacco grade, and the accuracy is the highest when used for the identification of absorption grade. Therefore, the first-order Raman spectrum data was used as the data source for the identification model, the k values were selected as 1–10, respectively, and the results of accuracy of the identification results were shown in Table 19.4. When the number of samples in the model is insufficient, the identification accuracy will decrease. Using the twostep identification method can improve the accuracy of the identification model. According to the identification results in Table 19.4, for the identification between grade 1 and grade 2, the highest identification accuracy is 62.5%. For samples of grade 3 and grade 4, the parameter K = 5 ~ 7 can be selected, and the identification accuracy is up to 75%. According to these results, the accuracy is poor when directly using the spectrum data of the first-order and second-order Raman data, so it is necessary to further select the key effective Raman spectrum region in the first-order and second-order Raman data and eliminate the invalid spectrum region to maximize the accuracy of the identification mode. Table 19.4 Accuracy of the tobacco grade identification when using first-order Raman data as the identification data source by the two-step identification method Classification K 1

2

3

4

5

6

7

8

9

10

First-step classification

65%

65%

65%

70%

70% 70% 80%

80%

70%

70%

Class 1 and class 2

62.5% 62.5% 37.5% 50%

50% 50% 62.5% 62.5% 50%

50%

Class 3 and class 4

58.3% 66.7% 66.7% 66.7% 75% 75% 75%

66.7% 58.3% 66.7%

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum …

221

19.3.3 Identification Model Optimization In Sects. 19.3.1 and 19.3.2, all Raman spectrum data were used as data sources for the identification model, while the best identification accuracy cannot be obtained. The reason for this phenomenon is that although the quality style of tobacco samples is different, they are essentially the same species, and their Raman spectra are quite similar. Moreover, the chemical structures reflected by most of the Raman spectrum data had little impact on the quality style of tobacco, and this part of the Raman spectrum could inevitably bring the system noise to the model, therefore decreasing the identification accuracy. When the difference caused by the systematic error is larger than that of the difference caused by the sample itself, using this part of Raman spectrum data to establish the identification model can increase the calculation amount of the model and reduce the accuracy of the model due to the redundant variables. Therefore, to improve the identification accuracy of the tobacco grade based on the existing Raman spectrum data, it is necessary to further select the key effective Raman spectrum region that determines the tobacco grade, to maximize the accuracy and efficiency of the identification model. In this paper, the CARS algorithm was used to select the key effective Raman spectrum region. According to the two-step identification method proposed in Sect. 19.3.2, the first step of the identification model is optimized first. After multiple operations of CARS algorithm, in the first-order Raman spectrum, the Raman band in the range of 899–907 cm−1 , 955–965 cm−1 , 994–1002 cm−1 , 1042–1052 cm−1 , 1100–1109 cm−1 , 1129–1138 cm−1 , 1177–1187 cm−1 , 1206–1225 cm−1 , 1323– 1331 cm−1 , 1447–1457 cm−1 , and 1765–1775 cm−1 were preferred as the key effective Raman spectrum region. In the second-order Raman spectrum, the Raman band in the range of 2702–2713 cm−1 , 2828–2838 cm−1 , 2895–2906 cm−1 , 3050– 3079 cm−1 , 3088–3099 cm−1 , 3156–3176 cm−1 , 3233–3242 cm−1 , 3319–3330 cm−1 , and 3397–3495 cm−1 were preferred as the key effective Raman spectrum region. When coupling the first-order and second-order Raman spectrum, the Raman bands in the range of 899–907 cm−1 , 955–965 cm−1 , 1129–1138 cm−1 , 2695–2751 cm−1 , 2818–2837 cm−1 , 3050–3107 cm−1 , and 3156–3194 cm−1 were preferred as the key effective Raman spectrum region. According to the above optimization results of Raman spectrum selection, the amount of data in the selected Raman segment is significantly reduced, and the redundant spectrum is effectively removed. Further to this, the KNN algorithm was used in the first step of the tobacco grade classification, and the key effective Raman spectrum region was used as the data source of the identification model. The K values are selected as 1, 3, 5, 7, 9, 11, and 13, respectively. The accuracy results are shown in Table 19.5. According to the results in Table 19.5, it is better to use the key effective region in the first-order Raman spectrum as the identification data source. The result was the best when K = 13, and the accuracy was significantly improved which reached up to 82.6%. However, the accuracy is poor when using the key effective region data in the second-order Raman spectrum. This result further confirms that the chemical structures that determine the tobacco grade are mainly reflected in the range of the first-order Raman spectrum.

222

Q. Li et al.

Table 19.5 Accuracy of the tobacco grade identification when using first-order Raman data, second-order Raman data, and their combination as the identification data source by the two-step identification method Data source

K 1

3

5

7

9

11

13

First-order data

70.0%

60.9%

78.3%

78.3%

78.3%

78.3%

82.6%

Second-order data

56.5%

69.6%

65.2%

65.2%

65.2%

73.9%

73.9%

First-order + Second-order data

69.6%

65.6%

60.9%

78.3%

78.3%

78.3%

69.6%

Based on the results of the first step, the second step of identification was carried out. First, the samples of grade 1 and grade 2 are distinguished, and the Raman spectral bands are selected by the CARS algorithm. After many calculations, among the firstorder Raman spectrum bands, 984–1013 cm−1 , 1081–1090 cm−1 , 1158–1177 cm−1 , 1727–1736 cm−1 , and 1784–1794 cm−1 Raman bands are preferred as the key effective region. In the second-order Raman spectrum segment, 2695–2703 cm−1 , 2731– 2752 cm−1 , 2770–2780 cm−1 , 2847–2858 cm−1 , 3001–3012 cm−1 , 3107–3118 cm−1 , 3127–3147 cm−1 , 3204–3214 cm−1 , 3387–3398 cm−1 , and 3483–3495 cm−1 are preferred as the key effective region. When coupling the first-order and secondorder Raman spectra, 946–960 cm−1 , 1023–1032 cm−1 , 1061–1090 cm−1 , 1437– 1447 cm−1 , and 3069–3078 cm−1 are preferred as key effective region. When distinguishing grade 3 and grade 4 samples in the second category, in the first-order Raman spectrum, the Raman bands of 899–907 cm−1 , 955–965 cm−1 , 994–1002 cm−1 , 1042–1052 cm−1 , 1100–1109 cm−1 , 1129–1138 cm−1 , 1177–1187 cm−1 , 1206– 1225 cm−1 , 1323–1331 cm−1 , 1447–1457 cm−1 , and 1765–1775 cm−1 are preferred as the key effective region. In the second-order Raman spectrum, 2702–2713 cm−1 , 2828–2838 cm−1 , 2895–2906 cm−1 , 3050–3079 cm−1 , 3088–3099 cm−1 , 3156– 3176 cm−1 , 3233–3242 cm−1 , 3319–3330 cm−1 , and 3397–3495 cm−1 are preferred as the key effective region. When coupling first-order and second-order Raman spectral data, 899–907 cm−1 , 955–965 cm−1 , 1129–1138 cm−1 , 2695–2751 cm−1 , 2818– 2837 cm−1 , 3050–3107 cm−1 , and 3156–3194 cm−1 are preferred as key effective region. Further to this, the KNN algorithm was used in the second step of the tobacco grade classification, and the key effective Raman spectrum region is used as the data source of the identification model. K values are selected as 1, 3, 5, 7, 9, and 11 respectively. The accuracy results are shown in Tables 19.6 and 19.7. According to the identification results in Table 19.6, much better results could be achieved by using the key effective region of first-order Raman data when identifying the samples of grade 1 and grade 2. When K = 5, the effect was the best, and the accuracy was 80%. Compared with the identification results using Raman full spectrum data, the accuracy rate was improved from 62.5% to 80%. According to the results of identification data in Table 19.7, when identifying the samples of grade 3 and grade 4, much better results could also be achieved by using the key effective region of first-order Raman data. When K = 11, the effect was the best, and the

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum …

223

Table 19.6 Identification accuracy of the samples of grade 1 and grade 2 using the key effective Raman spectrum region as the data source K

Data source

1

3

5

7

9

11

First-order data

50%

60%

80%

60%

70%

70%

Second-order data

50%

60%

70%

70%

60%

40%

First-order + Second-order data

50%

60%

70%

70%

60%

60%

Table 19.7 Identification accuracy of the samples of grade 3 and grade 4 using the key effective Raman spectrum region as the data source Data source

K 1

3

5

7

9

11

First-order data

53.8%

76.9%

84.6%

84.6%

76.9%

92.3%

Second-order data

46.2%

61.5%

76.9%

76.9%

76.9%

76.9%

First-order + Second-order data

69.2%

61.5%

61.5%

76.9%

84.6%

84.6%

accuracy was 92.3%. Compared with the identification results using Raman full spectrum data, the accuracy rate was improved from 75% to 92.3%. Based on the above-optimized identification model and identification method, the identification results of 23 samples to be identified in the test set were shown in Table 19.8, and the accuracy was 87.0%. The above results show that using the CARS algorithm can effectively select the key effective Raman spectrum region that can effectively reflect the tobacco quality style difference information. When this key effective Raman spectrum region is used as a data source for tobacco grade identification, the accuracy can be significantly improved.

19.4 Conclusion By quickly obtaining the high-quality Raman spectrum data of tobacco sample, and selecting the key effective Raman spectrum region that determines the tobacco quality style as the data source of the identification model, this paper established a database containing the tobacco quality style information and the corresponding Raman band data, and the mapping relationship model between the characteristic parameters of tobacco Raman spectrum, and its quality style information was established. This paper further combined the clustering analysis and the KNN algorithm, the accuracy of tobacco origin identification of 95.7% was achieved when using the second-order Raman data as the identification data source. The key effective region of the firstorder Raman spectrum selected by the CARS algorithm was used as the data source of the identification model, and the accuracy of tobacco grade identification of 87.0%

224 Table 19.8 Identification results of tobacco samples in this test set using the identification model in this paper

Q. Li et al. Type of tobacco

Manual classification

Raman prediction

Fujian YLC2Y-2017 F-1 Tobacco

F-1

Fujian ELC2Y-2017 Tobacco

F-1

F-1

Yunnan YLC1Y-2017 Tobacco

Y-1

Y-2

Yunnan YLC1Y-2017 Tobacco

Y-1

Y-2

Yunnan YLC1-2017 Tobacco

Y-1

Y-1

Fujian YLC1HDY-2017 Tobacco

F-2

F-2

Fujian ELC1HDY-2017 Tobacco

F-2

F-2

Yunnan ELC2Y-2017 Tobacco

Y-2

Y-2

Yunnan C3L-2017 Tobacco

Y-2

Y-2

Yunnan ELC2-2017 Tobacco

Y-2

Y-2

Fujian C4F-2017 Tobacco

F-3

F-3

Fujian C4F-2017 Tobacco

F-3

F-3

Fujian SLC1CB-2017 Tobacco

F-3

F-3

Fujian C34LF-2017 Tobacco

F-3

F-3

Yunnan C3L-2017 Tobacco

Y-3

Y-3

Yunnan C4F-2017 Tobacco

Y-3

Y-3

Yunnan C4F-2017 Tobacco

Y-3

F-3

Fujian B2F-2017 Tobacco

F-4

F-3 (continued)

19 Rapid Identification of Herbaceous Biomass Based on Raman Spectrum … Table 19.8 (continued)

Type of tobacco

Manual classification

Raman prediction

Fujian B2F-2017 Tobacco

F-4

F-4

Fujian B2F-2017 Tobacco

F-4

F-4

Yunnan B12FY-2017 Tobacco

Y-4

Y-4

Yunnan SLBB03-2017 Tobacco

Y-4

Y-4

Yunnan YLBB01-2017 Tobacco

Y-4

Y-4

225

was achieved. A fast identification method of tobacco quality style based on Raman spectrum analysis is successfully established in this paper, and the in-situ online fast (in seconds) identification of tobacco quality style is realized. Acknowledgements This work was financially supported by the R&D fund from China Tobacco Fujian Industrial Co., Ltd. (No. 2012-42). The assistance from the Analytical and Testing Center of Huazhong University of Science and Technology is also acknowledged.

References 1. Xiang, B., et al.: Simultaneous identification of geographical origin and grade of flue-cured tobacco using nir spectroscopy. Vib. Spectrosc. 111, 103182 (2020) 2. Huang, L., et al.: Comparative analysis of the volatile components in cut tobacco from different locations with gas chromatography-mass spectrometry (GC-MS) and combined chemometric methods. Anal. Chim. Acta 575(2), 236–245 (2006) 3. Shin, H.S., et al.: Sensitive and simple method for the determination of nicotine and cotinine in human urine, plasma and saliva by gas chromatography-mass spectrometry. J. Chromatogr. B-Analyt. Technol. Biomed. Life Sci. 769(1), 177–183 (2002) 4. Xu, W.H., et al.: Flow injection techniques in aquatic environmental analysis: recent applications and technological advances. Crit. Rev. Anal. Chem. 35(3), 237–246 (2005) 5. Iber, B.T., et al.: A review of various sources of chitin and chitosan in nature. J. Renew. Mater. 10(4), 1097–1123 (2022) 6. Jiang, T., et al.: Prediction and analysis of surface quality of northeast china ash wood during water-jet assisted co2 laser cutting. J. Renew. Mater. 9(1), 119–128 (2021) 7. Xin, X., et al.: Dynamic mechanical and chemorheology analysis for the blended epoxy system with polyurethane modified resin. J. Renew. Mater. 10(4), 1081–1095 (2022) 8. Huang, Y., et al.: Predicting heavy metals in dark sun-cured tobacco by near-infrared spectroscopy modeling based on the optimized variable selections. Ind. Crops Prod. 172 (2021) 9. Shao, Y., He, Y., Wang, Y.: A new approach to discriminate varieties of tobacco using vis/near infrared spectra. Eur. Food Res. Technol. 224(5), 591–596 (2007)

226

Q. Li et al.

10. Zhang, L., Ding, X., Hou, R.: Classification modeling method for near-infrared spectroscopy of tobacco based on multimodal convolution neural networks. J. Analyt. Methods Chem. (2020) 11. Zhang, Y., et al.: Quantitative analysis of routine chemical constituents in tobacco by nearinfrared spectroscopy and support vector machine. Spectrochim. Acta Part A-Mol. Biomol. Spectrosc. 71(4), 1408–1413 (2008) 12. Chang, C.F., et al.: Quantitative evaluation of high-resolution features in images of negatively stained tobacco mosaic virus. Ultramicroscopy 11(1), 3–11 (1983) 13. Guru, D.S., et al.: Machine vision based classification of tobacco leaves for automatic harvesting. Intell. Autom. Soft Comput. 18(5), 581–590 (2012) 14. Sari, Y., Pramunendar, R.A.: Classification quality of tobacco leaves as cigarette raw material based on artificial neural networks. Int. J. Comput. Trends Technol. 50(3), 147–150 (2017) 15. Yin, Y., Xiao, Y., Yu, H.: An image selection method for tobacco leave grading based on image information. Eng. Agric. Environ. Food 8(3), 148–154 (2015) 16. Zhang, F., Zhang, X.: Classification and quality evaluation of tobacco leaves based on image processing and fuzzy comprehensive evaluation. Sensors 11(3), 2369–2384 (2011) 17. Davies, A., et al.: Identification of volatiles from heated tobacco biomass using direct thermogravimetric analysis-mass spectrometry and target factor analysis. Thermochim. Acta 668, 132–141 (2018) 18. Magdy, N., Ayad, M.F.: Two smart spectrophotometric methods for the simultaneous estimation of simvastatin and ezetimibe in combined dosage form. Spectrochimica acta part a-molecular and biomolecular spectroscopy 137, 685–691 (2015) 19. Antunes, E.F., et al.: Comparative study of first- and second-order raman spectra of mwcnt at visible and infrared laser excitation. Carbon 44(11), 2202–2211 (2006) 20. Sadezky, A., et al.: Raman micro spectroscopy of soot and related carbonaceous materials: spectral analysis and structural information. Carbon 43(8), 1731–1742 (2005) 21. Gou, J., et al.: A class-specific mean vector-based weighted competitive and collaborative representation method for classification. Neural Netw. 150, 12–27 (2022)

Chapter 20

Research on Spam Detection with a Hybrid Machine Learning Model Yifu Gao, Jiuguang Song, Jia Gao, Na Suo, An Ren, Juan Wang, and Kun Zhang

Abstract Since the beginning of the twentieth century, with the rapid development and popularization of computer technology, e-mail has become an indispensable communication tool in people’s social life, and greatly simplified people’s daily life such as learning and working methods. However, following the rapid development of computer information technology, e-mail brings convenience to people, also generates some spam messages, which seriously threaten and discount the safety of e-mail users. Although the spam detection technology has been deeply studied and widely used, the traditional spam detection methods mostly rely on the static features extracted from the mails while these methods have great limitations and cannot effectively deal with new malicious mail attacks that are complex, aggressive, destructive, and targeted. Thus, with the rapid development of machine learning and artificial intelligence technology, this paper proposed a spam detection model with a hybrid Y. Gao · J. Song (B) · J. Gao · A. Ren The Research Institute of Petroleum Exploration and Development, No. 20 Xueyuan Rd, Beijing 100083, P. R. China e-mail: [email protected] Y. Gao e-mail: [email protected] J. Gao e-mail: [email protected] A. Ren e-mail: [email protected] N. Suo China Petroluem Pipeline Telcom and Electricity Engineering Co., Ltd., No. 49, Jin Guang Rd, Guangyang District, Langfang, Hebei Province, P. R. China e-mail: [email protected] J. Wang · K. Zhang Kunlun Digital Technology Co., Ltd., Level 5 Building B2 Block A12, The North Huanghe Street, ChangPing District, Beijing 100083, P. R. China e-mail: [email protected] K. Zhang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_20

227

228

Y. Gao et al.

machine learning method: first presented text pre-processing process including word tokenization, removing stop word, and extracting feature vector, then a spam detection classifier based on linear-regression hybrid model with three commonly-used machine learning methods, namely SVM, ANN, and RF, was discussed. The results show that the three models are able to produce a good result, but the hybrid model would present a better performance and demonstrate the effectiveness of the hybrid method. Keywords Spam detection · Machine learning · Hybrid model

20.1 Introduction Spam usually refers to unsolicited, mass e-mail that is forced into the user’s mailbox. In 1978, Gary Thuerk sent the same email to 600 people at the same time. He was reprimanded and told not to commit again. This is the first “spam” in human history. At present, with the rapid development of computer information technology, e-mail brings convenience to people, but also generates a lot of spam. There is no clear line between ham mail and spam one. Spam usually has the following obvious characteristics: 1. There is no clear sender, return address, and return method; 2. Sending advertisements, pornography, or other irrelevant materials to users without their permission, and affects their daily life and work; 3. Use e-mail to engage in illegal activities such as pyramid selling and heresy, which discounts people’s personal and property safety. During the COVID-19 epidemic in 2020–2021, the global economy was seriously affected. Illegal elements carried out a large number of phishing mail attacks with the theme of COVID-19, and a lot of foreign forces and hacker spy organizations continued to conduct high-level sustainable attacks on key areas in China. In the worldwide, many countries suffer from spam emails every day, for example as of October 18, 2021, the country with the highest number of spam emails sent within one day worldwide was the United States, with around 8.6 billion. The number of spam emails sent in China and India is relatively high compared to the total number of daily sent emails [1]. Figure 20.1 shows the daily number of spam emails worldwide as of October 18th 2021 by country in billions. To sum up, spam detection is of great significance. The remaining parts of this paper are organised as below: Sect. 20.2 discussed the literature review including the traditional spam email detection methods and related work; Sect. 20.3 presented our method for spam detection with machine learning technologies, and the experiment conduction was discussed in Sect. 20.4. Finally, a conclusion about this paper was drawn.

20 Research on Spam Detection with a Hybrid Machine Learning Model

229

Fig. 20.1 Daily number of spam emails sent worldwide as of October 18th 2021, by country in billions

20.2 Literature Review Since the first emergence of spam in 1978, experts and scholars all over the world have spared no effort to research and to practice spam detection technology and have achieved fruitful research results so far. As shown in Fig. 20.2, e-mail is mainly composed of e-mail address (including sender and receiver, e-mail IP address of sender and receiver), subject, message content (keyword, body, attachment), etc. These parts are the key parts for experts and scholars to study spam email detection technology and are the important basis for extraction, statistics, analysis, and judgment of spam detection technology. Fig. 20.2 The email sample

230

Y. Gao et al.

According to [2], spam detection can be conducted using traditional methods and the machine learning approach. For the traditional methods, the most commonlyused techniques are including Blacklist/White list, signatures, and so on. For the machine learning approaches, they are including Bayesian, SVM, Neural Network, Markov model, memory-based pattern discovery, etc. Black List detection technology [3], also known as address detection technology, is the earliest spam detection technology. The emails sent by the blacklisted sender will be judged as spam and be blocked finally. Suppose that a blacklist detection system based on some rules is established at server A. When a new email arrives, the systems will first check the sender’s server address, proxy server address, sender’s email address, and other relevant information in the email header, and then match them with the information in the blacklist. If this address information is consistent with record from the blacklist, this email will be judged as a spam and be intercepted at the end. White List detection technology [4–6], also known as safe list technology, was proposed by Hall in 1998. In contrast to the blacklist, the emails from the sender listed in the whitelist will be judged as legitimate. For some users, they may prefer to receive the spam emails rather than to miss a legitimate email, so users can define, set, and maintain their own white list as what they need. When the address of the sender for the newly received email matches the address in the white list, the email is judged as ham and be received directly. For the spam detection methods related to machine learning, in [7] it discussed the algorithms like Naïve Bayesian (NB), Multi-layered Perceptron (MLP), J48, and Linear Discriminant Analysis (LDA) and their performance was compared as well. In [8], it presented the Bayesian classification, k-NN, ANNs, SVMs, and Artificial Immune System and analyzed their performance on the Spam Assassin public mail corpus. In [9], the author conducted the comparison and analysis of many commonlyused machine learning models and the results have shown the data pre-processing phase with the N-Grams method would present better performance for spam detection in most of the cases.

20.3 Proposed Model In this part, we will discuss the framework of our proposed model for spam detection with a hybrid model based on three machine learning methods.

20.3.1 Framework of Proposed Model The proposed model for email spam detection is including two major stages. The stage one is mainly for feature vector extraction which consists of collecting and preprocessing dataset, and usually the pre-processing is including word tokenization, removing stop word, and extracting feature vector for email. The stage two is designed

20 Research on Spam Detection with a Hybrid Machine Learning Model

231

to classify the spam from the emails with a linear-regression hybrid model based on three commonly-used machine learning methods, namely SVM [10], ANN [11], and RF [12].

20.3.2 Stage One Process As described in Fig. 20.3, in this paper the stage one process consists of word tokenization, remove stop word, and feature extraction. Some words that are commonly used or appeared in a sentence or documents but do not convey too much useful information for readers, such as “and”, “in”, “or”, “on”, and “so”. Commonly, these kinds of words are marked as stop word. If we put these stop words into consideration for spam detection, it may discount the final accuracy of spam classification or degrade the model performance with increasing the computation cost for model training. Therefore, before the process of feature vector extraction for the emails, we locate these stop words in the text of all emails and then remove them from texts. Tokenization is the term which means splitting a text or sentences into different elements which are including words, numbers, punctuations, symbols, and other characters, and then these elements are not needed to be decomposed in subsequent processing, such as punctuation marks, special symbol, and the likes with no significant meaning would be discarded in the tokenization process. In English the sentences are naturally separated with spaces, commas, quotation marks, full stop, etc., so segmentation is relatively easier. In this paper, we applied the Aho-Corasick [14], as one of the dictionary matching algorithms, to identify token in the text of emails. At first a large number of related words are collected to form a relatively comprehensive English dictionary library (such as Oxford Dictionary), which is called AC tree, then we use this special automaton to match the string with the English dictionary. If the English string is found, a word is successfully recognized. In terms of the feature vector extraction process, the widely-used Term Frequency Inverse Document Frequency (TF-IDF) method is considered to extract the feature vector from the texts. The brief Flowchart of the process is shown in Fig. 20.4, and the specific principle of the TF-IDF is described in [13].

20.3.3 Stage Two Process After the stage one process, the texts of email are represented by a feature vector, and this feature vector was designed to contain as much useful information of the email as possible. Concerning these feature vectors, a hybrid machine method is designed to classify them with two main steps. The first step includes three commonly-used machine methods, namely SVM, ANN, and RF, and the others is the integration of these three methods with liner regression.

232

Fig. 20.3 The flowchart of proposed model Fig. 20.4 The flowchart of stage one

Y. Gao et al.

20 Research on Spam Detection with a Hybrid Machine Learning Model

233

For the ANN, it usually consists of input layer, hidden layer, and output layer and each layer can have one or more neurons. If the dataset size is big, the hidden layers are designed to have more layers. The input layer is responsible for accepting the input feature vector and then deliver the information to the hidden layers and finally the calculation result is output by the output layer. What is more, to avoid gradient disappearance, the mathematical Relu function was considered as the active function. For the linear SVM, it aims to calculate a super hyperplane which is able to separate all the input data points in the input vector space into different classes and tries to maximize the gap between these classes. If there is failure to calculate the hyperplane which can linearly separate these input points, the method of kernel function is innovated to map all input vectors into high dimension space from a low dimension space. The commonly-used kernel function includes polynomial, sigmoid, and Gaussian. The RF model, as an ensemble learning method, uses more than one decision trees to classify data into different classes. For the given set of data, each tree votes on an overall classification and the random forest algorithm selects the most votes for the individual classification. Some research work has shown that these traditional machine learning methods are able to predict the spam detection with high accuracy. However, for better performance of these machine models over spam detection, we explored the hybrid model over these three methods to verify the final prediction result. The liner integration [14] model is designed to hybrid these three models. Y = β0 + β1 X1 + β2 X2 + β3 X3

(20.1)

X1 means the predicted result of SVM, X2 means the result of ANN, and X3 means the result of RF. The β0 β1 , β2 , and β3 can be confirmed by least square method.

20.4 Experiment Result 20.4.1 Dataset The dataset is collected from one oil and gas company, and all the messages are delivered between July 18, 2020 and November 16, 2022 to a particular server. This dataset is designed for spam detection to verify the performance of proposed model. They are divided into training dataset and testing dataset. The training dataset includes 80,000 ham emails and 20,000 spam emails. The testing dataset has 10,000 ham emails and 5044 spam messages.

234 Table 20.1 The Rmse of three models over dataset

Table 20.2 The Rmse of all models over dataset

Y. Gao et al. Model

Training dataset

Testing dataset

SVM

0.74

0.61

ANN

0.86

0.64

RF

0.85

0.72

Model

Training dataset

Testing dataset

SVM

0.74

0.61

ANN

0.86

0.64

RF

0.85

0.72

Hybrid

0.90

0.78

20.4.2 Experiment Result First, we choose the SVM, ANN, and RF individually to verify their performance for spam detection. The SVM is configured with Sigmoid kernel function, and the ANN is designed with two hidden layers, which have 4 and 5 neurons respectively, and the output layer just has 1 neuron. All these configurations and running process is done with Matlab. Once these models are set up in Matlab, and the training dataset is used to train these models respectively. After the training process, the testing dataset was put into model to calculate the Rmse of these models. Table 20.1 records the Rmse of these models in training dataset and testing dataset. From Table 20.1, we conclude that the RF model presents the most accurate predicted result and the SVM shows the least accurate result while the ANN’s output is between them. The results present the feasibility of these three models for the span detections. Based on the SVM, ANN, and RF, we innovated a linear-regression hybrid method to verify the final performance, the weight of each model will be calculated by least square method with the training dataset. And then the testing dataset was put into the hybrid model, and Table 20.2 shows the Rmse of training dataset and testing dataset. From Table 20.2, it is obvious that the hybrid model produced the better performance than that of the rest of models, and the hybrid model’s result is up to the practical standard as well.

20.5 Conclusion In this paper, a spam detection model with hybrid machine learning method is proposed and investigated. We described the model’s flow chart which is including data collection, data pre-processing, and model configuration. The customed dataset was used to verify the performance of the proposed model. First, we discussed the

20 Research on Spam Detection with a Hybrid Machine Learning Model

235

commonly-used three models: SVM, ANN, and RF for the spam detection and their performance shows the feasibility for spam detection somehow. What is more, aimed for further improvement of prediction accuracy, a hybrid model with linear regression was proposed and the experiments results revealed that the hybrid model based on the mentions three models would produce better performance. For the future work, first, the hybrid model will be tested again with our the other real-time emails dataset and then be considered to be installed on the production system in the oil and gas company, and to help the system users to automatically detect the spam message. Meanwhile, more machine learning methods will be discussed and explored for the challenge of spam detection. Hopefully, the new model would be able to present the expecting and meet the demand for spam detection.

References 1. https://www.statista.com/statistics/1270488/spam-emails-sent-daily-by-country/. Last Accessed 11 Dec 2022 2. Saab, S.A., Mitri, N., Awad, M.: Ham or spam? A comparative study for some contentbased classification algorithms for email filtering. In: MELECON 2014–2014 17th IEEE Mediterranean Electrotechnical Conference, pp. 339–343. IEEE (2014) 3. Swe, M.M., Myo, N.N.: Fake accounts detection on twitter using blacklist. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp. 562–566. IEEE (2018) 4. Heron, S.: Technologies for spam detection. Netw. Secur. 1, 11–15 (2009) 5. Esquivel, H., Akella, A., Mori, T.: On the effectiveness of IP reputation for spam filtering. In: 2010 Second International Conference on Communication Systems and Networks (COMSNETS 2010), pp. 1–10. IEEE (2010) 6. Trifunovic, S., Kurant, M., Hummel, K.A., Legendre, F.: Preventing spam in opportunistic networks. Comput. Commun. 41, 31–42 (2014) 7. Deepa, L., Radha, N.: Supervised learning approach for spam classification analysis using data mining tools. Int. J. Comput. Sci. Eng. 2(9), 2783–2789 (2010) 8. Mujtaba, G., Shuib, L., Raj, R.G., Majeed, N., Al-Garadi, M.A.: Email classification research trends: review and open issues. IEEE Access 5, 9044–9064 (2017) 9. Sattu, N.: A Study of Machine Learning Algorithms on Email Spam Classification. Southeast Missouri State University (2020) 10. Huang, S., Cai, N., Pacheco, P.P., et al.: Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 15(1), 41–51 (2018) 11. Cadenas, E., Rivera, W.: Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA–ANN model. Renew. Energy 35(12), 2732–2738 (2010) 12. Biau, G., Scornet, E.: A random forest guided tour. TEST 25(2), 197–227 (2016) 13. Aizawa, A.: An information-theoretic perspective of tf–idf measures. Inf. Process. Manag. 39(1), 45–65 (2003) 14. Weisberg, S.: Applied Linear Regression. John Wiley & Sons (2005)

Chapter 21

Research on Weld Defect Object Detection Based on Multi-channel Fusion Convolutional Neural Network Hanlin Geng, Zhaohui Li, and Yuanyuan Zhou

Abstract Aiming at the problems of low efficiency and strong subjectivity in the current detection of weld defects by radiographic imaging technology, an object detection method of weld defects based on multi-channel fusion convolutional neural network is proposed. In this method, the images of weld defects are encoded and input into multiple feature extraction channels formed by parallel fusion of CNN. After that, the extracted features are fused with full connection layer and the feature vectors are output. Finally, the final output is obtained by Softmax for classification. The proposed method is verified by weld defect images in actual production. The experimental results indicate that the mAP of the multi-channel fusion convolutional neural network reaches 76.37%, and the detection accuracy of weld defects is higher than that of other network such as ResNet-50 and VGG-16. The proposed method can be applied to X-ray intelligent detection of weld defects and other scenarios. Keywords X-ray image · Weld defect · Object detection · Multi-channel fusion

21.1 Introduction With the rapid development of industrial technology, welding forming technology is widely used in aerospace, equipment, shipbuilding, and automobile industries. Due to the influence of environment, personnel operation, welding process, and other factors in the welding process, there are many types of defects in the weld seam. According to the national standard for classification and description of weld defects [1], weld defects are divided into six types: weld tumor, porosity, lack of penetration, lack of fusion, slag inclusion, and crack. The existence of defects will affect the performance of products, and even cause production safety accidents in serious cases. In actual production, X-ray flaw detection is often used as the detection method of weld internal defects, but a large batch of images will be generated in the inspection H. Geng (B) · Z. Li · Y. Zhou Shanghai Spaceflight Precision Machinery Institute, Shanghai, China e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_21

237

238

H. Geng et al.

process. Furthermore, the manual evaluation method is inefficient and depends on the experience of inspectors. Therefore, the detection standards and accuracy will be restricted by the subjectivity of personnel [2, 3]. With the rapid growth of artificial intelligence, many scholars have engaged indepth research on the intelligent detection of weld defects. The proposal of Convolutional Neural Network (CNN) in deep learning has promoted the development of object detection. Chen et al. [4] introduced the residual block into MobileNet and used ELU activation function to replace ReLU to solve the network degradation problem in MobileNet training. Compared with other networks, it has higher accuracy and less computational complexity. Jiang et al. [5] constructed an improved CNN model (IPFCNN) to solve the problem of low feature selection ability of traditional CNN and applied it to weld defect identification. The results indicated that the recognition accuracy was higher than that of traditional ones. Fan et al. [6] used the superpixel segmentation algorithm to construct a new CNN model, which solved the problem of the proportion of RoI in weld defect images and improved the feature extraction ability of the network, achieving an overall recognition rate of 97.8% in the experiment. Hi et al. [7] established a deep learning fusion network model for comprehensive analysis of waveform features and image features based on CNN and TCN, which solved the problem of lack of feature integration analysis in existing models and achieved high recognition accuracy. To sum up, although existing deep learning detection methods can achieve a certain accuracy in the identification task of weld defects, they cannot make detailed classification of different defects with similar morphological characteristics. For example, porosity and slag inclusion with the same circular characteristics. Therefore, based on the Faster-RCNN, this paper optimizes and improves ResNet, connects multiple feature extraction networks in parallel, and fuses them with the full connection layer. During training, the defects of the same type are encoded and input into the channel to extract the specific features of the weld defects. Finally, radiographic images of welding seams in actual production are used for verification, and the results indicate that the proposed network model achieves higher recognition accuracy than the generic model.

21.2 Multi-channel Fusion Convolutional Neural Network 21.2.1 ResNet Network Structure Analysis Compared with the common network, ResNet constructs Residual units between every two layers by shortcut and uses Residual Block to stack the network. The deep network established with Residual Block can solve the network degradation problem well [8]. Identity Mappings in Deep Residual Networks [9] further analyzed ResNet back propagation theory and adjusted the structure of Residual Block. The new structure is shown in Fig. 21.1.

21 Research on Weld Defect Object Detection Based on Multi-channel …

239

Fig. 21.1 Residual block

The new structure uses shortcut as the trunk path and residual path as bypass. The trunk path maintains the “purity” of shortcut so that the information flow can be transmitted intact in both forward and backward propagation. Thus the network blocks at either end of shortcut are not affected by the weighted parameters of the intermediate layer when transmitting information. Batch Normlizition and ReLU are uniformly placed in front of the weight layer on the residual path as pre-activation so that the network is easier to optimize and has stronger generalization ability to avoid network degradation. ResNet can efficiently extract image features for most images. However, the shape of weld defects is changeable, and some defects (porosity and slag inclusion) have the similar morphological characteristics. Therefore, ResNet cannot extract defect features effectively in this case, which leads to the decline of network training effect and limits the further improvement of detection accuracy.

21.2.2 Multi-channel Fusion ResNet Hi has pointed out in the paper [7], the weld defect data have two characteristics: graph and waveform. In this paper, parallel CNN and TCN are built to extract the graphic features and waveform features from the weld defect data respectively. Then

240

H. Geng et al.

the full connection layer is used to fuse two extracted features, and Softmax is used to classify the fusion features. Finally, good results are obtained. Inspired by that work, this paper attempts to use multi-channel convolutional neural network for feature extraction of various types of weld defects. Based on ResNet, a multi-channel fusion convolutional neural network (MC-ResNet) is proposed. The MC-ResNet structure is shown in Fig. 21.2. MC-ResNet establishes three channels for three kinds of weld defects. Each CNN channel contains 1 Max Pooling layer, 4 Bottleneck convolution blocks, 1 Average Pooling layer, and 1 full connection layer. Each Bottleneck block consists of 3 convolutional layers, and the Bottleneck blocks are stacked in series. A single channel has 54 convolution layers. Parameter settings of each layer of MC-ResNet are shown in Table 21.1. The inputted weld defect images are firstly processed by One-Hot encoding. Before this, it should be ensured that each sample in the weld defect image data only contains the same type of defect. Then the images with the same coding are packaged into a batch. All samples in one batch generated at this time have the same defect type. Each type of weld defect corresponding coding are shown in Table 21.2. Then, the batch is inputted into the corresponding channel according to the encoding, which only extracts the characteristics of the corresponding defect type. After that,

Fig. 21.2 Structural of multi-channel fusion convolutional neural network

21 Research on Weld Defect Object Detection Based on Multi-channel …

241

Table 21.1 MC-ResNet network structure Layer name

Output size

Channel 1

Channel 2

Channel 3

224×224

One-Hot encoding 7 × 7, 64, stride 2

Layer 0

112×112

Layer 1

28×28

3 × 3, max pool, stride 2 ⎡ ⎤ 1 × 1, 64 ⎢ ⎥ ⎢ 3 × 3, 64 ⎥ × 3 ⎣ ⎦ 1 × 1, 256

⎡ Layer 2

⎤ 1 × 1, 126 ⎢ ⎥ ⎢ 3 × 3, 126 ⎥ × 4 ⎣ ⎦ 1 × 1, 512

14×14

⎡ Layer 3

⎤ 1 × 1, 256 ⎢ ⎥ ⎢ 3 × 3, 256 ⎥ × 6 ⎣ ⎦ 1 × 1, 1024

7×7

⎤ 1 × 1, 512 ⎥ ⎢ ⎢ 3 × 3, 512 ⎥ × 3 ⎦ ⎣ 1 × 1, 2048 ⎡

Layer 4

1×1

Layer 5

1×1

Average pool, 1000–d fc

Layer 6

1000–d fc, softmax

FLOPs

11.4 × 109

the extracted features are inputted into the full connection layer of the channel. After the above operations, the corresponding defect features were extracted from the three channels respectively. A full connection layer is used to fuse the features extracted from the three channels and output the feature vectors. Finally, the final eigenvetors is classified by Softmax classifier. The multi-channel feature fusion process of weld defect images is shown in Fig. 21.3. In MC-ResNet, the CNN with feature extraction function is combined in parallel, and the number of feature extraction channels is set according to the type of weld defects. The weld defect data used in this paper have three types: porosity, slag inclusion, and crack. Correspondingly, three feature extraction channels are set up in Table 21.2 Weld defect one-hot encoding

Type of defect

Encoding

Porosity

0001

Slag insertion

0010

Crack

0100

242

H. Geng et al.

Fig. 21.3 Multi-channel feature fusion process of MC-ResNet

MC-ResNet. And the number of channels can be extended according to the number of types of objects detected in the dataset. Compared with ResNet, the proposed network sets feature extraction channels according to the number of detected object types. The “purity” of training samples in a single channel is guaranteed, which is equivalent to simplifying the detection task into a binary classification problem of “defective” and “non-defective”. Thus, the feature extraction ability is improved. The network proposed in this paper simplifies the problem of multi-object detection by using specific channels to extract features of specific objects, which makes the network pay more attention to the local difference information of images. It is beneficial to extract details and improve the detection accuracy of weld defects.

21.2.3 Faster-RCNN Object Detection Model Faster-RCNN is a two-stage object detection model proposed by Ross B. Girshick [10–12]. Structurally, Faster-RCNN integrates feature extraction, proposed region extraction, bounding box regression, and classification into a network, which significantly improves the comprehensive performance of the model, especially in detection speed. Faster-RCNN can be divided into four main parts: (1) Backbone: Faster-RCNN uses a set of “Conv + ReLU + pooling” layers to extract the feature map of the images. This feature map is shared for subsequent RPN and full connection layer. (2) Region Proposal Networks: RPN is used to generate anchors. This part determines whether the anchor belongs to positive or negative. Than the accurate anchors are obtained by using bounding box regression. (3) ROI pooling: This layer collects the feature maps and anchors. After integrating these information, feature maps of anchors are extracted and sent to the subsequent full connection layer to determine the object category.

21 Research on Weld Defect Object Detection Based on Multi-channel …

243

Fig. 21.4 Overall framework of faster-RCNN

(4) Classification and regression: The specific categories of objects is calculated by proposal feature maps. Meanwhile, a bounding box regression is performed again to obtain the precise location of the bounding box. In this paper, MC-ResNet with FPN is adopted as the backbone of Faster-RCNN. The entire framework of Faster-RCNN is shown in Fig. 21.4.

21.3 Experiment and Analysis 21.3.1 Weld Defect Dataset In this paper, the network training is carried out by using the radiographic images of welding seam produced by an aerospace research institute during 2016–2021. There are 1,540 pictures in the original data, including four types: porosity, slag inclusion, crack, and no defect. Each sample is accompanied by a test report issued by professional inspectors. In actual production, different workpiece detection process is different. Therefore, the radiographic image quality of welding seam is not consistent, and some images have problems such as over-exposure, noise, and image blur. In order to improve the image quality of the dataset, data cleaning is performed on the original samples to filter out unqualified images. Then, Gaussian filter is used to remove the noise in images. Meanwhile, image sharpness is improved by gamma correction and other image enhancement methods. After obtaining qualified dataset, “Labelme” is used to

244

H. Geng et al.

annotate the samples. And the annotated dataset is sorted into “PascalVOC” format. The images after annotation are shown in Fig. 21.5. Finally, in order to avoid the problem of over-fitting in network training due to the little sample size of dataset, the number of images in dataset is extended to 2844 by data enhancement. And the dataset is divided into training set and validation set, of which 1422 are training set and 1422 are validation set. Dataset division is shown in Table 21.3.

Fig. 21.5 Radiographic images of porosity, slag inclusion, crack, no defect

Table 21.3 Division of weld defect dataset Dataset

Porosity

Slag inclusion

Crack

Total

Training set

657

470

295

1422

Validation set

657

471

294

1422

21 Research on Weld Defect Object Detection Based on Multi-channel …

245

21.3.2 Evaluation Index To verify the effectiveness of MC-ResNet, mean average precision (mAP) is used to measure the performance of the network. The samples are divided into four types, true positive (TP), false positive (FP), true negative (TN), and false negative (FN), according to the combination of the real defect category and the predicted defect category. The confusion matrix composed of the above four sample types is shown in Fig. 21.6. The basis for determining the predicted defect of network as TP is that the intersection over union (IoU) of the predicted defect is greater than the threshold value. In this paper, IoU = 0.5 is set as the threshold to determine that the predicted defect is TP. Those greater than this value are regarded as positive samples, while those less than this value are regarded as negative samples. According to the above description, calculate the precision rate and recall rate. The P-R curve is drawn according to the value of precision and recall, and the average precision (AP) is the area under the

Fig. 21.6 A diagram of the confusion matrix

246

H. Geng et al.

Table 21.4 Calculation formula of evaluation index Evaluation index

Formula

Precision

pr ecision =

Recall

r ecall =

TP T P+F P TP T P+F N

P-R curve. mAP is calculated by averaging the AP values of each category, which is used as an evaluation index to verify the predictive ability of the network. The formula of precision and recall rate is shown in Table 21.4.

21.3.3 Experimental Running Environment The experiment is run under Ubuntu Server 20.04, using PyTorch 1.10.0 deep learning framework and Python 3.8 environment. The CPU used in the experiment is Intel Xeon Gold 6330 with 48 GB memory. The GPU model is Nvidia RTX 3090 24 GB. CUDA 11.2 is used for GPU computing framework, and Cudnn 8.3 is used for deep neural network acceleration computing.

21.3.4 Results To verify the effectiveness of MC-ResNet, MC-ResNet, ResNet-50, and VGG-16 are used as backbone of Faster-RCNN for horizontal comparative analysis. At the same time, one-stage methods represented by YOLO and SSD are used for longitudinal comparative analysis. The above methods use the dataset described in Table 21.3 for training and validation, and the number of epochs is 15. To avoid the randomness in the experiment affecting the results, the experiment is repeated for 5 times, and the average value of the 5 results is taken as the final result. The results are shown in Table 21.5. Table 21.5 Identification results of weld defects by different methods Method

Precision porosity

Slag inclusion

Crack

MC-ResNet

79.11

68.13

81.89

76.37

ResNet-50

78.72

46.00

84.06

69.67

VGG-16

80.03

45.25

75.12

66.81

SSD

77.47

52.60

77.86

69.30

YOLO

66.23

42.11

47.34

52.00

mAP

21 Research on Weld Defect Object Detection Based on Multi-channel …

247

It can be seen from Table 21.5 that the mAP of MC-ResNet is the highest and achieves the best recognition results. From the analysis of network structure, ResNet50 and MC-ResNet have the same network depth, so there is a small difference in the precision of porosity and crack. However, the precision of ResNet-50 for slag inclusion is relatively low, only 46.00%. As can be seen from Fig. 21.5, porosity and slag inclusion have the same circular features, mainly on the grey values have certain differences. ResNet-50, which mainly relies on shape features of object for recognition, is difficult to distinguish such differences. In comparison, MC-ResNet sets up multi-channel for feature extraction, which improves the ability to distinguish defects with similar shape features, and increases the precision of slag inclusion to 68.13%. VGG-16 has a shallow network depth, so it is inferior to ResNet-50 and other deep networks in feature extraction performance. Therefore, VGG-16 has low precision and mAP. One-stage methods such as YOLO and SSD unify object classification and bounding box regression in object detection and directly generate the classification probability and bounding box coordinates of objects. Therefore, the precision of one-stage algorithm is lower than that of two-stage algorithm from the algorithmic level. The SSD method here uses ResNet-50 as backbone, so its overall performance of defect identification is higher than that of VGG-16. However, compared with the Faster-RCNN method of MC-ResNet and ResNet-50 as backbone, its precision and mAP are still low. Figure 21.7 shows the curves of mAP and loss value of several methods in training. As can be seen from Fig. 21.7, after 10 epochs, the losses of ResNet-50, VGG-16, SSD, and YOLO methods converge gradually, while mAPs are basically stable. However, the loss of MC-ResNet method is still decreasing at the end of 14 eopchs, and mAP also shows an upward trend. Therefore, it can be predicted that by increasing the epoch of MC-ResNet method, the precision and other comprehensive recognition ability of the method can be further improved.

21.4 Conclusion In this paper, a weld defect object detection model based on multi-channel fusion convolutional neural network is proposed, and the effectiveness of proposed model is verified through experiments. The main conclusions are as follows: (1) Based on ResNet-50, a feature extraction strategy of parallel fusion of convolution layers is proposed, which extends the existing deep learning model construction method, enhances the feature extraction ability of similar shape features defects, and improves the precision of deep learning model for weld defect detection. (2) The proposed model is verified by weld defect radiographic images generated in actual production. The results indicate that the MC-ResNet model has

248

H. Geng et al.

Fig. 21.7 Curves of mAP and loss of several methods

higher precision than ResNet-50, VGG-16, SSD, and YOLO methods, especially for defects with similar shape features. Compared with ResNet-50, the mAP of MC-ResNet improves 6.70%, and the precision of slag inclusion defect increases 22.12%. Subsequently, this method can be applied to the welding seam radiographic detection of welding products. Acknowledgements This research is supported by China Aerospace Science and Technology Corporation for study on high efficiency digital ray inspection and evaluation technology of welding seam of aluminum alloy tank of carrier rocket (Project No.GXGY-2020-08).

References 1. GB/T 6417–86.: Classification and description of metal fusion weld defects 2. Wang, R., Hu, Y. L., Liu, W. P. L., Li, H. T.: Weld defect detection based on edge AI in X-ray image. Trans. China Weld. Inst. 43(01): 79–84+118 (2022) 3. Liu, H., Liu, X.J., Wang, Y.F., et al.: Research on weld defect classification technology based on compound convolution neural network structure. Acta Aeronaut. Astronaut. Sin. 43(S1), 726928 (2022). (In Chinese) 4. Chen, Y. F., Peng, H. S., Wang, J. T., et al.: Detection and recognition of weld defects based on lightweight convolutional neural network. Autom. & Instrum. 37(1), (2022) 5. Jiang, H.Q., He, S., Gao, J.M., et al.: An improved convolutional neural network for weld defect recognition. J. Mech. Eng. 56(08), 235–242 (2020) 6. Fan, D., Hu, A. D., Huang, J. K.: X-ray image defect recognition method for pipe weld based on improved convolutional neural network. Trans. China Weld. Inst. 41(01): 7–11+97 (2020)

21 Research on Weld Defect Object Detection Based on Multi-channel …

249

7. Hi, Z.L., Jiang, H.Q., Yang, D.Y., et al.: A deep learning fusion model of wave and image data for weld defect recognition. J. Xi’an Jiaotong Univ. 55(05), 73–82 (2021) 8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016) 9. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks, In: Leibe, B., Matas, J., Sebe, N, Welling, M. (eds.) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol. 9908. Springer, Cham (2016) 10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014) 11. Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015) 12. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE. Trans. Pattern. Anal. Mach Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6 PMID: 27295650

Chapter 22

Design and Research of Highway Tunnel Construction Positioning System Based on UWB Hengbo Zhang, Xinke Wang, Yuchen Liu, and Lei Cai

Abstract Maintenance personnel, according to the theory of highway tunnel construction vehicle real-time positioning problem, combined with the tunnel’s construction environment, make full use of information technology, the basic principles of UWB indoor positioning technology is analysed, and based on this, advances the UWB positioning system of the highway tunnel construction solution, finally, set up the test environment to analyse the location accuracy of the positioning system. The results show that the system can fully meet the positioning requirements of tunnel construction personnel and vehicles. The system is mainly composed of four parts: the perception layer, the transmission layer, the solution layer, and the application layer. Through the Internet of things technology, the positioning detection, track playback, one-button alarm, and other functions of the construction personnel and vehicles can be realized, which can effectively improve the level of intelligent management and security control in the construction stage of highway tunnel. Keywords Tunnel · Positioning · UWB · The Internet of things · Construction safety

22.1 Introduction With the rapid growth of China’s economy, the construction of highway tunnels in China changes with each passing day. According to statistics, in recent ten years, the annual mileage of highway tunnels in China is up to more than 1100 km [1]. And its construction scale is increasing year by year. In the process of the construction of tunnel engineering, due to its environment is relatively closed and no GNSS signals, with strong concealment, construction risk is high, the working space and the complex H. Zhang · X. Wang · Y. Liu (B) · L. Cai Beijing GOTEC ITS Technology Co., Ltd, Beijing 100088, China e-mail: [email protected] H. Zhang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_22

251

252

H. Zhang et al.

construction process, when the tunnel construction safety accident occurs to the realtime accurate positioning, it is difficult to achieve the timely and effective relief, is likely to cause many casualties and property losses. Therefore, how to ensure the personal safety of tunnel construction personnel has become an urgent problem to be solved. With the rapid development of Internet of Things and remote intelligent monitoring technology, the in-depth application of various positioning technologies in the transportation industry provides favorable conditions for the intelligent. Construction supervision of highway tunnels. For the application scenarios of highway tunnels, compared with other indoor positioning technologies, UWB positioning technology has unique advantages in security, anti-interference, and positioning accuracy. In this paper, the basic principle of UWB indoor positioning technology and its applicability in highway tunnel are analysed [2]. A UWB construction positioning system based on UWB is proposed. UWB base station is arranged in the tunnel according to a certain topological structure to locate and track the monitoring objects with positioning tags. Through background software, managers can view the status information of monitoring targets in real time, providing strong technical support for tunnel construction safety.

22.2 Brief Introduction of UWB Positioning Technology As a new non-carrier communication technology, UWB technology has become a hot spot of wireless indoor positioning technology in recent years [3]. The technology has a wide spectrum, applicable in the 3.1–10.6 GHz band and transmitting frequency below 41 db. Based on its ultra-bandwidth technology, UWB can send a series of very narrow low power pulses across the frequency band with data transmission rates ranging from tens of Mbps to hundreds of Mbps. The main algorithm is arrival time localization algorithm. UWB indoor positioning technology has low transmission power, high safety, strong anti-interference ability, high positioning accuracy, etc. The disadvantage is that need to be proprietary positioning terminal, but relative to the tunnel construction personnel and vehicles, can use the tag integrated in intelligent safety helmet, intelligent hand ring, and the construction vehicle terminal module, deployment is relatively convenient and easy to manage, very suitable for highway tunnel construction vehicles and personnel positioning. Entering on UWB indoor positioning technology, this paper studies and puts forward UWB based highway tunnel construction positioning system and related applications [4].

22 Design and Research of Highway Tunnel Construction Positioning …

253

22.3 Highway Tunnel Construction Positioning System Based on UWB Highway tunnel construction positioning system based on UWB is composed of four parts: perception layer, transmission layer, solution layer, and application layer. Based on this system, construction personnel and vehicles in the tunnel can be precisely positioned. The overall architecture of the system is shown as follows (Fig. 22.1)

22.3.1 Perception Layer Mainly by UWB system of location sensing layer base stations, tags, labels, and tags constitute, among them, the people label position mainly for construction personnel and tend to put tag in helmets, intelligent bracelet, car label position mainly for

Fig. 22.1 Overall system architecture diagram (API: Application Program Interface; UDP: User Datagram Protocol; HTTP: Hyper Text Transfer Protocol; TCP: Transmission Control Protocol; POE: Power Over Ethernet)

254

H. Zhang et al.

construction vehicle and will locate the label integrated vehicle mechanical positioning in the module. According to the length of the construction tunnel, the UWB positioning base station shall be arranged according to a certain distance, and the UWB positioning base station shall be arranged according to a certain distance when the visual distance effect is good [5]. After the positioning base station is installed, it is necessary to select a coordinate reference point, and then measure the distance from each base station to this point, and then determine the coordinate position of each positioning base station and input the position into the positioning software. Through the mutual communication between the positioning tag and the UWB base station, the location information and movement status of the monitoring target can be monitored in real time. The management staff of the monitoring centre can check the number, location, and movement track of the positioning target in real time through the application software [6].

22.3.2 The Transport Layer This system by integrated WIFI wireless communication module in the UWB base station, tunnel positioning base stations, and wireless communication mode is adopted to improve the connection between the server, the UWB stations within the tunnel layout according to certain spacing, comprehensive coverage of tunnel wireless communication network can be realized, thus meet the demand of the system of data information transmission. Among them, data transmission between the UWB positioning base station at the tunnel mouth and the server should be considered through optical fiber to effectively ensure the accuracy and effectiveness of data transmission.

22.3.3 Calculating Layer Calculation of positioning system layer is the core part of UWB positioning engine software, the mainstream of UWB time of arrival (TOA), including time difference of arrival (TDOA), angle of arrival (AOA), etc., TOA has high requirements for clock synchronization and poor positioning accuracy. Reference [7] Therefore, the positioning technology algorithm of the system uses the fusion algorithm of TDOA and AOA to solve the position.

22.3.4 The Application Layer The information application layer mainly uses the location data of the target to be measured calculated by the solution layer [8], carries on the identification and the

22 Design and Research of Highway Tunnel Construction Positioning …

255

positioning calculation, then mainly realizes the positioning target position display and the track playback function through the application software, but also has the intelligent attendance, emergency warning, quantity statistics, intelligent access control, and video linkage and other functions.

22.4 Positioning System Test 22.4.1 Test Environment Construction To verify the stability of highway tunnel construction based on UWB positioning system and the positioning accuracy, the Ministry of Transport road testing ground to carry on the test environment set up, through the lanes in the tunnel of UWB arranged on both sides of the base station equipment, layout spacing is 50 m, with wiring base station equipment to POE (Power Over Ethernet) switches, on the positioning system control software, Mark the fixed point coordinate position and spacing of each base station equipment, and the topology diagram of the environment construction in the tunnel is as follows (Fig. 22.2).

22.4.2 Performance Test The vehicles and personnel with positioning tags were allowed to pass through the positioning area at low speed, and the positioning errors of 50 m base station spacing in the tunnel were tested respectively. Through the database, the generation of positioning data was checked whether the time was periodic [9]. During driving, the system tool is used to check the success rate of receiving positioning signals and

base station

base station

base station

base station

base station Cable

base station

base station

base station

base station

base station

POE exchange

positioning system

Fig. 22.2 Tunnel topology view

256

H. Zhang et al.

positioning accuracy. Figure 22.3 is the interface of parameter setting, and Fig. 22.4 is the base station layout.

Fig. 22.3 The interface of parameter setting

Fig. 22.4 Base station layout

22 Design and Research of Highway Tunnel Construction Positioning …

257

Table 22.1 Test result 1 Time (s)

The theory of distance(m)

Test software positioning distance(m)

Error (absolute value) (m)

1

5.56

5.02

0.54

2

11.12

11.78

0.66

3

16.68

15.98

0.7

4

22.24

21.73

0.51

5

27.8

27.12

0.68

6

33.36

33.96

0.6

7

38.92

38.11

0.81

8

44.48

44.23

0.25

9

50.04

50.87

0.83

10

55.6

55.63

0.03

11

61.16

61.92

0.76

12

66.72

66.11

0.61

13

72.28

72.98

0.7

22.4.3 Test Results Theoretical distance is calculated according to the formula s = v × t, where v represents the vehicle speed at a constant speed (unit: m), and t represents time (unit: second). When the construction vehicle travels at a uniform speed of 20 km/h in the tunnel, one section is selected as the error statistics, and the results are as follows (Tables 22.1 and 22.2 ). When the construction vehicle travels at a speed of 5 km/h (imitating the normal walking speed of the construction personnel at a uniform speed in the tunnel) the statistical table of error is shown in the following table. Through this test, it can be found that when the construction vehicle is traveling at a uniform speed of 20 km/h, its positioning accuracy can be controlled within 1 m, and when the construction personnel are traveling at a uniform speed of normal walking speed, its positioning accuracy can reach within 0.2 m. Through further analysis, the deviation in positioning accuracy is mainly due to the relatively fast speed line of construction vehicles, which leads to high signal packet loss rate [10], and the construction speed is relatively slow, its positioning accuracy is relatively high. Overall, the test results show that the positioning system can fully meet the positioning accuracy requirements of tunnel construction personnel and vehicles.

258

H. Zhang et al.

Table 22.2 Test result 2 Time (s)

The theory of distance(m)

Test software positioning distance(m)

Error (absolute value) (m)

1

1.39

1.31

0.08

2

2.78

2.77

0.01

3

4.17

4.19

0.02

4

5.56

5.43

0.13

5

6.94

6.88

0.06

6

8.33

8.22

0.11

7

9.72

9.61

0.11

8

11.11

12.01

0.90

9

12.50

12.03

0.47

10

13.89

13.78

0.11

11

15.28

15.33

0.05

12

16.67

16.59

0.08

13

18.06

18.22

0.16

22.5 Conclusion This paper focuses on the real-time positioning of tunnel construction personnel and construction vehicles, studies the algorithm and principle of UWB indoor positioning technology, and designs the system architecture, system composition and system function of the positioning system on this basis. The system uses UWB pulse radio waves to locate and monitor the construction personnel and vehicles in the tunnel. Managers can obtain the accurate position and motion state of the positioning target in real time and play back the historical track [11]. At the same time, the positioning system also has one-button alarm, intelligent attendance, emergency warning, quantity statistics, intelligent access control and video linkage functions, which can ensure the personal safety of construction personnel to the maximum extent and effectively improve the intelligent management level of tunnel construction [12].

References 1. Cao, L., Zhang, Z., Yu, X., et al.: Application and research of indoor positioning system based on UWB. Ind. Control. Comput. 35(1), 3 (2022). (In Chinese) 2. Zhou, Z., Zhu, X., Zeng, D.: Research on indoor wireless location technology. Ind. Control. Comput. 35(3), 3 (2022). (In Chinese) 3. Liu, C., Ni, X., Xia, L., et al.: Application of indoor personnel positioning system based on UWB. Internet of Things technology (2022). (In Chinese) 4. Zhang, N., Fu, W.P., Meng, R., et al.: A review of indoor spatial positioning methods. Sci. Technol. Eng 22(3), 11 (2022). (In Chinese)

22 Design and Research of Highway Tunnel Construction Positioning …

259

5. Du, X., Zhu, W., Wen, X., et al.: UWB ultra-wideband indoor positioning technology. Digital Technology and applications (2021). (In Chinese) 6. Dong, X. B.: Research and implementation of high-precision indoor positioning system based on UWB. Guilin University of Electronic Technology (2021). (In Chinese) 7. Yang, D., Tang, X., Li, B., et al.: Review of indoor location technology based on ultra-wideband. Global Positioning System 40(5), 7 (2015). (In Chinese) 8. Ling, W.: Design and implementation of regional positioning system based on UWB technology. Harbin Institute of Technology. Electronic and communication engineering (2021). (In Chinese) 9. Wu, Y.: Research and implementation of indoor location algorithm based on DOA and TDOA. Guangxi Normal University (2021). (In Chinese) 10. Yu, W., Qiu, Y., He, X.: Research and design of object location module based on UWB. Internet of things technology (2021) (In Chinese) 11. Maalek, R., Sadeghpour, F.: Accuracy assessment of ultra-wide band technology in locating dynamic resources in indoor scenarios. Autom. Constr. 63, 12–26 (2016) 12. Tabaa, M., Diou, C., Saadane, R., Dandache, A.: LOS/NLOS Identification based on stable distribution feature extraction and SVM classifier for UWB On-body communications. Procedia Comput. Sci. 32, 882–887 (2014)

Chapter 23

Research on Cross Domain Data Analysis and Data Mining Technology of Power Grid Digital Benefits Gang Wang, Aidi Dong, Changhui Lv, Bo Zhao, and Jianhong Pan

Abstract The digital power grid system integrates flexible resources such as traditional power sources and distributed power sources and has the characteristics of high time variability and high complexity. It urgently needs cross domain data support. However, the traditional evaluation methods can’t evaluate cross domain data, which is not targeted and effective, and its practicality needs to be improved. This paper analyzes the cross domain intelligent construction and business service requirements, the characteristics of power grid digital cross domain data, studies the internal coupling relationship between digital system application functions and cross domain data, and proposes a coupling evaluation model to measure the digital system and cross domain data. Research the process of cross domain data mining, cluster similarity measurement, historical data feature mining, and other technologies to achieve historical cross domain data feature analysis. Study the fitting analysis technology, and use correlation analysis and regression analysis to achieve the fitting analysis of spanning data, so as to support the technical and economic analysis of spanning data of power grid digital system. Keywords Digital project · Technical and economic evaluation · Cross domain data analysis · Data mining technology

G. Wang (B) State Grid Smart Grid Research Institute CO., LTD, Nanjing 210003, China e-mail: [email protected] State Grid Key Laboratory of Information and Network Security, Nanjing 210003, China A. Dong · J. Pan State Grid Jilin Electric Power Company Limited, Changchun 130021, China C. Lv · B. Zhao Power Economic Research Institute of Jilin Electric Power Co, Ltd, Changchun 130011, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_23

261

262

G. Wang et al.

23.1 Introduction At present, the digital project evaluation is mainly based on manual subjective evaluation and qualitative analysis based on the importance of project needs, mainly focusing on the digital system functions and expected results of the project. Traditional evaluation methods can not specifically evaluate the requirements of digital business system, resulting in poor evaluation results. Facing the demand of digital construction based on the analysis of business system data characteristics, traditional evaluation methods can no longer meet the demand of cross domain digital construction evaluation [1, 2]. At the same time, the efficiency of traditional economic and technological methods is low, which cannot meet the needs of large-scale digital construction, rapid development, and in-depth application [3, 4]. It is necessary to study digital cross domain technical and economic data analysis and data mining technology.

23.2 Power Grid Cross Domain System Requirements Power grid digital system integrates flexible resources such as distributed power generation, which has the characteristics of high time variability and high complexity. It needs fast and flexible operation decision-making technology to ensure its safe and economic operation. However, the business links of power grid transmission, transformation, and distribution rely on cross domain data, and the high complexity and time variability of digital power grid make the limited cross domain data support worse [5, 6]. In view of the above problems, the key lies in mining the potential connections between data and improving the applicability of cross domain data, so as to solve the contradiction between the time variability and complexity of each link of the power grid and the cross domain data support ability. As a new means of production, data resources play the role of basic resources and innovation engines, and promote the overall improvement of business insight, decision-making power and optimization power. In the information construction stage, the company’s professional digital systems are built separately, and the data resources are scattered and inconsistent, resulting in difficulties in cross professional business collaboration and data sharing. In the digital construction stage, it is necessary to further expand the coverage of data resources such as data models to all business domains to achieve full coverage of core business. From the perspective of enterprise level, we will thoroughly break down horizontal barriers among specialties and realize the sharing and reuse of business data across the whole area [7, 8]. The digital system and the cross data are basically coupled. Digital system is composed of enterprise architecture, application architecture, data architecture, technology architecture, and other subsystems [9, 10]. There is mutual feedback between subsystems, and the coupling between subsystems is the core content of the coupling of the whole system. The coupling of subsystems is mainly accomplished

23 Research on Cross Domain Data Analysis and Data Mining Technology …

263

through the interaction and interconnection of coupling elements. The more interaction elements in the coupling, the more coupling nodes, and the greater the coupling effect. However, due to the cost of coupling interaction, the number of coupling elements is not infinite, but has utility boundaries. Cross domain data is the core of data architecture, while application function is the core of application architecture. The internal coupling relationship between application function of digital system and cross domain data restricts the orderly and collaborative development of the whole digital system construction.

23.3 Research on the Relationship Between Cross Domain Data and Digital System Digital system construction will generate a large amount of business data. As a new means of production, data resources play the role of basic resources and innovation engines and promote the overall improvement of business insight, decision-making power, and optimization power. The more comprehensive the application of each business digital system, the richer the data business data generated, the more able it is to reflect the business status and solve the business needs. Among them, the core data is often cross domain data, which is used for business applications in other business domains. In addition to satisfying the use of this business, the construction of each business digital system also needs to provide cross domain data to support the overall digital construction of the company. One of the central contents of building a digital system based on the data center is to build a good data center, and the data in the data center is often cross domain data, which comes from various business digital systems. The digital system is based on the construction of the enterprise middle office. All the data stored in the enterprise middle office are cross domain data. If the business digital system is not based on the cross domain data of the enterprise middle office, it will be unable to obtain the cross domain data, or cause multi-source data, which is difficult to ensure the data consistency. Moreover, in the subsequent operation of the system, it will cause the system operation and maintenance personnel to enter data repeatedly, affecting the effectiveness of system construction. The internal coupling relationship between digital system and cross domain data will seriously restrict the effectiveness of system construction. It is necessary to build a coupling evaluation model. The coupling evaluation model is a = B* C* D, where a is the coupling evaluation factor, B is the key evaluation factor of the coupling evaluation model, the proportion of cross domain data used by the business system in all data of the business system, C is the proportion of cross domain data generated by the business system in all data generated by the business system, and D is the proportion of cross domain data generated by the core functions of the business system in all data generated by the business system.

264

G. Wang et al.

In this paper, the coupling evaluation model is mainly used to measure the digital system and cross domain data, and the data is used for correlation analysis, regression analysis, lead, and lag analysis and other fitting analysis, so as to support the construction of a quantitative control dynamic analysis model and realize the analysis of the relationship between the digital system and cross domain data and the technical and economic analysis of the project.

23.4 Cross Domain Data Mining Process The cross domain data mining process is shown in Fig. 23.1. (1). Based on cross domain coupling evaluation results, appropriate mining tasks are determined. (2). Collect and preprocess the original data. Preprocessing realizes the filtering of typical data, the completion of missing data, the cleaning of abnormal data, and the conversion of data format. (3). First, select the appropriate data mining method (clustering, classification, association rules, time series, etc.) according to the definition of cross domain data mining tasks and then select the algorithm suitable for processing mining tasks from all the algorithms of the method, and use the algorithm to mine. (4). For the results obtained from data mining, information conversion is conducted based on cross domain data analysis business rules, and effective information is presented to users. This paper uses many mining methods such as statistics, information retrieval, machine learning, pattern recognition, etc. to mine technical and economic analysis data. According to different scenarios of economic and economic analysis, such as digital basic capability, common service capability, operation support capability, power grid production, enterprise operation, customer service, industrial ecology, and service government, different methods are adopted, including classification, regression analysis, association rules, clustering Change, and deviation analysis.

23.4.1 Clustering Method Clustering analysis divides the original data set into different cross domain topics through certain calculations so that the cross domain data in the same topic have high similarity, while the data in the non-same cross domain have low similarity. Common clustering analysis methods can be divided into the following categories.

Fig. 23.1 Process diagram of cross domain data mining

23 Research on Cross Domain Data Analysis and Data Mining Technology …

265

(1) Division method. This method is to make data objects with similar characteristics and data objects with different characteristics belong to one class through continuous iteration. And there must be objects in each group, and each object can only be in one group. (2) Hierarchy method. The hierarchical method is to decompose or merge the objects in the data set, and the resulting structure is similar to a tree with a hierarchical structure. The process is to put all the data in a cluster, and then divide the cluster into smaller and smaller groups by some strategy, so that the similarity of the data in the same group becomes higher and higher, until each group meets the termination conditions or becomes a separate cluster. (3) Clustering based on density method. The cluster analysis based on density method considers that the cluster to be formed in the final clustering is composed of sample points gathered together. These data objects are densely distributed, which are called high-density areas. The data objects in these cluster gaps are relatively scattered, which are called low-density areas. The algorithm aims to separate the areas where the data distribution is scattered from the areas where the data is concentrated, so as to find the target cluster of clustering. (4) Clustering based on grid method. This kind of clustering is to divide the data set into several data units, and cluster with the data unit as the smallest unit. In this way, we only need to care about the number of grids, not the dataset itself. (5) Clustering of model-based methods. This method combines the data set with the mathematical model, assumes a suitable mathematical model for each target cluster, and then divides the data conforming to the characteristics of the model into corresponding classes. The mathematical model generally selects the function based on the probability density distribution or the function reflecting the correlation.

23.4.2 Cluster Similarity Measurement The results of clustering analysis depend on the measurement of similarity of data objects, so when analyzing data objects, it is often necessary to consider the similarity between spanning systems, so the corresponding similarity matrix can be calculated by similarity. According to the business characteristics, the following methods are adopted. The first is to use the distance function to measure. Several common distance functions are as follows: (1) Euclidean distance, (2) Manhattan distance, and (3) Mahalanobis distance. The second is to use similarity coefficient to measure. The following are some common similarity coefficients. (1) Cosine angle. The common formula is used to calculate the cosine of the included angle that represents the similarity coefficient. (2) Jacard coefficient. Jacard coefficient is usually used to represent the similarity between sets.

266

G. Wang et al.

23.4.3 Feature Mining of Historical Data Clustering analysis is a common and important method of data mining. In this paper, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and other clustering analysis are used to mine historical data features. The main methods are as follows. As a typical density based clustering algorithm, the basic idea of DBSCAN algorithm is to first scan the cross domain business data set of the whole power grid and then determine a central point according to the knowledge of the cross domain business field of the whole power grid. The so-called central point refers to that the number of sampling points contained in the neighborhood of the inter domain data object is greater than or equal to the given lower limit value. Starting from the cross regional core, all data points related to the cross regional core density are found, and then these data points are used for core expansion according to the power grid cross regional service analysis rules. By traversing all the central points near the power grid cross domain service central point, all the relevant data points in the connected service system related to the central point service can be found until they cannot be expanded. After scanning all the data objects of the power grid business system, scan again to find the database points of the power grid business system that have not been grouped, and repeat the above process until all the base points are classified. After a group of packet termination conditions, the independent data that is not grouped is noise data. Cure algorithm. Cure algorithm is a hierarchical clustering method that selects some typical business samples from the whole cross domain data set, classifies them into several categories, and then clusters them partially, and then clusters the whole data set on this basis. The algorithm flow is as follows: (1) A random sample is extracted from the cross domain data object, and the sample is divided into several sets according to the power grid cross domain business rules, wherein each set is clustered as an independent data set; (2) Isolate outliers from data sets according to business rules. If a class is not updated and stays in the initial state, the cluster will be removed; (3) Re integrate the clustering results of the divided sets according to the business aggregation rules to cluster. At this time, a new class will be obtained, and the cluster center of the small set will fall into the new cluster. At this time, it will move closer or shrink to the cluster center according to the contraction factor; (4) The whole data set is clustered by using the experience of business experts and business rules, and each data object is divided into the cluster represented by the nearest representative point. K-means algorithm. The basic idea of this algorithm is to randomly select k data objects from the cross domain data set as the initial cluster center according to the analysis needs of the power grid cross domain business, then calculate the distance or other similarity measures between other data objects and the initial cluster center according to the similarity measurement method and business domain knowledge,

23 Research on Cross Domain Data Analysis and Data Mining Technology …

267

and then assign other cross domain business system data objects to the groups most similar to the initial cluster center. After allocation, recalculate the cluster center of each group (the mean value of all data objects in the group), and then carry out the next iteration until certain termination conditions are met. The algorithm process is as follows: (1) K data objects are randomly selected from the cross domain data set in combination with the knowledge of the power grid business domain, and these data objects are used as the initial clustering centers; (2) According to the defined mean value calculation method and combined with the power grid expert knowledge, the mean value of the obtained clusters is calculated, and it is regarded as the central object of the class. Then calculate the distance from each data object to all the central objects according to the power grid business rules, and divide the remaining data objects into the clusters where the central objects are located according to the principle of minimum value; (3) Recalculate the mean value (central object) of each (changed) cluster; (4) Loops (2) to (3) until the division of all data objects does not change any more. The k-means algorithm has the following characteristics: the algorithm has a fast operation speed, a relatively simple logic structure, no repetition cycle iteration, combined with the knowledge of the power grid business field, it is suitable for a large number of power grid cross domain data operations. At the same time, according to the experience of power grid experts, the number of categories is given in advance, which will affect the final result. It can only process numerical data, and the clustering result is greatly affected by the initial clustering center, which may lead to different clustering effects.

23.5 Fitting Analysis Technology The results of cross domain data analysis are affected by cross domain analysis requirements and business association factors. Correlation analysis and regression analysis for power grid cross domain business rules need to be used to analyze the correlation between different indicators. 1. Cross domain data correlation analysis can analyze the relationship between variables and the strength of the relationship. Cross domain data correlation analysis steps: (1) Before relevant analysis, first understand the general relationship between historical cross domain data variables and the business rules of the source business system through scatter plots. (2) Calculate the correlation coefficient. The scatter plot can show the relationship between grid cross domain collaborative variables, but it is not accurate. It is also necessary to obtain the correlation coefficient through the analysis of

268

G. Wang et al.

Fig. 23.2 Step chart of cross domain data regression analysis

relevant power grid business rules to accurately reflect the correlation degree in a numerical way. Based on the coupled evaluation model, select appropriate correlation analysis indicators according to the characteristics of historical data, use AHP, entropy weight method and cluster analysis to preliminarily analyze the relevant indicators to be processed, select candidate indicators, use data mining, text mining, machine learning, and big data analysis to analyze the correlation of relevant indicators based on historical reliable data, and calculate Pearson correlation coefficients of relevant indicators in different analysis and evaluation business scenarios, Spearman’s rank correlation coefficient and Kendall’s correlation coefficient are used to build an index correlation model based on business scenarios and the company’s digitization degree. 2. Research on regression analysis technology based on power grid cross domain data analysis. The digital system is built based on business requirements, technology drivers, and other factors. It is necessary to analyze which sub business functions, technical requirements, and relevant influencing factors have caused changes in the key indicators of the overall digital project and the degree of related changes. The steps of cross domain data regression analysis are shown in Fig. 23.2. (1) According to the business rules of the power grid, combined with the cross domain theme needs. Determine the existing data and relationships between independent variables and dependent variables, and preliminarily set the regression equation based on the theme of power grid crossing; (2) Find the reasonable regression coefficient; (3) Cross domain business data and correlation indicators are used for correlation testing to determine correlation coefficients; (4) After meeting the requirements of business rules, the future status of relevant indicators is determined according to the regression equation obtained, the development history of the power grid and the technical conditions of the power grid, and the confidence interval of the predicted value is calculated. Based on the coupling evaluation model, select appropriate regression analysis indicators and regression equations according to the characteristics of historical data, optimize the regression analysis indicators and regression equations by using AHP, entropy weight method and cluster analysis, and analyze the existing data and relationships between relevant independent variables and dependent variables by using data mining, text mining, machine learning and big data analysis based on business scenarios, the company’s digitization degree and historical credible data. The

23 Research on Cross Domain Data Analysis and Data Mining Technology …

269

regression equation based on the theme of power grid cross domain analysis is preliminarily set. The reasonable cross domain system regression coefficient is calculated. The cross domain system, data, and indicators are used for correlation test to determine the cross domain system index correlation coefficient of the power grid, and a regression analysis model oriented to the economic analysis of power grid digital technology is constructed.

23.6 Conclusion This paper analyzes the requirements of power grid cross domain intelligent construction and business service, the characteristics of power grid digital cross domain data, studies the internal coupling relationship between the application functions of power grid digital system, and cross domain data and proposes a coupling evaluation model to measure the correlation between power grid digital system and cross domain data. The process of power grid cross domain data mining, clustering similarity measurement, historical data feature mining, and other technologies are studied to achieve power grid cross domain data feature analysis. Research the correlation analysis method and regression analysis method facing the cross domain business scenario of the power grid and realize the cross domain data fitting analysis of the power grid digital system, so as to support the technical and economic analysis of the power grid digital system. Acknowledgements This work is supported by Science and Technology Project of State Grid Corporation of China (Research on the evaluation system and technology tools of digital technology and economy of enterprises, No. 5700-202129180A-0-0-00).

References 1. Jun, W., Yongqiang, H.: Discussion on information architecture management of power enterprises. Power Inf. Commun. Technol. 14(6), 14–17 (2016). (In Chinese) 2. Bogner, E., Voelklein, T., Schroedel, O.: Study based analysis on the current digitalization degree in the manufacturing industry in Germany. Procedia CIRP 57, 14–19 (2016) 3. Xiang, M., Chenhan, S.: Practice of enterprise architecture method based on TOGAF. Digit. Commun. World 4, 194–196 (2017). (In Chinese) 4. Yoo, Y., Henfridsson, O., Lyytinen, K.: Research commentary—the new organizing logic of digital innovation: an agenda for information systems research. Inf. Syst. Res. 21(4), 724–735 (2010) 5. Tong, Y., Li, Y.: The evaluation of enterprise informatization performance based on AHP/GHE/DEA. In: Proceeding of the international confererce on management, (2007) 6. Paulus, R.D., Schatton, H., Bauernhansl, T.: Ecosystems, strategy and business models in the age of digitization-How the manufacturing industry is going to change its logic. ProcediaCIRP 57, 8–13 (2016) 7. Bin, C., Xiaoyi, Y.: Research and practice of power enterprise information management based on enterprise architecture. Enterp. Manag. S2, 526–527 (2016). (In Chinese)

270

G. Wang et al.

8. Jiachen, T.: Research on enterprise information architecture design of the company. Wuhan University of technology (2016). (In Chinese) 9. Yongwei, L., Qian, S., Bo, L.: Enterprise information architecture design. China Sci. Technol. Inf. 18, 43–44 (2015). (In Chinese) 10. Chen, F., Ziyang, X.: Research and application of digital transformation technology in power enterprises. China New Commun. 22(04), 43–45 (2020). (In Chinese)

Chapter 24

Graph Convolutional Neural Networks for Drug Target Affinity Prediction in U-Shaped and Skip-Connection Architectures Jiale Chen, Xuelian Dong, and Zhongyuan Yang Abstract It is common knowledge that traditional new medication development is a costly, drawn-out procedure with greater safety uncertainty. Among them, drug target affinity prediction (DTA) is an important step in drug discovery and drug research. If we can significantly improve the accuracy of DTA prediction, it can help us potentially reduce the cost of new drug design and development significantly. Therefore, drug target affinity prediction is a very important topic. The precise and thorough characterization of medicines and proteins is the key to this subject. With the advancement of deep learning, it has become popular for academics to integrate deep learning into DTA prediction in an effort to increase accuracy. For example, DeepDTA, WideDTA, GraphDTA, etc., which are basically trained using information of drug molecules, information of protein molecules respectively, and does not make good use of their graph relationships as well as graph deep information, and with the increase in depth of the model over the years, what should have been excellent results are hardly excellent training results because of the difficulty of training. Inspired by Unet, this paper proposes a new method, UGraphDTA, which uses a new U-shaped architecture and Skip connection architecture to enable the model to understand deeper graph information. The novelty of this method is the use of skip connections in the convolutional network, which allows the model to utilize both the original molecular graph structure information and the information after convolution of the graph, enhancing the model’s prediction ability for DTA tasks. The prediction performance of UGraphDTA is empirically proven to be better than other baseline models. This indicates that our proposed U-shaped convolutional architecture for drug target affinity prediction strategy that mines the deep information of drugs and proteins is effective. J. Chen (B) · X. Dong · Z. Yang School of Computer Science, University of South China, Hengyang, China e-mail: [email protected] X. Dong e-mail: [email protected] Z. Yang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_24

271

272

J. Chen et al.

Keywords GCN · Unet · Deep learning · Drug repurposing · Drug-target binding affinity

24.1 Introduction Drug discovery is a time- and money-consuming subject where conventional chemical procedures are used. According to data currently available, creating a new medication can cost billions of dollars. Additionally, it takes the FDA in the United States 10–15 years to approve a product before it can be sold on the market [1]. The drug development process can be sped up by the use of drug target affinity (DTA) prediction, which is a crucial step in the drug discovery process [2]. Before machine learning helped with drug development, scientists utilized computational techniques to predict DTA, such are simulations of molecular dynamics and molecular docking, but these techniques are ineffective and expensive to run. Due to current advances in computer technology, in particular, the application of artificial intelligence technology in biological research field will give us an opportunity to make the drug discovery process faster by significantly shortening the space that we need to search. The DTA prediction in machine learning offers some data on the potency of drug binding to target proteins, and this knowledge can be utilized to determine whether a drug molecule can attach to a protein. We can utilize molecular-simulation tools and molecular docking to create thorough simulations and acquire deeply accurate results once we have knowledge about the structure and sites of the proteins. Currently, most of the latest prediction algorithms for drug-target affinity are founded on machine learning. In more detail, given any pair of drug target terms, machine learning methods are used to extract a representation of the drug and the target separately and connect them into a graph for prediction. For example, Cheng et al. [3] extract the substructure of the drug as the drug features, extract the protein structural domain information as the protein features, and then calculate the tensor product of the drug and protein features as the features of the drug-protein interaction pair and input them to Biased-SVM for classification training. KronRLS [4] and SimBoost [5] are both machine learning-based prediction drug-protein affinity and force models. Deep learning models have supplanted machine learning models as the method of choice for DTA prediction in recent years. This method predict outcomes more accurately than the previously mentioned machine learning methods. Drugs are represented as smiles, while proteins are described by sequences in two benchmark datasets created by deep learning models like DeepDTA [6], which have been proposed and are well known for drug-target affinity (DTA) prediction. For the representation of molecules and proteins, in DeepDTA, two convolutional networks were created, and the benchmark tests produced promising results. Based on DeepDTA, WideDTA [7] was further improved by introducing LiveMax Common Substructure (LMCS) and Protein Motif and Structural domain (PDM) and encoding them

24 Graph Convolutional Neural Networks for Drug Target Affinity …

273

into four representations using four CNNs. By creating a graph where atoms are the nodes and bonds are the edges, the GraphDTA [8] brought graph convolution natural network (GNN) [9] into DTA prediction and used this method to describe drug compounds. On the other hand, the protein sequence representation is extracted using CNNs and the GNN model is implemented on the molecular graph to improve the prediction performance of DTA. With deep learning, the basic goal is to deepen the model to provide superior results [10]. The network, however, degrades as the number of network layers rises. The loss of the training set typically declines progressively and eventually tends to saturate as the number of network layers rises. And when the network depth increases, the loss of the training set instead rises, i.e., the network degenerates. As a result, in order to improve DTA prediction outcomes, it is also necessary to address the network degradation phenomena that arises during the DTA activity. Most of the currently available methods use graph convolutional neural networks for information extraction, and as the graph gets larger, training becomes very uncomfortable. The proposal of Unet [11] significantly increases the prediction accuracy of semantic segmentation in the realm of picture segmentation. Unet uses skip connection and four upsampling cycles, ensuring that the final recovered feature map contains more low-level features and allowing the fusion of features at various scales. The segmentation map can recover finer details like edges because to the four upsampling times. Although researchers have started to use the idea together with graph neural networks, so far no one has used U-shaped architectures in graph convolutional neural networks. Just in human motion prediction, Lingwei Dang et al. purpose MultiScale Residual Graph Convolution Network (MSRGCN) [12]. They used the module, named Descending and ascending GCN blocks(GCBlock), to extract features at the four scales. The structure inside this GCBlock is actually the residual block [13] that we commonly use in CNNs. Using the residual structure can complement our network, making it easier to train and faster to converge. In this study, we suggest a novel approach called UGraphDTA that was inspired by Unet. Two graphs of drug molecules and proteins are generated and trained separately using the technique, which builds models using structural data from the compounds and proteins. A drug-target pair is the input, and the output is a measurement of the pair’s binding affinity. This reduces the DTA prediction problem framework to a regression task. Our approach is different in that we try to change the architecture to a U-shape in the graph convolution network for drugs, while retaining the graph convolution neural network, just add a skip connection architecture in it so that the model can directly capture the original structural information of drugs and proteins with the neighborhood information obtained through graph convolution. The method uses a new U-shaped architecture that allows the model to understand deeper graph information. The other innovation of the method is the use of skip connections in the convolutional network, which allows the model to utilize both the original molecular graph structure information and the information obtained by convolving the graph,

274

J. Chen et al.

enhancing the model’s predictive power for DTA tasks and increasing the training speed, allowing for faster model fitting. For performance assessment, we make use of DeepDTA’s suggested benchmark dataset. The Davis and KIBA datasets are used in our benchmark test. The experiments show that the prediction performance of UGraphDTA outperforms other baseline models, which indicates that our proposed modeling approach can effectively mine deep drug-protein information and provide good drug-target affinity prediction results.

24.2 Datasets and Methods 24.2.1 Datasets The two most popular datasets for the DTA problem, Davis [14] and KIBA [15], which were proposed by DeepDTA and utilized for performance evaluation, were employed in our most recent experiment. In the Davis dataset, the degree of drug-protein interaction is expressed by K d (dissociation constant). Since the distribution range of K d values is too large, this paper first converts K d to logarithmic space and obtains pK d values ranging from 5.0 to 10.8 with the reference Eq. 24.1. pKd = log10

) Kd (

(24.1)

1e9

72 compounds and 442 proteins were included in the Davis dataset, along with their corresponding affinity values. The average length of the SMILES [16] string of the compound was 64. When in the KIBA dataset, it contains combined kinase inhibitor bioactivities from different sources, such asK i , K d , and I C 50 . And the bioactivities in the KIBA dataset are processed by the KIBA score, and it is used in our training and prediction. What’s more, the KIBA dataset contains 229 proteins and 2116 compound. There are 118,254 drug and target affinity values in it. The affinity values in the KIBA dataset are range from 0.0 to 17.2. The average length of the smiley character string KIBA compounds was 58, and the average length of the amino acid sequence was 728. Information of the datasets used in this paper is summarized in Table 24.1. Table 24.1 Selected data from the Davis and KIBA datasets

Datasets

Compound

Protein

Affinity

Davis

72

442

30,056

KIBA

2116

229

118,254

24 Graph Convolutional Neural Networks for Drug Target Affinity …

275

24.2.2 Drug Molecule Representation For drug molecules, the current popular method is to represent them as SMILES strings, which we processed into graphs using RDkit[17] to represent. As the graph neural network (GNNs) theory is rapidly developing, an increasing number of graph-based methods is being applied to the field of chemistry. With the experience of previous use, it will become easy to use graphs to represent drug molecules again. Therefore, for each drug molecule, it is convenient to use the graph representation. We can use the atoms in the drug molecule as nodes and the chemical bonds as edges, and then the molecule is easily converted into a diagram and understood. To facilitate the description of each drug molecule, which is a node, the atomic features designed in DeepChem [18] are used as node features in this paper. The five bits of information that make up a node’s features are the atom symbol, the number of nearby atoms, the number of adjacent hydrogens, the implicit value of the atom, and whether or not the atom is part of an aromatic structure. Each node produces a 78-bit binary feature vector, and all five of these feature types are encoded using a one hot. Then we use RDKit to convert S M I L E S into undirected graph G = {V , E}. For the symbol V , we define it as the set of vertices, that is, the set of atoms in the drug molecule. And E is the set of edges which presents the relationship between each atom. We use the link adjacency matrix C ∈ R n×n , the atomic feature adjacency matrix N ∈ R n×l and the bond feature adjacency matrix B ∈ R m×k to represent the graph of one drug molecule, where n denotes the number of atoms, m denotes the number of the edges, l denote the dimension of atomic feature, k are the dimension of the edge feature. The final input will be fed into the model for training.

24.2.3 Protein Representation For protein molecules, the protein sequence is currently used for representation. For a protein molecule, as DgraphDTA [19], the number of residues is about a few hundred, so the protein molecule can be represented by constructing a graph in which the residues are the nodes. The link between residues is simply a long chain which has no spatial information. Therefore, a contact graph is used in this work. A contact graph is a representation of protein structure, a 2D representation of 3D protein structural information, which can be utilized as an output for protein structural information prediction. What’s more, the output of the contact graph, which is usually a matrix, is exactly the same as the adjacency matrix required for graph neural network training; therefore, using a contact graph is an excellent way to join two data information together.

276

J. Chen et al.

In this paper, we use Pconsc4 [20] to generate contact graph for proteins. The Pconsc4 is a quick, simple, and excellent library for contact graph prediction, and its performance is comparable to currently available methods.

24.2.4 Model Architecture In this study, we present a novel convolutional neural network with a U-shaped architecture for predicting drug target affinity, which regresses the DTA values by integrating the raw and deep information of molecular maps, and the overall model architecture diagram is shown in Fig. 24.1. Drug molecules are represented by SMILES, and we use RDkit to convert to graph. For the protein, we used contact graph for protein sequence converting to graph. Then input these two maps into their respective channels for feature extraction. The features extracted after the two channels also need to pass the flatten layer, and finally the features are concatenated to obtain the final prediction results. In our proposed model UGraphDTA. We use two different channels for the data. As mentioned above, for both drug molecules and proteins, graphs are used for representation in this paper, but the representation is different. In particular, drug molecules are represented using RDKit processed drug molecule graphs and proteins are represented using linkage graphs. Note that all the convolutions in Fig. 24.2 refer specifically to the graph convolution. The tool RDkit uses a graph to depict drug compounds. A 1x graph convolutional neural network layer receives this graph as input. The output results of each layer of graph convolution on the left-hand side (encoder) are fed to the next layer of the graph convolutional neural network for convolution. The output result of each layer of graph convolution on the right-hand side (decoder) is fed to the previous layer of the graph convolution neural network for convolution. The output of each layer of the encoder is input to the layer of graph convolution in the decoder corresponding to the same scale through a skip connection, and an add operation or concat operation is performed.

Fig. 24.1 The architecture of UGraphDTA

24 Graph Convolutional Neural Networks for Drug Target Affinity …

277

Fig. 24.2 The architecture of U-shaped architecture channel

U-Shaped Architecture Channel. A 4-layer graph convolutional neural network is utilized for drug graph encoding, and a 3-layer graph convolutional neural network is used for drug graph decoding, with a skip connection for information transfer learning in between. The U-shaped channel architecture can be viewed in Fig. 24.2. In Encoder, there are 4 layers of graph convolutional neural network, the first layer does not extract features and only convolves them as 1x, the second layer scales the features to 2x, the third layer scales them to 4x, and the fourth layer scales them to 6x. Each layer is convolved and then fed through a Relu [21] activation function. GCN is a method to perform semi-supervised learning on graph structured data. In the encoders of this paper, each layer of the GCN will be convolved by the following expression. ) ( ∼ − 21 ∼ ∼ − 21 g Hl , A = D A D Hl Wl+1

(24.2)

) ( ) ) (( He l+1 = f He l , A = σ g He l , A

(24.3)

where A˜ = A + I N denotes a self-connected adjacency matrix joined, where A is the adjacency matrix in the protein graph with the shape (n, n); I N denotes the unit ˜ σ() denotes the activation function, matrix; D˜ is expressed as the degree matrix of A; throughout this paper, the activation functions are ReLU functions. W l denotes the trainable weight matrix of layer l; He l indicates the input of the l-encoder-layer. Specially, He 0 = X . And in Decoder, we use a 3-layer graph convolutional neural network. The first layer reduces the features to 4x, the second layer reduces the features to 2x, and the third layer reduces the features to 1x. And each layer is summed with the input of the corresponding Encoder layer. For example, the output of the first layer of Decoder

278

J. Chen et al.

Fig. 24.3 The architecture of skip connection architecture

is added with the output of the third layer of Encoder, and then sent to the second layer of Decoder for convolution. ( ) ) ( Hd l+1 = σ g Hd l , A + He l

(24.4)

where Hd l denotes the input of decoder layer l. For Hd l in particular, Hd 0 = He 4 and He l in Hd 1 is 0, i.e., not used (Fig. 24.3). Skip Connection Channel. In the channel for extracting proteins, we used a threelayer graph convolutional neural network with convolutional feature sizes of 1x, 2x, and 4x, respectively. After the protein graph is input to this channel, feature extraction is performed by different size map convolution layers in turn, and the output result of the last layer is convolved with the output result of the first layer to obtain the final result. Flatten Layer. The output dimensions of different channels will be unified by this layer and finally output as a 128-dimensional feature vectors. This layer consists of multi-layer perceptron (MLP) [22] with ReLu activation function and Dropout [23] function. After each of the 2 channels is processed, both of them output 128-bit feature vectors with the Flatten Layer of processing, which are fed into the three-layer fully connected layer after feature fusion. The Fully Connected Layer regresses the DTA values, and to avoid overfitting, we also add two Dropout layers in the middle and set the coefficient of Dropout to 0.2.

24 Graph Convolutional Neural Networks for Drug Target Affinity …

279

24.3 Experiment and Result UGraphDTA is implemented base on PyTorch [24], which is a popular open-source deep learning framework with Python language. And the GCN in UGraphDTA is implement by using PyTorch geometric. All baseline methods were experimented on our machines using their open source code. And all of these model evaluation experiments used their original metrics. In general, basically every model used the MSE and C I scores. Some models are missing the corresponding rm 2 scores because they were proposed earlier. In our experiments, the data set is disrupted and randomly divided into 5 parts, 4 of which are used as the training datasets and the remainder are used as the validation datasets. After our model is trained by the training datasets, it is immediately validated by the validation datasets. In the experiments, we record the optimal results.

24.3.1 Evaluation In this paper, we develop a regression model for predicting the affinity of drug targets and evaluate the performance of the DTA task using two metrics, the consistency index and the mean square error, which are applicable to the regression task. In particular, the consistency index is formulated as Eq. (24.5) CI =

1 C

E bi>bj

( ) h yi − yj

(24.5)

in the formula, yi denotes the predicted value of the sample with larger affinity value bi ; yj denotes the predicted value for samples with smaller affinity values bj ; C Indicates normalized constants; h(x) denotes the step function, where h(x) is expressed as follows: ⎧ ⎨ 1, x > 0 h(x) = 0.5, x = 0 ⎩ −1, x < 0

(24.6)

The mean square error reflects the average degree of difference between the predicted and labeled values. The mean square error formula is shown below, with bi and yi denoting the labeled and model-output (predicted) values. And the N denoting the number of samples. The final mean squared error formula is expressed as Eq. (24.7): MSE =

1 N

EN i=1

(bi − yi )2

(24.7)

Moreover, there is another measurement method proposed in DeepDTA, the metric rm 2 index, and its calculation is given in Eq. (24.8).

280 Table 24.2 The hyperparameters used in training UGraphDTA model

J. Chen et al. Hyperparameters

Settings

Epoch

2000

Batch size

200

Optimizer

Adam

Learning rate

0.005

U-net layers

4

) ( / rm 2 = r 2 × 1 − r 2 − r0 2

(24.8)

where r 2 and r0 2 denote the squared correlation coefficient with intercept and the squared correlation coefficient without intercept, respectively.

24.3.2 Baseline models In order to evaluate the model proposed in this paper, the proposed model is compared with 5 classical state-of-the-art models. These models will be compared in performance on two datasets, Davis and KIBA. These models are KronRLS, SimBoost, DeepDTA, WideDTA, GraphDTA, and DgraphDTA respectively.

24.3.3 Setting of the Hyperparameters It is always known that training a model requires many hyperparameter settings, and UGraphDTA is no exception in this case. Since training a model is very time consuming, some parameters are set by our experience in this experiment in order to mitigate the time cost. In this experiment, we set the hyperparameter settings empirically as shown in Table 24.2.

24.3.4 Variants In our experiments, we have proposed several variants. M0 is the origin method base on DgraphDTA. The second one is an architecture using only Skip Connection Architecture, where the network is used in both channels, which we use M1 to denote it. The third one uses a U-shaped architecture in both channels, which we call for M2. M3, just use the U-shaped architecture to train. And the fourth one, M4, just use the Skip Connection Architecture. The last one, abbreviated as M5, uses a U-shaped

24 Graph Convolutional Neural Networks for Drug Target Affinity …

281

Table 24.3 The scores on test datasets of Davis dataset testing different method Method

Drug architecture

Protein architecture

MSE

CI

rm 2

M0

−

−

0.224

0.882

0.635

M1

Skip connection

Skip connection

0.229

0.888

0.680

M2

U-shaped rchitecture

U-shaped architecture

0.250

0.884

0.651

M3

U-shaped rchitecture

−

0.228

0.884

0.695

M4

−

Skip connection

0.227

0.886

0.684

M5

U-shaped rchitecture

Skip connection

0.208

0.894

0.706

architecture in the drug molecule channel and a Skip Connection Architecture in the protein channel. The results of the experimental results for each scenario are shown in Table 24.3.

24.3.5 Results and Analysis The MSE and CI and rm 2 scores for the independent validation sets in the Davis and KIBA datasets are shown in Tables 24.4 and 24.5. With the above results, we can see that the U-shaped architecture is has reached the advanced level. In the Davis dataset, as shown in Table 24.3, the model numbered M5, which used the U net architecture for the drug molecule channel and the skip connection architecture for the protein channel, obtained a score of 0.208 for M S E, 0.894 for C I , and 0.704 for rm 2 , which was the best model in the experiment. In order for us to figure out the contribution of each architecture in the new model, we analyze the contribution of each architecture in the respective in the model. The C I and rm 2 in M3 are significantly better than those in M0, where no architecture is used, and the C I and rm 2 in M4 are slightly better than those in M0. From this, we can conclude that our proposed U-shaped architecture is effective for the training of drug Table 24.4 The MSE and CI and rm 2 scores on test datasets of Davis dataset using the proposed model. Sorted by MSE Method

Proteins and compounds

MSE

CI

rm 2

DeepDTA

S-W & CNN

0.420

0.886

−

KronRLS

S-W & Pubchem Sim

0.379

0.871

−

SimBoost

S-W & Pubchem Sim

0.282

0.872

−

WideDTA

PS + PDM & LS + LSCS

0.262

0.886

−

DgraphDTA

GCN + GCN

0.224

0.890

0.635

GraphDTA

GIN + CNN

0.223

0.892

−

UGraphDTA

M5

0.208

0.894

0.706

282

J. Chen et al.

Table 24.5 The MSE and CI and rm 2 scores on test datasets of KIBA dataset using the proposed model. Sorted by MSE Method

Proteins and compounds

MSE

CI

rm 2

KronRLS

S-W & Pubchem Sim

0.411

0.782

−

SimBoost

S-W & Pubchem Sim

0.222

0.836

−

DeepDTA

S-W & CNN

0.204

0.854

−

WideDTA

PS + PDM & LS + LSCS

0.179

0.875

−

DgraphDTA

GCN + GCN

0.141

0.895

0.767

GraphDTA

GIN + CNN

0.139

0.891

−

UGraphDTA

M5

0.134

0.897

0.776

molecule channels, while our proposed Skip Connection Architecture is effective for the training of protein channels. The proposed U-shaped architecture is also effective for the training of protein channels. The M5, which uses our proposed architecture for both channels, has an even more significant improvement. Its M S E score, C I score, and rm 2 score are all much better than M0. Finally, the best-performing model of the proposed model is then studied in comparison with the baseline model on each of the two datasets. Table 24.4 shows the scores of each model on the independent test set of the Davis dataset. As seen in Table 24.4, the proposed model shows a significant improvement over the two machine learning models KronRLS and SimBoost; the results are also better relative to DeepDTA and WideDTA, and further improved for GraphDTA and DGraphDTA. On the KIBA dataset, the results in Table 24.5 show that the proposed model has a very significant improvement in C I and M S E scores compared to the four baseline models. Comparing with other models, the results show that the proposed model can extract the structural information of drugs and proteins more effectively, and has better prediction of their affinity.

24.4 Conclusion In this paper, a new model structure is proposed where we use graph representations of drug molecules and proteins for training, respectively, and use different architectures for the different graphs. On the one hand, we used a U-shaped architecture for the drug molecule graph channel in training, which accelerates the model fitting by using the original graph information while learning the deep graph features. On the other hand, for the protein graph channel, we use a skip connection architecture, which enables the model complexity and learning difficulty of this channel to be reduced substantially. Overall, we use a U-shaped architecture with skip connections in the proposed model and achieve better results than the baseline model.

24 Graph Convolutional Neural Networks for Drug Target Affinity …

283

References 1. Ashburn, T.T., Thor, K.B.: Drug repositioning: identifying and developing new uses for existing drugs. Nat. Rev. Drug Discovery 3(8), 673–683 (2004) 2. Parenti, M.D., Rastelli, G.: Advances and applications of binding affinity prediction methods in drug discovery. Biotechnol. Adv. 30(1), 244–250 (2012) 3. Cheng, Z.Z., Zhou, S.G., Wang, Y., et al.: Effectively identifying compound-protein interactions by learning from positive and unlabeled examples. IEEE/ACM Trans. Comput. Biol. Bioinform. 15(6), 1832–1843 (2018) 4. Pahikkala, T., et al.: Toward more realistic drug–target interaction predictions. Brief. Bioinform. 16(2), 325–337 (2015) 5. He, T., Heidemeyer, M., Ban, F., Cherkasov, A., Ester, M.: SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines. J. Cheminformatics. 9(1), (2017) 6. Öztürk, H., Özgür, A., Ozkirimli, E.: DeepDTA: deep drug–target binding affinity prediction. Bioinforamatics 34(17), i821–i829 (2018) 7. Öztürk, H., Ozkirimli, E., Özgür, A.: WideDTA: prediction of drug-target binding affinity. arXiv:1902.04166 [cs, q-bio, stat], (2019) 8. Nguyen, T., Le, H., Quinn, T. P., Nguyen, T., Le, T. D., Venkatesh, S.: GraphDTA: Predicting drug–target binding affinity with graph neural networks. Bioinformatics, (2020) 9. Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 [cs, stat] (2017) 10. Ghosh, S., Ghosh, S: Exploring the ideal depth of neural network when predicting question deletion on community question answering. arXiv:1912.03585 [cs], (2019). Available https:// arxiv.org/abs/1912.03585. Last Accessed 24 Aug 2022 11. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. arXiv.org, (2015) 12. Dang, L., Nie, Y., Long, C., Zhang, Q., Li, G.: MSR-GCN: Multi-scale residual graph convolution networks for human motion prediction. arXiv:2108.07152 [cs] (2021). Available https:// arxiv.org/abs/2108.07152. Last Accessed 22 Aug 2022 13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv.org (2015) 14. Pahikkala, T., Airola, A., Pietilä, S., Shakyawar, S., Szwajda, A., Tang J., Aittokallio, T.: Briefings Bioinf. 16, 325–337 (2014). https://tdcommons.ai/multi_pred_tasks/dti/#davis. Last Accessed 22 Aug 2022 15. The, T., Heidemeyer, M., Ban, F., Cherkasov, A., Ester, M.: Cheminf. J. 9, 24 (2017). https:// tdcommons.ai/multi_pred_tasks/dti/#kiba. Last Accessed 22 Aug 2022 16. Weininger, D.: Smiles, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Model. 28 (1), 31–36 (1988) 17. Landrum, G.: RDKit: Open-source cheminformatics (2006) 18. DeepChem.: GitHub, (2022). https://github.com/deepchem/deepchem. Accessed 24 Aug 2022 19. Jiang, M., et al.: Drug–target affinity prediction using graph neural network and contact maps. RSC Adv. 10(35), 20701–20712 (2020) 20. PconsC4: GitHub, (2022). https://github.com/ElofssonLab/PconsC4. Accessed 24 Aug. 2022 21. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. Proceedings. mlr. press (2011) 22. Rosenblatt, F.X.: Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan Books, Washington DC (1961) 23. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014) 24. pytorch/pytorch.: GitHub (2021). https://github.com/pytorch/pytorch

Chapter 25

SOC Estimation of Lithium Titanate Battery Based on Variable Temperature Equivalent Model Chao Song, Jianhua Luo, Xi Chen, and Zhizhao Peng

Abstract To ensure the normal service life and battery safety, accurate estimation of state of charge (SOC) of lithium titanate ion battery is of great significance. For the purpose of improving the accuracy of SOC estimation further, an equivalent circuit model considering temperature factors is established according to the external properties of lithium titanate ion battery at various ambient temperatures. Then, based on the proposed variable temperature equivalent model, the battery SOC is estimated using the Cubature Kalman Filter (CKF) algorithm. Finally, the SOC estimation outcomes are compared with the real value, and it is found that the maximum estimation error is within ±2% at various temperature conditions. According to the test results, the suggested SOC estimation algorithm on the basis of the variable temperature equivalent model has good temperature adaptability and high estimation accuracy. Keywords Lithium titanate battery · Variable temperature equivalent model · Cubature kalman filter · Soc Estimation

25.1 Introduction At present, the accurate estimation of state of charge (SOC) is one of the main technologies of the battery management system. Accurate SOC information can help protect batteries, prevent overcharge and overdischarge, and improve battery utilization. However, unlike current, voltage, and temperature, SOC can be measured directly or indirectly through sensors. As a state variable that is not directly observable, SOC requires an appropriate algorithm to estimate it in real time. Available SOC estimation algorithms are roughly classified into equivalent model method and pure data-driven method. The equivalent model method is generally based on the premise of establishing an appropriate equivalent model, combined with the open C. Song (B) · J. Luo · X. Chen · Z. Peng Army Academy of Armored Forces, Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8_25

285

286

C. Song et al.

circuit voltage approach for estimation. Pure data-driven approach often involves extensive measurement data and then predicting SOC. Ampere hour integral approach [1] is direct and simple. However, the uncertainty of initial state and the superposition and accumulation of current noise will lead to the drift of SOC estimation. Therefore, it is usually used as the experimental reference value of SOC. Both the open circuit voltage approach [2] and the internal resistance approach [3] depend on the functional relation established between the battery SOC and the battery static parameters, which indirectly reflects SOC of battery. However, the measurement of open circuit voltage needs the electrolyte to be evenly distributed in the battery. The battery must thus stand for a long period. Similarly, it is very difficult to measure the internal resistance of batteries in real time. As a matter of fact, neither of these two approaches enables real-time SOC estimation as a dynamic variable. Pure data-driven methods, for example, support vector regression [4], fuzzy logic reasoning [5] and artificial neural network [6], establish a nonlinear relation between battery input and battery output from massive training data. Therefore, it puts forward high requirements for data measurement and hardware operation ability, which is currently quite challenging to be applied to engineering practice. The equivalent model method relies on a high-precision battery model and can correct the initial error of SOC. The advantage of particle filter (PF) is that it is more suitable for state estimation of nonlinear systems by using the theory of probability, and will not be subject to the constraints of Gaussian noise model. However, the existence of too many particles also brings higher computational challenges to the algorithm. The improved Kalman filter algorithm (KF) [7] has the ability to find the optimal answer to nonlinear Gaussian system. In reference [8], combined with the nonlinear relation between open circuit voltage and SOC, Extended Kalman filter (EKF) algorithm was employed for estimating SOC. Moreover, Double Extended Kalman filter and Unscented Kalman filter (UKF) [9] have been widely researched for estimating battery SOC. The majority of the present research, however, concentrates on the estimation of SOC in some specific scenarios and experimental conditions. This paper focuses on the effect of temperature on the model of lithium titanate batteries for further improving the accuracy of SOC estimation. A variable temperature equivalent model is established by analyzing the external features of lithium titanate ion battery at various ambient temperatures. On this basis, the SOC is estimated by using the Cubature Kalman filter (CKF) algorithm. A variety of test conditions are used to confirm the presented SOC estimation algorithm’s accuracy and temperature adaptability. The results of this study are helpful to improve the battery thermal management system as well as improve the battery system’s reliability under different temperature conditions.

25 SOC Estimation of Lithium Titanate Battery Based on Variable …

287

25.2 Equivalent Model of Lithium Battery 25.2.1 Equivalent Model As shown in Fig. 25.1, the temperature dependent second-order RC equivalent circuit model employed in this paper contains a voltage source, two parallel RC rings, and ohmic internal resistance. The continuous state space equation for the battery model is generated by using Kirchhoff’s law. ⎧ 1 1 ⎪ U˙ 1 = − U1 + I ⎪ ⎪ ⎪ R1 C 1 C1 ⎨ 1 1 U˙ 2 = − U2 + I ⎪ ⎪ R2 C 2 C2 ⎪ ⎪ ⎩ Ut = OC V − U1 − U2 − R0 I

(25.1)

where R0 refers to the ohmic internal resistance; R1 and R2 denote the internal resistances of electrochemical polarization; U1 and U2 is the voltage of two RC rings; C1 and C2 is the concentration polarization capacitance; Ut is the terminal voltage; I is current; OC V is the open circuit voltage.

25.2.2 OCV-SOC Nonlinear Relationship The accurate description of the nonlinear properties of the OCV-SOC of lithium batteries correlates with the expression of the equivalent model and the subsequent SOC estimation’s accuracy. The OCV-SOC related data were collected using the small current interpolation method, and Fig. 25.2 shows the test outcomes. Taking the test results of temperature as an example, the OCV decreases with the decrease of SOC as a whole, the change of OCV is relatively gentle in the middle SOC area, and becomes steep in the high SOC and low SOC areas. When the temperature becomes very low, the change of OCV will no longer follow the conventional law. OCV increases sharply in the low SOC area and decreases in the high SOC area. Fig. 25.1 Equivalent circuit model

U1

U2

C1

C2

R1

R2

I R0 OCV

+ Ut -

288

C. Song et al.

Fig. 25.2 Nonlinear relationship curve of OCV-SOC

2.65 2.6

2.7

2.55

2.6

2.5

ocv

2.5

2.45

2.4

2.4

2.3

2.35

2.2

2.3

2.1 40

2.25 1

20 0.5

0 -20 0

2.2 2.15

SOC

Therefore, if the influence factors of temperature are ignored, the expression accuracy of OCV will be affected, which in turn affects the accuracy of SOC estimation for battery operating at various temperatures.

25.2.3 Parameter Identification The off-line identification of model parameters is realized according to Hybrid Pulse Power Characterization (HPPC) experiment. The following are the specific experimental steps: 1. The battery is charged at room temperature with a charging current of 1/3 C ratio. At this time, the SOC is 1 and stand for 1 h. 2. The battery is discharged for 20 min with a discharge current of 1/3 C and left for 30 min. 3. Discharge with a pulsed current at 1 C rate for 10 s and stand for 40 s. After charging for 10 s and standing for 40 s, an HPPC experiment is completed. 4. Repeat step (2) and step (3) until the SOC of the battery is 0. 5. Repeat the above steps (1) to (4) at various temperature conditions of − 20 °C, − 15 °C, − 10 °C, 0 °C, 20 °C, and 40 °C and record relevant data. Off line identification of ohmic internal resistance parameters: the ohmic polarization effect disappears at the end of the current pulse. Therefore, it is the ohmic internal resistance that causes the transient changes in the battery terminal voltage at the beginning and end of current pulse. Figure 25.3 presents the identification results under different temperature conditions. According to the figure, for a given SOC, the lower the temperature, the higher the ohmic internal resistance. Identification of polarization resistance capacitance parameters: in the short pulse discharge process of HPPC, due to the influence of polarization effect, the terminal

25 SOC Estimation of Lithium Titanate Battery Based on Variable …

289

Fig. 25.3 Identification results of R0

0.06

0.08 0.05

0.06

R0

0.04

0.04 0.03

0.02

0.02

0 40 1

20

T/C

0.01

0.5

0 -20

0

SOC

voltage has a slow rise in response and reaches a stable recovery process after the pulse discharge. At this time, it can be equivalent to zero input response. Then the end power response can meet y = OC V (S OC) − c1 e−c3 t − c2 e−c4 t

(25.2)

where c1 = I R1 , c1 = I R2 , c3 = R11C1 , and c4 = R21C2 . y represents the terminal voltage data of a SOC stage. Figure 25.4 shows the identification results at various temperature conditions. According to the figure that the overall change of R1 is relatively uniform, and there is a peak only at 25% SOC. C1 is obviously affected by temperature in the middle SOC area. The higher the temperature, the greater C1 . The change trend of R2 and C2 is similar, and the influence of SOC is greater than that of temperature, especially when SOC = 1. The fitting result of terminal voltage can reflect the identification accuracy of battery parameters. As an example, Fig. 25.5 shows the fitting values and measured values of the terminal voltage under intermittent discharge conditions. The fitting error of the terminal voltage is shown to be within 10 mV, indicating that the above method meets the reasonable error range.

25.3 Cubature Kalman Filter Algorithm The cubature Kalman filter algorithm (CKF) selects a group of volume points based on the volume criterion and approximately calculates the a posteriori mean and error covariance transmitted by the nonlinear function by transferring the volume points through the nonlinear function, that is, approximately calculates the Gaussian integral in the nonlinear Gaussian filter.

290

C. Song et al. 5

10 6

5

8 5

0.5

0.6

6 4

C1

R1

10

0.6

0.8

0.4

0.4

4 3

0.2 0 40 1

20

2

0.2

0 40

2

-20

0

1

20

0.1

0.5

0

T/

0.3

T/

-20

SOC

(a) Identification results of R1

1

0.5

0 0

SOC

(b) Identification results of C1 5

10 3.5

0.9

0.8

0.7

0.6

0.6

0.4

0.5

5

10

C2

0.8

R2

1

0.4

0.2

4

3

3

2.5

2

2

1.5

1

0.3

0 40

0.2

1

20

-20

0

1

20

0.1

0.5

0

T/

1

0 40

T/

SOC

-20

(c) Identification results of R2

0.5

0.5

0 0

SOC

(d) Identification results of C2

Fig. 25.4 Identification results of polarization resistance and capacitance parameters 0.01

2.7 Estimated Fitted

2.65

0.005

2.6

Ut error /v

Ut /v

2.55 2.5 2.45

0

-0.005

2.4 2.35

-0.01

2.3 2.25 2.2 0

0.2

0.4

0.6

0.8

1 t/s

1.2

1.4

1.6

1.8

2 104

(a) terminal voltage Fig. 25.5 Verification of identification results

-0.015 0

0.2

0.4

0.6

0.8

1 t/s

1.2

1.4

1.6

1.8

2 104

(b) terminal voltage fitting error

25 SOC Estimation of Lithium Titanate Battery Based on Variable …

291

The continuous state equation is discretized by the Euler method to obtain the state estimation equation of the filter ⎤⎡ ⎤ ⎡ ⎤ ⎡ η1 T ⎤ − Cn 1 0 0 S OCk+1 S OCk ⎥ ⎥ ⎢ T ⎣ U1,k+1 ⎦ = ⎢ ⎦ ⎣ 0 + ⎣ CT1 ⎦ Ik ⎦ U1,k ⎣ 0 1 − R1 C 1 T 0 0 1 − R2TC2 U2,k+1 U2,k C

(25.3)

Ut,k+1 = OC Vk+1 (S OCk+1 ) − U1,k+1 − U2,k+1 − Ik+1 R0,k+1

(25.4)

⎡

2

| | where S OCk U1,k U2,k refers to the state variable; Ik refers to the input variable; Terminal voltage Ut,k+1 is the output variable; Cn denotes the rated capacity of the battery, in ah; T refers to the sampling time, 1 s is taken in this paper. The following equations are the state equation and measurement equation of lithium battery: {

xk+1 = f (xk , u k ) + ωk yk = h(xk , u k ) + νk

(25.5)

where the first one shows the state estimation equation. The second equation is the system observation equation. f (·) and h(·) refer to the nonlinear transfer function. xk refers to state variables. yk refers to the system output. u k denotes the system input. The specific steps applied to the CKF algorithm are summarized as follows: 1. Initialization settings (a) Initialization of state variables and variance. For k = 0, set x0 = E[x0 ], P0 = E[(x0 − xˆ0 )(x0 − xˆ0 )T ]

(25.6)

(b) Initialization process noise covariance matrix and observation noise covariance matrix: Q 0 , R0 (c) Initialize standard volume point set { ξ

(i )

=

√

n[1](i) , i = 1, 2, ..., n √ − n[1](i−n) , i = n + 1, n + 2, ...2n

(25.7)

⎡

⎤ 100 where n = 3, [1] = ⎣ 0 1 0 ⎦, n is the state dimension, (i ) representing the 001 ith column vector. Sk−1 = chol(Pk−1 )

(25.8)

292

C. Song et al. T Pk−1 = Sk−1 Sk−1

(25.9)

2. Time update (a) Decomposition error covariance: where chol(·) is the matrix’s Cholesky decomposition. Calculated volume point: (i) xk−1 = xˆk−1 + Sk−1 ξ (i ) , i = 1, 2, · · · 2n

(25.10)

(c) Propagate the volume point and compute the predicted state: (i ) (i) xk/k−1 = Axk−1 + Bu k−1 , i = 1, 2, · · · 2n

x k/k−1 =

2n 1 E (i) x 2n i=1 k/k−1

(25.11)

(25.12)

(d) Compute propagation covariance: Pk/k−1 =

2n 1 E (i) (i) (x − x k/k−1 )(xk/k−1 − x k/k−1 )T + Q 2n i=1 k/k−1

(25.13)

3. Measurement update (a) Decomposition error covariance: Sk/k−1 = chol(Pk/k−1 )

(25.14)

(b) Recalculate volume points: (i ) xk/k−1 = x k−1 + Sk/k−1 ξ (i ) , i = 1, 2, · · · 2n

(25.15)

(c) Propagate volume points and compute predicted measurements: (i) (i) yk/k−1 = C xk/k−1 + Du k , i = 1, 2, · · · 2n

y k/k−1 =

2n 1 E (i) y 2n i=1 k/k−1

(25.16)

(25.17)

25 SOC Estimation of Lithium Titanate Battery Based on Variable …

293

(d) Calculate estimated covariance: y

Pk/k−1 =

xy

2n 1 E (i ) (i ) − y k/k−1 )T + R (y − y k/k−1 )(yk/k−1 2n i=1 k/k−1

(25.18)

2n 1 E (i) (i) (x − x k/k−1 )(yk/k−1 − y k/k−1 )T 2n i=1 k/k−1

(25.19)

Pk/k−1 =

(e) Calculate Kalman gain K k = Pk/k−1 (Pk/k−1 )−1

(25.20)

err ork = yk − y k/k−1

(25.21)

xˆk = x k/k−1 + K k · err ork

(25.22)

xy

y

(f) Update forecast status:

(g) Update error covariance y

Pk = Pk/k−1 + K k Pk/k−1 K kT

(25.23)

25.4 Testing and Discussion This paper uses a lithium titanate battery with a nominal capacity of 1.5 Ah. The cut-off voltage is 2.8/1.5 V, the nominal voltage is 2.3 V, the working temperature range is −40 ~ 60 °C, the normal temperature cycle life is 10000 ~ 20,000 times, and the energy density is 50 ~ 85 Wh/kg. Section 25.4.1 is the test under DST discharge and pulse discharge to verify the current adaptability of the algorithm. Section 25.4.2 is the test at various temperature conditions for validating the effectiveness of the presented temperature related model. Section 25.4.3 is a comparative test of different algorithms for the purpose of verifying the superiority of the suggested estimation algorithm. Section 25.4.4 is used to test and verify the initial value robustness of the algorithm.

25.4.1 Test Under Different Current Conditions Figures 25.6 and 25.7 are the test results under Dynamic Stress Test and pulse discharge conditions respectively, in which Fig. a and b present the SOC value and

294

C. Song et al. 2

1

Estimated True

0.9

1.5

0.8

1

SOC Error /%

0.7

SOC

0.6 0.5 0.4

0.5 0 -0.5

0.3 -1 0.2 -1.5

0.1 0

0

2000

4000

6000

8000

10000 12000 14000 16000 18000

-2

0

2000

4000

6000

8000

10000

12000

14000

16000

t/s

t/s

(a) estimated value of SOC

(b) SOC estimation error

2.7

0.1 0.08

2.6

0.06 0.04 Ut error /v

Ut /v

2.5 2.4

0.02 0 -0.02

2.3

-0.04 -0.06

2.2

-0.08 2.1

0

2000

4000

6000

8000 10000 12000 14000 t/s

16000

(c) terminal voltage

-0.1 0

2000

4000

6000

8000 10000 12000 14000 16000 t/s

(d) terminal voltage fitting error

Fig. 25.6 Verification of identification results

the SOC estimation error, respectively, Fig. c is the terminal voltage, and Fig. d is the terminal voltage estimation error. According to the figure, under the two current conditions, the CKF algorithm’s maximum SOC estimation error is lower than 1.5%, and the terminal voltage estimation error is within 20 mV. And even if there is an incorrect initial value, the algorithm can still converge to the real state quickly.

25.4.2 Test Under Different Temperature Conditions The algorithm’s test outcomes under different temperature conditions (i.e. high temperature, normal temperature, and low temperature) are shown in Fig. 25.8. According to the figure, the SOC estimation error is the smallest at low temperature

25 SOC Estimation of Lithium Titanate Battery Based on Variable … 2

1

Estimated Fitted

0.9

1.5

0.8

1 SOC Error /%

0.7 0.6 SOC

295

0.5 0.4

0.5 0 -0.5

0.3 -1 0.2 -1.5

0.1 0

0

0.5

1

1.5

2

-2

2.5

0

0.2

0.4

0.6

0.8

4

t/s

1

1.2

1.4

1.6

1.8

(a) estimated value of SOC

2

4

t/s

10

10

(b) SOC estimation error

2.7

0.1 Estimated

0.08

Fitted

2.6

0.06 0.04 Ut error /v

Ut /v

2.5 2.4

0.02 0 -0.02

2.3

-0.04 -0.06

2.2

-0.08 2.1

0

0.2

0.4

0.6

0.8

1 1.2 t/s

1.4

1.6

1.8

2 104

(c) terminal voltage

-0.1

0

0.2

0.4

0.6

0.8

1 1.2 t/s

1.4

1.6

1.8

2 104

(d) terminal voltage fitting error

Fig. 25.7 Verification of identification results

and the largest at high temperature when the CKF algorithm applies the variable temperature equivalent model suggested in this study. However, the maximum SOC estimation error under these three different temperature conditions is less than 2%, meeting the error accuracy requirements. Further, because the temperature will affect the expression accuracy of model parameters, especially for the OCV-SOC nonlinear function, when the temperature influence factor is ignored, the estimation accuracy of the algorithm will be greatly reduced, and even divergence will occur.

296

C. Song et al. 1 0.8 0.6 0.5 0.4 0.3

2 1 0 -1 -2

0.2 0.1 0

Low Normal High

3

SOC Error /%

0.7

SOC

4

True Low Normal High

0.9

-3 0

2000 4000 6000 8000 10000 12000 14000 16000 18000

-4

t/s

(a) estimated value of SOC

0

2000

4000

6000

8000 10000 12000 14000 16000

t/s

(b) SOC estimation error

Fig. 25.8 Estimates of SOC at different temperatures

25.4.3 Algorithm Comparison Test Figure 25.9 is the test comparison results of EKF algorithm and CKF algorithm under the DST discharge condition. The parameter identification data of battery model at room temperature are used, and the selection of noise matrix coefficient and initial covariance coefficient of EKF algorithm and CKF algorithm are consistent, where | |T Q = 1e−7· I3×3 , R = 0.01, and P0 = 0.001 0.001 1 · I3×3 . According to Figure b, the fluctuation trend of SOC estimation error of the two algorithms is basically the same. In the high SOC region, the estimation accuracy of EKF is slightly better than that of CKF algorithm, but CKF algorithm shows obvious advantages in the subsequent discharge process. The maximum estimation error appears in the low SOC region, which is due to the "sudden decrease" in open circuit voltage and the turbulence in the model parameters, but the maximum error of the two algorithms is less than 1.5%. Table 25.1 shows the mean absolute value error (MEAN), maximum absolute value error (MAX), and root mean square absolute value error (RMSE) after the algorithm converges stably for 1000 s, where Max, MEAN, and RMSE of EKF are 2.0730, 0.8492, and 0.9838% respectively, and Max, MEAN, and RMSE of CKF are 1.2553, 0.6540, and 0.5357% respectively.

25.4.4 Initial Value Robustness Test Figure 25.10 shows the test results of CKF algorithm at different SOC initial values. According to the results, the local enlargement in the figure shows that the CKF algorithm is able to converge to the true state quickly and stably with convergence times within 300 s under the error conditions of 20, 50, and 70% of the initial error, respectively. In addition, different initial values slightly affect the steady-state

25 SOC Estimation of Lithium Titanate Battery Based on Variable … 1.2

4 EKF CKF

3

True

1

SOC Error /%

EKF CKF

0.8

SOC

297

0.6 0.4

2 1 0 -1 -2

0.2 0

-3 0

-4

2000 4000 6000 8000 10000 12000 14000 16000 18000

0

2000

4000

6000

8000 10000 12000 14000 16000

t/s

t/s

(a) estimated value of SOC

(b) SOC estimation error

Fig. 25.9 Comparison of SOC Estimates under Different Algorithms

Table 25.1 Comparison of error in SOC estimates

Error/%

Max

Mean

RMSE

EKF

2.0730

0.8492

0.9838

CKF

1.2553

0.6540

0.5357

estimation accuracy of SOC, but they are small and negligible. Therefore, even in the situation of large difference in the initial value range of SOC, CKF algorithm still has high robustness and fast convergence speed. 1.2 70% initial error 50% initial error 20% initial error

1

True

SOC

0.8

0.6

1 0.99 0.98

0.4

0.97 0.96 0.95

0.2

0.94 0.93

0

0

0

200

2000

400

4000

600

6000

800

1000

8000

10000

12000

t/s

Fig. 25.10 SOC estimation results with different initial errors

14000

16000

18000

298

C. Song et al.

25.5 Conclusion According to the external properties of lithium titanate ion batteries under various ambient temperatures, this paper establishes an equivalent model considering temperature factors, and determines the unknown model parameters by the least square fitting approach. The model improves the adaptability to temperature changes and outperforms conventional models significantly. Furthermore, this study employs the Cubature Kalman Filter algorithm to further improve the accuracy of SOC estimation. The results of various test conditions also indicate that the developed algorithm is of good robustness and estimation accuracy.

References 1. Ji, Y.J., Qiu, S.L., Li, G.: Simulation of second-order RC equivalent circuit model of lithium battery based on variable resistance and capacitance. J. Cent.L South Univ. 27(9), 2606–2613 (2020) 2. Wang, Y., Zhao, L., Cheng, J., Zhou, J., Wang, S.: A state of charge estimation method of lithium-Ion battery based on fused open circuit voltage curve. Appl. Sci. 10(4), 1264 (2020) 3. Chao, S., Kawamura, A.: A new way of state of charge using internal resistance for lead acid battery, vol.2. pp. 565–570 (2020) 4. Schwunk, S., Armbruster, N., Straub, S., Kehl, J., Vetter, M.: Particle filter for state of charge and state of health estimation for lithium–iron phosphate batteries. J. Power Sources 239, 705–710 (2013) 5. Shi, Q.S., Zhang, C.H., Cui, N.X.: Estimation of battery state-of-charge using ν -support vector regression algorithm. Int. J. Automot. Technol. 9, 759–764 (2008) 6. He, H.W., Zhang, X.W.: Online model-based estimation of state-of-charge and open-circuit voltage of lithium-ion batteries in electric vehicles. Energy 39, 310–318 (2012) 7. Bhangu, B.S., Bentley, P., Stone, D.A., Bingham, C.M.: Nonlinear observers for predicting stateof-charge and state-of-health of lead-acid batteries for hybrid-electric vehicles. IEEE Trans. Veh. Technol. 54(3), 783–794 (2005) 8. Huang, Z.: SOC estimation of Li-ION battery based on improved EKF algorithm. Int. J. Automot. Technol. 22(2), 335–340 (2021) 9. Zheng, Y., He, F., Wang, W.: A way to identify estimate SOC based on different temperatures and driving conditions. Electronics 8(12), 1391 (2019)

Appendix

This book contains the proceedings of the Fourth Conference “3D Imaging—Multidimensional Signal Processing and Deep Learning (3DIT-MSP&DL’ 2022)”, Volume 1: “Images, Augmented Reality and Information Technologies”.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8

299

Author Index

C Cai, Guohua, 213 Cai, Lei, 167, 251 Cai, Xiaoyu, 181 Chen, Jiale, 271 Chen, Xi, 285 Cui, Yuzhu, 1

D Dong, Aidi, 261 Dong, Xuelian, 271

F Fang, Zhou, 213 Fan, Tongliang, 109 Fu, Weida, 143

G Gan, Quan, 121 Gao, Fei, 99 Gao, Jia, 227 Gao, Yifu, 227 Geng, Hanlin, 237 Ge, Yawei, 11 Guo, Xin, 143

H Hao, Liang, 167 He, Zhimin, 77, 203 Hou, Xiqian, 11 Huang, Yihe, 87

J Jiang, Jing, 69 Ji, Jiaojiao, 131 L Liang, Hui, 213 Li, Qiaoling, 213 Liu, Huizhong, 51 Liu, Ya, 41 Liu, Yuchen, 167, 251 Liu, Zechun, 213 Li, Zhaohui, 237 Li, Zheng, 41 Luo, Jianhua, 285 Lu, Yue, 11 Lv, Changhui, 261 M Meng, Zhuxuan, 11 P Pan, Jianhong, 261 Pan, Sen, 31, 69 Peng, Lin, 31, 77, 203 Peng, Zhizhao, 285 Pu, Kaijie, 99 Q Qian, Kun, 203 Qiao, Junfeng, 31, 69 Qin, Lixiang, 121 Qiu, Hongbin, 69

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Patnaik et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 349, https://doi.org/10.1007/978-981-99-1230-8

301

302 R Ren, An, 227

S Shen, Jian, 203 Shi, Sihan, 143 Song, Chao, 285 Song, Jiuguang, 227 Sun, Yandong, 109 Suo, Na, 227

T Tang, Liang, 121

W Wang, Gang, 261 Wang, He, 77 Wang, Hongkai, 121 Wang, Juan, 227 Wang, Xiaoyuan, 41 Wang, Xin, 41 Wang, Xinke, 251 Wang, Yanying, 41 Wang, Yixuan, 157 Wang, Zijun, 193

X Xiao, Bing, 1

Author Index Xiong, Zhe, 213 Xu, Jun, 213 Xu, Menghan, 69

Y Yang, Jing, 121 Yang, Pei, 31 Yang, Zhongyuan, 271 Yang, Zhuocheng, 167 Yan, Jing, 143 Yan, Li, 213 Yao, Yuying, 143 Ye, Zhongli, 213 You, Keshun, 51 Yu, Hai, 77, 203 Yu, Hongfei, 181, 193 Yu, Jintao, 1 Yu, Zhiqiang, 213

Z Zhang, Hengbo, 251 Zhang, Kun, 227 Zhao, Bo, 261 Zhao, Na, 99 Zheng, Quanxing, 213 Zhong, Hongxiang, 213 Zhou, Aihua, 31 Zhou, Yuan, 41 Zhou, Yuanyuan, 237 Zhu, Lipeng, 31