Handbook of Big Geospatial Data [1 ed.] 3030554619, 9783030554613

This handbook covers a wide range of topics related to the collection, processing, analysis, and use of geospatial data

740 155 19MB

English Pages 652 [633] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Handbook of Big Geospatial Data [1 ed.]
 3030554619, 9783030554613

Table of contents :
Preface
Overview of the Book
Contents
Part I Spatial Computing Systems and Applications
1 IBM PAIRS: Scalable Big Geospatial-Temporal Data and Analytics As-a-Service
1.1 Introduction
1.2 PAIRS Architecture Overview
1.3 Key-Value Store Design and Performance
1.4 PAIRS User Experience
1.4.1 Data Service
1.4.2 Search or Discovery Service
1.4.3 Analytics Platform Service
1.5 Selected Industry Applications
1.5.1 PAIRS Enabled Improvements in Weather Forecasting
1.5.2 Vegetation Management
1.6 Conclusion and PAIRS Resources
References
2 Big Geospatial Data Processing Made Easy: A Working Guide to GeoSpark
2.1 Introduction
2.2 Background
2.2.1 Cluster Computing Systems
2.2.2 Spatial Queries
2.3 Overview
2.4 Spatial RDD Layer
2.4.1 Supported Spatial Data Sources
2.4.2 Spatial RDD Built-In Geometrical Library
2.4.3 Spatial RDD Partitioning
2.4.4 Spatial RDD Index
2.4.5 Spatial RDD Customized Serializer
2.5 Spatial Query Processing Layer
2.5.1 Spatial Range Query
2.5.2 Spatial K Nearest Neighbors (KNN) Query
2.5.3 Spatial Join Query
2.6 Perform Spatial Data Analytics in GeoSpark
2.6.1 Run Queries Using RDD APIs
2.6.2 Run Queries Using SQL APIs
2.6.3 Interact with GeoSpark via Zeppelin Notebook
References
3 Indoor 3D: Overview on Scanning and Reconstruction Methods
3.1 Introduction
3.1.1 Terminology
3.2 Properties of Indoor Environments and Identification of Scanning and Reconstruction Problems
3.3 Map Representations
3.4 Development of Indoor Scanning Systems
3.4.1 Single Sensor Methods and Multi-sensor Systems
3.4.1.1 Carriable Systems
3.4.1.2 Mobile Platforms
3.4.1.3 Micro Aerial Vehicles
3.5 Iterative Closest Point SLAM
3.5.1 The ICP Algorithm
3.5.2 Computing Optimal Poses
3.5.3 Marker and Feature-Based Registration
3.5.4 ICP-Based SLAM
3.5.5 Assessing the SLAM Errors
3.6 Indoor 3D Reconstruction
3.6.1 Space Subdivision and Room Segmentation
3.6.2 Reconstruction of Walls
3.6.3 Grammar Approach
3.6.4 Detection and Reconstruction of Openings
3.6.5 Reconstructing Occluded Data by Machine Learning
3.7 Applications
3.8 Future Trends
3.9 Exercises for Students
References
4 Big Earth Observation Data Processing for Disaster Damage Mapping
4.1 Monitoring Disasters from Space
4.2 Earth Observation Satellites
4.2.1 Optical Satellite Missions
4.2.2 SAR Satellite Missions
4.3 Land Cover Mapping
4.4 Disaster Mapping
4.4.1 Flood Mapping
4.4.2 Landslide Mapping
4.4.3 Building Damage Mapping
4.5 Conclusion and Future Lines
References
5 Spatial Data Reduction Through Element-of-Interest (EOI) Extraction
5.1 Introduction
5.2 Methods to Obtain EOI from Georeferenced Big Data
5.2.1 Methods Commonly Used in the Remote Sensing and Mapping Fields
5.2.1.1 Pixel-Based Methods
5.2.1.2 Object-Based Methods
5.2.1.3 Machine Learning
5.2.2 Methods to Analyze Social Media and Location-Based Data
5.2.2.1 Data Mining
5.2.2.2 Data Analytics
5.2.2.3 Machine Learning
5.3 Use Cases in the Active and Passive Big Data Spatial Realms
5.3.1 Active Use Cases
5.3.2 Passive Use Cases
5.4 Conclusion
References
6 Semantic Graphs to Reflect the Evolution of GeographicDivisions
6.1 Introduction
6.2 Context
6.2.1 Not Fully Interconnected Data
6.2.2 Broken Time-Series
6.2.3 Removal of Changes
6.3 Towards a Change in Representation with the Semantic Web
6.3.1 Open Data
6.3.2 Linked Data
6.3.3 Semantic Data
6.3.4 Linked Open Geospatial Data
6.4 Modeling Geospatial Changes in the Semantic Web
6.4.1 Identity and Changes
6.4.2 Modeling Changes
6.4.2.1 Standard Space and Time Ontologies
6.4.2.2 Fundamentals for the Modeling of Evolving Geospatial Entities
6.4.3 Ontological Approaches for the Modeling of Evolving Entities
6.4.3.1 Versioning Approach
6.4.3.2 SNAP and SPAN Approach
6.4.3.3 Ontologies for Fluents Approach
6.4.4 Ontological Approaches for the Modeling of Evolving Geospatial Entities
6.5 Contributions
6.6 Conclusion and Perspectives
References
Part II Trajectories, Event and Movement Data
7 Big Spatial Flow Data Analytics
7.1 Introduction
7.2 Flow Mapping & Geovisualization
7.2.1 Flow Aggregation
7.2.2 Edge Bundling
7.2.3 Visual Analytics and Tools
7.3 Spatial Data Mining Methods
7.3.1 Spatial Outlier Detection
7.3.2 Flow Clustering
7.4 Spatial Statistical Methods
7.4.1 Spatial Patterns Detection
7.4.2 From Patterns to Spatial Interaction Models
7.5 Conclusion
References
8 Semantic Trajectories Data Models
8.1 Introduction
8.2 Preliminaries
8.2.1 Historical Perspective
8.2.2 Spatial vs. Semantic Trajectories
8.3 A Semantic Trajectory Meta-model
8.4 Semantic Trajectory Data Models: A Purpose-Driven Taxonomy
8.4.1 Conceptual Representation
8.4.2 Database Logical Models
8.4.3 Query Processing
8.4.4 Data Analytics
8.5 Final Remarks and Research Directions
References
9 Multi-attribute Trajectory Data Management
9.1 Introduction
9.2 Related Work
9.2.1 Enriching Spatio-Temporal Trajectories
9.2.2 Indexing Spatio-Temporal Trajectories
9.3 Problem Definition
9.3.1 Data Representation
9.3.2 Queries
9.4 Indexing Multi-attribute Trajectories
9.4.1 An Overview of the Structure
9.4.2 Packing Trajectories
9.4.3 Partitioning Trajectories
9.4.4 BAR
9.4.5 Updating the Index
9.4.6 The Generality
9.5 Query Algorithms
9.5.1 An Outline
9.5.2 Processing RQMAT
9.5.3 Processing CRQMAT
9.5.4 Processing CkNN_MAT
9.6 The System Development
9.6.1 The Architecture
9.6.2 A Tool for GPS Data Clean
9.6.3 The Generation of Multi-attribute Values and Query Interface
9.6.4 MDBF: A Tool for Monitoring Database Files
9.7 Performance Evaluation
9.7.1 Evaluation of RQMAT
9.7.2 Evaluation of CRQMAT
9.7.3 Evaluation of CkNN_MAT
9.8 Future Directions
9.8.1 Data Analytics
9.8.2 Intelligent Trajectory Data Management
9.9 Conclusions
References
10 Mining Colocation from Big Geo-Spatial Event Data on GPU
10.1 Introduction
10.2 GPU Computing
10.3 Related Work
10.4 Problem Statement
10.4.1 Basic Concepts
10.4.2 Problem Definition
10.5 Approach
10.5.1 Algorithm Overview
10.5.2 Cell-Aggregate-Based Upper Bound Filter
10.5.3 Refinement Algorithms
10.6 Evaluation
10.6.1 Results on Synthetic Data
10.6.1.1 Effect of the Number of Instances
10.6.1.2 Effect of Clumpiness
10.6.1.3 Comparison on Filter and Refinement
10.6.2 Results on Real World Dataset
10.6.2.1 Effect of Minimum Participation Index Threshold
10.6.2.2 Comparison of Filter and Refinement
10.7 Discussion and Conclusion
References
11 Automatic Urban Road Network Extraction From Massive GPS Trajectories of Taxis
11.1 Introduction
11.2 Literature Review
11.2.1 Density-Based Approaches
11.2.2 Cluster-Based Approaches
11.3 Methodology
11.3.1 Trajectory Compression
11.3.2 Identification of the Trajectory Points Along the Road
11.3.2.1 Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
11.3.2.2 Anisotropic Perspective on Local Point Density
11.3.2.3 Anisotropic Density-Based Clusters with Noise (ADCN) Algorithm
11.3.2.4 ADCN Algorithm in Road Network Extraction
11.3.3 Road Network Generation
11.3.3.1 Road Density Surface Generation
11.3.3.2 Collapse Surface to Centerline
11.4 Case Study
11.4.1 Data
11.4.2 Experiment
11.4.2.1 Evaluation Metrics
11.4.2.2 Results
11.5 Conclusion and Future Work
References
12 Exploratory Analysis of Massive Movement Data
12.1 Introduction
12.2 Movement Data Characteristics & Their Relation to Big Data Vs
12.2.1 Variety
12.2.2 Velocity & Volume
12.3 Exploratory Data Analysis (EDA)
12.4 EDA Tasks for Massive Movement Data
12.4.1 Task 1: Spatio-Temporal Lookup or Range Queries
12.4.1.1 Challenge 1: Trajectory Indexing
12.4.1.2 Challenge 2: Spatio-Temporal Visualizations of Massive Movement Data
12.4.2 Task 2: Similar Trajectory Search and Join
12.4.2.1 Challenge 3: Building and Segmenting Trajectories
12.4.2.2 Challenge 4: Moving Object Identifiers
12.4.3 Task 3: Density Mapping and Other Grid-Based Summarizations
12.4.3.1 Challenge 5: Representativeness & Bias
12.4.4 Task 4: Extracting Events & Places
12.4.4.1 Challenge 6: Data Quality or Veracity
12.4.5 Task 5: Detection of Outliers and Anomalies
12.4.5.1 Challenge 7: Anomaly Detection Performance
12.5 Privacy
12.5.1 k-Anonymity
12.5.2 Differential Privacy
12.5.3 Privacy by Design
12.6 Recommended EDA Workflow for Massive Movement Data
12.6.1 Establishing an Overview
12.6.2 Putting Movement Records in Context
12.6.3 Extracting Trajectories & Events
12.6.4 Exploring Patterns in Trajectory and Event Data
12.6.5 Analyzing Outliers
12.7 Conclusions
References
Part III Statistics, Uncertainty and Data Quality
13 Spatio-Temporal Data Quality: Experience from Provision of DOT Traveler Information
13.1 Introduction
13.2 Example Data Quality Problems
13.3 Data Quality Attributes
13.4 Data Quality Assessment Methods
13.5 Enhanced Methods
13.5.1 General Definitions
13.5.2 General Approach
13.5.3 Interpolation to Model Ground Truth
13.5.4 Our SMART Approach
13.5.5 Artificial Data Set
13.5.6 Evaluation
13.5.7 Evaluation Using an Artificial Data Set
13.5.8 December 2015 MADIS California Data
13.5.9 December 2017 MADIS Montana Data
13.5.10 December 2015–2017 USGS Streamflow Data
13.5.11 Evaluation Summary
13.6 Further Research and Development Topics
Bibliography
14 Uncertain Spatial Data Management: An Overview
14.1 Introduction
14.2 Discrete and Continuous Models for Uncertain Data
14.2.1 Existing Models for Uncertain Data
14.2.2 Discrete Models
14.2.3 Continuous Models
14.3 Possible World Semantics
14.4 Existing Uncertain Spatial Database Management Systems
14.5 Probabilistic Result Semantics
14.5.1 Object Based Probabilistic Result Semantics
14.5.2 Result Based Probabilistic Result Semantics
14.6 Probabilistic Query Predicates
14.6.1 Probabilistic Threshold Queries
14.6.2 Probabilistic Topk Queries
14.6.3 Discussion
14.7 The Paradigm of Equivalent Worlds
14.7.1 Equivalent Worlds
14.7.2 Exploiting Equivalent Worlds for Efficient Algorithms
14.8 Case Study: Range Queries and the Sum of Independent Bernoulli Trials
14.8.1 Poisson-Binomial Recurrence
14.8.1.1 Complexity Analysis
14.8.2 Generating Functions
14.8.2.1 Complexity Analysis
14.9 Advanced Techniques for Managing Uncertain Spatial Data
14.10 Summary
References
15 Spatial Statistics, or How to Extract Knowledge from Data
15.1 Introduction
15.2 Spatial Data
15.3 Geostatistical Models
15.3.1 Covariogram Estimation
15.3.2 Modeling Approaches
15.3.3 Dimensionality Reduction of the Spatial Covariance Matrix
15.4 Spatial Regression Models
15.4.1 Specification of Spatial Weighting Matrices
15.4.2 Inferences on Parameter Estimates
15.4.3 Estimation Procedures
15.5 Case Study
15.6 Conclusion
15.7 Further Reading
References
Part IV Information Retrieval from Multimedia Spatial Datasets
16 A Survey of Textual Data & Geospatial Technology
16.1 Introduction
16.2 Research Questions & Different Notions of ``Where''
16.3 Spatial Indexing
16.3.1 Spatial Data Structures
16.3.2 Spatially Enabled Database Management Systems
16.4 Address Geocoding
16.5 Geoparsing and Spatial Resolution
16.5.1 Toponym Resolution
16.5.2 Geospatial Expression Resolution
16.6 Content Enrichment with Geospatial Metadata
16.7 Hybrid Textual/Spatial Document Retrieval
16.8 Geofencing
16.9 Applications
16.9.1 Location Search
16.9.2 Crime Mapping, Hotspot Analysis and Forecasting
16.9.3 Political Anaysis and Intelligence Applications
16.9.4 Healthcare Applications
16.9.5 Location-Based Services and Location-Aware Advertising
16.9.6 Other Applications
16.10 Summary, Conclusion and Future Work
Appendix: Ancillary Tasks
Augmenting Gazetteers via Web Mining
Curating Gold Standard Data for Evaluation and Training
Bibliography
References
17 Harnessing Heterogeneous Big Geospatial Data
17.1 Introduction
17.2 Geospatial Data Conflation
17.3 Geospatial Data Integration
17.4 Geospatial Data Enrichment
17.5 Summary
References
18 Big Historical Geodata for Urban and Environmental Research
18.1 Introduction
18.2 Data Sources and Time Spans
18.3 From the Data Source to Big Geospatial Data
18.4 Potentials of Big Historical Geodata
18.4.1 Human-Environment Interactions
18.4.2 Land Change Model Calibration
18.4.3 Data-Driven Geoscience and Geodata Science
18.4.4 Digital Humanities and Cultural Heritage
18.4.5 Urban Research and Spatial Planning
18.5 Conclusion
References
19 Harvesting Big Geospatial Data from Natural Language Texts
19.1 Introduction and Motivation
19.2 Methods and Tools
19.2.1 Toponym Recognition
19.2.2 Toponym Resolution
19.2.3 Developed Geoparsers and Tools
19.2.4 Location Inference from Language Modeling
19.2.5 Summary
19.3 Applications of Geospatial Data Harvested from Texts
19.3.1 Understanding Places and Human Experiences
19.3.2 Situation Awareness for Emergency Response
19.3.3 Place Relations in Virtual or Cognitive Space
19.4 Summary and Future Directions
References
20 Automating Information Extraction from Large Historical Topographic Map Archives: New Opportunities and Challenges
20.1 Introduction
20.2 Digital Historical Map Archives
20.3 Preprocessing Methods
20.3.1 Automated Georeferencing
20.3.2 Spatial Data Alignment
20.3.3 Exploratory Methods
20.4 Automated Map Content Recognition and Extraction
20.4.1 Training Data Collection
20.4.2 Recognition and Extraction Methods
20.5 Conclusions and Outlook
References
Part V Governance, Infrastructures and Society
21 The Integration of Decision Maker's Requirements to Develop a Spatial Data Warehouse
21.1 Introduction
21.2 Overview of the Existing Approaches
21.3 Overview of the Proposal
21.4 GeoCIM Definition
21.5 Classification of the GeoCIMs Models
21.6 K == Random Number of the Clusters Containing Adjacent Objects
21.7 From GeoCIM to GeoPIM
21.7.1 GeoPIM Definition
21.7.2 Formal Transformations from GeoCIM to GeoPIM
21.8 Using Topological Relationships to Enrich Dimension Hierarchies
21.8.1 Geo SM Definition
21.9 Transformations from GeoPIM to GeoPSM
21.10 Experimentation
21.10.1 Transition from the Requirements Model to the Implementation Model of a SDW
21.11 Case Study
21.12 Evaluation of the Proposal
21.13 Conclusion
References
22 Smart Cities
22.1 Introduction
22.2 History and Background: A Brief Review
22.3 Defining Smart Cities in Practice
22.4 Context Variables Affecting Smart Cities
22.4.1 Structural Factors
22.4.2 Economic Development
22.4.3 Technology
22.4.4 Effective Environmentally-Progressive Governance
22.5 The Role of Data
22.5.1 Smart City and Big Data
22.5.2 Real-Time Data
22.5.3 Open Government Data
22.5.4 The Semantic Web and Linked Open Data
22.5.4.1 OpenStreetMap
22.5.4.2 GeoNames
22.6 Examples of Smart Cities
22.7 Future Directions
22.8 Conclusion
References
23 The 4th Paradigm in Multiscale Data Representation: Modernizing the National Geospatial Data Infrastructure
23.1 Access to Nationally Managed Spatial Data in the United States
23.2 Chronology and Current Status of NSDI in the United States
23.2.1 Geospatial Interoperability Reference Architecture (GIRA)
23.2.2 Geospatial Platform
23.2.3 Cloud Computing
23.2.4 Multiagency Geospatial Acquisition
23.3 The Role of the Fourth Paradigm
23.4 Activities for Short- and Longer-Term NSDI Implementation
23.4.1 Short-Term Goals: Integrate NSDI Across Spatial and Temporal Scales
23.4.2 Longer-Term Goals: Aligning NSDI with User Needs and Demands
23.5 Implications and Prospects of the Fourth Paradigm for the NSDI
References
24 INSPIRE: The Entry Point to Europe's Big Geospatial Data Infrastructure
24.1 Introduction
24.2 Big Data in the EU
24.3 INSPIRE State of Play
24.3.1 Legal, Technical and Organisational Framework
24.3.2 INSPIRE Geoportal
24.4 Inspire as a Big Data Infrastructure
24.4.1 Characteristics of INSPIRE in Terms of Big Data
24.4.2 Challenges from the User Perspective
24.4.2.1 Discoverability of Datasets
24.4.2.2 Combining National Datasets to Create Pan-European Products
24.4.2.3 Data Access and Consumption by Clients
24.4.2.4 Cloud Infrastructures
24.5 Conclusions and Outlook
References

Polecaj historie