Data Warehouse and Data Mining: Concepts, techniques and real life applications

Data warehouse and data mining are essential technologies in the field of data analysis and business intelligence. Data

125 7 8MB

English Pages 214 Year 2024

Report DMCA / Copyright

DOWNLOAD FILE

Data Warehouse and Data Mining: Concepts, techniques and real life applications

Table of contents :
Cover
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgement
Preface
Table of Contents
1. Introduction to Data Warehousing
Introduction
Structure
Objectives
Data warehousing
History of data warehouse
Decision support system development (1970s–1980s)
Online analytical processing was first introduced in the 1980s
Concepts for data warehousing in the 1980s and 1990s
Ralph Kimball’s dimensional modelling first appeared in the 1990s
ETL and data integration advancements between the 1990s and 2000s
Columnar databases and in-memory computing adoption between 2000 and 2010
Big Data and Cloud data warehousing (2010s–present)
Data warehouse works
Sources of data
Extract, Transform, and Load
Data organization and storage
Types of data warehouses
Enterprise data warehouse
Data mart
Operational data store
General stages of data warehouse
Analysis of requirements
Modelling data
Extraction of data
Data loading
Transformation of data
Data management and storage
Management of metadata
Analysis and querying
Upkeep and evolution
Need of data warehouse
Uses and trends of data warehouse
Data warehouse applications in various industries
Banking
Retail
Healthcare
Marketing
Manufacturing
Telecommunications
Trends of data warehouse
Database management system versus data warehouse
Metadata
Types of metadata
Descriptive metadata
Administrative metadata
Structural metadata
Technical metadata
Provenance metadata
Rights metadata
Preservation metadata
Role of metadata
Metadata repository
Benefits of metadata
Challenges for metadata management
Multidimensional data model
Data cubes
Data cube classification
Operations in data cubes
Advantages of data cubes
Data cube disadvantages
Schemas for multidimensional database
Star schema
Snowflake schema
Star Schema
Fact table
Snowflake schema
Fact table
Fact constellation in data warehouse
General structure of fact constellation
Fact constellation schema architecture
Conclusion
Exercises
2. Data Warehouse Process and Architecture
Introduction
Structure
Objectives
Objectives of data warehouse architecture
Single-tier architecture
Two-tier architecture
Three-tier architecture
Properties of data warehouse architecture
Types of data warehouse architecture
Single-tier architecture
Two-tier architecture
Three-tier architecture
Advantages of data warehouse architecture
Disadvantages of data warehouse architecture
Data warehouse database
Online transaction processing and online analytical processing
Online transaction processing
Online analytical processing
Advantages and disadvantages of Online transaction processing and online analytical processing
Types of online analytical processing
Examples of all types, advantages and disadvantages of different OLAP systems
Servers
Data warehouse manager
Conclusion
Exercises
3. Data Warehouse Implementation
Introduction
Structure
Objectives
Introduction of data warehouse implementation
Data warehouse implementation guidelines
Components of data warehouse implementation
Benefits of implementing a data warehouse
Computation of data cubes
Subtracting two cubes for differences
Modeling Online Analytical Processing data
Online Analytical Processing queries manager
Data warehouse back-end tools
Complex aggregation at multiple granularities
Tuning and testing of data warehouse
Tuning and testing of data warehouse: Enhancing performance and reliability
Conclusion
Exercises
4. Data Mining Definition and Task
Introduction
Structure
Objectives
Introduction to data mining
Defining business objectives/problem definition
Importance
Stakeholder meetings
Problem definition
Prioritization
Scope definition
Deliverables
Benefits
Data mining tasks
Data mining versus data analysis
Data mining functionality
Knowledge Discovery in Databases versus data mining
Data mining techniques
Classification
Clustering
Association rule mining
Regression analysis
Correlation versus causation
Anomaly detection
Global outliers
Contextual outliers
Collective outliers
Sequence pattern mining
Text mining
Data mining tools and applications
Conclusion
Exercises
5. Data Mining Query Languages
Introduction
Structure
Objectives
Introduction to data mining query languages
Syntax of Data Mining Query Languages
Structured Query Language
GROUP BY
ORDER BY
WINDOW operations (using OVER)
Creating a view as part of data preprocessing
Various clauses to explore the attributes or dimensions
Data Mining Query Language
Multidimensional Expressions
MDX query clauses
MDX functions
Datalog
XQuery
XPath and XQuery
CSS selectors
Web scraping libraries
JSONPath
SPARQL
RESTful APIs
Web Scraping Frameworks
Graph Query Languages
Standardization of Data Mining Languages
Data specification
DMQL for characterization
DMQL for discrimination
Specifying knowledge
Hierarchy specification
Pattern presentation and visualization specification
Data mining languages and standardization of data mining
Conclusion
Exercises
6. Data Mining Techniques
Introduction
Structure
Objectives
Data mining techniques
Association rules
Mathematical explanation of association rule and all of these components
Types of association rules in data mining
Algorithms for generating association rules in data mining
Clustering techniques
Types of clustering
Clustering methods
Partitioning method
Hierarchical methods
Density-based method
Grid-based method
Model-based methods
Constraint-based method
Decision tree
Rough sets
Support vector machines and fuzzy techniques
Support vector machines
Hyperplane
Support vectors
Margin
Fuzzy techniques
Conclusion
Exercises
7. Mining Complex Data Objects
Introduction
Structure
Objectives
Mining complex data objects
Time-series data mining
Sequential pattern mining in symbolic sequences
Data mining of biological sequences
Graph pattern mining
Statistical modeling of networks
Mining spatial data
Mining cyber-physical system data
Mining multimedia data
Mining web data
Mining text data
Mining spatiotemporal data
Mining data streams
Spatial databases
Multimedia databases
Multimedia databases and temporal data
Difference between spatial and temporal data
Different types of multimedia applications
Challenges with multimedia database
Architecture for multimedia data mining
Applications of multimedia mining
Time series and sequence data mining
Components of time series
Categories of time series movements
Text databases and mining
Word Wide Web
Streaming data processing: An in-depth exploration
Types of windows for aggregating streaming data
Conclusion
Exercises
Index

Polecaj historie