Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features [2 ed.] 9781801070034

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analy

779 119 30MB

English Pages 437 Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features [2 ed.]
 9781801070034

Table of contents :
Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Section 1 – Getting Started with Machine Learning with Elastic Stack
Chapter 1: Machine Learning for IT
Overcoming the historical challenges in IT
Dealing with the plethora of data
The advent of automated anomaly detection
Unsupervised versus supervised ML
Using unsupervised ML for anomaly detection
Defining unusual
Learning what's normal
Probability models
Learning the models
De-trending
Scoring of unusualness
The element of time
Applying supervised ML to data frame analytics
The process of supervised learning
Summary
Chapter 2: Enabling and Operationalization
Technical requirements
Enabling Elastic ML features
Enabling ML on a self-managed cluster
Enabling ML in the cloud – Elasticsearch Service
Understanding operationalization
ML nodes
Jobs
Bucketing data in a time series analysis
Feeding data to Elastic ML
The supporting indices
Anomaly detection orchestration
Anomaly detection model snapshots
Summary
Section 2 – Time Series Analysis – Anomaly Detection and Forecasting
Chapter 3: Anomaly Detection
Technical requirements
Elastic ML job types
Dissecting the detector
The function
The field
The partition field
The by field
The over field
The "formula"
Exploring the count functions
Other counting functions
Detecting changes in metric values
Metric functions
Understanding the advanced detector functions
rare
Frequency rare
Information content
Geographic
Time
Splitting analysis along categorical features
Setting the split field
The difference between splitting using partition and by_field
Understanding temporal versus population analysis
Categorization analysis of unstructured messages
Types of messages that are good candidates for categorization
The process used by categorization
Analyzing the categories
Categorization job example
When to avoid using categorization
Managing Elastic ML via the API
Summary
Chapter 4: Forecasting
Technical requirements
Contrasting forecasting with prophesying
Forecasting use cases
Forecasting theory of operation
Single time series forecasting
Looking at forecast results
Multiple time series forecasting
Summary
Chapter 5: Interpreting Results
Technical requirements
Viewing the Elastic ML results index
Anomaly scores
Bucket-level scoring
Normalization
Influencer-level scoring
Influencers
Record-level scoring
Results index schema details
Bucket results
Record results
Influencer results
Multi-bucket anomalies
Multi-bucket anomaly example
Multi-bucket scoring
Forecast results
Querying for forecast results
Results API
Results API endpoints
Getting the overall buckets API
Getting the categories API
Custom dashboards and Canvas workpads
Dashboard "embeddables"
Anomalies as annotations in TSVB
Customizing Canvas workpads
Summary
Chapter 6: Alerting on ML Analysis
Technical requirements
Understanding alerting concepts
Anomalies are not necessarily alerts
In real-time alerting, timing matters
Building alerts from the ML UI
Defining sample anomaly detection jobs
Creating alerts against the sample jobs
Simulating some real-time anomalous behavior
Receiving and reviewing the alerts
Creating an alert with a watch
Understanding the anatomy of the legacy default ML watch
Custom watches can offer some unique functionality
Summary
Chapter 7: AIOps and Root Cause Analysis
Technical requirements
Demystifying the term ''AIOps''
Understanding the importance and limitations of KPIs
Moving beyond KPIs
Organizing data for better analysis
Custom queries for anomaly detection datafeeds
Data enrichment on ingest
Leveraging the contextual information
Analysis splits
Statistical influencers
Bringing it all together for RCA
Outage background
Correlation and shared influencers
Summary
Chapter 9: Anomaly Detection in Other Elastic Stack Apps
Technical requirements
Anomaly detection in Elastic APM
Enabling anomaly detection for APM
Viewing the anomaly detection job results in the APM UI
Creating ML Jobs via the data recognizer
Anomaly detection in the Logs app
Log categories
Log anomalies
Anomaly detection in the Metrics app
Anomaly detection in the Uptime app
Anomaly detection in the Elastic Security app
Prebuilt anomaly detection jobs
Anomaly detection jobs as detection alerts
Summary
Section 3 – Data Frame Analysis
Chapter 9: Introducing Data Frame Analytics
Technical requirements
Learning how to use transforms
Why are transforms useful?
The anatomy of a transform
Using transforms to analyze e-commerce orders
Exploring more advanced pivot and aggregation configurations
Discovering the difference between batch and continuous transforms
Analyzing social media feeds using continuous transforms
Using Painless for advanced transform configurations
Introducing Painless
Working with Python and Elasticsearch
A brief tour of the Python Elasticsearch clients
Summary
Further reading
Chapter 10: Outlier Detection
Technical requirements
Discovering the four techniques used for outlier detection
Understanding feature influence
How does outlier detection differ from anomaly detection?
Applying outlier detection in practice
Evaluating outlier detection with the Evaluate API
Hyperparameter tuning for outlier detection
Summary
Chapter 11: Classification Analysis
Technical requirements
Classification: from data to a trained model
Feature engineering
Evaluating the model
Taking your first steps with classification
Classification under the hood: gradient boosted decision trees
Introduction to decision trees
Gradient boosted decision trees
Hyperparameters
Interpreting results
Summary
Further reading
Chapter 12: Regression
Technical requirements
Using regression analysis to predict house prices
Using decision trees for regression
Summary
Further reading
Chapter 13: Inference
Technical requirements
Examining, exporting, and importing your trained models with the Trained Models API
A tour of the Trained Models API
Exporting and importing trained models with the Trained Models API and Python
Understanding inference processors and ingest pipelines
Handling missing or corrupted data in ingest pipelines
Using inference processor configuration options to gain more insight into your predictions
Importing external models into Elasticsearch using eland
Learning about supported external models in eland
Training a scikit-learn DecisionTreeClassifier and importing it into Elasticsearch using eland
Summary
Appendix: Anomaly Detection Tips
Technical requirements
Understanding influencers in split versus non-split jobs
Using one-sided functions to your advantage
Ignoring time periods
Ignoring an upcoming (known) window of time
Ignoring an unexpected window of time, after the fact
Using custom rules and filters to your advantage
Creating custom rules
Benefiting from custom rules for a "top-down" alerting philosophy
Anomaly detection job throughput considerations
Avoiding the over-engineering of a use case
Using anomaly detection on runtime fields
Summary
Why subscribe?
About Packt
Other Books You May Enjoy
Index

Polecaj historie