Bayesian Optimization in Action [Team-IRA] 1633439070, 9781633439078

Bayesian optimization helps pinpoint the best configuration for your machine learning models with speed and accuracy. Pu

192 37 6MB

English Pages 424 [426] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Bayesian Optimization in Action [Team-IRA]
 1633439070, 9781633439078

  • Commentary
  • Thanks to Team-IRA for the True PDF

Table of contents :
Bayesian Optimization in Action
contents
forewords
preface
acknowledgments
about this book
Who should read this book?
How this book is organized: A roadmap
About the code
liveBook discussion forum
about the author
About the technical editor
about the cover illustration
1 Introduction to Bayesian optimization
1.1 Finding the optimum of an expensive black box function
1.1.1 Hyperparameter tuning as an example of an expensive black box optimization problem
1.1.2 The problem of expensive black box optimization
1.1.3 Other real-world examples of expensive black box optimization problems
1.2 Introducing Bayesian optimization
1.2.1 Modeling with a Gaussian process
1.2.2 Making decisions with a BayesOpt policy
1.2.3 Combining the GP and the optimization policy to form the optimization loop
1.2.4 BayesOpt in action
1.3 What will you learn in this book?
Summary
Part 1—Modeling with Gaussian processes
2 Gaussian processes as distributions over functions
2.1 How to sell your house the Bayesian way
2.2 Modeling correlations with multivariate Gaussian distributions and Bayesian updates
2.2.1 Using multivariate Gaussian distributions to jointly model multiple variables
2.2.2 Updating MVN distributions
2.2.3 Modeling many variables with high-dimensional Gaussian distributions
2.3 Going from a finite to an infinite Gaussian
2.4 Implementing GPs in Python
2.4.1 Setting up the training data
2.4.2 Implementing a GP class
2.4.3 Making predictions with a GP
2.4.4 Visualizing predictions of a GP
2.4.5 Going beyond one-dimensional objective functions
2.5 Exercise
Summary
3 Customizing a Gaussian process with the mean and covariance functions
3.1 The importance of priors in Bayesian models
3.2 Incorporating what you already know into a GP
3.3 Defining the functional behavior with the mean function
3.3.1 Using the zero mean function as the base strategy
3.3.2 Using the constant function with gradient descent
3.3.3 Using the linear function with gradient descent
3.3.4 Using the quadratic function by implementing a custom mean function
3.4 Defining variability and smoothness with the covariance function
3.4.1 Setting the scales of the covariance function
3.4.2 Controlling smoothness with different covariance functions
3.4.3 Modeling different levels of variability with multiple length scales
3.5 Exercise
Summary
Part 2—Making decisions with Bayesian optimization
4 Refining the best result with improvement-based policies
4.1 Navigating the search space in BayesOpt
4.1.1 The BayesOpt loop and policies
4.1.2 Balancing exploration and exploitation
4.2 Finding improvement in BayesOpt
4.2.1 Measuring improvement with a GP
4.2.2 Computing the Probability of Improvement
4.2.3 Running the PoI policy
4.3 Optimizing the expected value of improvement
4.4 Exercises
4.4.1 Exercise 1: Encouraging exploration with PoI
4.4.2 Exercise 2: BayesOpt for hyperparameter tuning
Summary
5 Exploring the search space with bandit-style policies
5.1 Introduction to the MAB problem
5.1.1 Finding the best slot machine at a casino
5.1.2 From MAB to BayesOpt
5.2 Being optimistic under uncertainty with the Upper Confidence Bound policy
5.2.1 Optimism under uncertainty
5.2.2 Balancing exploration and exploitation
5.2.3 Implementation with BoTorch
5.3 Smart sampling with the Thompson sampling policy
5.3.1 One sample to represent the unknown
5.3.2 Implementation with BoTorch
5.4 Exercises
5.4.1 Exercise 1: Setting an exploration schedule for the UCB
5.4.2 Exercise 2: BayesOpt for hyperparameter tuning
Summary
6 Using information theory with entropy-based policies
6.1 Measuring knowledge with information theory
6.1.1 Measuring uncertainty with entropy
6.1.2 Looking for a remote control using entropy
6.1.3 Binary search using entropy
6.2 Entropy search in BayesOpt
6.2.1 Searching for the optimum using information theory
6.2.2 Implementing entropy search with BoTorch
6.3 Exercises
6.3.1 Exercise 1: Incorporating prior knowledge into entropy search
6.3.2 Exercise 2: Bayesian optimization for hyperparameter tuning
Summary
Part 3—Extending Bayesian optimization to specialized settings
7 Maximizing throughput with batch optimization
7.1 Making multiple function evaluations simultaneously
7.1.1 Making use of all available resources in parallel
7.1.2 Why can’t we use regular BayesOpt policies in the batch setting?
7.2 Computing the improvement and upper confidence bound of a batch of points
7.2.1 Extending optimization heuristics to the batch setting
7.2.2 Implementing batch improvement and UCB policies
7.3 Exercise 1: Extending TS to the batch setting via resampling
7.4 Computing the value of a batch of points using information theory
7.4.1 Finding the most informative batch of points with cyclic refinement
7.4.2 Implementing batch entropy search with BoTorch
7.5 Exercise 2: Optimizing airplane designs
Summary
8 Satisfying extra constraints with constrained optimization
8.1 Accounting for constraints in a constrained optimization problem
8.1.1 Constraints can change the solution of an optimization problem
8.1.2 The constraint-aware BayesOpt framework
8.2 Constraint-aware decision-making in BayesOpt
8.3 Exercise 1: Manual computation of constrained EI
8.4 Implementing constrained EI with BoTorch
8.5 Exercise 2: Constrained optimization of airplane design
Summary
9 Balancing utility and cost with multifidelity optimization
9.1 Using low-fidelity approximations to study expensive phenomena
9.2 Multifidelity modeling with GPs
9.2.1 Formatting a multifidelity dataset
9.2.2 Training a multifidelity GP
9.3 Balancing information and cost in multifidelity optimization
9.3.1 Modeling the costs of querying different fidelities
9.3.2 Optimizing the amount of information per dollar to guide optimization
9.4 Measuring performance in multifidelity optimization
9.5 Exercise 1: Visualizing average performance in multifidelity optimization
9.6 Exercise 2: Multifidelity optimization with multiple low-fidelity approximations
Summary
10 Learning from pairwise comparisons with preference optimization
10.1 Black-box optimization with pairwise comparisons
10.2 Formulating a preference optimization problem and formatting pairwise comparison data
10.3 Training a preference-based GP
10.4 Preference optimization by playing king of the hill
Summary
11 Optimizing multiple objectives at the same time
11.1 Balancing multiple optimization objectives with BayesOpt
11.2 Finding the boundary of the most optimal data points
11.3 Seeking to improve the optimal data boundary
11.4 Exercise: Multiobjective optimization of airplane design
Summary
Part 4—Special Gaussian process models
12 Scaling Gaussian processes to large datasets
12.1 Training a GP on a large dataset
12.1.1 Setting up the learning task
12.1.2 Training a regular GP
12.1.3 Problems with training a regular GP
12.2 Automatically choosing representative points from a large dataset
12.2.1 Minimizing the difference between two GPs
12.2.2 Training the model in small batches
12.2.3 Implementing the approximate model
12.3 Optimizing better by accounting for the geometry of the loss surface
12.4 Exercise
Summary
13 Combining Gaussian processes with neural networks
13.1 Data that contains structures
13.2 Capturing similarity within structured data
13.2.1 Using a kernel with GPyTorch
13.2.2 Working with images in PyTorch
13.2.3 Computing the covariance of two images
13.2.4 Training a GP on image data
13.3 Using neural networks to process complex structured data
13.3.1 Why use neural networks for modeling?
13.3.2 Implementing the combined model in GPyTorch
Summary
Appendix—Solutions to the exercises
A.1 Chapter 2: Gaussian processes as distributions over functions
A.2 Chapter 3: Incorporating prior knowledge with the mean and covariance functions
A.3 Chapter 4: Refining the best result with improvement-based policies
A.3.1 Exercise 1: Encouraging exploration with Probability of Improvement
A.3.2 Exercise 2: BayesOpt for hyperparameter tuning
A.4 Chapter 5: Exploring the search space with bandit-style policies
A.4.1 Exercise 1: Setting an exploration schedule for Upper Confidence Bound
A.4.2 Exercise 2: BayesOpt for hyperparameter tuning
A.5 Chapter 6: Using information theory with entropy-based policies
A.5.1 Exercise 1: Incorporating prior knowledge into entropy search
A.5.2 Exercise 2: BayesOpt for hyperparameter tuning
A.6 Chapter 7: Maximizing throughput with batch optimization
A.6.1 Exercise 1: Extending TS to the batch setting via resampling
A.6.2 Exercise 2: Optimizing airplane designs
A.7 Chapter 8: Satisfying extra constraints with constrained optimization
A.7.1 Exercise 1: Manual computation of constrained EI
A.7.2 Exercise 2: Constrained optimization of airplane design
A.8 Chapter 9: Balancing utility and cost with multifidelity optimization
A.8.1 Exercise 1: Visualizing average performance in multifidelity optimization
A.8.2 Exercise 2: Multifidelity optimization with multiple low-fidelity approximations
A.9 Chapter 11: Optimizing multiple objectives at the same time
A.10 Chapter 12: Scaling Gaussian processes to large data sets
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W

Polecaj historie