Preface |
|
ix | |
Notation |
|
xiii | |
|
|
1 | (14) |
|
1.1 Formalization of Optimization |
|
|
2 | (3) |
|
1.2 The Bayesian Approach |
|
|
5 | (10) |
|
|
15 | (30) |
|
2.1 Definition and Basic Properties |
|
|
16 | (2) |
|
2.2 Inference with Exact and Noisy Observations |
|
|
18 | (8) |
|
2.3 Overview of Remainder of Chapter |
|
|
26 | (1) |
|
2.4 Joint Gaussian Processes |
|
|
26 | (2) |
|
|
28 | (2) |
|
|
30 | (3) |
|
2.7 Existence and Uniqueness of Global Maxima |
|
|
33 | (2) |
|
2.8 Inference with Non-Gaussian Observations and Constraints |
|
|
35 | (6) |
|
2.9 Summary of Major Ideas |
|
|
41 | (4) |
|
3 Modeling With Gaussian Processes |
|
|
45 | (22) |
|
3.1 The Prior Mean Function |
|
|
46 | (3) |
|
3.2 The Prior Covariance Function |
|
|
49 | (2) |
|
3.3 Notable Covariance Functions |
|
|
51 | (3) |
|
3.4 Modifying and Combining Covariance Functions |
|
|
54 | (7) |
|
3.5 Modeling Functions on High-Dimensional Domains |
|
|
61 | (3) |
|
3.6 Summary of Major Ideas |
|
|
64 | (3) |
|
4 Model Assessment, Selection, And Averaging |
|
|
67 | (20) |
|
4.1 Models and Model Structures |
|
|
68 | (2) |
|
4.2 Bayesian Inference over Parametric Model Spaces |
|
|
70 | (3) |
|
4.3 Model Selection via Posterior Maximization |
|
|
73 | (1) |
|
|
74 | (4) |
|
4.5 Multiple Model Structures |
|
|
78 | (3) |
|
4.6 Automating Model Structure Search |
|
|
81 | (3) |
|
4.7 Summary of Major Ideas |
|
|
84 | (3) |
|
5 Decision Theory For Optimization |
|
|
87 | (22) |
|
5.1 Introduction to Bayesian Decision Theory |
|
|
89 | (2) |
|
5.2 Sequential Decisions with a Fixed Budget |
|
|
91 | (8) |
|
5.3 Cost and Approximation of the Optimal Policy |
|
|
99 | (4) |
|
5.4 Cost-Aware Optimization and Termination as a Decision |
|
|
103 | (3) |
|
3.5 Summary of Major Ideas |
|
|
106 | (3) |
|
6 Utility Functions For Optimization |
|
|
109 | (14) |
|
6.1 Expected Utility of Terminal Recommendation |
|
|
109 | (5) |
|
|
114 | (1) |
|
|
115 | (1) |
|
6.4 Dependence on Model of Objective Function |
|
|
116 | (1) |
|
6.5 Comparison of Utility Functions |
|
|
117 | (2) |
|
6.6 Summary of Major Ideas |
|
|
119 | (4) |
|
7 Common Bayesian Optimization Policies |
|
|
123 | (34) |
|
7.1 Example Optimization Scenario |
|
|
124 | (1) |
|
7.2 Decision-Theoretic Policies |
|
|
124 | (3) |
|
|
127 | (2) |
|
|
129 | (2) |
|
7.5 Probability of Improvement |
|
|
131 | (4) |
|
7.6 Mutual Information and Entropy Search |
|
|
135 | (6) |
|
7.7 Multi-Armed Bandits and Optimization |
|
|
141 | (4) |
|
7.8 Maximizing a Statistical Upper Bound |
|
|
145 | (3) |
|
|
148 | (2) |
|
7.10 Other Ideas in Policy Construction |
|
|
150 | (6) |
|
7.11 Summary of Major Ideas |
|
|
156 | (1) |
|
8 Computing Policies With Gaussian Processes |
|
|
157 | (44) |
|
8.1 Notation for Objective Function Model |
|
|
157 | (1) |
|
|
158 | (9) |
|
8.3 Probability of Improvement |
|
|
167 | (3) |
|
8.4 Upper Confidence Bound |
|
|
170 | (1) |
|
8.5 Approximate Computation for One-Step Lookahead |
|
|
171 | (1) |
|
|
172 | (4) |
|
|
176 | (4) |
|
8.8 Mutual Information with x* |
|
|
180 | (7) |
|
8.9 Mutual Information with f* |
|
|
187 | (5) |
|
8.10 Averaging over a Space of Gaussian Processes |
|
|
192 | (4) |
|
8.11 Alternative Models: Bayesian Neural Networks, etc. |
|
|
196 | (4) |
|
8.12 Summary of Major Ideas |
|
|
200 | (1) |
|
|
201 | (12) |
|
9.1 Gaussian Process Inference, Scaling, and Approximation |
|
|
201 | (6) |
|
9.2 Optimizing Acquisition Functions |
|
|
207 | (3) |
|
9.3 Starting and Stopping Optimization |
|
|
210 | (2) |
|
9.4 Summary of Major Ideas |
|
|
212 | (1) |
|
|
213 | (32) |
|
|
213 | (2) |
|
10.2 Useful Function Spaces for Studying Convergence |
|
|
215 | (5) |
|
10.3 Relevant Properties of Covariance Functions |
|
|
220 | (4) |
|
10.4 Bayesian Regret with Observation Noise |
|
|
224 | (8) |
|
10.5 Worst-Case Regret with Observation Noise |
|
|
232 | (5) |
|
10.6 The Exact Observation Case |
|
|
237 | (4) |
|
10.7 The Effect of Unknown Hyperparameters |
|
|
241 | (2) |
|
10.8 Summary of Major Ideas |
|
|
243 | (2) |
|
11 Extensions And Related Settings |
|
|
245 | (42) |
|
11.1 Unknown Observation Costs |
|
|
245 | (4) |
|
11.2 Constrained Optimization and Unknown Constraints |
|
|
249 | (3) |
|
11.3 Synchronous Batch Observations |
|
|
252 | (10) |
|
11.4 Asynchronous Observation with Pending Experiments |
|
|
262 | (1) |
|
11.5 Multifidelity Optimization |
|
|
263 | (3) |
|
11.6 Multitask Optimization |
|
|
266 | (3) |
|
11.7 Multiobjective Optimization |
|
|
269 | (7) |
|
11.8 Gradient Observations |
|
|
276 | (1) |
|
11.9 Stochastic and Robust Optimization |
|
|
277 | (4) |
|
11.10 Incremental Optimization of Sequential Procedures |
|
|
281 | (1) |
|
11.11 Non-Gaussian Observation Models and Active Search |
|
|
282 | (3) |
|
|
285 | (2) |
|
12 A Brief History Of Bayesian Optimization |
|
|
287 | (8) |
|
12.1 Historical Precursors and Optimal Design |
|
|
287 | (1) |
|
12.2 Sequential Analysis and Bayesian Experimental Design |
|
|
287 | (2) |
|
12.3 The Rise of Bayesian Optimization |
|
|
289 | (1) |
|
12.4 Later Rediscovery and Development |
|
|
290 | (2) |
|
12.5 Multi-Armed Bandits to Infinite-Armed Bandits |
|
|
292 | (2) |
|
|
294 | (1) |
A The Gaussian Distribution |
|
295 | (6) |
B Methods For Approximate Bayesian Inference |
|
301 | (6) |
C Gradients |
|
307 | (6) |
D Annotated Bibliography Of Applications |
|
313 | (18) |
References |
|
331 | (22) |
Index |
|
353 | |