Preface |
|
xv | |
Biography |
|
xvii | |
|
|
xix | |
|
|
xxi | |
|
|
1 | (130) |
|
1 Discrete Spike-and-Slab Priors: Models and Computational Aspects |
|
|
3 | (22) |
|
|
|
4 | (1) |
|
1.2 Spike-and-Slab Priors for Linear Regression Models |
|
|
4 | (4) |
|
1.2.1 Stochastic Search MCMC |
|
|
7 | (1) |
|
1.2.2 Prediction via Bayesian Model Averaging |
|
|
8 | (1) |
|
1.3 Spike-and-Slab Priors for Non-Gaussian Data |
|
|
8 | (3) |
|
1.3.1 Compositional Count Data |
|
|
9 | (2) |
|
1.4 Structured Spike-and-Slab Priors for Biomedical Studies |
|
|
11 | (4) |
|
|
13 | (1) |
|
1.4.2 Spiked Nonparametric Priors |
|
|
14 | (1) |
|
1.5 Scalable Bayesian Variable Selection |
|
|
15 | (3) |
|
1.5.1 Variational Inference |
|
|
17 | (1) |
|
|
18 | (1) |
|
|
19 | (6) |
|
2 Recent Theoretical Advances with the Discrete Spike-and-Slab Priors |
|
|
25 | (32) |
|
|
|
|
26 | (1) |
|
2.2 Optimal Recovery in Gaussian Sequence Models |
|
|
27 | (5) |
|
2.2.1 Minimax Rate in Nearly Black Gaussian Mean Models |
|
|
27 | (1) |
|
2.2.2 Optimal Bayesian Recovery in lq-norm |
|
|
28 | (3) |
|
2.2.3 Optimal Contraction Rate for Other Variants of Priors |
|
|
31 | (1) |
|
2.2.4 Slow Contraction Rate for Light-tailed Priors |
|
|
32 | (1) |
|
2.3 Sparse Linear Regression Model |
|
|
32 | (9) |
|
2.3.1 Prior Construction and Assumptions |
|
|
33 | (1) |
|
2.3.2 Compatibility Conditions on the Design Matrix |
|
|
34 | (2) |
|
2.3.3 Posterior Contraction Rate |
|
|
36 | (1) |
|
2.3.4 Variable Selection Consistency |
|
|
37 | (1) |
|
2.3.5 Variable Selection with Discrete Spike and Zellner's g-Priors |
|
|
38 | (1) |
|
2.3.6 Bernstein-von Mises Theorem for the Posterior Distribution |
|
|
39 | (2) |
|
2.4 Extension to Generalized Linear Models |
|
|
41 | (9) |
|
2.4.1 Construction of the GLM Family |
|
|
42 | (1) |
|
2.4.2 Clipped GLM and Connections to Regression Settings |
|
|
42 | (2) |
|
2.4.3 Construction of Sparsity Favoring Prior |
|
|
44 | (1) |
|
2.4.4 Assumptions on Data Generating Distribution and Prior |
|
|
45 | (3) |
|
2.4.5 Adaptive Rate-Optimal Posterior Contraction Rate in ^-norm |
|
|
48 | (2) |
|
2.5 Optimality Results for Variational Inference in Linear Regression Models |
|
|
50 | (2) |
|
|
52 | (1) |
|
|
53 | (4) |
|
3 Theoretical and Computational Aspects of Continuous Spike-and-Slab Priors |
|
|
57 | (24) |
|
|
|
58 | (1) |
|
3.2 Variable Selection in Linear Models |
|
|
58 | (2) |
|
3.3 Continuous Spike-and-Slab Priors |
|
|
60 | (1) |
|
3.3.1 Shrinking and Diffusing Priors |
|
|
60 | (1) |
|
3.3.2 Spike-and-Slab LASSO |
|
|
61 | (1) |
|
3.4 Theoretical Properties |
|
|
61 | (7) |
|
3.4.1 Variable Selection Consistency |
|
|
62 | (1) |
|
|
63 | (3) |
|
|
66 | (2) |
|
|
68 | (4) |
|
3.5.1 Skinny Gibbs for Scalable Posterior Sampling |
|
|
68 | (2) |
|
3.5.2 Skinny Gibbs for Non-Normal Spike-and-Slab Priors |
|
|
70 | (2) |
|
|
72 | (1) |
|
|
72 | (5) |
|
|
77 | (4) |
|
4 Spike-and-Slab Meets LASSO: A Review of the Spike-and-Slab LASSO |
|
|
81 | (28) |
|
|
|
|
|
82 | (1) |
|
4.2 Variable Selection in High-Dimensions: Frequentist and Bayesian Strategies |
|
|
83 | (2) |
|
4.2.1 Penalized Likelihood Approaches |
|
|
83 | (1) |
|
4.2.2 Spike-and-Slab Priors |
|
|
84 | (1) |
|
4.3 The Spike-and-Slab LASSO |
|
|
85 | (4) |
|
4.3.1 Prior Specification |
|
|
85 | (1) |
|
4.3.2 Selective Shrinkage and Self-Adaptivity to Sparsity |
|
|
86 | (2) |
|
4.3.3 The Spike-and-Slab LASSO in Action |
|
|
88 | (1) |
|
4.4 Computational Details |
|
|
89 | (3) |
|
4.4.1 Coordinate-wise Optimization |
|
|
89 | (2) |
|
4.4.2 Dynamic Posterior Exploration |
|
|
91 | (1) |
|
4.4.3 EM Implementation of the Spike-and-Slab LASSO |
|
|
91 | (1) |
|
4.5 Uncertainty Quantification |
|
|
92 | (1) |
|
4.5.1 Debiasing the Posterior Mode |
|
|
92 | (1) |
|
4.5.2 Posterior Sampling for the Spike-and-Slab LASSO |
|
|
93 | (1) |
|
|
93 | (3) |
|
4.6.1 Example on Synthetic Data |
|
|
93 | (2) |
|
4.6.2 Bardet-Beidl Syndrome Gene Expression Study |
|
|
95 | (1) |
|
4.7 Methodological Extensions |
|
|
96 | (4) |
|
4.8 Theoretical Properties |
|
|
100 | (1) |
|
|
101 | (2) |
|
|
103 | (6) |
|
5 Adaptive Computational Methods for Bayesian Variable Selection |
|
|
109 | (22) |
|
|
|
|
109 | (3) |
|
5.1.1 Some Reasons to be Cheerful Ill |
|
|
|
5.1.2 Adaptive Monte Carlo Methods |
|
|
112 | (1) |
|
5.2 Some Adaptive Approaches to Bayesian Variable Selection |
|
|
112 | (1) |
|
5.3 Two Adaptive Algorithms |
|
|
113 | (4) |
|
|
115 | (1) |
|
5.3.2 Non-Gaussian Models |
|
|
116 | (1) |
|
|
117 | (8) |
|
5.4.1 Simulated Example: Linear Regression |
|
|
117 | (2) |
|
5.4.2 Fine Mapping for Systemic Lupus Erythematosus |
|
|
119 | (2) |
|
5.4.3 Analysing Environmental DNA Data |
|
|
121 | (4) |
|
|
125 | (2) |
|
|
127 | (4) |
|
II Continuous Shrinkage Priors |
|
|
131 | (68) |
|
6 Theoretical Guarantees for the Horseshoe and Other Global-Local Shrinkage Priors |
|
|
133 | (28) |
|
|
|
134 | (3) |
|
|
134 | (1) |
|
6.1.2 Global-Local Shrinkage Priors and Spike-and-Slab Priors |
|
|
134 | (1) |
|
6.1.3 Performance Measures |
|
|
135 | (2) |
|
6.2 Global-Local Shrinkage Priors |
|
|
137 | (2) |
|
|
139 | (10) |
|
6.3.1 Non-Adaptive Posterior Concentration Theorems |
|
|
140 | (3) |
|
|
143 | (2) |
|
6.3.3 Adaptive Posterior Concentration Theorems |
|
|
145 | (2) |
|
6.3.4 Other Sparsity Assumptions |
|
|
147 | (1) |
|
6.3.5 Implications for Practice |
|
|
147 | (2) |
|
6.4 Uncertainty Quantification Guarantees |
|
|
149 | (3) |
|
|
149 | (2) |
|
|
151 | (1) |
|
6.4.3 Implications for Practice |
|
|
152 | (1) |
|
6.5 Variable Selection Guarantees |
|
|
152 | (2) |
|
6.5.1 Thresholding on the Amount of Shrinkage |
|
|
152 | (1) |
|
6.5.2 Checking for Zero in Marginal Credible Intervals |
|
|
153 | (1) |
|
|
154 | (1) |
|
|
155 | (6) |
|
7 MCMC for Global-Local Shrinkage Priors in High-Dimensional Settings |
|
|
161 | (18) |
|
|
|
|
161 | (1) |
|
7.2 Global-Local Shrinkage Priors |
|
|
162 | (1) |
|
|
163 | (8) |
|
7.3.1 Sampling Structured High-Dimensional Gaussians |
|
|
164 | (3) |
|
7.3.2 Blocking can be Advantageous |
|
|
167 | (2) |
|
7.3.3 Geometric Convergence |
|
|
169 | (2) |
|
|
171 | (2) |
|
|
173 | (2) |
|
|
175 | (4) |
|
8 Variable Selection with Shrinkage Priors via Sparse Posterior Summaries |
|
|
179 | (20) |
|
|
|
|
|
180 | (1) |
|
8.2 Penalized Credible Region Selection |
|
|
180 | (6) |
|
|
181 | (1) |
|
8.2.2 Global-Local Shrinkage Priors |
|
|
182 | (2) |
|
8.2.3 Example: Simulation Studies |
|
|
184 | (1) |
|
8.2.4 Example: Mouse Gene Expression Real-time PCR |
|
|
185 | (1) |
|
8.3 Approaches Based on Other Posterior Summaries |
|
|
186 | (1) |
|
8.4 Model Selection for Logistic Regression |
|
|
187 | (1) |
|
8.5 Graphical Model Selection |
|
|
188 | (2) |
|
|
190 | (2) |
|
8.7 Time-Varying Coefficients |
|
|
192 | (1) |
|
|
193 | (2) |
|
|
195 | (4) |
|
III Extensions to Various Modeling Frameworks |
|
|
199 | (150) |
|
9 Bayesian Model Averaging in Causal Inference |
|
|
201 | (26) |
|
|
|
9.1 Introduction to Causal Inference |
|
|
202 | (4) |
|
9.1.1 Potential Outcomes, Estimands, and Identifying Assumptions |
|
|
203 | (1) |
|
9.1.2 Estimation Strategies Using Outcome Regression, Propensity Scores, or Both |
|
|
204 | (1) |
|
9.1.3 Why Use BMA for Causal Inference? |
|
|
205 | (1) |
|
9.2 Failure of Traditional Model Averaging for Causal Inference Problems |
|
|
206 | (2) |
|
9.3 Prior Distributions Tailored Towards Causal Estimation |
|
|
208 | (4) |
|
9.3.1 Bayesian Adjustment for Confounding Prior |
|
|
209 | (2) |
|
9.3.2 Related Prior Distributions that Link Treatment and Outcome Models |
|
|
211 | (1) |
|
9.4 Bayesian Estimation of Treatment Effects |
|
|
212 | (5) |
|
9.4.1 Outcome Model Based Estimation |
|
|
212 | (1) |
|
9.4.2 Incorporating the Propensity Score into the Outcome Model |
|
|
213 | (1) |
|
9.4.3 BMA Coupled with Traditional Frequentist Estimators |
|
|
214 | (1) |
|
9.4.4 Analysis of Volatile Compounds on Cholesterol Levels |
|
|
215 | (2) |
|
9.5 Assessment of Uncertainty |
|
|
217 | (2) |
|
9.6 Extensions to Shrinkage Priors and Nonlinear Regression Models |
|
|
219 | (2) |
|
|
221 | (2) |
|
|
223 | (4) |
|
10 Variable Selection for Hierarchically-Related Outcomes: Models and Algorithms |
|
|
227 | (24) |
|
|
|
|
|
228 | (1) |
|
10.2 Model Formulations, Computational Challenges and Tradeoffs |
|
|
229 | (5) |
|
10.3 Illustrations on Published Case Studies |
|
|
234 | (7) |
|
10.3.1 Modelling eQTL Signals across Multiple Tissues |
|
|
234 | (4) |
|
10.3.2 Modelling eQTL Hotspots under Different Experimental Conditions |
|
|
238 | (3) |
|
|
241 | (4) |
|
|
245 | (6) |
|
11 Bayesian Variable Selection in Spatial Regression Models |
|
|
251 | (20) |
|
|
|
|
251 | (1) |
|
|
252 | (1) |
|
11.3 Regression Coefficients as Spatial Processes |
|
|
253 | (2) |
|
11.3.1 Spatially-Varying Coefficient Model |
|
|
253 | (1) |
|
11.3.2 Scalar-on-Image Regression |
|
|
254 | (1) |
|
11.4 Sparse Spatial Processes |
|
|
255 | (6) |
|
11.4.1 Discrete Mixture Priors |
|
|
256 | (4) |
|
11.4.2 Continuous Shrinkage Priors |
|
|
260 | (1) |
|
11.5 Application to Microbial Fungi across US Households |
|
|
261 | (1) |
|
|
262 | (5) |
|
|
267 | (4) |
|
12 Effect Selection and Regularization in Structured Additive Distributional Regression |
|
|
271 | (26) |
|
|
|
|
|
272 | (1) |
|
12.2 Structured Additive Distributional Regression |
|
|
273 | (4) |
|
12.2.1 Basic Model Structure |
|
|
273 | (2) |
|
12.2.2 Predictor Components |
|
|
275 | (1) |
|
12.2.3 Common Response Distributions |
|
|
276 | (1) |
|
12.2.4 Basic MCMC Algorithm |
|
|
276 | (1) |
|
12.3 Effect Selection Priors |
|
|
277 | (5) |
|
|
277 | (1) |
|
12.3.2 Spike-and-Slab Priors for Effect Selection |
|
|
278 | (3) |
|
12.3.3 Regularization Priors for Effect Selection |
|
|
281 | (1) |
|
12.4 Application: Childhood Undernutrition in India |
|
|
282 | (6) |
|
|
282 | (1) |
|
12.4.2 A Main Effects Location-Scale Model |
|
|
283 | (3) |
|
12.4.3 Decomposing an Interaction Surface |
|
|
286 | (2) |
|
12.5 Other Regularization Priors for Functional Effects |
|
|
288 | (3) |
|
12.5.1 Locally Adaptive Regularization |
|
|
288 | (1) |
|
12.5.2 Shrinkage towards a Functional Subspace |
|
|
289 | (2) |
|
12.6 Summary and Discussion |
|
|
291 | (2) |
|
|
293 | (4) |
|
13 Sparse Bayesian State-Space and Time-Varying Parameter Models |
|
|
297 | (30) |
|
Sylvia Fruhwirth-Schnatter |
|
|
|
|
298 | (1) |
|
13.2 Univariate Time-Varying Parameter Models |
|
|
299 | (4) |
|
13.2.1 Motivation and Model Definition |
|
|
299 | (2) |
|
13.2.2 The Inverse Gamma Versus the Ridge Prior |
|
|
301 | (2) |
|
13.2.3 Gibbs Sampling in the Non-Centered Parametrization |
|
|
303 | (1) |
|
13.3 Continuous Shrinkage Priors for Sparse TVP Models |
|
|
303 | (7) |
|
13.3.1 From the Ridge Prior to Continuous Shrinkage Priors |
|
|
303 | (4) |
|
13.3.2 Efficient MCMC Inference |
|
|
307 | (1) |
|
13.3.3 Application to US Inflation Modelling |
|
|
308 | (2) |
|
13.4 Spike-and-Slab Priors for Sparse TVP Models |
|
|
310 | (5) |
|
13.4.1 From the Ridge prior to Spike-and-Slab Priors |
|
|
310 | (3) |
|
|
313 | (1) |
|
13.4.3 Application to US Inflation Modelling |
|
|
314 | (1) |
|
|
315 | (6) |
|
13.5.1 Including Stochastic Volatility |
|
|
315 | (1) |
|
13.5.2 Sparse TVP Models for Multivariate Time Series |
|
|
316 | (2) |
|
13.5.3 Non-Gaussian Outcomes |
|
|
318 | (1) |
|
13.5.4 Log Predictive Scores for Comparing Shrinkage Priors |
|
|
318 | (2) |
|
13.5.5 BMA Versus Continuous Shrinkage Priors |
|
|
320 | (1) |
|
|
321 | (2) |
|
|
323 | (4) |
|
14 Bayesian Estimation of Single and Multiple Graphs |
|
|
327 | (22) |
|
|
|
|
328 | (1) |
|
14.2 Bayesian Approaches for Single Graph Estimation |
|
|
328 | (3) |
|
14.2.1 Background on Graphical Models |
|
|
328 | (1) |
|
14.2.2 Bayesian Priors for Undirected Networks |
|
|
329 | (1) |
|
14.2.3 Bayesian Priors for Directed Networks |
|
|
330 | (1) |
|
14.2.4 Bayesian Network Inference for Non-Gaussian Data |
|
|
331 | (1) |
|
14.3 Multiple Graphs with Shared Structure |
|
|
331 | (5) |
|
|
332 | (1) |
|
|
332 | (1) |
|
14.3.3 Simulation and Case Studies |
|
|
333 | (3) |
|
|
336 | (1) |
|
14.4 Multiple Graphs with Shared Edge Values |
|
|
336 | (5) |
|
|
337 | (1) |
|
|
337 | (2) |
|
14.4.3 Analysis of Neuroimaging Data |
|
|
339 | (2) |
|
14.5 Multiple DAGs and Other Multiple Graph Approaches |
|
|
341 | (1) |
|
|
342 | (1) |
|
|
343 | (2) |
|
|
345 | (4) |
|
IV Other Approaches to Bayesian Variable Selection |
|
|
349 | (112) |
|
15 Bayes Factors Based on g-Priors for Variable Selection |
|
|
351 | (20) |
|
|
|
|
351 | (3) |
|
15.2 Variable Selection in the Gaussian Linear Model |
|
|
354 | (9) |
|
15.2.1 Objective Prior Specifications |
|
|
354 | (2) |
|
|
356 | (1) |
|
15.2.3 BayesVarSel and Applications |
|
|
357 | (4) |
|
15.2.4 Sensitivity to Prior Inputs |
|
|
361 | (2) |
|
15.3 Variable Selection for Non-Gaussian Data |
|
|
363 | (2) |
|
15.3.1 glmBfp and Applications |
|
|
364 | (1) |
|
|
365 | (2) |
|
|
367 | (4) |
|
16 Balancing Sparsity and Power: Likelihoods, Priors, and Misspeciflcation |
|
|
371 | (24) |
|
|
|
|
372 | (1) |
|
16.2 BMS in Regression Models |
|
|
373 | (2) |
|
16.3 Interpreting BMS Under Misspeciflcation |
|
|
375 | (1) |
|
|
376 | (1) |
|
16.5 Prior Elicitation and Robustness |
|
|
377 | (1) |
|
16.6 Validity of Model Selection Uncertainty |
|
|
378 | (1) |
|
16.7 Finite-Dimensional Results |
|
|
379 | (1) |
|
16.8 High-Dimensional Results |
|
|
380 | (2) |
|
16.9 Balancing Sparsity and Power |
|
|
382 | (3) |
|
|
385 | (4) |
|
|
385 | (2) |
|
|
387 | (1) |
|
16.10.3 Survival Analysis of Serum Free Light Chain Data |
|
|
388 | (1) |
|
|
389 | (2) |
|
|
391 | (4) |
|
17 Variable Selection and Interaction Detection with Bayesian Additive Regression Trees |
|
|
395 | (20) |
|
|
|
|
|
|
396 | (1) |
|
|
396 | (3) |
|
17.2.1 Specification of the BART Regularization Prior |
|
|
397 | (1) |
|
17.2.2 Posterior Calculation and Information Extraction |
|
|
398 | (1) |
|
17.3 Model-Free Variable Selection with BART |
|
|
399 | (4) |
|
17.3.1 Variable Selection with the Boston Housing Data |
|
|
400 | (3) |
|
17.4 Model-Free Interaction Detection with BART |
|
|
403 | (2) |
|
17.4.1 Variable Selection and Interaction Detection with the Friedman Simulation Setup |
|
|
403 | (1) |
|
17.4.2 Interaction Detection with the Boston Housing Data |
|
|
404 | (1) |
|
17.5 A Utility Based Approach to Variable Selection using BART Inference |
|
|
405 | (5) |
|
17.5.1 Step 1: BART Inference |
|
|
407 | (1) |
|
17.5.2 Step 2: Subset Search |
|
|
407 | (1) |
|
17.5.3 Step 3: Uncertainty Assessment |
|
|
408 | (2) |
|
|
410 | (3) |
|
|
413 | (2) |
|
18 Variable Selection for Bayesian Decision Tree Ensembles |
|
|
415 | (26) |
|
|
|
|
416 | (2) |
|
|
416 | (1) |
|
18.1.2 Possible Strategies |
|
|
417 | (1) |
|
18.2 Bayesian Additive Regression Trees |
|
|
418 | (3) |
|
18.2.1 Decision Trees and their Priors |
|
|
418 | (2) |
|
|
420 | (1) |
|
18.3 Variable Importance Scores |
|
|
421 | (3) |
|
18.3.1 Empirical Bayes and Variable Importance Scores |
|
|
421 | (3) |
|
18.4 Sparsity Inducing Priors on s |
|
|
424 | (6) |
|
18.4.1 The Uniform Prior on s |
|
|
424 | (1) |
|
18.4.2 The Dirichlet Prior |
|
|
424 | (2) |
|
18.4.3 The Spike-and-Forest Prior |
|
|
426 | (2) |
|
18.4.4 Finite Gibbs Priors |
|
|
428 | (2) |
|
18.5 An Illustration: The WIPP Dataset |
|
|
430 | (3) |
|
|
433 | (2) |
|
18.6.1 Interaction Detection |
|
|
433 | (2) |
|
18.6.2 Structure in Predictors |
|
|
435 | (1) |
|
|
435 | (2) |
|
|
437 | (4) |
|
19 Stochastic Partitioning for Variable Selection in Multivariate Mixture of Regression Models |
|
|
441 | (20) |
|
|
|
|
442 | (1) |
|
19.2 Mixture of Univariate Regression Models |
|
|
443 | (3) |
|
|
443 | (2) |
|
19.2.2 Variable Selection |
|
|
445 | (1) |
|
19.3 Stochastic Partitioning for Multivariate Mixtures |
|
|
446 | (3) |
|
|
446 | (1) |
|
19.3.2 Prior Specification |
|
|
446 | (2) |
|
|
448 | (1) |
|
19.3.4 Posterior Inference |
|
|
448 | (1) |
|
19.4 Spavs and Application |
|
|
449 | (3) |
|
19.4.1 Choice of Hyperparameters and Other Input Values |
|
|
449 | (1) |
|
19.4.2 Post-Processing of MCMC Output and Posterior Inference |
|
|
450 | (2) |
|
|
452 | (5) |
|
|
457 | (4) |
Index |
|
461 | |