Muutke küpsiste eelistusi

Bayesian Modeling of Spatio-Temporal Data with R [Kõva köide]

(University of Southhampton)
  • Formaat: Hardback, 434 pages, kõrgus x laius: 234x156 mm, kaal: 1900 g, 60 Tables, black and white; 79 Line drawings, color; 20 Line drawings, black and white; 79 Illustrations, color; 20 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Interdisciplinary Statistics
  • Ilmumisaeg: 02-Mar-2022
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367277980
  • ISBN-13: 9780367277987
Teised raamatud teemal:
  • Formaat: Hardback, 434 pages, kõrgus x laius: 234x156 mm, kaal: 1900 g, 60 Tables, black and white; 79 Line drawings, color; 20 Line drawings, black and white; 79 Illustrations, color; 20 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Interdisciplinary Statistics
  • Ilmumisaeg: 02-Mar-2022
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367277980
  • ISBN-13: 9780367277987
Teised raamatud teemal:
Applied sciences, both physical and social, such as atmospheric, biological, climate, demographic, economic, ecological, environmental, oceanic and political, routinely gather large volumes of spatial and spatio-temporal data in order to make wide ranging inference and prediction. Ideally such inferential tasks should be approached through modelling, which aids in estimation of uncertainties in all conclusions drawn from such data. Unified Bayesian modelling, implemented through user friendly software packages, provides a crucial key to unlocking the full power of these methods for solving challenging practical problems.

Key features of the book:

Accessible detailed discussion of a majority of all aspects of Bayesian methods and computations with worked examples, numerical illustrations and exercises

A spatial statistics jargon buster chapter that enables the reader to build up a vocabulary without getting clouded in modeling and technicalities

Computation and modeling illustrations are provided with the help of the dedicated R package bmstdr, allowing the reader to use well-known packages and platforms, such as rstan, INLA, spBayes, spTimer, spTDyn, CARBayes, CARBayesST, etc

Included are R code notes detailing the algorithms used to produce all the tables and figures, with data and code available via an online supplement

Two dedicated chapters discuss practical examples of spatio-temporal modeling of point referenced and areal unit data

Throughout, the emphasis has been on validating models by splitting data into test and training sets following on the philosophy of machine learning and data science

This book is designed to make spatio-temporal modeling and analysis accessible and understandable to a wide audience of students and researchers, from mathematicians and statisticians to practitioners in the applied sciences. It presents most of the modeling with the help of R commands written in a purposefully developed R package to facilitate spatio-temporal modeling. It does not compromise on rigour, as it presents the underlying theories of Bayesian inference and computation in standalone chapters, which would be appeal those interested in the theoretical details. By avoiding hard core mathematics and calculus, this book aims to be a bridge that removes the statistical knowledge gap from among the applied scientists.

Arvustused

"This book is a fine addition to the literature on linear modelling of spatio-temporal data, both geostatistical and areal unit; the linkage to the authors R package bmstdr is particularly useful." Peter Diggle (Lancaster University, UK)

"This book provides a heroic solo effort by an author who is at the top of the game in Bayesian spatio-temporal analysis. The author is a leader in this community with regard to computation for fitting these demanding hierarchical models. This volume enables applied researchers to implement sound Bayesian modeling, rather than "procedure-based" analysis, to address challenging spatio-temporal issues. The fact that it emphasizes modeling in building bridges to the practitioners application is one of its strongest virtues. The book is well illustrated with lots of graphics and boxes of code, doing this primarily within bmstdr and ggplot, two well-developed R-packages. Attractively, the book emphasizes model assessment and comparison in predictive space, a necessity with spatio-temporal data. The books accessibility is much appreciated, exemplified by a useful "jargon" chapter for basic ideas, complemented with suitable figures. In summary, there are several competitors out there now but this book finds its own place in terms of bringing state-of the-art modeling approachably to exigent application." Alan Gelfand, Duke University, USA

"This book fills an essential gap in the literature about spatial-temporal data modelling. It provides a valuable gentle introduction to the theory and current practice of Bayesian modelling without the need for the reader to fully master the deep statistical theories underpinned by rigorous calculus-based mathematics. Every topic in the book is linked to elaborations in R that takes the reader to the practical level quickly. The book provides valuable insights on all the steps of spatial-temporal data analysis, from the initial exploration to the more refined models. The language is not too technical, and the students will really appreciate chapter 2, Jargon of Spatial and Spatio-Temporal Modelling summarising all relevant definitions in the field. I teach a class on spatial statistics, and I will be happy to use this book as a suggested textbook."

Giovanna Jona-Lasinio, Sapienza University of Rome, Italy"Bayesian spatio-temporal modelling is a complex research field with a daunting array of potential models to choose between and software packages to use. This book is an invaluable guide to statisticians and non-statisticians alike who are new to spatio-temporal modelling, by providing them with an accessible introduction to both Bayesian modelling ideas and the array of different types of spatio-temporal data structures and models that are available. Key to this is the array of practical examples that are illustrated throughout the book, which along with the discussion of the software options for fitting these models will enable others new to the field to easily apply the methods to their own data. The author is an expert in spatio-temporal modelling with long experience in this area with a diverse range of application specialities, and he provides clear and concise descriptions of all the key ideas and concepts." Duncan Lee, University of Glasgow, Scotland

"The quality of the paper and the printing is excellent. Many of the figures are in colors. Some are quite small, but with the scripts on the website you can recreate them yourself if needed. In summary, a good book with an emphasis on careful statistical modelling. On the publishers website (https://www.routledge.com/Bayesian-Modeling-of-Spatio-Temporal-Data-with-R/S ahu/p/book/9780367277987) the table of contents and more information are available." Paul Eilers, ISCB Book Reviews

"There are twelve chapters, two appendices, an excellent bibliography, and an extensive glossary in this book. The topics covered in this book are examples of spatio-temporal data, needed jargons for stochastic processes, exploratory data analytic methods, Bayesian inferential techniques, Bayesian computations, point referenced spatial-temporal data with modeling, area unit data modeling, Gaussian processes, statistical densities, and chapter exercises with solutions in the appendices. The bibliography contains an extensive and up to date. The readers ought to read first and recognize the terminologies before start reading this book. Some special features of thiswell written book are about ocean chlorophyll data analyses, COVID-19 data analytic results, isotropy,Matern covariance function,Monte Carlo integration,Hubbard Brook precipitation data analytic results, childhood vaccination data in Kenya, method of batching, and autoregressive processes among others." Ramalingam Shanmugam, Texas State University

"Sujit Sahu has been prolific at writing papers and creating R packages for spatio-temporal modelling. . . The book fulfils three roles: an introduction to spatio-temporal data analysis; a detailed reference text on Bayesian computation for spatio-temporal models; and a comprehensive vignette for the accompanying R package bmstdr. An elegant web site contains a full set of code for reproducing the analyses in the book. This book is a useful beyond the basics resource for anyone wanting to use random-effects models for solving scientific problems involving spatio-temporal data. The book has 12 chapters plus appendices and feels like longer text than the 400 pages it actually has. It is very well structured as a reference text and a reasonably knowledgeable statistician could absorb most sections without having read the previous material. Specifically, theory and computational methods are covered thoroughly but it is possible to read the more applied sections treating the inferential algorithm as a black box. There are extensive examples which consider both spatial prediction and using inference on model parameters to understand the underlying physical process." Patrick E. Brown, University of Toronto, Canada

Introduction xv
Preface xvii
1 Examples of spatio-temporal data
1(22)
1.1 Introduction
1(1)
1.2 Spatio-temporal data types
2(2)
1.3 Point referenced data sets used in the book
4(8)
1.3.1 New York air pollution data set
4(2)
1.3.2 Air pollution data from England and Wales
6(1)
1.3.3 Air pollution in the eastern US
7(1)
1.3.4 Hubbard Brook precipitation data
7(2)
1.3.5 Ocean chlorophyll data
9(1)
1.3.6 Atlantic ocean temperature and salinity data set
10(2)
1.4 Areal unit data sets used in the book
12(8)
1.4.1 Covid-19 mortality data from England
12(2)
1.4.2 Childhood vaccination coverage in Kenya
14(2)
1.4.3 Cancer rates in the United States
16(2)
1.4.4 Hospitalization data from England
18(1)
1.4.5 Child poverty in London
18(2)
1.5 Conclusion
20(1)
1.6 Exercises
20(3)
2 Jargon of spatial and spatio-temporal modeling
23(26)
2.1 Introduction
23(1)
2.2 Stochastic processes
23(2)
2.3 Stationarity
25(1)
2.4 Variogram and covariogram
26(2)
2.5 Isotropy
28(1)
2.6 Matern covariance function
29(3)
2.7 Gaussian processes (GP) GP(O, C(Ψ))
32(2)
2.8 Space-time covariance functions
34(3)
2.9 Kriging or optimal spatial prediction
37(1)
2.10 Autocorrelation and partial autocorrelation
38(1)
2.11 Measures of spatial association for areal data
39(2)
2.12 Internal and external standardization for areal data
41(1)
2.13 Spatial smoothers
42(2)
2.14 CAR models
44(2)
2.15 Point processes
46(1)
2.16 Conclusion
47(1)
2.17 Exercises
47(2)
3 Exploratory data analysis methods
49(20)
3.1 Introduction
49(1)
3.2 Exploring point reference data
50(5)
3.2.1 Non-spatial graphical exploration
50(1)
3.2.2 Exploring spatial variation
51(4)
3.3 Exploring spatio-temporal point reference data
55(3)
3.4 Exploring areal Covid-19 case and death data
58(8)
3.4.1 Calculating the expected numbers of cases and deaths
59(2)
3.4.2 Graphical displays and covariate information
61(5)
3.5 Conclusion
66(1)
3.6 Exercises
67(2)
4 Bayesian inference methods
69(52)
4.1 Introduction
69(4)
4.2 Prior and posterior distributions
73(1)
4.3 The Bayes theorem for probability
73(1)
4.4 Bayes theorem for random variables
74(2)
4.5 Posterior α Likelihood × Prior
76(1)
4.6 Sequential updating of the posterior distribution
77(1)
4.7 Normal-Normal example
77(3)
4.8 Bayes estimators
80(4)
4.8.1 Posterior mean
81(1)
4.8.2 Posterior median
82(1)
4.8.3 Posterior mode
82(2)
4.9 Credible interval
84(1)
4.10 Prior Distributions
85(2)
4.10.1 Conjugate prior distribution
85(1)
4.10.2 Locally uniform prior distribution
86(1)
4.10.3 Non-informative prior distribution
86(1)
4.11 Posterior predictive distribution
87(4)
4.11.1 Normal-Normal example
89(2)
4.12 Prior predictive distribution
91(1)
4.13 Inference for more than one parameter
92(1)
4.14 Normal example with both parameters unknown
93(5)
4.15 Model choice
98(5)
4.15.1 The Bayes factor
98(2)
4.15.2 Posterior probability of a model
100(1)
4.15.3 Hypothesis testing
101(2)
4.16 Criteria-based Bayesian model selection
103(10)
4.16.1 The DIC
105(2)
4.16.2 The WAIC
107(3)
4.16.3 Posterior predictive model choice criteria (PMCC)
110(3)
4.17 Bayesian model checking
113(2)
4.17.1 Nuisance parameters
114(1)
4.18 The pressing need for Bayesian computation
115(1)
4.19 Conclusion
116(1)
4.20 Exercises
116(5)
5 Bayesian computation methods
121(32)
5.1 Introduction
121(1)
5.2 Two motivating examples for Bayesian computation
122(2)
5.3 Monte Carlo integration
124(1)
5.4 Importance sampling
125(3)
5.5 Rejection sampling
128(1)
5.6 Notions of Markov chains for understanding MCMC
129(2)
5.7 Metropolis-Hastings algorithm
131(3)
5.8 The Gibbs sampler
134(2)
5.9 Hamiltonian Monte Carlo
136(3)
5.10 Integrated nested Laplace approximation (INLA)
139(2)
5.11 MCMC implementation issues and MCMC output processing
141(5)
5.11.1 Diagnostics based on visual plots and autocorrelation
142(1)
5.11.2 How many chains?
143(2)
5.11.3 Method of batching
145(1)
5.12 Computing Bayesian model choice criteria
146(3)
5.12.1 Computing DIC
146(1)
5.12.2 Computing WAIC
147(1)
5.12.3 Computing PMCC
148(1)
5.12.4 Computing the model choice criteria for the New York air pollution data
149(1)
5.13 Conclusion
149(1)
5.14 Exercises
150(3)
6 Bayesian modeling for point referenced spatial data
153(40)
6.1 Introduction
153(2)
6.2 Model versus procedure based methods
155(2)
6.3 Formulating linear models
157(7)
6.3.1 Data set preparation
157(1)
6.3.2 Writing down the model formula
158(3)
6.3.3 Predictive distributions
161(3)
6.4 Linear model for spatial data
164(4)
6.4.1 Spatial model fitting using bmstdr
166(2)
6.5 A spatial model with nugget effect
168(3)
6.5.1 Marginal model implementation
170(1)
6.6 Model fitting using software packages
171(12)
6.6.1 Spbayes
171(3)
6.6.2 R-Stan
174(5)
6.6.3 R-inla
179(4)
6.7 Model choice
183(1)
6.8 Model validation methods
184(7)
6.8.1 Four most important model validation criteria
185(2)
6.8.2 K-fold cross-validation
187(1)
6.8.3 Illustrating the model validation statistics
188(3)
6.9 Posterior predictive checks
191(1)
6.10 Conclusion
191(1)
6.11 Exercises
192(1)
7 Bayesian modeling for point referenced spatio-temporal data
193(46)
7.1 Introduction
193(4)
7.2 Models with spatio-temporal error distribution
197(9)
7.2.1 Posterior distributions
198(1)
7.2.2 Predictive distributions
199(2)
7.2.3 Simplifying the expressions: Σ12H-1 and Σ12H-1Σ21
201(2)
7.2.4 Estimation of υ
203(1)
7.2.5 Illustration of a spatio-temporal model fitting
203(3)
7.3 Independent GP model with nugget effect
206(11)
7.3.1 Full model implementation using spTimer
207(5)
7.3.2 Marginal model implementation using Stan
212(5)
7.4 Auto regressive (AR) models
217(7)
7.4.1 Hierarchical AR Models using spTimer
217(4)
7.4.2 AR modeling using INLA
221(3)
7.5 Spatio-temporal dynamic models
224(5)
7.5.1 A spatially varying dynamic model spTDyn
224(2)
7.5.2 A dynamic spatio-temporal model using spBayes
226(3)
7.6 Spatio-temporal models based on Gaussian predictive processes (GPP)
229(5)
7.7 Performance assessment of all the models
234(2)
7.8 Conclusion
236(1)
7.9 Exercises
237(2)
8 Practical examples of point referenced data modeling
239(38)
8.1 Introduction
239(1)
8.2 Estimating annual average air pollution in England and Wales
239(5)
8.3 Assessing probability of non-compliance in air pollution
244(7)
8.4 Analyzing precipitation data from the Hubbard Experimental Forest
251(15)
8.4.1 Exploratory data analysis
251(6)
8.4.2 Modeling and validation
257(4)
8.4.3 Predictive inference from model fitting
261(1)
8.4.3.1 Selecting gauges for possible downsizing
261(1)
8.4.3.2 Spatial patterns in 3-year rolling average annual precipitation
262(2)
8.4.3.3 Catchment specific trends in annual precipitation
264(1)
8.4.3.4 A note on model fitting
265(1)
8.5 Assessing annual trends in ocean chlorophyll levels
266(2)
8.6 Modeling temperature data from roaming ocean Argo floats
268(7)
8.6.1 Predicting an annual average temperature map
272(3)
8.7 Conclusion
275(1)
8.8 Exercises
276(1)
9 Bayesian forecasting for point referenced data
277(24)
9.1 Introduction
277(3)
9.2 Exact forecasting method for GP
280(4)
9.2.1 Example: Hourly ozone levels in the Eastern US
281(3)
9.3 Forecasting using the models implemented in spTimer
284(4)
9.3.1 Forecasting using GP models
285(1)
9.3.2 Forecasting using AR models
286(1)
9.3.3 Forecasting using the GPP models
287(1)
9.4 Forecast calibration methods
288(5)
9.4.1 Theory
288(2)
9.4.2 Illustrating the calibration plots
290(3)
9.5 Example comparing GP, AR and GPP models
293(2)
9.6 Example: Forecasting ozone levels in the Eastern US
295(5)
9.7 Conclusion
300(1)
9.8 Exercises
300(1)
10 Bayesian modeling for areal unit data
301(32)
10.1 Introduction
301(1)
10.2 Generalized linear models
302(6)
10.2.1 Exponential family of distributions
302(2)
10.2.2 The link function
304(2)
10.2.3 Offset
306(1)
10.2.4 The implied likelihood function
307(1)
10.2.5 Model specification using a GLM
307(1)
10.3 Example: Bayesian generalized linear model
308(4)
10.3.1 GLM fitting with binomial distribution
309(1)
10.3.2 GLM fitting with Poisson distribution
310(1)
10.3.3 GLM fitting with normal distribution
311(1)
10.4 Spatial random effects for areal unit data
312(2)
10.5 Revisited example: Bayesian spatial generalized linear model
314(4)
10.5.1 Spatial GLM fitting with binomial distribution
315(1)
10.5.2 Spatial GLM fitting with Poisson distribution
316(1)
10.5.3 Spatial GLM fitting with normal distribution
317(1)
10.6 Spatio-temporal random effects for areal unit data
318(2)
10.6.1 Linear model of trend
318(1)
10.6.2 Anova model
319(1)
10.6.3 Separable model
319(1)
10.6.4 Temporal autoregressive model
320(1)
10.7 Example: Bayesian spatio-temporal generalized linear model
320(7)
10.7.1 Spatio-temporal GLM fitting with binomial distribution
321(1)
10.7.2 Spatio-temporal GLM fitting with Poisson distribution
322(1)
10.7.3 Examining the model fit
323(2)
10.7.4 Spatio-temporal GLM fitting with normal distribution
325(2)
10.8 Using INLA for model fitting and validation
327(3)
10.9 Conclusion
330(1)
10.10 Exercises
331(2)
11 Further examples of areal data modeling
333(24)
11.1 Introduction
333(1)
11.2 Assessing childhood vaccination coverage in Kenya
333(5)
11.3 Assessing trend in cancer rates in the US
338(4)
11.4 Localized modeling of hospitalization data from England
342(6)
11.4.1 A localized model
344(1)
11.4.2 Model fitting results
345(3)
11.5 Assessing trend in child poverty in London
348(6)
11.5.1 Adaptive CAR-AR model
350(1)
11.5.2 Model fitting results
351(3)
11.6 Conclusion
354(1)
11.7 Exercises
354(3)
12 Gaussian processes for data science and other applications
357(20)
12.1 Introduction
357(3)
12.2 Learning methods and their Bayesian interpretations
360(9)
12.2.1 Learning with empirical risk minimization
362(2)
12.2.2 Learning by complexity penalization
364(1)
12.2.3 Supervised learning and generalized linear models
365(1)
12.2.4 Ridge regression, LASSO and elastic net
365(3)
12.2.5 Regression trees and random forests
368(1)
12.3 Gaussian Process (GP) prior-based machine learning
369(4)
12.3.1 Example: predicting house prices
371(2)
12.4 Use of GP in Bayesian calibration of computer codes
373(2)
12.5 Conclusion
375(1)
12.6 Exercises
375(2)
Appendix A Statistical densities used in the book
377(6)
A.1 Continuous
377(4)
A.2 Discrete
381(2)
Appendix B Answers to selected exercises
383(12)
B.1 Solutions to Exercises in
Chapter 4
383(7)
B.2 Solutions to Exercises in
Chapter 5
390(5)
Bibliography 395(12)
Glossary 407(2)
Index 409
Sujit K. Sahu is a Professor of Statistics at the University of Southampton. He has co-authored more than 60 papers on Bayesian computation and modeling of spatio-temporal data. He has also contributed to writing specialist R packages for modeling and analysis of such data.