Muutke küpsiste eelistusi

E-raamat: Time Series for Data Science: Analysis and Forecasting [Taylor & Francis e-raamat]

, (Technical Assistant Professor, Southern Methodist University), (Southern Methodist University, Dallas, Texas, USA)
  • Formaat: 528 pages, 74 Tables, black and white; 268 Line drawings, black and white; 4 Halftones, black and white; 272 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Texts in Statistical Science
  • Ilmumisaeg: 01-Aug-2022
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9781003089070
  • Taylor & Francis e-raamat
  • Hind: 170,80 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 244,00 €
  • Säästad 30%
  • Formaat: 528 pages, 74 Tables, black and white; 268 Line drawings, black and white; 4 Halftones, black and white; 272 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Texts in Statistical Science
  • Ilmumisaeg: 01-Aug-2022
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9781003089070
Data Science students and practitioners want to find a forecast that works and dont want to be constrained to a single forecasting strategy, Time Series for Data Science: Analysis and Forecasting discusses techniques of ensemble modelling for combining information from several strategies. Covering time series regression models, exponential smoothing, Holt-Winters forecasting, and Neural Networks. It places a particular emphasis on classical ARMA and ARIMA models that is often lacking from other textbooks on the subject.

This book is an accessible guide that doesnt require a background in calculus to be engaging but does not shy away from deeper explanations of the techniques discussed.

Features:





Provides a thorough coverage and comparison of a wide array of time series models and methods: Exponential Smoothing, Holt Winters, ARMA and ARIMA, deep learning models including RNNs, LSTMs, GRUs, and ensemble models composed of combinations of these models. Introduces the factor table representation of ARMA and ARIMA models. This representation is not available in any other book at this level and is extremely useful in both practice and pedagogy. Uses real world examples that can be readily found via web links from sources such as the US Bureau of Statistics, Department of Transportation and the World Bank. There is an accompanying R package that is easy to use and requires little or no previous R experience. The package implements the wide variety of models and methods presented in the book and has tremendous pedagogical use.
Preface xv
Acknowledgments xix
Authors xxi
1 Working with Data Collected Over Time
1(40)
1.1 Introduction
1(2)
1.2 Time Series Datasets
3(7)
1.2.1 Cyclic Data
3(1)
1.2.1.1 SunspotData
3(2)
1.2.1.2 DFW Temperature Data
5(1)
1.2.1.3 Air Passengers Data
6(1)
1.2.2 Trends
7(1)
1.2.2.1 Real Datasets That Have Trending Behavior
8(1)
1.2.2.2 The Problem with Trends
9(1)
1.3 The Programming Language R
10(19)
1.3.1 The tswge Time Series Package
11(1)
1.3.2 Base R
12(1)
1.3.3 Plotting Time Series Data in R
12(1)
1.3.4 The ts Object
13(1)
1.3.4.1 Creating a ts Object
14(2)
1.3.4.2 More About ts Objects
16(2)
1.3.5 The plotts wge Function in tswge
18(1)
1.3.5.1 Modifying the Appearance of Plots Using the tswge plotts wge Function
19(1)
1.3.6 Loading Time Series Data into R
19(1)
1.3.6.1 The csv file
20(1)
1.3.6.2 The txt file
21(1)
1.3.6.3 Other File Formats
21(1)
1.3.7 Accessing Time Series Data
22(1)
1.3.7.1 Accessing Data from the Internet
22(6)
1.3.7.2 Business / Proprietary Data: Ozona Bar and Grill
28(1)
1.4 Dealing with Messy Data
29(8)
1.4.1 Preparing Time Series Data for Analysis: Cleaning, Wrangling, and Imputation
29(1)
1.4.1.1 Missing Data
29(4)
1.4.1.2 Downloading When no csv Download Option Is Available
33(1)
1.4.1.3 Data that Require Cleaning and Wrangling
34(3)
1.4.1.4 Programatic Method of Ingestion and Wrangling Data from Tables on Web Pages
37(1)
1.5 Concluding Remarks
37(4)
Appendix 1A
37(4)
2 Exploring Time Series Data
41(34)
2.1 Understanding and Visualizing Data
41(19)
2.1.1 Smoothing Time Series Data
41(1)
2.1.1.1 Smoothing Data Using a Centered Moving Average Smoother
42(2)
2.1.1.2 Other Methods Available for Smoothing Data
44(1)
2.1.1.3 Moving Average Smoothing versus Aggregating
44(3)
2.1.1.4 Using Moving Average Smoothing for Estimating Trend in Data with Fixed Cycle Lengths
47(2)
2.1.2 Decomposing Seasonal Data
49(1)
2.1.2.1 Additive Decompositions
50(6)
2.1.2.2 Multiplicative Decompositions
56(2)
2.1.3 Seasonal Adjustment
58(1)
2.1.3.1 Additive Seasonal Adjustment
58(1)
2.1.3.2 Multiplicative Seasonal Adjustment
59(1)
2.2 Forecasting
60(10)
2.2.1 Predictive Moving Average Smoother
62(1)
2.2.2.1 Forecasting with Exponential Smoothing beyond the Observed Dataset
63(1)
2.2.2 Exponential Smoothing
64(2)
2.2.3 Holt-Winters Forecasting
66(1)
2.2.3.1 Additive Holt-Winters Equations
66(1)
2.2.3.2 Multiplicative Holt-Winters Equations
67(1)
2.2.4 Assessing the Accuracy of Forecasts
68(2)
2.3 Concluding Remarks
70(5)
Appendix 2A
71(4)
3 Statistical Basics for Time Series Analysis
75(46)
3.1 Statistics Basics
75(14)
3.1.1 Univariate Data
75(5)
3.1.2 Multivariate Data
80(1)
3.1.2.1 Measuring Relationships between Two Random Variables in a Bivariate Random Sample
81(1)
3.1.2.2 Assessing Association from a Bivariate Random Sample
82(3)
3.1.3 Independent vs Dependent Data
85(4)
3.2 Time Series and Realizations
89(11)
3.2.1 Multiple Realizations
90(2)
3.2.1.1 Time Series 1: Xt
92(2)
3.2.1.2 Time Series 2: Vt (Example 3.3 Continued)
94(3)
3.2.2 The Effect of Realization Length
97(3)
3.3 Stationary Time Series
100(13)
3.3.1 Plotting the Autocorrelations of a Stationary Process
104(1)
3.3.2 Estimating the Parameters of a Stationary Process
105(1)
3.3.2.1 Estimating /J
105(4)
3.3.2.2 Estimating the Variance
109(1)
3.3.2.3 Estimating the Autocovariance and Autocorrelation
109(1)
3.3.2.4 Plotting Sample Autocorrelations
110(3)
3.4 Concluding Remarks
113(8)
Appendix 3A
113(1)
Appendix 3B
114(7)
4 The Frequency Domain
121(30)
4.1 Trigonometric Review and Terminology
122(3)
4.2 The Spectral Density
125(15)
4.2.1 Euler's Formula
126(1)
4.2.2 Definition and Properties of the Spectrum and Spectral Density
127(1)
4.2.2.1 The Nyquist Frequency
128(1)
4.2.2.2 Frequency f=0
129(2)
4.2.2.3 The Spectral Density and the Autocorrelation Function
131(1)
4.2.3 Estimating the Spectral Density
132(1)
4.2.3.1 The Sample Spectral Density
132(2)
4.2.3.2 Smoothing the Sample Spectral Density
134(1)
4.2.3.3 Parzen Spectral Density Estimate vs Sample Autocorrelations
135(2)
4.2.3.4 Why We Plot Spectral Densities in Log Scale
137(3)
4.3 Smoothing and Filtering
140(5)
4.3.1 Types of Filters
140(3)
4.3.2 The Butterworth Filter
143(2)
4.4 Concluding Remarks
145(6)
Appendix 4A
145(6)
5 ARMA Models
151(66)
5.1 The Autoregressive Model
151(40)
5.1.1 The AR(1) Model
152(1)
5.1.1.1 The AR(1) in Backshift Operator Notation
152(1)
5.1.1.2 The AR(1) Characteristic Polynomial and Characteristic Equation
153(1)
5.1.1.3 Properties of a Stationary AR(1) Model
153(2)
5.1.1.4 Spectral Density of an AR(1)
155(1)
5.1.1.5 AR(1) Models with Positive Roots of the Characteristic Equation
155(4)
5.1.1.6 AR(1) Models with Roots Close to+1
159(1)
5.1.1.7 AR(1) Models with Negative Roots of the Characteristic Equation
160(2)
5.1.1.8 Nonstationary 1s,-order Models
162(1)
5.1.1.9 Final Comments Regarding AR(1) Models
162(1)
5.1.2 The AR(2) Model
163(1)
5.1.2.1 Facts about the AR(2) Model
164(1)
5.1.2.2 Operator Notation and Characteristic Equation for an AR(2)
164(3)
5.1.2.3 Stationary AR(2) with Two Real Roots
167(1)
5.1.2.4 Stationary AR(2) with Complex Conjugate Roots
168(5)
5.1.2.5 Summary of AR(1) and AR(2) Behavior
173(1)
5.1.3 The AR(p) Models
173(1)
5.1.3.1 Facts about the AR(p) Model
174(1)
5.1.3.2 Operator Notation and Characteristic Equation for an AR(p)
175(1)
5.1.3.3 Factoring the AR(p) Characteristic Polynomial
176(1)
5.1.3.4 Factor Tables for AR(p) Models
177(7)
5.1.3.5 Dominance of Roots Close to the Unit Circle
184(3)
5.1.4 Linear Filters, the General Linear Process, and AR(p) Models
187(1)
5.1.4.1 AR(1) in GLP Form
188(2)
5.1.4.2 AR(p) in GLP Form
190(1)
5.2 Autoregressive-Moving Average (ARMA) Models
191(16)
5.2.1 Moving Average Models
192(1)
5.2.1.1 The MA(1) Model
192(3)
5.2.1.2 The MA(2) Model
195(3)
5.2.1.3 The General MA(g) Model
198(1)
5.2.1.4 Invertibility
198(3)
5.2.2 ARMA(p,g) Models
201(1)
5.2.2.1 Stationarity and Invertibility of an ARMA(p,q) Process
201(2)
5.2.2.2 AR versus ARMA Models
203(4)
5.3 Concluding Remarks
207(10)
Appendix 5A
208(3)
Appendix 5B
211(6)
6 ARMA Fitting and Forecasting
217(58)
6.1 Fitting ARMA Models to Data
217(28)
6.1.1 Estimating the Parameters of an ARMA(p,q) Model
218(1)
6.1.1.1 Maximum Likelihood Estimation of the Φ and θ Coefficients of an ARMA Model
218(2)
6.1.1.2 Estimating μ
220(1)
6.1.1.3 Estimating σa2
220(3)
6.1.1.4 Alternative estimates for AR(p) models
223(6)
6.1.2 ARMA Model Identification
229(1)
6.1.2.1 Plotting the Data and Checking for White Noise
229(1)
6.1.2.2 Model Identification Types
230(1)
6.1.2.3 AlC-type Measures for ARMA Model Fitting
231(8)
6.1.2.4 The Special Case of AR Model Identification
239(6)
6.2 Forecasting Using an ARMA(p,q) Model
245(25)
6.2.1 ARMA Forecasting Setting, Notation, and Strategy
245(1)
6.2.1.1 Strategy and Notation
245(1)
6.2.1.2 Forecasting Xt0 + 1 for / ≤, 0
246(1)
6.2.1.3 Forecasting at0 + 1 for / > 0
246(1)
6.2.2 Forecasting Using an AR(p) Model
246(1)
6.2.2.1 Forecasting Using an AR(1) Model
246(5)
6.2.3 Basic Formula for Forecasting Using an ARMA(p,q) Model
251(4)
6.2.4 Eventual Forecast Functions
255(1)
6.2.5 Probability Limits for ARMA Forecasts
255(1)
6.2.5.1 Facts about Forecast Errors
256(4)
6.2.5.2 Lack of Symmetry
260(1)
6.2.6 Assessing Forecast Performance
260(1)
6.2.6.1 How "Good" Are the Forecasts?
260(1)
6.2.6.2 Some Strategies for Using RMSE to Measure Forecast Performance
261(9)
6.3 Concluding Remarks
270(5)
Appendix 6A
270(5)
7 ARIMA and Seasonal Models
275(68)
7.1 ARIMA(p,d,q) Models
275(29)
7.1.1 Properties of the ARIMA(p,d,q) Model
276(1)
7.1.1.1 Some ARIMA(p,d,q) Models
276(1)
7.1.1.2 Characteristic Equations for Models (a)-(c)
277(1)
7.1.1.3 Limiting Autocorrelations
277(2)
7.1.1.4 Lack of Attraction to a Mean
279(1)
7.1.1.5 Random Trends
280(1)
7.1.1.6 Differencing an ARIMA(0,1,0) Model
281(1)
7.1.1.7 ARIMA Models with Stationary and Nonstationary Components
282(1)
7.1.1.8 The Stationary AR(2) Model: (1-1.4B+.65B2)Xt = at
282(5)
7.1.2 Model Identification and Parameter Estimation of AR\MA(p,d,q) Models
287(1)
7.1.2.1 Deciding Whether to Include One or More 1-B Factors (That Is, Unit Roots) in the Model
287(4)
7.1.2.2 General Procedure for Fitting an ARIMA(p,d,q) Model to a Set of Time Series Data
291(7)
7.1.3 Forecasting with ARIMA Models
298(1)
7.1.3.1 ARMA Forecast Formula
298(6)
7.2 Seasonal Models
304(23)
7.2.1 Properties of Seasonal Models
305(1)
7.2.1.1 Some Seasonal Models
305(6)
7.2.2 Fitting Seasonal Models to Data
311(1)
7.2.2.1 Overfitting
312(9)
7.2.3 Forecasting Using Seasonal Models
321(6)
7.3 ARCH and GARCH Models
327(11)
7.3.1 ARCH(1) Model
329(3)
7.3.2 The ARCH(p) and GARCH(p,q) Processes
332(2)
7.3.3 Assessing the Appropriateness of an ARCH/GARCH Fit to a Set of Data
334(1)
7.3.4 Fitting ARCH/GARCH Models to Simulated Data
335(2)
7.3.5 Modeling Daily Rates of Return Data
337(1)
7.4 Concluding Remarks
338(5)
Appendix 7A
339(1)
Appendix 7B
340(3)
8 Time Series Regression
343(38)
8.1 Line+Noise Models
343(17)
8.1.1 Testing for Linear Trend
344(1)
8.1.1.1 Testing for Trend Using Simple Linear Regression
344(4)
8.1.1.2 A t-test Simulation
348(2)
8.1.1.3 Cochrane-Orcutt Test for Trend
350(2)
8.1.1.4 Bootstrap-Based Test for Trend
352(3)
8.1.1.5 Other Methods for Testing for Trend in Time Series Data
355(1)
8.1.2 Fitting Line+Noise Models to Data
355(3)
8.1.3 Forecasting Using Line+Noise Models
358(2)
8.2 Cosine Signal+Noise Models
360(16)
8.2.1 Fitting a Cosine Signal+Noise Model to Data
361(3)
8.2.2 Forecasting Using Cosine Signal+Noise Models
364(2)
8.2.2.1 Using fore.sigplusnoise.wge
366(1)
8.2.3 Deciding Whether to Fit a Cosine Signal+Noise Model to a Set of Data
367(2)
8.2.3.1 A Closer Look at the Cyclic Behavior
369(7)
8.3 Concluding Remarks
376(5)
Appendix 8A
377(4)
9 Model Assessment
381(36)
9.1 Residual Analysis
381(9)
9.1.1 Checking Residuals for White Noise
383(1)
9.1.1.1 Check Residual Sample Autocorrelations against 95% Limit Lines
383(1)
9.1.1.2 Ljung-Box Test
384(5)
9.1.2 Checking the Residuals for Normality
389(1)
9.2 Case Study 1: Modeling the Global Temperature Data
390(17)
9.2.1 A Stationary Model
391(1)
9.2.1.1 Checking the Residuals
392(1)
9.2.1.2 Realizations and their Characteristics
393(1)
9.2.1.3 Forecasting Based on the ARMA(4,1) Model
394(1)
9.2.2 A Correlation-Based Model with a Unit Root
395(1)
9.2.2.1 Checking the Residuals
396(1)
9.2.2.2 Realizations and their Characteristics
397(1)
9.2.2.3 Forecasting Based on ARIMA(0,1,1) Model
398(2)
9.2.3 Line+Noise Models for the Global Temperature Data
400(1)
9.2.3.1 Checking the Residuals, at, for White Noise
401(1)
9.2.3.2 Realizations and their Characteristics
402(1)
9.2.3.3 Forecasting Based on the Signal-plus-Noise Model
403(2)
9.2.3.4 Other Forecasts
405(2)
9.3 Case Study 2: Comparing Models for the Sunspot Data
407(6)
9.3.1 Selecting the Models for Comparison
408(1)
9.3.2 Do the Models Whiten the Residuals?
409(1)
9.3.3 Do Realizations and Their Characteristics Behave Like the Data?
410(2)
9.3.4 Do Forecasts Reflect What Is Known about the Physical Setting?
412(1)
9.3.4.1 Final Comments about the Models Fit to the Sunspot Data
412(1)
9.4 Comprehensive Analysis of Time Series Data: A Summary
413(1)
9.5 Concluding Remarks
413(4)
Appendix 9A
414(3)
10 Multivariate Time Series
417(38)
10.1 Introduction
417(1)
10.2 Multiple Regression with Correlated Errors
417(11)
10.2.1 Notation for Multiple Regression with Correlated Errors
418(1)
10.2.2 Fitting Multiple Regression Models to Time Series Data
419(3)
10.2.2.1 Including a Trend Term in the Multiple Regression Model
422(1)
10.2.2.2 Adding Lagged Variables
423(2)
10.2.2.3 Using Lagged Variables and a Trend Variable
425(1)
10.2.3 Cross Correlation
426(2)
10.3 Vector Autoregressive (VAR) Models
428(9)
10.3.1 Forecasting with VAR(p) Models
429(2)
10.3.1.1 Univariate Forecasts
431(1)
10.3.1.2 VAR Analysis
432(4)
10.3.1.3 Comparing RMSEs
436(1)
10.3.1.4 Final Comments
437(1)
10.4 Relationship between MLR and VAR Models
437(1)
10.5 A Comprehensive and Final Example: Los Angeles Cardiac Mortality
437(9)
10.5.1 Applying the VAR(p) to the Cardiac Mortality Data
439(2)
10.5.2 The Seasonal VAR(p) Model
441(3)
10.5.3 Forecasting the Future
444(1)
10.5.3.1 Short vs. Long Term Forecasts
445(1)
10.6 Conclusion
446(9)
Appendix 10A
447(2)
Appendix 10B
449(6)
11 Deep Neural Network-Based Time Series Models
455(42)
11.1 Introduction
455(1)
11.2 The Perceptron
455(2)
11.3 The Extended Perceptron for Univariate Time Series Data
457(18)
11.3.1 A Neural Network Similar to the AR(1)
458(1)
11.3.1.1 The Architecture
458(1)
11.3.1.2 Fitting the MLP
458(2)
11.3.1.3 Forecasting
460(3)
11.3.1.4 Cross Validation Using the Rolling Window RMSE
463(1)
11.3.2 A Neural Network Similar to AR(p): Adding More Lags
464(3)
11.3.3 A Deeper Neural Network: Adding a Hidden Layer
467(3)
11.3.3.1 Differences and Seasonal" Dummies"
470(5)
11.4 The Extended Perceptron for Multivariate Time Series Data
475(12)
11.4.1 Forecasting Melanoma Using Sunspots
475(1)
11.4.1.1 Architecture
475(1)
11.4.1.2 Fitting the Baseline Model
475(1)
11.4.1.3 Forecasting Future Sunspot Data for Predicting Future Melanoma
476(2)
11.4.1.4 Forecasting the Last Eight Years of Melanoma
478(1)
11.4.1.5 Fitting a Competing Model
478(1)
11.4.1.6 Assessing the Competing Model on the Last Eight Years of Melanoma Data
479(1)
11.4.1.7 Forecasting the Next Eight Years of Melanoma
480(2)
11.4.2 Forecasting Cardiac Mortality Using Temperature and Particulates
482(1)
11.4.2.1 General Architecture
482(1)
11.4.2.2 Train / Test Split
483(1)
11.4.2.3 Forecasting Covariates: Temperature and Particulates
483(1)
11.4.2.4 Model Without Seasonal Indicator Variables
484(2)
11.4.2.5 Model With Seasonal Indicator Variables
486(1)
11.5 An "Ensemble" Model
487(4)
11.5.1 Final Forecasts for the Next Fifty-Two Weeks
488(1)
11.5.2 Final Forecasts for the Next Three Years (Longer Term Forecasts)
489(2)
11.6 Concluding Remarks
491(6)
Appendix 11A
491(1)
Appendix 11B
492(5)
References 497(4)
Index 501
Wayne Woodward, Bivin Sadler, Stephen Robertson