Muutke küpsiste eelistusi

Foundations of Predictive Analytics [Pehme köide]

  • Formaat: Paperback / softback, 338 pages, kõrgus x laius: 234x156 mm, kaal: 453 g
  • Ilmumisaeg: 05-Sep-2019
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367381680
  • ISBN-13: 9780367381684
Teised raamatud teemal:
  • Formaat: Paperback / softback, 338 pages, kõrgus x laius: 234x156 mm, kaal: 453 g
  • Ilmumisaeg: 05-Sep-2019
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367381680
  • ISBN-13: 9780367381684
Teised raamatud teemal:
Drawing on the authors two decades of experience in applied modeling and data mining, Foundations of Predictive Analytics presents the fundamental background required for analyzing data and building models for many practical applications, such as consumer behavior modeling, risk and marketing analytics, and other areas. It also discusses a variety of practical topics that are frequently missing from similar texts.





The book begins with the statistical and linear algebra/matrix foundation of modeling methods, from distributions to cumulant and copula functions to CornishFisher expansion and other useful but hard-to-find statistical techniques. It then describes common and unusual linear methods as well as popular nonlinear modeling approaches, including additive models, trees, support vector machine, fuzzy systems, clustering, naïve Bayes, and neural nets. The authors go on to cover methodologies used in time series and forecasting, such as ARIMA, GARCH, and survival analysis. They also present a range of optimization techniques and explore several special topics, such as DempsterShafer theory.





An in-depth collection of the most important fundamental material on predictive analytics, this self-contained book provides the necessary information for understanding various techniques for exploratory data analysis and modeling. It explains the algorithmic details behind each technique (including underlying assumptions and mathematical formulations) and shows how to prepare and encode data, select variables, use model goodness measures, normalize odds, and perform reject inference.





Web ResourceThe books website at www.DataMinerXL.com offers the DataMinerXL software for building predictive models. The site also includes more examples and information on modeling.
List of Figures xv
List of Tables xvii
Preface xix
1 Introduction 1(8)
1.1 What Is a Model?
1(1)
1.2 What Is a Statistical Model?
2(1)
1.3 The Modeling Process
3(1)
1.4 Modeling Pitfalls
4(1)
1.5 Characteristics of Good Modelers
5(2)
1.6 The Future of Predictive Analytics
7(2)
2 Properties of Statistical Distributions 9(54)
2.1 Fundamental Distributions
9(29)
2.1.1 Uniform Distribution
9(1)
2.1.2 Details of the Normal (Gaussian) Distribution
10(9)
2.1.3 Lognormal Distribution
19(1)
2.1.4 F Distribution
20(2)
2.1.5 Chi-Squared Distribution
22(3)
2.1.6 Non-Central Chi-Squared Distribution
25(3)
2.1.7 Student's t-Distribution
28(1)
2.1.8 Multivariate t-Distribution
29(2)
2.1.9 F-Distribution
31(1)
2.1.10 Binomial Distribution
31(1)
2.1.11 Poisson Distribution
32(1)
2.1.12 Exponential Distribution
32(1)
2.1.13 Geometric Distribution
33(1)
2.1.14 Hypergeometric Distribution
33(1)
2.1.15 Negative Binomial Distribution
34(1)
2.1.16 Inverse Gaussian (IG) Distribution
35(1)
2.1.17 Normal Inverse Gaussian (NIG) Distribution
36(2)
2.2 Central Limit Theorem
38(2)
2.3 Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Data
40(1)
2.4 Estimate of the Standard Deviation of the Sample Mean
40(1)
2.5 (Pseudo) Random Number Generators
41(2)
2.5.1 Mersenne Twister Pseudorandom Number Generator
42(1)
2.5.2 Box-Muller Transform for Generating a Normal Distribution
42(1)
2.6 Transformation of a Distribution Function
43(1)
2.7 Distribution of a Function of Random Variables
43(3)
2.7.1 Z = X + Y
44(1)
2.7.2 Z = X · Y
44(1)
2.7.3 (Z1, Z2, ..., Zn) = (X1, X2, ..., Xn) · Y
44(1)
2.7.4 Z = X/Y
45(1)
2.7.5 Z = max(X, Y)
45(1)
2.7.6 Z = min(X, Y)
45(1)
2.8 Moment Generating Function
46(2)
2.8.1 Moment Generating Function of Binomial Distribution
46(1)
2.8.2 Moment Generating Function of Normal Distribution
47(1)
2.8.3 Moment Generating Function of the Γ Distribution
47(1)
2.8.4 Moment Generating Function of Chi-Square Distribution
47(1)
2.8.5 Moment Generating Function of the Poisson Distribution
48(1)
2.9 Cumulant Generating Function
48(2)
2.10 Characteristic Function
50(3)
2.10.1 Relationship between Cumulative Function and Characteristic Function
51(1)
2.10.2 Characteristic Function of Normal Distribution
52(1)
2.10.3 Characteristic Function of F Distribution
52(1)
2.11 Chebyshev's Inequality
53(1)
2.12 Markov's Inequality
54(1)
2.13 Gram-Charlier Series
54(1)
2.14 Edgeworth Expansion
55(1)
2.15 Cornish-Fisher Expansion
56(2)
2.15.1 Lagrange Inversion Theorem
56(1)
2.15.2 Cornish-Fisher Expansion
57(1)
2.16 Copula Functions
58(5)
2.16.1 Gaussian Copula
60(1)
2.16.2 t-Copula
61(1)
2.16.3 Archimedean Copula
62(1)
3 Important Matrix Relationships 63(20)
3.1 Pseudo-Inverse of a Matrix
63(1)
3.2 A Lemma of Matrix Inversion
64(2)
3.3 Identity for a Matrix Determinant
66(1)
3.4 Inversion of Partitioned Matrix
66(1)
3.5 Determinant of Partitioned Matrix
67(1)
3.6 Matrix Sweep and Partial Correlation
67(2)
3.7 Singular Value Decomposition (SVD)
69(2)
3.8 Diagonalization of a Matrix
71(4)
3.9 Spectral Decomposition of a Positive Semi-Definite Matrix
75(1)
3.10 Normalization in Vector Space
76(1)
3.11 Conjugate Decomposition of a Symmetric Definite Matrix
77(1)
3.12 Cholesky Decomposition
77(3)
3.13 Cauchy-Schwartz Inequality
80(1)
3.14 Relationship of Correlation among Three Variables
81(2)
4 Linear Modeling and Regression 83(46)
4.1 Properties of Maximum Likelihood Estimators
84(4)
4.1.1 Likelihood Ratio Test
87(1)
4.1.2 Wald Test
87(1)
4.1.3 Lagrange Multiplier Statistic
88(1)
4.2 Linear Regression
88(18)
4.2.1 Ordinary Least Squares (OLS) Regression
89(6)
4.2.2 Interpretation of the Coefficients of Linear Regression
95(2)
4.2.3 Regression on Weighted Data
97(3)
4.2.4 Incrementally Updating a Regression Model with Additional Data
100(1)
4.2.5 Partitioned Regression
101(1)
4.2.6 How Does the Regression Change When Adding One More Variable?
101(2)
4.2.7 Linearly Restricted Least Squares Regression
103(2)
4.2.8 Significance of the Correlation Coefficient
105(1)
4.2.9 Partial Correlation
105(1)
4.2.10 Ridge Regression
105(1)
4.3 Fisher's Linear Discriminant Analysis
106(3)
4.4 Principal Component Regression (PCR)
109(1)
4.5 Factor Analysis
110(1)
4.6 Partial Least Squares Regression (PLSR)
111(2)
4.7 Generalized Linear Model (GLM)
113(3)
4.8 Logistic Regression: Binary
116(3)
4.9 Logistic Regression: Multiple Nominal
119(2)
4.10 Logistic Regression: Proportional Multiple Ordinal
121(2)
4.11 Fisher Scoring Method for Logistic Regression
123(2)
4.12 Tobit Model: A Censored Regression Model
125(4)
4.12.1 Some Properties of the Normal Distribution
125(1)
4.12.2 Formulation of the Tobit Model
126(3)
5 Nonlinear Modeling 129(44)
5.1 Naive Bayesian Classifier
129(2)
5.2 Neural Network
131(6)
5.2.1 Back Propagation Neural Network
131(6)
5.3 Segmentation and Tree Models
137(14)
5.3.1 Segmentation
137(1)
5.3.2 Tree Models
138(2)
5.3.3 Sweeping to Find the Best Cutpoint
140(3)
5.3.4 Impurity Measure of a Population: Entropy and Gini Index
143(4)
5.3.5 Chi-Square Splitting Rule
147(1)
5.3.6 Implementation of Decision Trees
148(3)
5.4 Additive Models
151(7)
5.4.1 Boosted Tree
153(1)
5.4.2 Least Squares Regression Boosting Tree
154(1)
5.4.3 Binary Logistic Regression Boosting Tree
155(3)
5.5 Support Vector Machine (SVM)
158(10)
5.5.1 Wolfe Dual
158(1)
5.5.2 Linearly Separable Problem
159(2)
5.5.3 Linearly Inseparable Problem
161(1)
5.5.4 Constructing Higher-Dimensional Space and Kernel
162(1)
5.5.5 Model Output
163(1)
5.5.6 C-Support Vector Classification (C-SVC) for Classification
164(1)
5.5.7 E-Support Vector Regression (E-SVR) for Regression
164(3)
5.5.8 The Probability Estimate
167(1)
5.6 Fuzzy Logic System
168(1)
5.6.1 A Simple Fuzzy Logic System
168(1)
5.7 Clustering
169(4)
5.7.1 K Means, Fuzzy C Means
170(1)
5.7.2 Nearest Neighbor, K Nearest Neighbor (KNN)
171(1)
5.7.3 Comments on Clustering Methods
171(2)
6 Time Series Analysis 173(22)
6.1 Fundamentals of Forecasting
173(8)
6.1.1 Box-Cox Transformation
174(1)
6.1.2 Smoothing Algorithms
175(1)
6.1.3 Convolution of Linear Filters
176(1)
6.1.4 Linear Difference Equation
177(1)
6.1.5 The Autocovariance Function and Autocorrelation Function
178(1)
6.1.6 The Partial Autocorrelation Function
179(2)
6.2 ARIMA Models
181(6)
6.2.1 MA(q) Process
182(2)
6.2.2 AR(p) Process
184(2)
6.2.3 ARMA(p, q) Process
186(1)
6.3 Survival Data Analysis
187(4)
6.3.1 Sampling Method
190(1)
6.4 Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1)
191(4)
6.4.1 Exponentially Weighted Moving Average (EWMA)
191(1)
6.4.2 ARCH and GARCH Models
192(3)
7 Data Preparation and Variable Selection 195(18)
7.1 Data Quality and Exploration
196(1)
7.2 Variable Scaling and Transformation
197(1)
7.3 How to Bin Variables
197(2)
7.3.1 Equal Interval
198(1)
7.3.2 Equal Population
198(1)
7.3.3 Tree Algorithms
199(1)
7.4 Interpolation in One and Two Dimensions
199(1)
7.5 Weight of Evidence (WOE) Transformation
200(4)
7.6 Variable Selection Overview
204(2)
7.7 Missing Data Imputation
206(1)
7.8 Stepwise Selection Methods
207(2)
7.8.1 Forward Selection in Linear Regression
208(1)
7.8.2 Forward Selection in Logistic Regression
208(1)
7.9 Mutual Information, KL Distance
209(1)
7.10 Detection of Multicollinearity
210(3)
8 Model Goodness Measures 213(18)
8.1 Training, Testing, Validation
213(2)
8.2 Continuous Dependent Variable
215(3)
8.2.1 Example: Linear Regression
217(1)
8.3 Binary Dependent Variable (Two-Group Classification)
218(9)
8.3.1 Kolmogorov-Smirnov (KS) Statistic
218(2)
8.3.2 Confusion Matrix
220(1)
8.3.3 Concordant and Discordant
221(2)
8.3.4 R2 for Logistic Regression
223(1)
8.3.5 AIC and SBC
224(1)
8.3.6 Hosmer-Lemeshow Goodness-of-Fit Test
224(1)
8.3.7 Example: Logistic Regression
225(2)
8.4 Population Stability Index Using Relative Entropy
227(4)
9 Optimization Methods 231(40)
9.1 Lagrange Multiplier
232(2)
9.2 Gradient Descent Method
234(2)
9.3 Newton-Raphson Method
236(2)
9.4 Conjugate Gradient Method
238(2)
9.5 Quasi-Newton Method
240(2)
9.6 Genetic Algorithms (GA)
242(1)
9.7 Simulated Annealing
242(1)
9.8 Linear Programming
243(4)
9.9 Nonlinear Programming (NLP)
247(16)
9.9.1 General Nonlinear Programming (GNLP)
248(1)
9.9.2 Lagrange Dual Problem
249(1)
9.9.3 Quadratic Programming (QP)
250(4)
9.9.4 Linear Complementarity Programming (LCP)
254(2)
9.9.5 Sequential Quadratic Programming (SQP)
256(7)
9.10 Nonlinear Equations
263(1)
9.11 Expectation-Maximization (EM) Algorithm
264(4)
9.12 Optimal Design of Experiment
268(3)
10 Miscellaneous Topics 271(20)
10.1 Multidimensional Scaling
271(3)
10.2 Simulation
274(4)
10.3 Odds Normalization and Score Transformation
278(2)
10.4 Reject Inference
280(1)
10.5 Dempster-Shafer Theory of Evidence
281(10)
10.5.1 Some Properties in Set Theory
281(1)
10.5.2 Basic Probability Assignment, Belief Function, and Plausibility Function
282(3)
10.5.3 Dempster-Shafer's Rule of Combination
285(2)
10.5.4 Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Function
287(4)
Appendix A Useful Mathematical Relations 291(8)
A.1 Information Inequality
291(1)
A.2 Relative Entropy
291(1)
A.3 Saddle-Point Method
292(1)
A.4 Stirling's Formula
293(1)
A.5 Convex Function and Jensen's Inequality
294(5)
Appendix B DataMinerXL - Microsoft Excel Add-In for Building Predictive Models 299(10)
B.1 Overview
299(1)
B.2 Utility Functions
299(1)
B.3 Data Manipulation Functions
300(1)
B.4 Basic Statistical Functions
300(1)
B.5 Modeling Functions for All Models
301(1)
B.6 Weight of Evidence Transformation Functions
301(1)
B.7 Linear Regression Functions
302(1)
B.8 Partial Least Squares Regression Functions
302(1)
B.9 Logistic Regression Functions
303(1)
B.10 Time Series Analysis Functions
303(1)
B.11 Naive Bayes Classifier Functions
303(1)
B.12 Tree-Based Model Functions
304(1)
B.13 Clustering and Segmentation Functions
304(1)
B.14 Neural Network Functions
304(1)
B.15 Support Vector Machine Functions
304(1)
B.16 Optimization Functions
305(1)
B.17 Matrix Operation Functions
305(1)
B.18 Numerical Integration Functions
306(1)
B.19 Excel Built-in Statistical Distribution Functions
306(3)
Bibliography 309(4)
Index 313
James Wu is a Fixed Income Quant with extensive expertise in a wide variety of applied analytical solutions in consumer behavior modeling and financial engineering. He previously worked at ID Analytics, Morgan Stanley, JPMorgan Chase, Los Alamos Computational Group, and CASA. He earned a PhD from the University of Idaho.





Stephen Coggeshall is the Chief Technology Officer of ID Analytics. He previously worked at Los Alamos Computational Group, Morgan Stanley, HNC Software, CASA, and Los Alamos National Laboratory. During his over 20 year career, Dr. Coggeshall has helped teams of scientists develop practical solutions to difficult business problems using advanced analytics. He earned a PhD from the University of Illinois and was named 2008 Technology Executive of the Year by the San Diego Business Journal.