|
1 Introduction to Predictive Analytics |
|
|
1 | (26) |
|
1.1 Predictive Analytics in Action |
|
|
2 | (4) |
|
|
6 | (5) |
|
|
7 | (4) |
|
|
11 | (1) |
|
1.3.1 Predictive Analytics |
|
|
11 | (1) |
|
|
12 | (1) |
|
1.5 Machine Learning Techniques |
|
|
13 | (4) |
|
1.6 Predictive Analytics Model |
|
|
17 | (2) |
|
1.7 Opportunities in Analytics |
|
|
19 | (2) |
|
1.8 Introduction to the Automobile Insurance Claim Fraud Example |
|
|
21 | (2) |
|
|
23 | (4) |
|
|
25 | (2) |
|
2 Know Your Data: Data Preparation |
|
|
27 | (28) |
|
2.1 Classification of Data |
|
|
27 | (2) |
|
2.1.1 Qualitative Versus Quantitative |
|
|
28 | (1) |
|
2.1.2 Scales of Measurement |
|
|
28 | (1) |
|
2.2 Data Preparation Methods |
|
|
29 | (3) |
|
2.2.1 Inconsistent Formats |
|
|
30 | (1) |
|
|
30 | (1) |
|
|
31 | (1) |
|
2.2.4 Other Data Cleansing Considerations |
|
|
32 | (1) |
|
2.3 Data Sets and Data Partitioning |
|
|
32 | (1) |
|
2.4 SAS Enterprise Miner™ Model Components |
|
|
32 | (21) |
|
2.4.1 Step 1. Create Three of the Model Components |
|
|
33 | (2) |
|
2.4.2 Step 2. Import an Excel File and Save as a SAS File |
|
|
35 | (3) |
|
2.4.3 Step 3. Create the Data Source |
|
|
38 | (4) |
|
2.4.4 Step 4. Partition the Data Source |
|
|
42 | (2) |
|
2.4.5 Step 5. Data Exploration |
|
|
44 | (1) |
|
2.4.6 Step 6. Missing Data |
|
|
44 | (3) |
|
2.4.7 Step 7. Handling Outliers |
|
|
47 | (4) |
|
2.4.8 Step 8. Categorical Variables with Too Many Levels |
|
|
51 | (2) |
|
|
53 | (2) |
|
|
54 | (1) |
|
3 What Do Descriptive Statistics Tell Us |
|
|
55 | (32) |
|
3.1 Descriptive Analytics |
|
|
56 | (1) |
|
3.2 The Role of the Mean, Median, and Mode |
|
|
56 | (1) |
|
3.3 Variance and Distribution |
|
|
57 | (3) |
|
3.4 The Shape of the Distribution |
|
|
60 | (5) |
|
|
60 | (1) |
|
|
61 | (4) |
|
3.5 Covariance and Correlation |
|
|
65 | (2) |
|
|
67 | (11) |
|
3.6.1 Variable Clustering |
|
|
68 | (7) |
|
3.6.2 Principal Component Analysis |
|
|
75 | (3) |
|
|
78 | (1) |
|
3.8 Analysis of Variance (ANOVA) |
|
|
79 | (1) |
|
|
80 | (1) |
|
|
81 | (1) |
|
|
82 | (1) |
|
|
83 | (4) |
|
|
85 | (2) |
|
4 Predictive Models Using Regression |
|
|
87 | (36) |
|
|
88 | (1) |
|
4.1.1 Classical Assumptions |
|
|
88 | (1) |
|
4.2 Ordinary Least Squares |
|
|
89 | (1) |
|
4.3 Simple Linear Regression |
|
|
90 | (1) |
|
4.3.1 Determining Relationship Between Two Variables |
|
|
90 | (1) |
|
4.3.2 Line of Best Fit and Simple Linear Regression Equation |
|
|
90 | (1) |
|
4.4 Multiple Linear Regression |
|
|
91 | (4) |
|
4.4.1 Metrics to Evaluate the Strength of the Regression Line |
|
|
92 | (1) |
|
|
93 | (1) |
|
4.4.3 Selection of Variables in Regression |
|
|
93 | (2) |
|
4.5 Principal Component Regression |
|
|
95 | (1) |
|
4.5.1 Principal Component Analysis Revisited |
|
|
95 | (1) |
|
4.5.2 Principal Component Regression |
|
|
95 | (1) |
|
4.6 Partial Least Squares |
|
|
95 | (1) |
|
|
96 | (5) |
|
4.7.1 Binary Logistic Regression |
|
|
97 | (2) |
|
4.7.2 Examination of Coefficients |
|
|
99 | (1) |
|
4.7.3 Multinomial Logistic Regression |
|
|
100 | (1) |
|
4.7.4 Ordinal Logistic Regression |
|
|
100 | (1) |
|
4.8 Implementation of Regression in SAS Enterprise Miner™ |
|
|
101 | (3) |
|
4.8.1 Regression Node Train Properties: Class Targets |
|
|
101 | (1) |
|
4.8.2 Regression Node Train Properties: Model Options |
|
|
102 | (1) |
|
4.8.3 Regression Node Train Properties: Model Selection |
|
|
102 | (2) |
|
4.9 Implementation of Two-Factor Interaction and Polynomial Terms |
|
|
104 | (2) |
|
4.9.1 Regression Node Train Properties: Equation |
|
|
105 | (1) |
|
4.10 DMINE Regression in SAS Enterprise Miner™ |
|
|
106 | (3) |
|
|
106 | (2) |
|
|
108 | (1) |
|
4.11 Partial Least Squares Regression in SAS Enterprise Miner™ |
|
|
109 | (4) |
|
4.11.1 Partial Least Squares Properties |
|
|
109 | (2) |
|
4.11.2 Partial Least Squares Results |
|
|
111 | (2) |
|
4.12 Least Angle Regression in SAS Enterprise Miner™ |
|
|
113 | (4) |
|
4.12.1 Least Angle Regression Properties |
|
|
114 | (1) |
|
4.12.2 Least Angle Regression Results |
|
|
115 | (2) |
|
4.13 Other Forms of Regression |
|
|
117 | (1) |
|
|
118 | (5) |
|
|
121 | (2) |
|
5 The Second of the Big 3: Decision Trees |
|
|
123 | (22) |
|
5.1 What Is a Decision Tree? |
|
|
123 | (2) |
|
5.2 Creating a Decision Tree |
|
|
125 | (1) |
|
5.3 Classification and Regression Trees (CART) |
|
|
126 | (1) |
|
5.4 Data Partitions and Decision Trees |
|
|
127 | (2) |
|
5.5 Creating a Decision Tree Using SAS Enterprise Miner™ |
|
|
129 | (8) |
|
|
136 | (1) |
|
5.6 Creating an Interactive Decision Tree Using SAS Enterprise Miner™ |
|
|
137 | (3) |
|
5.7 Creating a Maximal Decision Tree Using SAS Enterprise Miner™ |
|
|
140 | (3) |
|
|
143 | (2) |
|
|
144 | (1) |
|
6 The Third of the Big 3: Neural Networks |
|
|
145 | (30) |
|
6.1 What Is a Neural Network? |
|
|
145 | (2) |
|
6.2 History of Neural Networks |
|
|
147 | (2) |
|
6.3 Components of a Neural Network |
|
|
149 | (2) |
|
6.4 Neural Network Architectures |
|
|
151 | (2) |
|
6.5 Training a Neural Network |
|
|
153 | (1) |
|
6.6 Radial Basis Function Neural Networks |
|
|
154 | (1) |
|
6.7 Creating a Neural Network Sing SAS Enterprise Miner™ |
|
|
155 | (7) |
|
6.8 Using SAS Enterprise Miner™ to Automatically Generate a Neural Network |
|
|
162 | (6) |
|
6.9 Explaining a Neural Network |
|
|
168 | (3) |
|
|
171 | (4) |
|
|
173 | (2) |
|
7 Model Comparisons and Scoring |
|
|
175 | (24) |
|
|
175 | (1) |
|
|
176 | (2) |
|
|
178 | (2) |
|
|
180 | (1) |
|
7.5 Memory-Based Reasoning |
|
|
181 | (3) |
|
|
184 | (1) |
|
7.7 Comparing Predictive Models |
|
|
185 | (5) |
|
7.7.1 Evaluating Fit Statistics: Which Model Do We Use? |
|
|
187 | (3) |
|
7.8 Using Historical Data to Predict the Future: Scoring |
|
|
190 | (5) |
|
7.8.1 Analyzing and Reporting Results |
|
|
191 | (2) |
|
|
193 | (1) |
|
|
194 | (1) |
|
7.9 The Importance of Predictive Analytics |
|
|
195 | (2) |
|
7.9.1 What Should We Expect for Predictive Analytics in the Future? |
|
|
196 | (1) |
|
|
197 | (2) |
|
|
198 | (1) |
|
8 Finding Associations in Data Through Cluster Analysis |
|
|
199 | (34) |
|
8.1 Applications and Uses of Cluster Analysis |
|
|
199 | (1) |
|
8.2 Types of Clustering Techniques |
|
|
200 | (1) |
|
8.3 Hierarchical Clustering |
|
|
200 | (11) |
|
8.3.1 Agglomerative Clustering |
|
|
201 | (5) |
|
8.3.2 Divisive Clustering |
|
|
206 | (4) |
|
8.3.3 Agglomerative Versus Divisive Clustering |
|
|
210 | (1) |
|
8.4 Non-hierarchical Clustering |
|
|
211 | (10) |
|
|
211 | (4) |
|
8.4.2 Initial Centroid Selection |
|
|
215 | (1) |
|
8.4.3 Determining the Number of Clusters |
|
|
216 | (3) |
|
8.4.4 Evaluating Your Clusters |
|
|
219 | (2) |
|
8.5 Hierarchical Versus Non-hierarchical |
|
|
221 | (1) |
|
8.6 Cluster Analysis Using SAS Enterprise Miner™ |
|
|
221 | (3) |
|
|
222 | (1) |
|
8.6.2 Additional Key Properties of the Cluster Node |
|
|
222 | (2) |
|
8.7 Applying Cluster Analysis to the Insurance Claim Fraud Data Set |
|
|
224 | (7) |
|
|
231 | (2) |
|
|
232 | (1) |
|
9 Text Analytics: Using Qualitative Data to Support Quantitative Results |
|
|
233 | (22) |
|
9.1 What Is Text Analytics? |
|
|
234 | (1) |
|
9.2 Information Retrieval |
|
|
235 | (2) |
|
|
237 | (3) |
|
|
240 | (1) |
|
|
241 | (2) |
|
|
243 | (3) |
|
|
246 | (3) |
|
|
249 | (2) |
|
|
251 | (1) |
|
|
252 | (3) |
|
|
254 | (1) |
Appendix A Data Dictionary for the Automobile Insurance Claim Fraud Data Example |
|
255 | (2) |
Appendix B Can You Predict the Money Laundering Cases? |
|
257 | (8) |
References |
|
265 | (2) |
Index |
|
267 | |