Preface |
|
xi | |
|
Chapter 1 An Introduction To Data Mining |
|
|
1 | (15) |
|
|
1 | (1) |
|
|
2 | (1) |
|
1.3 The Need for Human Direction of Data Mining |
|
|
3 | (1) |
|
1.4 The Cross-Industry Standard Practice for Data Mining |
|
|
4 | (2) |
|
1.4.1 Crisp-DM: The Six Phases |
|
|
5 | (1) |
|
1.5 Fallacies of Data Mining |
|
|
6 | (2) |
|
1.6 What Tasks Can Data Mining Accomplish? |
|
|
8 | (8) |
|
|
8 | (1) |
|
|
8 | (2) |
|
|
10 | (1) |
|
|
10 | (2) |
|
|
12 | (2) |
|
|
14 | (1) |
|
|
14 | (1) |
|
|
15 | (1) |
|
Chapter 2 Data Preprocessing |
|
|
16 | (35) |
|
2.1 Why do We Need to Preprocess the Data? |
|
|
17 | (1) |
|
|
17 | (2) |
|
2.3 Handling Missing Data |
|
|
19 | (3) |
|
2.4 Identifying Misclassifications |
|
|
22 | (1) |
|
2.5 Graphical Methods for Identifying Outliers |
|
|
22 | (1) |
|
2.6 Measures of Center and Spread |
|
|
23 | (3) |
|
|
26 | (1) |
|
2.8 Min-Max Normalization |
|
|
26 | (1) |
|
2.9 Z-Score Standardization |
|
|
27 | (1) |
|
|
28 | (1) |
|
2.11 Transformations to Achieve Normality |
|
|
28 | (7) |
|
2.12 Numerical Methods for Identifying Outliers |
|
|
35 | (1) |
|
|
36 | (1) |
|
2.14 Transforming Categorical Variables into Numerical Variables |
|
|
37 | (1) |
|
2.15 Binning Numerical Variables |
|
|
38 | (1) |
|
2.16 Reclassifying Categorical Variables |
|
|
39 | (1) |
|
2.17 Adding an Index Field |
|
|
39 | (1) |
|
2.18 Removing Variables that are Not Useful |
|
|
39 | (1) |
|
2.19 Variables that Should Probably Not Be Removed |
|
|
40 | (1) |
|
2.20 Removal of Duplicate Records |
|
|
41 | (1) |
|
2.21 A Word About ID Fields |
|
|
41 | (10) |
|
|
42 | (6) |
|
|
48 | (1) |
|
|
48 | (2) |
|
|
50 | (1) |
|
Chapter 3 Exploratory Data Analysis |
|
|
51 | (40) |
|
3.1 Hypothesis Testing Versus Exploratory Data Analysis |
|
|
51 | (1) |
|
3.2 Getting to Know the Data Set |
|
|
52 | (3) |
|
3.3 Exploring Categorical Variables |
|
|
55 | (7) |
|
3.4 Exploring Numeric Variables |
|
|
62 | (7) |
|
3.5 Exploring Multivariate Relationships |
|
|
69 | (2) |
|
3.6 Selecting Interesting Subsets of the Data for Further Investigation |
|
|
71 | (1) |
|
3.7 Using EDA to Uncover Anomalous Fields |
|
|
71 | (1) |
|
3.8 Binning Based on Predictive Value |
|
|
72 | (2) |
|
3.9 Deriving New Variables: Flag Variables |
|
|
74 | (3) |
|
3.10 Deriving New Variables: Numerical Variables |
|
|
77 | (1) |
|
3.11 Using EDA to Investigate Correlated Predictor Variables |
|
|
77 | (3) |
|
|
80 | (11) |
|
|
82 | (6) |
|
|
88 | (1) |
|
|
88 | (1) |
|
|
89 | (2) |
|
Chapter 4 Univariate Statistical Analysis |
|
|
91 | (18) |
|
4.1 Data Mining Tasks in Discovering Knowledge in Data |
|
|
91 | (1) |
|
4.2 Statistical Approaches to Estimation and Prediction |
|
|
92 | (1) |
|
4.3 Statistical Inference |
|
|
93 | (1) |
|
4.4 How Confident are We in Our Estimates? |
|
|
94 | (1) |
|
4.5 Confidence Interval Estimation of the Mean |
|
|
95 | (2) |
|
4.6 How to Reduce the Margin of Error |
|
|
97 | (1) |
|
4.7 Confidence Interval Estimation of the Proportion |
|
|
98 | (1) |
|
4.8 Hypothesis Testing for the Mean |
|
|
99 | (2) |
|
4.9 Assessing the Strength of Evidence Against the Null Hypothesis |
|
|
101 | (1) |
|
4.10 Using Confidence Intervals to Perform Hypothesis Tests |
|
|
102 | (2) |
|
4.11 Hypothesis Testing for the Proportion |
|
|
104 | (5) |
|
|
105 | (1) |
|
|
106 | (1) |
|
|
106 | (3) |
|
Chapter 5 Multivariate Statistics |
|
|
109 | (29) |
|
5.1 Two-Sample t-Test for Difference in Means |
|
|
110 | (1) |
|
5.2 Two-Sample Z-Test for Difference in Proportions |
|
|
111 | (1) |
|
5.3 Test for Homogeneity of Proportions |
|
|
112 | (2) |
|
5.4 Chi-Square Test for Goodness of Fit of Multinomial Data |
|
|
114 | (1) |
|
|
115 | (3) |
|
|
118 | (4) |
|
5.7 Hypothesis Testing in Regression |
|
|
122 | (1) |
|
5.8 Measuring the Quality of a Regression Model |
|
|
123 | (1) |
|
5.9 Dangers of Extrapolation |
|
|
123 | (2) |
|
5.10 Confidence Intervals for the Mean Value of y Given x |
|
|
125 | (1) |
|
5.11 Prediction Intervals for a Randomly Chosen Value of y Given x |
|
|
125 | (1) |
|
|
126 | (1) |
|
5.13 Verifying Model Assumptions |
|
|
127 | (11) |
|
|
131 | (4) |
|
|
135 | (1) |
|
|
135 | (1) |
|
|
136 | (2) |
|
Chapter 6 Preparing To Model The Data |
|
|
138 | (11) |
|
6.1 Supervised Versus Unsupervised Methods |
|
|
138 | (1) |
|
6.2 Statistical Methodology and Data Mining Methodology |
|
|
139 | (1) |
|
|
139 | (2) |
|
|
141 | (1) |
|
6.5 BIAS--Variance Trade-Off |
|
|
142 | (2) |
|
6.6 Balancing the Training Data Set |
|
|
144 | (1) |
|
6.7 Establishing Baseline Performance |
|
|
145 | (4) |
|
|
146 | (1) |
|
|
147 | (1) |
|
|
147 | (2) |
|
Chapter 7 K-Nearest Neighbor Algorithm |
|
|
149 | (16) |
|
|
149 | (1) |
|
7.2 k-Nearest Neighbor Algorithm |
|
|
150 | (3) |
|
|
153 | (3) |
|
|
156 | (2) |
|
7.4.1 Simple Unweighted Voting |
|
|
156 | (1) |
|
|
156 | (2) |
|
7.5 Quantifying Attribute Relevance: Stretching the Axes |
|
|
158 | (1) |
|
7.6 Database Considerations |
|
|
158 | (1) |
|
7.7 k-Nearest Neighbor Algorithm for Estimation and Prediction |
|
|
159 | (1) |
|
|
160 | (1) |
|
7.9 Application of k-Nearest Neighbor Algorithm Using IBM/SPSS Modeler |
|
|
160 | (5) |
|
|
162 | (1) |
|
|
163 | (1) |
|
|
164 | (1) |
|
|
165 | (22) |
|
8.1 What is a Decision Tree? |
|
|
165 | (2) |
|
8.2 Requirements for Using Decision Trees |
|
|
167 | (1) |
|
8.3 Classification and Regression Trees |
|
|
168 | (6) |
|
|
174 | (5) |
|
|
179 | (1) |
|
8.6 Comparison of the C5.0 and Cart Algorithms Applied to Real Data |
|
|
180 | (7) |
|
|
183 | (1) |
|
|
184 | (1) |
|
|
185 | (1) |
|
|
185 | (2) |
|
Chapter 9 Neural Networks |
|
|
187 | (22) |
|
9.1 Input and Output Encoding |
|
|
188 | (2) |
|
9.2 Neural Networks for Estimation and Prediction |
|
|
190 | (1) |
|
9.3 Simple Example of a Neural Network |
|
|
191 | (2) |
|
9.4 Sigmoid Activation Function |
|
|
193 | (1) |
|
|
194 | (4) |
|
9.5.1 Gradient Descent Method |
|
|
194 | (1) |
|
9.5.2 Back-Propagation Rules |
|
|
195 | (1) |
|
9.5.3 Example of Back-Propagation |
|
|
196 | (2) |
|
|
198 | (1) |
|
|
198 | (1) |
|
|
199 | (2) |
|
|
201 | (1) |
|
9.10 Application of Neural Network Modeling |
|
|
202 | (7) |
|
|
204 | (3) |
|
|
207 | (1) |
|
|
207 | (1) |
|
|
207 | (2) |
|
Chapter 10 Hierarchical And K-Means Clustering |
|
|
209 | (19) |
|
|
209 | (3) |
|
10.2 Hierarchical Clustering Methods |
|
|
212 | (1) |
|
10.3 Single-Linkage Clustering |
|
|
213 | (1) |
|
10.4 Complete-Linkage Clustering |
|
|
214 | (1) |
|
|
215 | (1) |
|
10.6 Example of k-Means Clustering at Work |
|
|
216 | (3) |
|
10.7 Behavior of MSB, MSE, and PSEUDO-F as the k-Means Algorithm Proceeds |
|
|
219 | (1) |
|
10.8 Application of k-Means Clustering Using SAS Enterprise Miner |
|
|
220 | (3) |
|
10.9 Using Cluster Membership to Predict Churn |
|
|
223 | (5) |
|
|
224 | (2) |
|
|
226 | (1) |
|
|
226 | (1) |
|
|
226 | (2) |
|
Chapter 11 Kohonen Networks |
|
|
228 | (19) |
|
11.1 Self-Organizing Maps |
|
|
228 | (2) |
|
|
230 | (1) |
|
11.2.1 Kohonen Networks Algorithm |
|
|
231 | (1) |
|
11.3 Example of a Kohonen Network Study |
|
|
231 | (4) |
|
|
235 | (1) |
|
11.5 Application of Clustering Using Kohonen Networks |
|
|
235 | (2) |
|
11.6 Interpreting the Clusters |
|
|
237 | (5) |
|
|
240 | (2) |
|
11.7 Using Cluster Membership as Input to Downstream Data Mining Models |
|
|
242 | (5) |
|
|
243 | (2) |
|
|
245 | (1) |
|
|
245 | (1) |
|
|
245 | (2) |
|
Chapter 12 Association Rules |
|
|
247 | (19) |
|
12.1 Affinity Analysis and Market Basket Analysis |
|
|
247 | (2) |
|
12.1.1 Data Representation for Market Basket Analysis |
|
|
248 | (1) |
|
12.2 Support, Confidence, Frequent Itemsets, and the a Priori Property |
|
|
249 | (2) |
|
12.3 How Does the a Priori Algorithm Work? |
|
|
251 | (4) |
|
12.3.1 Generating Frequent Itemsets |
|
|
251 | (2) |
|
12.3.2 Generating Association Rules |
|
|
253 | (2) |
|
12.4 Extension from Flag Data to General Categorical Data |
|
|
255 | (1) |
|
12.5 Information-Theoretic Approach: Generalized Rule Induction Method |
|
|
256 | (2) |
|
|
257 | (1) |
|
12.6 Association Rules are Easy to do Badly |
|
|
258 | (1) |
|
12.7 How Can We Measure the Usefulness of Association Rules? |
|
|
259 | (1) |
|
12.8 Do Association Rules Represent Supervised or Unsupervised Learning? |
|
|
260 | (1) |
|
12.9 Local Patterns Versus Global Models |
|
|
261 | (5) |
|
|
262 | (1) |
|
|
263 | (1) |
|
|
263 | (1) |
|
|
264 | (2) |
|
Chapter 13 Imputation Of Missing Data |
|
|
266 | (11) |
|
13.1 Need for Imputation of Missing Data |
|
|
266 | (1) |
|
13.2 Imputation of Missing Data: Continuous Variables |
|
|
267 | (3) |
|
13.3 Standard Error of the Imputation |
|
|
270 | (1) |
|
13.4 Imputation of Missing Data: Categorical Variables |
|
|
271 | (1) |
|
13.5 Handling Patterns in Missingness |
|
|
272 | (5) |
|
|
273 | (3) |
|
|
276 | (1) |
|
|
276 | (1) |
|
|
276 | (1) |
|
Chapter 14 Model Evaluation Techniques |
|
|
277 | (17) |
|
14.1 Model Evaluation Techniques for the Description Task |
|
|
278 | (1) |
|
14.2 Model Evaluation Techniques for the Estimation and Prediction Tasks |
|
|
278 | (2) |
|
14.3 Model Evaluation Techniques for the Classification Task |
|
|
280 | (1) |
|
14.4 Error Rate, False Positives, and False Negatives |
|
|
280 | (3) |
|
14.5 Sensitivity and Specificity |
|
|
283 | (1) |
|
14.6 Misclassification Cost Adjustment to Reflect Real-World Concerns |
|
|
284 | (1) |
|
14.7 Decision Cost/Benefit Analysis |
|
|
285 | (1) |
|
14.8 Lift Charts and Gains Charts |
|
|
286 | (3) |
|
14.9 Interweaving Model Evaluation with Model Building |
|
|
289 | (1) |
|
14.10 Confluence of Results: Applying a Suite of Models |
|
|
290 | (4) |
|
|
291 | (1) |
|
|
291 | (1) |
|
|
291 | (1) |
|
|
291 | (3) |
Appendix: Data Summarization And Visualization |
|
294 | (15) |
Index |
|
309 | |