Preface |
|
xiii | |
Acknowledgment |
|
xv | |
Author |
|
xvii | |
|
Chapter 1 Introduction to Machine Learning |
|
|
1 | (24) |
|
1.1 Machine Learning, Statistical Analysis, and Data Science |
|
|
2 | (1) |
|
1.2 Machine Learning: A First Example |
|
|
3 | (3) |
|
1.2.1 Attribute-Value Format |
|
|
3 | (1) |
|
1.2.2 A Decision Tree for Diagnosing Illness |
|
|
4 | (2) |
|
1.3 Machine Learning Strategies |
|
|
6 | (6) |
|
|
7 | (1) |
|
|
7 | (1) |
|
|
8 | (3) |
|
1.3.4 Unsupervised Clustering |
|
|
11 | (1) |
|
1.3.5 Market Basket Analysis |
|
|
12 | (1) |
|
1.4 Evaluating Performance |
|
|
12 | (5) |
|
1.4.1 Evaluating Supervised Models |
|
|
12 | (1) |
|
1.4.2 Two-Class Error Analysis |
|
|
13 | (1) |
|
1.4.3 Evaluating Numeric Output |
|
|
14 | (1) |
|
1.4.4 Comparing Models by Measuring Lift |
|
|
15 | (2) |
|
1.4.5 Unsupervised Model Evaluation |
|
|
17 | (1) |
|
|
17 | (1) |
|
|
18 | (1) |
|
|
18 | (2) |
|
|
20 | (5) |
|
Chapter 2 Introduction to R |
|
|
25 | (16) |
|
2.1 Introducing R and Rstudio |
|
|
25 | (3) |
|
|
26 | (1) |
|
|
26 | (2) |
|
|
28 | (1) |
|
|
28 | (10) |
|
|
28 | (2) |
|
|
30 | (2) |
|
2.2.3 The Global Environment |
|
|
32 | (5) |
|
|
37 | (1) |
|
|
38 | (1) |
|
2.4 Obtaining Help and Additional Information |
|
|
38 | (1) |
|
|
39 | (1) |
|
|
39 | (2) |
|
Chapter 3 Data Structures and Manipulation |
|
|
41 | (20) |
|
|
41 | (3) |
|
3.1.1 Character Data and Factors |
|
|
42 | (2) |
|
3.2 Single-Mode Data Structures |
|
|
44 | (3) |
|
|
44 | (2) |
|
3.2.2 Mxatrices and Arrays |
|
|
46 | (1) |
|
3.3 Multimode Data Structures |
|
|
47 | (3) |
|
|
47 | (1) |
|
|
48 | (2) |
|
3.4 Writing Your Own Functions |
|
|
50 | (8) |
|
3.4.1 Writing a Simple Function |
|
|
50 | (2) |
|
3.4.2 Conditional Statements |
|
|
52 | (1) |
|
|
53 | (4) |
|
3.4.4 Recursive Programming |
|
|
57 | (1) |
|
|
58 | (1) |
|
|
58 | (1) |
|
|
59 | (2) |
|
Chapter 4 Preparing the Data |
|
|
61 | (18) |
|
4.1 A Process Model for Knowledge Discovery |
|
|
61 | (1) |
|
4.2 Creating a Target Dataset |
|
|
62 | (4) |
|
4.2.1 Interfacing R with the Relational Model |
|
|
64 | (2) |
|
4.2.2 Additional Sources for Target Data |
|
|
66 | (1) |
|
|
66 | (4) |
|
|
66 | (1) |
|
4.3.2 Preprocessing with R |
|
|
67 | (2) |
|
|
69 | (1) |
|
|
69 | (1) |
|
|
70 | (5) |
|
|
70 | (2) |
|
4.4.2 Data Type Conversion |
|
|
72 | (1) |
|
4.4.3 Attribute and Instance Selection |
|
|
72 | (2) |
|
4.4.4 Creating Training and Test Set Data |
|
|
74 | (1) |
|
4.4.5 Cross Validation and Bootstrapping |
|
|
74 | (1) |
|
|
75 | (1) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (2) |
|
Chapter 5 Supervised Statistical Techniques |
|
|
79 | (48) |
|
5.1 Simple Linear Regression |
|
|
79 | (6) |
|
5.2 Multiple Linear Regression |
|
|
85 | (14) |
|
5.2.1 Multiple Linear Regression: An Example |
|
|
85 | (3) |
|
5.2.2 Evaluating Numeric Output |
|
|
88 | (1) |
|
5.2.3 Training/Test Set Evaluation |
|
|
89 | (2) |
|
5.2.4 Using Cross Validation |
|
|
91 | (2) |
|
5.2.5 Linear Regression with Categorical Data |
|
|
93 | (6) |
|
|
99 | (10) |
|
5.3.1 Transforming the Linear Regression Model |
|
|
100 | (1) |
|
5.3.2 The Logistic Regression Model |
|
|
100 | (1) |
|
5.3.3 Logistic Regression with R |
|
|
101 | (3) |
|
5.3.4 Creating a Confusion Matrix |
|
|
104 | (1) |
|
5.3.5 Receiver Operating Characteristics (ROC) Curves |
|
|
104 | (4) |
|
5.3.6 The Area under an ROC Curve |
|
|
108 | (1) |
|
5.4 Naive Bayes Classifier |
|
|
109 | (11) |
|
5.4.1 Bayes Classifier: An Example |
|
|
109 | (3) |
|
5.4.2 Zero-Valued Attribute Counts |
|
|
112 | (1) |
|
|
112 | (1) |
|
|
113 | (2) |
|
5.4.5 Experimenting with Naive Bayes |
|
|
115 | (5) |
|
|
120 | (1) |
|
|
120 | (2) |
|
|
122 | (5) |
|
Chapter 6 Tree-Based Methods |
|
|
127 | (34) |
|
6.1 A Decision Tree Algorithm |
|
|
127 | (6) |
|
6.1.1 An Algorithm for Building Decision Trees |
|
|
128 | (1) |
|
6.1.2 C4.5 Attribute Selection |
|
|
128 | (5) |
|
6.1.3 Other Methods for Building Decision Trees |
|
|
133 | (1) |
|
6.2 Building Decision Trees: C5.0 |
|
|
133 | (4) |
|
6.2.1 A Decision Tree for Credit Card Promotions |
|
|
134 | (1) |
|
6.2.2 Data for Simulating Customer Churn |
|
|
135 | (1) |
|
6.2.3 Predicting Customer Churn with C5.0 |
|
|
136 | (1) |
|
6.3 Building Decision Trees: Rpart |
|
|
137 | (10) |
|
6.3.1 An rpart Decision Tree for Credit Card Promotions |
|
|
139 | (2) |
|
6.3.2 Train and Test rpart: Churn Data |
|
|
141 | (2) |
|
6.3.3 Cross Validation rpart: Churn Data |
|
|
143 | (4) |
|
6.4 Building Decision Trees: J48 |
|
|
147 | (2) |
|
6.5 Ensemble Techniques for Improving Performance |
|
|
149 | (5) |
|
|
149 | (1) |
|
|
150 | (1) |
|
6.5.3 Boosting: An Example with C5.0 |
|
|
150 | (1) |
|
|
151 | (3) |
|
|
154 | (2) |
|
|
156 | (1) |
|
|
157 | (1) |
|
|
157 | (4) |
|
Chapter 7 Rule-Based Techniques |
|
|
161 | (28) |
|
|
161 | (4) |
|
7.1.1 The Spam Email Dataset |
|
|
162 | (1) |
|
7.1.2 Spam Email Classification: C5.0 |
|
|
163 | (2) |
|
7.2 A Basic Covering Rule Algorithm |
|
|
165 | (4) |
|
7.2.1 Generating Covering Rules with JRip |
|
|
166 | (3) |
|
7.3 Generating Association Rules |
|
|
169 | (8) |
|
7.3.1 Confidence and Support |
|
|
169 | (1) |
|
7.3.2 Mining Association Rules: An Example |
|
|
170 | (3) |
|
7.3.3 General Considerations |
|
|
173 | (1) |
|
7.3.4 Rweka's Apriori Function |
|
|
173 | (4) |
|
7.4 Shake, Rattle, and Roll |
|
|
177 | (7) |
|
|
184 | (1) |
|
|
184 | (1) |
|
|
185 | (4) |
|
Chapter 8 Neural Networks |
|
|
189 | (50) |
|
8.1 Feed-Forward Neural Networks |
|
|
190 | (4) |
|
8.1.1 Neural Network Input Format |
|
|
190 | (2) |
|
8.1.2 Neural Network Output Format |
|
|
192 | (1) |
|
8.1.3 The Sigmoid Evaluation Function |
|
|
193 | (1) |
|
8.2 Neural Network Training: A Conceptual View |
|
|
194 | (2) |
|
8.2.1 Supervised Learning with Feed-Forward Networks |
|
|
194 | (1) |
|
8.2.2 Unsupervised Clustering with Self-Organizing Maps |
|
|
195 | (1) |
|
8.3 Neural Network Explanation |
|
|
196 | (1) |
|
8.4 General Considerations |
|
|
197 | (1) |
|
|
197 | (1) |
|
|
198 | (1) |
|
8.5 Neural Network Training: A Detailed View |
|
|
198 | (5) |
|
8.5.1 The Backpropagation Algorithm: An Example |
|
|
198 | (4) |
|
8.5.2 Kohonen Self-Organizing Maps: An Example |
|
|
202 | (1) |
|
8.6 Building Neural Networks With R |
|
|
203 | (20) |
|
8.6.1 The Exclusive-or Function |
|
|
204 | (2) |
|
8.6.2 Modeling Exclusive-or with MLP: Numeric Output |
|
|
206 | (4) |
|
8.6.3 Modeling Exclusive-or with MLP: Categorical Output |
|
|
210 | (2) |
|
8.6.4 Modeling Exclusive-or with neuralnet: Numeric Output |
|
|
212 | (2) |
|
8.6.5 Modeling Exclusive-or with neuralnet: Categorical Output |
|
|
214 | (2) |
|
8.6.6 Classifying Satellite Image Data |
|
|
216 | (4) |
|
8.6.7 Testing for Diabetes |
|
|
220 | (3) |
|
8.7 Neural Net Clustering for Attribute Evaluation |
|
|
223 | (4) |
|
8.8 Times Series Analysis |
|
|
227 | (5) |
|
8.8.1 Stock Market Analytics |
|
|
227 | (1) |
|
8.8.2 Time Series Analysis: An Example |
|
|
228 | (1) |
|
|
229 | (1) |
|
8.8.4 Modeling the Time Series |
|
|
230 | (2) |
|
8.8.5 General Considerations |
|
|
232 | (1) |
|
|
232 | (1) |
|
|
233 | (1) |
|
|
234 | (5) |
|
Chapter 9 Formal Evaluation Techniques |
|
|
239 | (18) |
|
9.1 What Should Be Evaluated? |
|
|
240 | (1) |
|
|
241 | (6) |
|
9.2.1 Single-Valued Summary Statistics |
|
|
242 | (1) |
|
9.2.2 The Normal Distribution |
|
|
242 | (2) |
|
9.2.3 Normal Distributions and Sample Means |
|
|
244 | (1) |
|
9.2.4 A Classical Model for Hypothesis Testing |
|
|
245 | (2) |
|
9.3 Computing Test Set Confidence Intervals |
|
|
247 | (2) |
|
9.4 Comparing Supervised Models |
|
|
249 | (4) |
|
9.4.1 Comparing the Performance of Two Models |
|
|
251 | (1) |
|
9.4.2 Comparing the Performance of Two or More Models |
|
|
252 | (1) |
|
9.5 Confidence Intervals for Numeric Output |
|
|
253 | (1) |
|
|
253 | (1) |
|
|
254 | (1) |
|
|
255 | (2) |
|
Chapter 10 Support Vector Machines |
|
|
257 | (22) |
|
10.1 Linearly Separable Classes |
|
|
259 | (5) |
|
|
264 | (1) |
|
10.3 Experimenting With Linearly Separable Data |
|
|
265 | (2) |
|
10.4 Microarray Data Mining |
|
|
267 | (2) |
|
10.4.1 DNA and Gene Expression |
|
|
267 | (1) |
|
10.4.2 Preprocessing Microarray Data: Attribute Selection |
|
|
268 | (1) |
|
10.4.3 Microarray Data Mining: Issues |
|
|
269 | (1) |
|
10.5 A Microarray Application |
|
|
269 | (5) |
|
10.5.1 Establishing a Benchmark |
|
|
270 | (1) |
|
10.5.2 Attribute Elimination |
|
|
271 | (3) |
|
|
274 | (1) |
|
|
275 | (1) |
|
|
275 | (4) |
|
Chapter 11 Unsupervised Clustering Techniques |
|
|
279 | (32) |
|
11.1 The K-Means Algorithm |
|
|
280 | (4) |
|
11.1.1 An Example Using K-Means |
|
|
280 | (3) |
|
11.1.2 General Considerations |
|
|
283 | (1) |
|
11.2 Agglomerative Clustering |
|
|
284 | (3) |
|
11.2.1 Agglomerative Clustering: An Example |
|
|
284 | (2) |
|
11.2.2 General Considerations |
|
|
286 | (1) |
|
11.3 Conceptual Clustering |
|
|
287 | (4) |
|
11.3.1 Measuring Category Utility |
|
|
287 | (1) |
|
11.3.2 Conceptual Clustering: An Example |
|
|
288 | (2) |
|
11.3.3 General Considerations |
|
|
290 | (1) |
|
11.4 Expectation Maximization |
|
|
291 | (1) |
|
11.5 Unsupervised Clustering With R |
|
|
292 | (13) |
|
11.5.1 Supervised Learning for Cluster Evaluation |
|
|
292 | (2) |
|
11.5.2 Unsupervised Clustering for Attribute Evaluation |
|
|
294 | (3) |
|
11.5.3 Agglomerative Clustering: A Simple Example |
|
|
297 | (1) |
|
11.5.4 Agglomerative Clustering of Gamma-Ray Burst Data |
|
|
298 | (3) |
|
11.5.5 Agglomerative Clustering of Cardiology Patient Data |
|
|
301 | (2) |
|
11.5.6 Agglomerative Clustering of Credit Screening Data |
|
|
303 | (2) |
|
|
305 | (1) |
|
|
306 | (1) |
|
|
307 | (4) |
|
Chapter 12 A Case Study in Predicting Treatment Outcome |
|
|
311 | (10) |
|
|
313 | (1) |
|
12.2 A Measure of Treatment Success |
|
|
314 | (1) |
|
12.3 Target Data Creation |
|
|
315 | (1) |
|
|
316 | (1) |
|
|
316 | (1) |
|
|
316 | (2) |
|
12.6.1 Two-Class Experiments |
|
|
316 | (2) |
|
12.7 Interpretation and Evaluation |
|
|
318 | (1) |
|
12.7.1 Should Patients Torso Rotate? |
|
|
318 | (1) |
|
|
319 | (1) |
|
|
319 | (2) |
Bibliography |
|
321 | (6) |
Appendix A Supplementary Materials And More Datasets |
|
327 | (2) |
Appendix B Statistics For Performance Evaluation |
|
329 | (6) |
Subject Index |
|
335 | (6) |
Index of R Functions |
|
341 | (2) |
Script Index |
|
343 | |