| Preface |
|
xiii | |
| Acknowledgments |
|
xv | |
| Presentational Conventions |
|
xvii | |
| About the Companion Website |
|
xix | |
|
Part I Introductory Background |
|
|
1 | (18) |
|
1 What Can We Do With Data? |
|
|
3 | (16) |
|
1.1 Big Data and Data Science |
|
|
4 | (1) |
|
1.2 Big Data Architectures |
|
|
5 | (1) |
|
|
|
6 | (1) |
|
|
|
7 | (2) |
|
1.5 A Short Taxonomy of Data Analytics |
|
|
9 | (1) |
|
|
|
10 | (2) |
|
1.6.1 Breast Cancer in Wisconsin |
|
|
11 | (1) |
|
1.6.2 Polish Company Insolvency Data |
|
|
11 | (1) |
|
1.7 A Project on Data Analytics |
|
|
12 | (4) |
|
1.7.1 A Little History on Methodologies for Data Analytics |
|
|
12 | (2) |
|
|
|
14 | (1) |
|
1.7.3 The CRISP-DM Methodology |
|
|
15 | (1) |
|
1.8 How this Book is Organized |
|
|
16 | (2) |
|
1.9 Who Should Read this Book |
|
|
18 | (1) |
|
Part II Getting Insights from Data |
|
|
19 | (140) |
|
|
|
21 | (28) |
|
|
|
22 | (3) |
|
2.2 Descriptive Univariate Analysis |
|
|
25 | (15) |
|
2.2.1 Univariate Frequencies |
|
|
25 | (2) |
|
2.2.2 Univariate Data Visualization |
|
|
27 | (5) |
|
2.2.3 Univariate Statistics |
|
|
32 | (6) |
|
2.2.4 Common Univariate Probability Distributions |
|
|
38 | (2) |
|
2.3 Descriptive Bivariate Analysis |
|
|
40 | (7) |
|
2.3.1 Two Quantitative Attributes |
|
|
41 | (4) |
|
2.3.2 Two Qualitative Attributes, at Least one of them Nominal |
|
|
45 | (1) |
|
2.3.3 Two Ordinal Attributes |
|
|
46 | (1) |
|
|
|
47 | (1) |
|
|
|
47 | (2) |
|
3 Descriptive Multivariate Analysis |
|
|
49 | (22) |
|
3.1 Multivariate Frequencies |
|
|
49 | (1) |
|
3.2 Multivariate Data Visualization |
|
|
50 | (9) |
|
3.3 Multivariate Statistics |
|
|
59 | (7) |
|
3.3.1 Location Multivariate Statistics |
|
|
59 | (1) |
|
3.3.2 Dispersion Multivariate Statistics |
|
|
60 | (6) |
|
3.4 Infographics and Word Clouds |
|
|
66 | (1) |
|
|
|
66 | (1) |
|
|
|
67 | (1) |
|
|
|
67 | (1) |
|
|
|
68 | (3) |
|
4 Data Quality and Preprocessing |
|
|
71 | (28) |
|
|
|
71 | (6) |
|
|
|
72 | (2) |
|
|
|
74 | (1) |
|
|
|
75 | (1) |
|
|
|
76 | (1) |
|
|
|
77 | (1) |
|
4.2 Converting to a Different Scale Type |
|
|
77 | (6) |
|
4.2.1 Converting Nominal to Relative |
|
|
78 | (3) |
|
4.2.2 Converting Ordinal to Relative or Absolute |
|
|
81 | (1) |
|
4.2.3 Converting Relative or Absolute to Ordinal or Nominal |
|
|
82 | (1) |
|
4.3 Converting to a Different Scale |
|
|
83 | (2) |
|
|
|
85 | (1) |
|
4.5 Dimensionality Reduction |
|
|
86 | (10) |
|
4.5.1 Attribute Aggregation |
|
|
88 | (1) |
|
4.5.1.1 Principal Component Analysis |
|
|
88 | (3) |
|
4.5.1.2 Independent Component Analysis |
|
|
91 | (1) |
|
4.5.1.3 Multidimensional Scaling |
|
|
91 | (1) |
|
4.5.2 Attribute Selection |
|
|
92 | (1) |
|
|
|
92 | (1) |
|
|
|
93 | (1) |
|
|
|
94 | (1) |
|
4.5.2.4 Search Strategies |
|
|
95 | (1) |
|
|
|
96 | (1) |
|
|
|
96 | (3) |
|
|
|
99 | (26) |
|
|
|
100 | (7) |
|
5.1.1 Differences between Values of Common Attribute Types |
|
|
101 | (2) |
|
5.1.2 Distance Measures for Objects with Quantitative Attributes |
|
|
103 | (1) |
|
5.1.3 Distance Measures for Non-conventional Attributes |
|
|
104 | (3) |
|
5.2 Clustering Validation |
|
|
107 | (1) |
|
5.3 Clustering Techniques |
|
|
108 | (14) |
|
|
|
110 | (1) |
|
5.3.1.1 Centroids and Distance Measures |
|
|
110 | (1) |
|
5.3.1.2 How K-means Works |
|
|
111 | (4) |
|
|
|
115 | (2) |
|
5.3.3 Agglomerative Hierarchical Clustering Technique |
|
|
117 | (2) |
|
5.3.3.1 Linkage Criterion |
|
|
119 | (1) |
|
|
|
120 | (2) |
|
|
|
122 | (1) |
|
|
|
123 | (2) |
|
6 Frequent Pattern Mining |
|
|
125 | (26) |
|
|
|
127 | (12) |
|
6.1.1 Setting the min_sup Threshold |
|
|
128 | (3) |
|
6.1.2 Apriori -- a Join-based Method |
|
|
131 | (2) |
|
|
|
133 | (1) |
|
|
|
134 | (4) |
|
6.1.5 Maximal and Closed Frequent Itemsets |
|
|
138 | (1) |
|
|
|
139 | (3) |
|
6.3 Behind Support and Confidence |
|
|
142 | (105) |
|
6.3.1 Cross-support Patterns |
|
|
143 | (1) |
|
|
|
144 | (1) |
|
|
|
145 | (102) |
|
6.4 Other Types of Pattern |
|
|
247 | |
|
6.4.1 Sequential patterns |
|
|
147 | (1) |
|
6.4.2 Frequent Sequence Mining |
|
|
148 | (1) |
|
6.4.3 Closed and Maximal Sequences |
|
|
148 | (1) |
|
|
|
149 | (1) |
|
|
|
149 | (2) |
|
7 Cheat Sheet and Project on Descriptive Analytics |
|
|
151 | (8) |
|
7.1 Cheat Sheet of Descriptive Analytics |
|
|
151 | (3) |
|
7.1.1 On Data Summarization |
|
|
151 | (1) |
|
|
|
151 | (2) |
|
7.1.3 On Frequent Pattern Mining |
|
|
153 | (1) |
|
7.2 Project on Descriptive Analytics |
|
|
154 | (5) |
|
7.2.1 Business Understanding |
|
|
154 | (1) |
|
|
|
155 | (100) |
|
|
|
255 | |
|
|
|
157 | (1) |
|
|
|
158 | (100) |
|
|
|
258 | |
|
Part III Predicting the Unknown |
|
|
159 | (108) |
|
|
|
161 | (26) |
|
8.1 Predictive Performance Estimation |
|
|
164 | (7) |
|
|
|
164 | (1) |
|
|
|
165 | (4) |
|
8.1.3 Predictive Performance Measures for Regression |
|
|
169 | (2) |
|
8.2 Finding the Parameters of the Model |
|
|
171 | (11) |
|
|
|
171 | (2) |
|
|
|
173 | (2) |
|
8.2.2 The Bias-variance Trade-off |
|
|
175 | (2) |
|
|
|
177 | (2) |
|
|
|
179 | (101) |
|
|
|
280 | |
|
8.2.4 Methods that use Linear Combinations of Attributes |
|
|
181 | (1) |
|
8.2.4.1 Principal Components Regression |
|
|
181 | (1) |
|
8.2.4.2 Partial Least Squares Regression |
|
|
182 | (1) |
|
8.3 Technique and Model Selection |
|
|
182 | (1) |
|
|
|
183 | (1) |
|
|
|
184 | (3) |
|
|
|
187 | (24) |
|
9.1 Binary Classification |
|
|
188 | (4) |
|
9.2 Predictive Performance Measures for Classification |
|
|
192 | (7) |
|
9.3 Distance-based Learning Algorithms |
|
|
199 | (4) |
|
9.3.1 K-nearest Neighbor Algorithms |
|
|
199 | (3) |
|
9.3.2 Case-based Reasoning |
|
|
202 | (1) |
|
9.4 Probabilistic Classification Algorithms |
|
|
203 | (5) |
|
9.4.1 Logistic Regression Algorithm |
|
|
205 | (2) |
|
9.4.2 Naive Bayes Algorithm |
|
|
207 | (1) |
|
|
|
208 | (12) |
|
|
|
220 | |
|
10 Additional Predictive Methods |
|
|
211 | (30) |
|
10.1 Search-based Algorithms |
|
|
211 | (10) |
|
10.1.1 Decision Tree Induction Algorithms |
|
|
212 | (5) |
|
10.1.2 Decision Trees for Regression |
|
|
217 | (1) |
|
|
|
218 | (1) |
|
10.1.2.2 Multivariate Adaptive Regression Splines |
|
|
219 | (2) |
|
10.2 Optimization-based Algorithms |
|
|
221 | (17) |
|
10.2.1 Artificial Neural Networks |
|
|
222 | (2) |
|
|
|
224 | (6) |
|
10.2.1.2 Deep Networks and Deep Learning Algorithms |
|
|
230 | (3) |
|
10.2.2 Support Vector Machines |
|
|
233 | (4) |
|
10.2.2.1 SVM for Regression |
|
|
237 | (1) |
|
|
|
238 | (1) |
|
|
|
239 | (2) |
|
11 Advanced Predictive Topics |
|
|
241 | (18) |
|
|
|
241 | (5) |
|
|
|
243 | (1) |
|
|
|
244 | (1) |
|
|
|
245 | (1) |
|
|
|
246 | (2) |
|
11.3 Non-binary Classification Tasks |
|
|
248 | (5) |
|
11.3.1 One-class Classification |
|
|
248 | (1) |
|
11.3.2 Multi-class Classification |
|
|
249 | (1) |
|
11.3.3 Ranking Classification |
|
|
250 | (1) |
|
11.3.4 Multi-label Classification |
|
|
251 | (1) |
|
11.3.5 Hierarchical Classification |
|
|
252 | (1) |
|
11.4 Advanced Data Preparation Techniques for Prediction |
|
|
253 | (2) |
|
11.4.1 Imbalanced Data Classification |
|
|
253 | (1) |
|
11.4.2 For Incomplete Target Labeling |
|
|
254 | (1) |
|
11.4.2.1 Semi-supervised Learning |
|
|
254 | (1) |
|
|
|
255 | (1) |
|
11.5 Description and Prediction with Supervised Interpretable Techniques |
|
|
255 | (1) |
|
|
|
256 | (3) |
|
12 Cheat Sheet and Project on Predictive Analytics |
|
|
259 | (8) |
|
12.1 Cheat Sheet on Predictive Analytics |
|
|
259 | (1) |
|
12.2 Project on Predictive Analytics |
|
|
259 | (8) |
|
12.2.1 Business Understanding |
|
|
260 | (1) |
|
12.2.2 Data Understanding |
|
|
260 | (5) |
|
|
|
265 | (1) |
|
|
|
265 | (1) |
|
|
|
265 | (1) |
|
|
|
266 | (1) |
|
Part IV Popular Data Analytics Applications |
|
|
267 | (36) |
|
13 Applications for Text, Web and Social Media |
|
|
269 | (34) |
|
|
|
269 | (9) |
|
|
|
271 | (1) |
|
13.1.2 Feature Extraction |
|
|
271 | (1) |
|
|
|
272 | (1) |
|
|
|
272 | (3) |
|
13.1.2.3 Conversion to Structured Data |
|
|
275 | (1) |
|
13.1.2.4 Is the Bag of Words Enough? |
|
|
276 | (1) |
|
|
|
277 | (1) |
|
|
|
277 | (1) |
|
13.1.4.1 Sentiment Analysis |
|
|
278 | (1) |
|
|
|
278 | (1) |
|
|
|
278 | (13) |
|
|
|
279 | (1) |
|
13.2.2 Recommendation Tasks |
|
|
280 | (1) |
|
13.2.3 Recommendation Techniques |
|
|
281 | (1) |
|
13.2.3.1 Knowledge-based Techniques |
|
|
281 | (1) |
|
13.2.3.2 Content-based Techniques |
|
|
282 | (1) |
|
13.2.3.3 Collaborative Filtering Techniques |
|
|
282 | (7) |
|
|
|
289 | (2) |
|
13.3 Social Network Analysis |
|
|
291 | (9) |
|
13.3.1 Representing Social Networks |
|
|
291 | (3) |
|
13.3.2 Basic Properties of Nodes |
|
|
294 | (1) |
|
|
|
294 | (1) |
|
|
|
294 | (1) |
|
|
|
295 | (1) |
|
|
|
296 | (1) |
|
13.3.2.5 Clustering Coefficient |
|
|
297 | (1) |
|
13.3.3 Basic and Structural Properties of Networks |
|
|
297 | (1) |
|
|
|
297 | (1) |
|
|
|
297 | (2) |
|
|
|
299 | (1) |
|
13.3.3.4 Clustering Coefficient |
|
|
299 | (1) |
|
|
|
299 | (1) |
|
13.3.4 Trends and Final Remarks |
|
|
299 | (1) |
|
|
|
300 | (3) |
| Appendix A Comprehensive Description of the CRISP-DM Methodology |
|
303 | (8) |
| References |
|
311 | (4) |
| Index |
|
315 | |