Preface |
|
xi | |
Authors |
|
xv | |
|
|
xvii | |
|
|
xxi | |
|
List of Examples and R illustrations |
|
|
xxv | |
Symbol Description |
|
xxvii | |
|
1 Introduction and Overview |
|
|
1 | (28) |
|
1.1 What is contamination? |
|
|
4 | (2) |
|
1.2 Evaluating robustness |
|
|
6 | (6) |
|
|
7 | (1) |
|
1.2.2 Local robustness: the influence function |
|
|
8 | (2) |
|
1.2.3 Global robustness: the breakdown point |
|
|
10 | (1) |
|
1.2.4 Global robustness: the maximum bias |
|
|
11 | (1) |
|
1.3 What is data reduction? |
|
|
12 | (3) |
|
1.3.1 Dimension reduction |
|
|
13 | (1) |
|
|
14 | (1) |
|
1.4 An overview of robust dimension reduction |
|
|
15 | (3) |
|
1.5 An overview of robust sample reduction |
|
|
18 | (4) |
|
|
22 | (7) |
|
1.6.1 G8 macroeconomic data |
|
|
22 | (1) |
|
1.6.2 Handwritten digits data |
|
|
23 | (1) |
|
|
24 | (1) |
|
1.6.4 Metallic oxide data |
|
|
24 | (1) |
|
1.6.5 Spam detection data |
|
|
25 | (1) |
|
1.6.6 Video surveillance data |
|
|
26 | (1) |
|
1.6.7 Water treatment plant data |
|
|
27 | (2) |
|
2 Multivariate Estimation Methods |
|
|
29 | (42) |
|
2.1 Robust univariate methods |
|
|
30 | (12) |
|
|
31 | (1) |
|
|
32 | (1) |
|
2.1.3 Redescending M estimators |
|
|
33 | (2) |
|
|
35 | (5) |
|
2.1.5 Measuring outlyingness |
|
|
40 | (2) |
|
2.2 Classical multivariate estimation |
|
|
42 | (1) |
|
2.3 Robust multivariate estimation |
|
|
43 | (12) |
|
2.3.1 Multivariate M estimators |
|
|
45 | (2) |
|
2.3.2 Multivariate S estimators |
|
|
47 | (1) |
|
2.3.3 Multivariate MM estimators |
|
|
48 | (1) |
|
2.3.4 Minimum Covariance Determinant |
|
|
49 | (2) |
|
|
51 | (2) |
|
2.3.6 Other multivariate estimators |
|
|
53 | (2) |
|
2.4 Identification of multivariate outliers |
|
|
55 | (4) |
|
2.4.1 Multiple testing strategy |
|
|
56 | (3) |
|
|
59 | (12) |
|
2.5.1 Italian demographics data |
|
|
59 | (2) |
|
2.5.2 Star cluster CYG OB1 data |
|
|
61 | (7) |
|
|
68 | (3) |
|
Part I Dimension Reduction |
|
|
71 | (74) |
|
Introduction to Dimension Reduction |
|
|
73 | (2) |
|
3 Principal Component Analysis |
|
|
75 | (26) |
|
|
76 | (4) |
|
3.2 PCA based on robust covariance estimation |
|
|
80 | (2) |
|
3.3 PCA based on projection pursuit |
|
|
82 | (1) |
|
|
83 | (1) |
|
3.5 PCA in high dimensions |
|
|
84 | (1) |
|
3.6 Outlier identification using principal components |
|
|
85 | (2) |
|
|
87 | (14) |
|
|
87 | (6) |
|
|
93 | (3) |
|
3.7.3 Video surveillance data |
|
|
96 | (5) |
|
|
101 | (16) |
|
4.1 Basic concepts and sPCA |
|
|
102 | (3) |
|
|
105 | (2) |
|
4.3 Choice of the degree of sparsity |
|
|
107 | (1) |
|
4.4 Sparse projection pursuit |
|
|
108 | (1) |
|
|
109 | (8) |
|
|
109 | (4) |
|
|
113 | (4) |
|
5 Canonical Correlation Analysis |
|
|
117 | (16) |
|
5.1 Classical canonical correlation analysis |
|
|
117 | (4) |
|
5.1.1 Interpretation of the results |
|
|
119 | (1) |
|
5.1.2 Selection of the number of canonical variables |
|
|
120 | (1) |
|
5.2 CCA based on robust covariance estimation |
|
|
121 | (1) |
|
|
122 | (1) |
|
|
122 | (11) |
|
|
122 | (6) |
|
|
128 | (5) |
|
|
133 | (12) |
|
|
133 | (5) |
|
6.1.1 Fitting the FA model |
|
|
135 | (3) |
|
6.2 Robust factor analysis |
|
|
138 | (1) |
|
|
138 | (7) |
|
|
138 | (4) |
|
|
142 | (3) |
|
|
145 | (86) |
|
Introduction to Sample Reduction |
|
|
147 | (2) |
|
7 k-means and Model-Based Clustering |
|
|
149 | (22) |
|
7.1 A brief overview of applications of cluster analysis |
|
|
149 | (1) |
|
|
150 | (1) |
|
|
151 | (5) |
|
7.4 Model-based clustering |
|
|
156 | (8) |
|
7.4.1 Likelihood inference |
|
|
157 | (2) |
|
7.4.2 Distribution of component densities |
|
|
159 | (3) |
|
7.4.3 Examples of model-based clustering |
|
|
162 | (2) |
|
7.5 Choosing the number of clusters |
|
|
164 | (7) |
|
|
171 | (18) |
|
8.1 Partitioning Around Medoids |
|
|
171 | (3) |
|
|
174 | (3) |
|
8.2.1 The double minimization problem involved with trimmed k-means |
|
|
175 | (2) |
|
|
177 | (4) |
|
8.3.1 Snipping and the component-wise contamination model |
|
|
178 | (1) |
|
8.3.2 Minimization of the loss function for snipped k-means |
|
|
179 | (2) |
|
8.4 Choosing the trimming and snipping levels |
|
|
181 | (3) |
|
|
184 | (5) |
|
8.5.1 Metallic oxide data |
|
|
185 | (1) |
|
8.5.2 Handwritten digits data |
|
|
186 | (3) |
|
9 Robust Model-Based Clustering |
|
|
189 | (20) |
|
9.1 Robust heterogeneous clustering based on trimming |
|
|
190 | (5) |
|
9.1.1 A robust CEM for model estimation: the tclust algorithm |
|
|
191 | (2) |
|
|
193 | (2) |
|
9.2 Robust heterogeneous clustering based on snipping |
|
|
195 | (7) |
|
9.2.1 A robust CEM for model estimation: the sclust algorithm |
|
|
197 | (3) |
|
|
200 | (2) |
|
|
202 | (7) |
|
9.3.1 Metallic oxide data |
|
|
202 | (2) |
|
9.3.2 Water treatment plant data |
|
|
204 | (5) |
|
|
209 | (10) |
|
|
210 | (2) |
|
10.2 Trimmed double k-means |
|
|
212 | (2) |
|
10.3 Snipped double k-means |
|
|
214 | (1) |
|
10.4 Robustness properties |
|
|
214 | (5) |
|
|
219 | (12) |
|
11.1 Classical discriminant analysis |
|
|
219 | (3) |
|
11.2 Robust discriminant analysis |
|
|
222 | (9) |
|
A Use of the Software R for Data Reduction |
|
|
231 | (14) |
|
A.1 Multivariate estimation methods |
|
|
231 | (4) |
|
|
235 | (3) |
|
|
238 | (1) |
|
A.4 Canonical correlation analysis |
|
|
239 | (1) |
|
|
240 | (1) |
|
A.6 Classical k-means and model based clustering |
|
|
240 | (1) |
|
|
241 | (2) |
|
A.8 Robust double clustering |
|
|
243 | (1) |
|
A.9 Discriminant analysis |
|
|
244 | (1) |
Bibliography |
|
245 | (22) |
Index |
|
267 | |