Preface |
|
xix | |
Abbreviations |
|
xxiii | |
|
|
1 | (12) |
|
|
2 | (2) |
|
1.2 Dimensional Reduction |
|
|
4 | (1) |
|
|
4 | (1) |
|
1.4 Classification Rules of Thumb |
|
|
5 | (4) |
|
1.5 DNA Microarray Datasets Used |
|
|
9 | (2) |
|
|
11 | (2) |
Part I Class Discovery |
|
13 | (146) |
|
2 Crisp K-Means Cluster Analysis |
|
|
15 | (32) |
|
|
15 | (1) |
|
|
16 | (2) |
|
|
18 | (2) |
|
|
20 | (4) |
|
|
24 | (11) |
|
2.5.1 Davies-Bouldin Index |
|
|
25 | (1) |
|
|
25 | (1) |
|
2.5.3 Intracluster Distance |
|
|
26 | (1) |
|
2.5.4 Intercluster Distance |
|
|
27 | (3) |
|
|
30 | (1) |
|
2.5.6 Hubert's F Statistic |
|
|
31 | (1) |
|
2.5.7 Randomization Tests for Optimal Value of K |
|
|
31 | (4) |
|
2.6 V-Fold Cross-Validation |
|
|
35 | (2) |
|
2.7 Cluster Initialization |
|
|
37 | (7) |
|
2.7.1 K Randomly Selected Microarrays |
|
|
37 | (3) |
|
2.7.2 K Random Partitions |
|
|
40 | (1) |
|
2.7.3 Prototype Splitting |
|
|
41 | (3) |
|
|
44 | (1) |
|
|
44 | (1) |
|
|
45 | (2) |
|
3 Fuzzy K-Means Cluster Analysis |
|
|
47 | (10) |
|
|
47 | (1) |
|
3.2 Fuzzy K-Means Algorithm |
|
|
47 | (2) |
|
|
49 | (5) |
|
|
54 | (1) |
|
|
54 | (3) |
|
|
57 | (24) |
|
|
57 | (1) |
|
|
57 | (6) |
|
4.2.1 Feature Transformation and Reference Vector Initialization |
|
|
59 | (1) |
|
|
60 | (1) |
|
|
61 | (2) |
|
|
63 | (4) |
|
4.3.1 Feature Transformation and Reference VectorInitialization |
|
|
63 | (3) |
|
4.3.2 Reference Vector Weight Learning |
|
|
66 | (1) |
|
4.4 Cluster Visualization |
|
|
67 | (4) |
|
4.4.1 Crisp K-Means Cluster Analysis |
|
|
67 | (1) |
|
4.4.2 Adjacency Matrix Method |
|
|
68 | (1) |
|
4.4.3 Cluster Connectivity Method |
|
|
69 | (1) |
|
4.4.4 Hue-Saturation-Value (HSV) Color Normalization |
|
|
69 | (2) |
|
4.5 Unified Distance Matrix (U Matrix) |
|
|
71 | (1) |
|
|
71 | (2) |
|
|
73 | (2) |
|
4.8 Nonlinear Dimension Reduction |
|
|
75 | (4) |
|
|
79 | (2) |
|
5 Unsupervised Neural Gas |
|
|
81 | (10) |
|
|
81 | (1) |
|
|
82 | (1) |
|
|
82 | (3) |
|
5.3.1 Feature Transformation and Prototype Initialization |
|
|
82 | (1) |
|
|
83 | (2) |
|
5.4 Nonlinear Dimension Reduction |
|
|
85 | (2) |
|
|
87 | (1) |
|
|
88 | (3) |
|
6 Hierarchical Cluster Analysis |
|
|
91 | (16) |
|
|
91 | (1) |
|
|
91 | (5) |
|
6.2.1 General Programming Methods |
|
|
91 | (1) |
|
6.2.2 Step 1: Cluster-Analyzing Arrays as Objects with Genes as Attributes |
|
|
92 | (2) |
|
6.2.3 Step 2: Cluster-Analyzing Genes as Objects with Arrays as Attributes |
|
|
94 | (2) |
|
|
96 | (1) |
|
|
96 | (9) |
|
6.4.1 Heatmap Color Control |
|
|
96 | (1) |
|
6.4.2 User Choices for Clustering Arrays and Genes |
|
|
97 | (1) |
|
6.4.3 Distance Matrices and Agglomeration Sequences |
|
|
98 | (6) |
|
6.4.4 Drawing Dendograms and Heatmaps |
|
|
104 | (1) |
|
|
105 | (2) |
|
|
107 | (12) |
|
|
107 | (3) |
|
|
110 | (1) |
|
|
111 | (5) |
|
|
116 | (1) |
|
|
117 | (2) |
|
8 Text Mining: Document Clustering |
|
|
119 | (20) |
|
|
119 | (1) |
|
|
119 | (1) |
|
8.3 Streams and Documents |
|
|
120 | (1) |
|
|
120 | (1) |
|
|
120 | (1) |
|
|
121 | (1) |
|
|
121 | (1) |
|
|
121 | (3) |
|
|
124 | (1) |
|
8.8 Main Terms Representing Concept Vectors |
|
|
124 | (1) |
|
|
125 | (2) |
|
|
127 | (10) |
|
|
137 | (1) |
|
|
137 | (2) |
|
9 Text Mining: N-Gram Analysis |
|
|
139 | (20) |
|
|
139 | (1) |
|
|
140 | (1) |
|
|
141 | (13) |
|
|
154 | (2) |
|
|
156 | (3) |
Part II Dimension Reduction |
|
159 | (46) |
|
10 Principal Components Analysis |
|
|
161 | (28) |
|
|
161 | (1) |
|
10.2 Multivariate Statistical Theory |
|
|
161 | (9) |
|
10.2.1 Matrix Definitions |
|
|
162 | (1) |
|
10.2.2 Principal Component Solution of R |
|
|
163 | (1) |
|
10.2.3 Extraction of Principal Components |
|
|
164 | (2) |
|
10.2.4 Varimax Orthogonal Rotation of Components |
|
|
166 | (2) |
|
10.2.5 Principal Component Score Coefficients |
|
|
168 | (1) |
|
10.2.6 Principal Component Scores |
|
|
169 | (1) |
|
|
170 | (1) |
|
10.4 When to Use Loadings and PC Scores |
|
|
170 | (1) |
|
|
171 | (11) |
|
10.5.1 Correlation Matrix R |
|
|
171 | (1) |
|
10.5.2 Eigenanalysis of Correlation Matrix R |
|
|
172 | (2) |
|
10.5.3 Determination of Loadings and Varimax Rotation |
|
|
174 | (2) |
|
10.5.4 Calculating Principal Component (PC) Scores |
|
|
176 | (6) |
|
10.6 Rules of Thumb For PCA |
|
|
182 | (4) |
|
|
186 | (1) |
|
|
187 | (2) |
|
11 Nonlinear Manifold Learning |
|
|
189 | (16) |
|
|
189 | (1) |
|
11.2 Correlation-Based PCA |
|
|
190 | (1) |
|
|
191 | (1) |
|
|
192 | (1) |
|
|
192 | (1) |
|
11.6 Local Linear Embedding |
|
|
193 | (1) |
|
11.7 Locality Preserving Projections |
|
|
194 | (1) |
|
|
195 | (1) |
|
11.9 NLML Prior to Classification Analysis |
|
|
195 | (2) |
|
11.10 Classification Results |
|
|
197 | (3) |
|
|
200 | (3) |
|
|
203 | (2) |
Part III Class Prediction |
|
205 | (420) |
|
|
207 | (66) |
|
|
207 | (1) |
|
12.2 Filtering versus Wrapping |
|
|
208 | (1) |
|
|
209 | (2) |
|
|
209 | (1) |
|
|
209 | (1) |
|
12.3.3 Measurement Scales |
|
|
210 | (1) |
|
|
211 | (1) |
|
|
211 | (2) |
|
|
213 | (41) |
|
12.5.1 Continuous Features |
|
|
213 | (6) |
|
|
219 | (17) |
|
12.5.3 Randomization Tests |
|
|
236 | (1) |
|
12.5.4 Multitesting Problem |
|
|
237 | (5) |
|
12.5.5 Filtering Qualitative Features |
|
|
242 | (4) |
|
12.5.6 Multiclass Gini Diversity Index |
|
|
246 | (1) |
|
12.5.7 Class Comparison Techniques |
|
|
247 | (3) |
|
12.5.8 Generation of Nonredundant Gene List |
|
|
250 | (4) |
|
|
254 | (5) |
|
12.6.1 Greedy Plus Takeaway (Greedy PTA) |
|
|
254 | (4) |
|
|
258 | (1) |
|
|
259 | (11) |
|
|
270 | (1) |
|
|
270 | (3) |
|
13 Classifier Performance |
|
|
273 | (24) |
|
|
273 | (1) |
|
13.2 Input-Output, Speed, and Efficiency |
|
|
273 | (4) |
|
13.3 Training, Testing, and Validation |
|
|
277 | (3) |
|
13.4 Ensemble Classifier Fusion |
|
|
280 | (3) |
|
13.5 Sensitivity and Specificity |
|
|
283 | (1) |
|
|
284 | (1) |
|
|
285 | (1) |
|
13.8 Receiver-Operator Characteristic (ROC) Curves |
|
|
286 | (9) |
|
|
295 | (2) |
|
|
297 | (14) |
|
|
297 | (2) |
|
|
299 | (1) |
|
|
299 | (1) |
|
14.4 Cross-Validation Results |
|
|
300 | (3) |
|
|
303 | (3) |
|
14.6 Multiclass ROC Curves |
|
|
306 | (2) |
|
|
308 | (2) |
|
|
310 | (1) |
|
|
310 | (1) |
|
15 Decision Tree Classification |
|
|
311 | (20) |
|
|
311 | (3) |
|
|
314 | (1) |
|
15.3 Terminal Nodes and Stopping Criteria |
|
|
315 | (1) |
|
|
315 | (1) |
|
|
315 | (3) |
|
15.6 Cross-Validation Results |
|
|
318 | (8) |
|
|
326 | (1) |
|
|
327 | (2) |
|
|
329 | (2) |
|
|
331 | (30) |
|
|
331 | (2) |
|
|
333 | (1) |
|
|
334 | (4) |
|
16.4 Strength and Correlation |
|
|
338 | (4) |
|
16.5 Proximity and Supervised Clustering |
|
|
342 | (3) |
|
16.6 Unsupervised Clustering |
|
|
345 | (3) |
|
16.7 Class Outlier Detection |
|
|
348 | (2) |
|
|
350 | (1) |
|
|
350 | (7) |
|
|
357 | (1) |
|
|
358 | (3) |
|
|
361 | (18) |
|
|
361 | (1) |
|
|
362 | (1) |
|
|
363 | (1) |
|
17.4 Cross-Validation Results |
|
|
364 | (5) |
|
|
369 | (4) |
|
17.6 Multiclass ROC Curves |
|
|
373 | (1) |
|
|
374 | (3) |
|
|
377 | (1) |
|
|
378 | (1) |
|
|
379 | (14) |
|
|
379 | (1) |
|
|
380 | (1) |
|
18.3 Cross-Validation Results |
|
|
380 | (4) |
|
|
384 | (2) |
|
18.5 Multiclass ROC Curves |
|
|
386 | (1) |
|
|
386 | (3) |
|
|
389 | (2) |
|
|
391 | (2) |
|
19 Linear Discriminant Analysis |
|
|
393 | (22) |
|
|
393 | (1) |
|
19.2 Multivariate Matrix Definitions |
|
|
394 | (2) |
|
19.3 Linear Discriminant Analysis |
|
|
396 | (7) |
|
|
397 | (1) |
|
19.3.2 Cross-Validation Results |
|
|
397 | (4) |
|
|
401 | (1) |
|
19.3.4 Multiclass ROC Curves |
|
|
402 | (1) |
|
19.3.5 Decision Boundaries |
|
|
403 | (1) |
|
19.4 Quadratic Discriminant Analysis |
|
|
403 | (3) |
|
19.5 Fisher's Discriminant Analysis |
|
|
406 | (5) |
|
|
411 | (1) |
|
|
412 | (3) |
|
20 Learning Vector Quantization |
|
|
415 | (18) |
|
|
415 | (2) |
|
20.2 Cross-Validation Results |
|
|
417 | (1) |
|
|
417 | (9) |
|
20.4 Multiclass ROC Curves |
|
|
426 | (2) |
|
|
428 | (1) |
|
|
428 | (2) |
|
|
430 | (3) |
|
|
433 | (16) |
|
|
433 | (1) |
|
21.2 Binary Logistic Regression |
|
|
434 | (5) |
|
21.3 Polytomous Logistic Regression |
|
|
439 | (4) |
|
21.4 Cross-Validation Results |
|
|
443 | (1) |
|
|
444 | (1) |
|
|
444 | (3) |
|
|
447 | (2) |
|
22 Support Vector Machines |
|
|
449 | (38) |
|
|
449 | (1) |
|
22.2 Hard-Margin SVM for Linearly Separable Classes |
|
|
449 | (3) |
|
22.3 Kernel Mapping into Nonlinear Feature Space |
|
|
452 | (1) |
|
22.4 Soft-Margin SVM for Nonlinearly Separable Classes |
|
|
452 | (2) |
|
22.5 Gradient Ascent Soft-Margin SVM |
|
|
454 | (11) |
|
22.5.1 Cross-Validation Results |
|
|
455 | (2) |
|
|
457 | (8) |
|
22.5.3 Multiclass ROC Curves |
|
|
465 | (1) |
|
22.5.4 Decision Boundaries |
|
|
465 | (1) |
|
22.6 Least-Squares Soft-Margin SVM |
|
|
465 | (16) |
|
22.6.1 Cross-Validation Results |
|
|
470 | (7) |
|
|
477 | (1) |
|
22.6.3 Multiclass ROC Curves |
|
|
477 | (1) |
|
22.6.4 Decision Boundaries |
|
|
477 | (4) |
|
|
481 | (2) |
|
|
483 | (4) |
|
23 Artificial Neural Networks |
|
|
487 | (38) |
|
|
487 | (1) |
|
|
488 | (1) |
|
23.3 Basics of ANN Training |
|
|
488 | (9) |
|
23.3.1 Backpropagation Learning |
|
|
493 | (3) |
|
23.3.2 Resilient Backpropagation (RPROP) Learning |
|
|
496 | (1) |
|
|
496 | (1) |
|
23.4 ANN Training Methods |
|
|
497 | (5) |
|
23.4.1 Method 1: Gene Dimensional Reduction and Recursive Feature Elimination for Large Gene Lists |
|
|
497 | (5) |
|
23.4.2 Method 2: Gene Filtering and Selection |
|
|
502 | (1) |
|
|
502 | (2) |
|
23.6 Batch versus Online Training |
|
|
504 | (1) |
|
|
504 | (1) |
|
23.8 Cross-Validation Results |
|
|
504 | (2) |
|
|
506 | (1) |
|
23.10 Multiclass ROC Curves |
|
|
506 | (7) |
|
23.11 Decision Boundaries |
|
|
513 | (1) |
|
23.12 RPROP versus Backpropagation |
|
|
513 | (9) |
|
|
522 | (1) |
|
|
522 | (3) |
|
|
525 | (18) |
|
|
525 | (2) |
|
|
527 | (1) |
|
24.3 Cross-Validation Results |
|
|
527 | (1) |
|
|
528 | (8) |
|
24.5 Multiclass ROC Curves |
|
|
536 | (1) |
|
|
537 | (3) |
|
|
540 | (2) |
|
|
542 | (1) |
|
25 Neural Adaptive Learning with Metaheuristics |
|
|
543 | (30) |
|
25.1 Multilayer Perceptrons |
|
|
544 | (1) |
|
|
544 | (5) |
|
25.3 Covariance Matrix Self-Adaptation-Evolution Strategies |
|
|
549 | (7) |
|
25.4 Particle Swarm Optimization |
|
|
556 | (4) |
|
25.5 ANT Colony Optimization |
|
|
560 | (7) |
|
|
560 | (2) |
|
25.5.2 Continuous-Function Approximation |
|
|
562 | (5) |
|
|
567 | (1) |
|
|
567 | (6) |
|
|
573 | (18) |
|
|
573 | (1) |
|
|
574 | (1) |
|
26.3 Cross-Validation Results |
|
|
574 | (8) |
|
|
582 | (1) |
|
26.5 Multiclass ROC Curves |
|
|
582 | (2) |
|
26.6 Class Decision Boundaries |
|
|
584 | (2) |
|
|
586 | (2) |
|
|
588 | (3) |
|
|
591 | (10) |
|
|
591 | (4) |
|
|
595 | (1) |
|
27.3 Cross-Validation Results |
|
|
596 | (1) |
|
|
597 | (1) |
|
|
597 | (2) |
|
|
599 | (2) |
|
28 Covariance Matrix Filtering |
|
|
601 | (24) |
|
|
601 | (1) |
|
28.2 Covariance and Correlation Matrices |
|
|
601 | (1) |
|
|
602 | (6) |
|
28.4 Component Subtraction |
|
|
608 | (2) |
|
28.5 Covariance Matrix Shrinkage |
|
|
610 | (3) |
|
28.6 Covariance Matrix Filtering |
|
|
613 | (8) |
|
|
621 | (1) |
|
|
622 | (3) |
Appendixes |
|
625 | (78) |
|
|
627 | (12) |
|
|
627 | (1) |
|
|
628 | (2) |
|
|
630 | (2) |
|
|
632 | (7) |
|
|
633 | (1) |
|
A.4.2 Multiplication Rule and Conditional Probabilities |
|
|
634 | (1) |
|
A.4.3 Multiplication Rule for Independent Events |
|
|
635 | (1) |
|
A.4.4 Elimination Rule (Disease Prevalence) |
|
|
636 | (1) |
|
A.4.5 Bayes' Rule (Pathway Probabilities) |
|
|
637 | (2) |
|
|
639 | (16) |
|
|
639 | (3) |
|
|
642 | (5) |
|
B.3 Sample Mean, Covariance, and Correlation |
|
|
647 | (1) |
|
|
648 | (1) |
|
|
649 | (1) |
|
|
650 | (1) |
|
|
650 | (1) |
|
B.8 Symmetric Eigenvalue Problem |
|
|
650 | (1) |
|
B.9 Generalized Eigenvalue Problem |
|
|
651 | (1) |
|
|
652 | (3) |
|
|
655 | (10) |
|
|
655 | (1) |
|
|
655 | (1) |
|
|
656 | (1) |
|
|
656 | (1) |
|
|
656 | (1) |
|
C.6 Product and Summation Operators |
|
|
657 | (1) |
|
|
657 | (1) |
|
|
658 | (7) |
|
|
665 | (14) |
|
|
665 | (3) |
|
|
668 | (10) |
|
|
678 | (1) |
|
E Probability Distributions |
|
|
679 | (20) |
|
E.1 Basics of Hypothesis Testing |
|
|
679 | (3) |
|
E.2 Probability Functions: Source of p Values |
|
|
682 | (1) |
|
|
682 | (4) |
|
|
686 | (3) |
|
|
689 | (3) |
|
E.6 Pseudo-Random-Number Generation |
|
|
692 | (6) |
|
E.6.1 Standard Uniform Distribution |
|
|
692 | (1) |
|
E.6.2 Normal Distribution |
|
|
693 | (1) |
|
E.6.3 Lognormal Distribution |
|
|
694 | (1) |
|
E.6.4 Binomial Distribution |
|
|
695 | (1) |
|
E.6.5 Poisson Distribution |
|
|
696 | (1) |
|
E.6.6 Triangle Distribution |
|
|
697 | (1) |
|
E.6.7 Log-Triangle Distribution |
|
|
698 | (1) |
|
|
698 | (1) |
|
|
699 | (4) |
Index |
|
703 | |