Preface |
|
xiii | |
|
|
1 | |
|
|
4 | |
|
1.2 Data-mining techniques |
|
|
5 | |
|
|
6 | |
|
|
11 | |
|
|
16 | |
|
1.2.4 Finding local patterns |
|
|
16 | |
|
1.3 Why use matrix decompositions? |
|
|
17 | |
|
1.3.1 Data that comes from multiple processes |
|
|
18 | |
|
1.3.2 Data that has multiple causes |
|
|
19 | |
|
1.3.3 What are matrix decompositions used for? |
|
|
20 | |
|
|
23 | |
|
|
23 | |
|
2.2 Interpreting decompositions |
|
|
28 | |
|
2.2.1 Factor interpretation hidden sources |
|
|
29 | |
|
2.2.2 Geometric interpretation - hidden clusters |
|
|
29 | |
|
2.2.3 Component interpretation - underlying processes |
|
|
32 | |
|
2.2.4 Graph interpretation hidden connections |
|
|
32 | |
|
|
34 | |
|
|
34 | |
|
2.3 Applying decompositions |
|
|
36 | |
|
2.3.1 Selecting factors, dimensions, components, or waystations |
|
|
36 | |
|
2.3.2 Similarity and clustering |
|
|
41 | |
|
2.3.3 Finding local relationships |
|
|
42 | |
|
2.3.4 Sparse representations |
|
|
43 | |
|
|
44 | |
|
|
45 | |
|
2.4.1 Algorithms and complexity |
|
|
45 | |
|
2.4.2 Data preparation issues |
|
|
45 | |
|
2.4.3 Updating a decomposition |
|
|
46 | |
|
3 Singular Value Decomposition (SVD) |
|
|
49 | |
|
|
49 | |
|
|
54 | |
|
3.2.1 Factor interpretation |
|
|
54 | |
|
3.2.2 Geometric interpretation |
|
|
56 | |
|
3.2.3 Component interpretation |
|
|
60 | |
|
3.2.4 Graph interpretation |
|
|
61 | |
|
|
62 | |
|
3.3.1 Selecting factors, dimensions, components, and waystations |
|
|
62 | |
|
3.3.2 Similarity and clustering |
|
|
70 | |
|
3.3.3 Finding local relationships |
|
|
73 | |
|
3.3.4 Sampling and sparsifying by removing values |
|
|
76 | |
|
3.3.5 Using domain knowledge or priors |
|
|
77 | |
|
|
77 | |
|
3.4.1 Algorithms and complexity |
|
|
77 | |
|
|
78 | |
|
|
78 | |
|
3.5.1 The workhorse of noise removal |
|
|
78 | |
|
3.5.2 Information retrieval — Latent Semantic Indexing (LSI) |
|
|
78 | |
|
3.5.3 Ranking objects and attributes by interestingness |
|
|
81 | |
|
3.5.4 Collaborative filtering |
|
|
81 | |
|
3.5.5 Winnowing microarray data |
|
|
86 | |
|
|
87 | |
|
|
87 | |
|
3.6.2 The CUR decomposition |
|
|
87 | |
|
|
91 | |
|
4.1 Graphs versus datasets |
|
|
91 | |
|
|
95 | |
|
4.3 Eigenvalues and eigenvectors |
|
|
96 | |
|
|
97 | |
|
|
98 | |
|
4.6 Overview of the embedding process |
|
|
101 | |
|
4.7 Datasets versus graphs |
|
|
102 | |
|
4.7.1 Mapping Euclidean space to an affinity matrix |
|
|
103 | |
|
4.7.2 Mapping an affinity matrix to a representation matrix |
|
|
104 | |
|
|
110 | |
|
|
111 | |
|
|
114 | |
|
|
115 | |
|
4.12 The ATHENS system for novel-knowledge discovery |
|
|
118 | |
|
|
121 | |
|
5 SemiDiscrete Decomposition (SDD) |
|
|
123 | |
|
|
123 | |
|
|
132 | |
|
5.2.1 Factor interpretation |
|
|
133 | |
|
5.2.2 Geometric interpretation |
|
|
133 | |
|
5.2.3 Component interpretation |
|
|
134 | |
|
5.2.4 Graph interpretation |
|
|
134 | |
|
|
134 | |
|
|
134 | |
|
5.3.2 Similarity and clustering |
|
|
135 | |
|
|
138 | |
|
|
139 | |
|
5.5.1 Binary nonorthogonal matrix decomposition |
|
|
139 | |
|
6 Using SVD and SDD together |
|
|
141 | |
|
|
142 | |
|
|
143 | |
|
6.1.9 Applying SDD to the truncated correlation matrices |
|
|
143 | |
|
6.2 Applications of SVD and SDD together |
|
|
114 | |
|
6.2.1 Classifying galaxies |
|
|
141 | |
|
6.2.2 Mineral exploration |
|
|
145 | |
|
6.2.3 Protein conformation |
|
|
151 | |
|
7 Independent Component Analysis (ICA) |
|
|
155 | |
|
|
156 | |
|
|
159 | |
|
7.2.1 Factor interpretation |
|
|
159 | |
|
7.2.2 Geometric interpretation |
|
|
159 | |
|
7.2.3 Component interpretation |
|
|
160 | |
|
7.2.4 Graph interpretation |
|
|
160 | |
|
|
160 | |
|
7.3.1 Selecting dimensions |
|
|
160 | |
|
7.3.2 Similarity and clustering |
|
|
161 | |
|
|
161 | |
|
|
163 | |
|
7.5.1 Determining suspicious messages |
|
|
163 | |
|
7.5.2 Removing spatial artifacts from microarrays |
|
|
166 | |
|
7.5.3 Finding al Qaeda groups |
|
|
169 | |
|
8 Non-Negative Matrix Factorization (NNMF) |
|
|
173 | |
|
|
174 | |
|
|
177 | |
|
8.2.1 Factor interpretation |
|
|
177 | |
|
8.2.2 Geometric interpretation |
|
|
177 | |
|
8.2.3 Component interpretation |
|
|
178 | |
|
8.2.4 Graph interpretation |
|
|
178 | |
|
|
178 | |
|
|
178 | |
|
|
179 | |
|
8.3.3 Similarity and clustering |
|
|
180 | |
|
|
180 | |
|
8.4.1 Algorithms and complexity |
|
|
180 | |
|
|
180 | |
|
|
181 | |
|
|
181 | |
|
8.5.2 Microarray analysis |
|
|
181 | |
|
8.5.3 Mineral exploration revisited |
|
|
182 | |
|
|
189 | |
|
9.1 The Tucker3 tensor decomposition |
|
|
190 | |
|
|
193 | |
|
9.3 Applications of tensor decompositions |
|
|
194 | |
|
|
194 | |
|
9.3.2 Words, documents, and links |
|
|
195 | |
|
9.3.3 Users, keywords, and time in chat rooms |
|
|
195 | |
|
|
196 | |
10 Conclusion |
|
197 | |
Appendix A Matlab scripts |
|
203 | |
Bibliography |
|
223 | |
Index |
|
233 | |