|
|
1 | (12) |
|
|
1 | (3) |
|
Probabilistic methods in computational linguistics |
|
|
1 | (1) |
|
Supervised and unsupervised training |
|
|
2 | (1) |
|
|
3 | (1) |
|
|
4 | (4) |
|
Major varieties of learning problem |
|
|
4 | (2) |
|
|
6 | (1) |
|
|
7 | (1) |
|
|
8 | (1) |
|
Organization and assumptions |
|
|
8 | (5) |
|
|
8 | (2) |
|
|
10 | (1) |
|
|
11 | (2) |
|
Self-training and Co-training |
|
|
13 | (18) |
|
|
13 | (5) |
|
|
13 | (1) |
|
|
14 | (2) |
|
|
16 | (2) |
|
|
18 | (10) |
|
|
19 | (1) |
|
|
20 | (3) |
|
|
23 | (2) |
|
Symmetry of features and instances |
|
|
25 | (2) |
|
|
27 | (1) |
|
|
28 | (3) |
|
Applications of Self-Training and Co-Training |
|
|
31 | (12) |
|
|
31 | (2) |
|
|
33 | (2) |
|
|
35 | (1) |
|
|
36 | (7) |
|
|
36 | (2) |
|
Word-sense disambiguation |
|
|
38 | (2) |
|
|
40 | (3) |
|
|
43 | (24) |
|
|
43 | (5) |
|
|
43 | (2) |
|
k-nearest-neighbor classifier |
|
|
45 | (3) |
|
|
48 | (5) |
|
|
48 | (2) |
|
|
50 | (2) |
|
|
52 | (1) |
|
Evaluating detectors and classifiers that abstain |
|
|
53 | (9) |
|
Confidence-rated classifiers |
|
|
53 | (1) |
|
|
54 | (3) |
|
Idealized performance curves |
|
|
57 | (2) |
|
|
59 | (3) |
|
Binary classifiers and ECOC |
|
|
62 | (5) |
|
Mathematics for Boundary-Oriented Methods |
|
|
67 | (28) |
|
|
67 | (7) |
|
Representing a hyperplane |
|
|
67 | (2) |
|
Eliminating the threshold |
|
|
69 | (1) |
|
|
70 | (2) |
|
Naive Bayes decision boundary |
|
|
72 | (2) |
|
|
74 | (9) |
|
|
74 | (2) |
|
|
76 | (3) |
|
Differentiation of vector and matrix expressions |
|
|
79 | (2) |
|
An example: linear regression |
|
|
81 | (2) |
|
|
83 | (12) |
|
|
83 | (1) |
|
|
84 | (3) |
|
|
87 | (4) |
|
|
91 | (4) |
|
Boundary-Oriented Methods |
|
|
95 | (36) |
|
|
97 | (6) |
|
|
97 | (2) |
|
|
99 | (1) |
|
|
100 | (1) |
|
The perceptron algorithm as gradient descent |
|
|
101 | (2) |
|
|
103 | (2) |
|
|
105 | (9) |
|
|
110 | (1) |
|
|
111 | (2) |
|
|
113 | (1) |
|
Support Vector Machines (SVMs) |
|
|
114 | (15) |
|
|
114 | (2) |
|
|
116 | (3) |
|
|
119 | (2) |
|
Slack in the separable case |
|
|
121 | (2) |
|
|
123 | (2) |
|
|
125 | (2) |
|
Training a transductive SVM |
|
|
127 | (2) |
|
Null-category noise model |
|
|
129 | (2) |
|
|
131 | (22) |
|
|
131 | (1) |
|
|
132 | (5) |
|
|
132 | (1) |
|
|
133 | (3) |
|
|
136 | (1) |
|
|
137 | (2) |
|
|
139 | (4) |
|
|
139 | (1) |
|
Pseudo relevance feedback |
|
|
140 | (3) |
|
|
143 | (3) |
|
|
146 | (6) |
|
Clustering by propagation |
|
|
146 | (1) |
|
Self-training as propagation |
|
|
147 | (3) |
|
Co-training as propagation |
|
|
150 | (2) |
|
|
152 | (1) |
|
|
153 | (22) |
|
|
153 | (10) |
|
Definition and geometric interpretation |
|
|
153 | (3) |
|
The linear discriminant decision boundary |
|
|
156 | (3) |
|
Decision-directed approximation |
|
|
159 | (3) |
|
|
162 | (1) |
|
|
163 | (12) |
|
|
163 | (1) |
|
Relative frequency estimation |
|
|
164 | (2) |
|
|
166 | (3) |
|
|
169 | (6) |
|
|
175 | (18) |
|
|
175 | (7) |
|
The conditional independence assumption |
|
|
176 | (2) |
|
The power of conditional independence |
|
|
178 | (4) |
|
Agreement-based self-teaching |
|
|
182 | (2) |
|
|
184 | (8) |
|
Applied to self-training and co-training |
|
|
184 | (2) |
|
|
186 | (1) |
|
Markov chains and random walks |
|
|
187 | (5) |
|
|
192 | (1) |
|
|
193 | (28) |
|
|
194 | (2) |
|
|
196 | (2) |
|
|
198 | (5) |
|
|
203 | (10) |
|
|
203 | (2) |
|
|
205 | (4) |
|
|
209 | (1) |
|
|
210 | (3) |
|
|
213 | (2) |
|
|
215 | (5) |
|
|
220 | (1) |
|
Mathematics for Spectral Methods |
|
|
221 | (16) |
|
|
221 | (3) |
|
|
221 | (1) |
|
Matrices as linear operators |
|
|
222 | (1) |
|
|
222 | (2) |
|
Eigenvalues and eigenvectors |
|
|
224 | (3) |
|
Definition of eigenvalues and eigenvectors |
|
|
224 | (1) |
|
|
225 | (1) |
|
Orthogonal diagonalization |
|
|
226 | (1) |
|
Eigenvalues and the scaling effects of a matrix |
|
|
227 | (9) |
|
|
227 | (1) |
|
|
228 | (2) |
|
|
230 | (2) |
|
|
232 | (2) |
|
The Courant-Fischer minimax theorem |
|
|
234 | (2) |
|
|
236 | (1) |
|
|
237 | (40) |
|
|
237 | (14) |
|
|
237 | (2) |
|
|
239 | (2) |
|
|
241 | (2) |
|
|
243 | (8) |
|
Spectra of matrices and graphs |
|
|
251 | (6) |
|
|
252 | (1) |
|
Relating matrices and graphs |
|
|
253 | (3) |
|
The Laplacian matrix and graph spectrum |
|
|
256 | (1) |
|
|
257 | (8) |
|
The second smallest eigenvector of the Laplacian |
|
|
257 | (2) |
|
The cut size and the Laplacian |
|
|
259 | (1) |
|
|
260 | (2) |
|
|
262 | (1) |
|
|
263 | (2) |
|
Spectral methods for semisupervised learning |
|
|
265 | (10) |
|
Harmonics and harmonic functions |
|
|
265 | (2) |
|
|
267 | (1) |
|
The Laplacian and random fields |
|
|
268 | (2) |
|
Harmonic functions and the Laplacian |
|
|
270 | (2) |
|
Using the Laplacian for regularization |
|
|
272 | (2) |
|
Transduction to induction |
|
|
274 | (1) |
|
|
275 | (2) |
Bibliography |
|
277 | (24) |
Index |
|
301 | |