Series Foreword |
|
xi | |
Preface |
|
xiii | |
|
Introduction to Semi-Supervised Learning |
|
|
1 | (12) |
|
Supervised, Unsupervised, and Semi-Supervised Learning |
|
|
1 | (3) |
|
When Can Semi-Supervised Learning Work? |
|
|
4 | (4) |
|
Classes of Algorithms and Organization of This Book |
|
|
8 | (5) |
|
|
13 | (90) |
|
A Taxonomy for Semi-Supervised Learning Methods |
|
|
15 | (18) |
|
|
The Semi-Supervised Learning Problem |
|
|
15 | (2) |
|
Paradigms for Semi-Supervised Learning |
|
|
17 | (5) |
|
|
22 | (9) |
|
|
31 | (2) |
|
Semi-Supervised Text Classification Using EM |
|
|
33 | (24) |
|
|
|
|
|
33 | (2) |
|
A Generative Model for Text |
|
|
35 | (6) |
|
Experimental Results with Basic EM |
|
|
41 | (2) |
|
Using a More Expressive Generative Model |
|
|
43 | (6) |
|
Overcoming the Challenges of Local Maxima |
|
|
49 | (5) |
|
|
54 | (3) |
|
Risks of Semi-Supervised Learning |
|
|
57 | (16) |
|
|
|
Do Unlabeled Data Improve or Degrade Classification Performance? |
|
|
57 | (2) |
|
Understanding Unlabeled Data: Asymptotic Bias |
|
|
59 | (4) |
|
The Asymptotic Analysis of Generative Semi-Supervised Learning |
|
|
63 | (4) |
|
The Value of Labeled and Unlabeled Data |
|
|
67 | (2) |
|
|
69 | (1) |
|
Model Search and Robustness |
|
|
70 | (1) |
|
|
71 | (2) |
|
Probabilistic Semi-Supervised Clustering with Constraints |
|
|
73 | (30) |
|
|
|
|
|
|
74 | (1) |
|
HMRF Model for Semi-Supervised Clustering |
|
|
75 | (6) |
|
|
81 | (12) |
|
Active Learning for Constraint Acquisition |
|
|
93 | (3) |
|
|
96 | (4) |
|
|
100 | (1) |
|
|
101 | (2) |
|
II Low-Density Separation |
|
|
103 | (88) |
|
Transductive Support Vector Machines |
|
|
105 | (14) |
|
|
|
105 | (3) |
|
Transductive Support Vector Machines |
|
|
108 | (3) |
|
Why Use Margin on the Test Set? |
|
|
111 | (1) |
|
Experiments and Applications of TSVMs |
|
|
112 | (2) |
|
Solving the TSVM Optimization Problem |
|
|
114 | (2) |
|
Connection to Related Approaches |
|
|
116 | (1) |
|
|
116 | (3) |
|
Semi-Supervised Learning Using Semi-Definite Programming |
|
|
119 | (18) |
|
|
|
Relaxing SVM Transduction |
|
|
119 | (7) |
|
An Approximation for Speedup |
|
|
126 | (2) |
|
General Semi-Supervised Learning Settings |
|
|
128 | (1) |
|
|
129 | (4) |
|
|
133 | (1) |
|
Appendix: The Extended Schur Complement Lemma |
|
|
134 | (3) |
|
Gaussian Processes and the Null-Category Noise Model |
|
|
137 | (14) |
|
|
|
|
137 | (4) |
|
|
141 | (2) |
|
Process Model and Effect of the Null-Category |
|
|
143 | (2) |
|
Posterior Inference and Prediction |
|
|
145 | (2) |
|
|
147 | (2) |
|
|
149 | (2) |
|
|
151 | (18) |
|
|
|
|
151 | (1) |
|
Derivation of the Criterion |
|
|
152 | (3) |
|
|
155 | (3) |
|
|
158 | (2) |
|
|
160 | (6) |
|
|
166 | (1) |
|
Appendix: Proof of Theorem 9.1 |
|
|
166 | (3) |
|
Data-Dependent Regularization |
|
|
169 | (22) |
|
|
|
|
169 | (5) |
|
Information Regularization on Metric Spaces |
|
|
174 | (8) |
|
Information Regularization and Relational Data |
|
|
182 | (7) |
|
|
189 | (2) |
|
|
191 | (84) |
|
Label Propagation and Quadratic Criterion |
|
|
193 | (24) |
|
|
|
|
|
193 | (1) |
|
Label Propagation on a Similarity Graph |
|
|
194 | (4) |
|
|
198 | (7) |
|
From Transduction to Induction |
|
|
205 | (1) |
|
Incorporating Class Prior Knowledge |
|
|
205 | (1) |
|
Curse of Dimensionality for Semi-Supervised Learning |
|
|
206 | (9) |
|
|
215 | (2) |
|
The Geometric Basis of Semi-Supervised Learning |
|
|
217 | (20) |
|
|
|
|
|
217 | (3) |
|
Incorporating Geometry in Regularization |
|
|
220 | (4) |
|
|
224 | (5) |
|
Data-Dependent Kernels for Semi-Supervised Learning |
|
|
229 | (2) |
|
Linear Methods for Large-Scale Semi-Supervised Learning |
|
|
231 | (1) |
|
Connections to Other Algorithms and Related Work |
|
|
232 | (2) |
|
|
234 | (3) |
|
|
237 | (14) |
|
|
|
|
237 | (2) |
|
|
239 | (6) |
|
|
245 | (4) |
|
|
249 | (2) |
|
Semi-Supervised Learning with Conditional Harmonic Mixing |
|
|
251 | (24) |
|
|
|
|
251 | (4) |
|
Conditional Harmonic Mixing |
|
|
255 | (1) |
|
|
256 | (5) |
|
Incorporating Prior Knowledge |
|
|
261 | (1) |
|
Learning the Conditionals |
|
|
261 | (1) |
|
|
262 | (1) |
|
|
263 | (10) |
|
|
273 | (2) |
|
IV Change of Representation |
|
|
275 | (8) |
|
Graph Kernels by Spectral Transforms |
|
|
277 | (6) |
|
|
|
|
|
|
278 | (2) |
|
Kernels by Spectral Transforms |
|
|
280 | (1) |
|
|
281 | (1) |
|
Optimizing Alignment Using QCQP for Semi-Supervised Learning |
|
|
282 | (1) |
|
V Semi-Supervised Kernels with Order Constraints |
|
|
283 | (112) |
|
|
285 | (4) |
|
|
289 | (4) |
|
Spectral Methods for Dimensionality Reduction |
|
|
293 | (16) |
|
|
|
|
|
|
|
293 | (2) |
|
|
295 | (2) |
|
|
297 | (6) |
|
|
303 | (3) |
|
|
306 | (3) |
|
|
309 | (24) |
|
|
|
309 | (3) |
|
|
312 | (9) |
|
|
321 | (6) |
|
Semi-Supervised Learning Using Density-Based Metrics |
|
|
327 | (2) |
|
Conclusions and Future Work |
|
|
329 | (2) |
|
Semi-Supervised Learning in Practice |
|
|
331 | (2) |
|
|
333 | (10) |
|
|
|
|
|
333 | (1) |
|
|
334 | (3) |
|
|
337 | (3) |
|
|
340 | (3) |
|
Semi-Supervised Protein Classification Using Cluster Kernels |
|
|
343 | (18) |
|
|
|
|
|
|
343 | (2) |
|
Representations and Kernels for Protein Sequences |
|
|
345 | (3) |
|
Semi-Supervised Kernels for Protein Sequences |
|
|
348 | (4) |
|
|
352 | (6) |
|
|
358 | (3) |
|
Prediction of Protein Function from Networks |
|
|
361 | (16) |
|
|
|
|
361 | (3) |
|
Graph-Based Semi-Supervised Learning |
|
|
364 | (2) |
|
Combining Multiple Graphs |
|
|
366 | (3) |
|
Experiments on Function Prediction of Proteins |
|
|
369 | (5) |
|
|
374 | (3) |
|
|
377 | (18) |
|
|
377 | (6) |
|
Application of SSL Methods |
|
|
383 | (7) |
|
|
390 | (5) |
|
|
395 | (114) |
|
An Augmented PAC Model for Semi-Supervised Learning |
|
|
397 | (24) |
|
|
|
|
398 | (2) |
|
|
400 | (3) |
|
Sample Complexity Results |
|
|
403 | (9) |
|
|
412 | (4) |
|
Related Models and Discussion |
|
|
416 | (5) |
|
Metric-Based Approaches for Semi-Supervised Regression and Classification |
|
|
421 | (32) |
|
|
|
|
|
|
421 | (2) |
|
Metric Structure of Supervised Learning |
|
|
423 | (3) |
|
|
426 | (10) |
|
|
436 | (9) |
|
|
445 | (4) |
|
|
449 | (4) |
|
Transductive Inference and Semi-Supervised Learning |
|
|
453 | (20) |
|
|
|
453 | (2) |
|
Problem of Generalization in Inductive and Transductive Inference |
|
|
455 | (2) |
|
Structure of the VC Bounds and Transductive Inference |
|
|
457 | (1) |
|
The Symmetrization Lemma and Transductive Inference |
|
|
458 | (1) |
|
Bounds for Transductive Inference |
|
|
459 | (1) |
|
The Structural Risk Minimization Principle for Induction and Transduction |
|
|
460 | (2) |
|
Combinatorics in Transductive Inference |
|
|
462 | (1) |
|
Measures of the Size of Equivalence Classes |
|
|
463 | (2) |
|
Algorithms for Inductive and Transductive SVMs |
|
|
465 | (5) |
|
|
470 | (1) |
|
Conclusion: Transductive Inference and the New Problems of Inference |
|
|
470 | (1) |
|
Beyond Transduction: Selective Inference |
|
|
471 | (2) |
|
A Discussion of Semi-Supervised Learning and Transduction |
|
|
473 | (36) |
|
|
479 | (20) |
|
|
499 | (4) |
|
|
503 | (6) |
Index |
|
509 | |