Contributors |
|
xv | |
|
PART 1 INTRODUCTION TO META LEARNING |
|
|
|
Chapter 1 Learning to learn in medical applications |
|
|
3 | (24) |
|
|
|
|
|
3 | (1) |
|
|
3 | (2) |
|
|
5 | (4) |
|
|
5 | (3) |
|
1.3.2 Optimization-based learning |
|
|
8 | (1) |
|
1.3.3 Model-based learning |
|
|
8 | (1) |
|
1.4 Task construction in meta learning |
|
|
9 | (1) |
|
1.5 Representation learning in meta learning |
|
|
10 | (1) |
|
1.6 Unsupervised / self-supervised meta learning |
|
|
11 | (1) |
|
1.7 Meta learning applications |
|
|
12 | (7) |
|
|
12 | (4) |
|
1.7.2 Few-shot image generation |
|
|
16 | (1) |
|
|
16 | (3) |
|
1.8 Relation to federated learning |
|
|
19 | (1) |
|
|
20 | (1) |
|
1.10 Conclusion and outlook |
|
|
20 | (7) |
|
|
20 | (7) |
|
Chapter 2 Introduction to meta learning |
|
|
27 | (10) |
|
|
|
2.1 History of meta learning |
|
|
27 | (1) |
|
|
28 | (2) |
|
2.2.1 Supervised learning |
|
|
28 | (1) |
|
|
28 | (2) |
|
2.3 Meta learning formulation |
|
|
30 | (2) |
|
2.3.1 Task distribution perspective |
|
|
30 | (1) |
|
2.3.2 Meta learning from bilevel optimization view |
|
|
31 | (1) |
|
2.3.3 Feed-forward model view |
|
|
31 | (1) |
|
2.4 Meta learning taxonomy |
|
|
32 | (5) |
|
|
33 | (4) |
|
Chapter 3 Metric learning algorithms for meta learning |
|
|
37 | (16) |
|
|
|
3.1 Siamese networks for meta learning |
|
|
37 | (1) |
|
|
37 | (1) |
|
3.1.2 Training and testing procedure |
|
|
38 | (1) |
|
|
38 | (4) |
|
|
38 | (2) |
|
3.2.2 Full-context embeddings |
|
|
40 | (1) |
|
3.2.3 Episodic training strategy |
|
|
41 | (1) |
|
3.3 Prototypical networks |
|
|
42 | (2) |
|
|
42 | (1) |
|
3.3.2 Reinterpretation as a linear model |
|
|
43 | (1) |
|
3.3.3 Comparison to matching networks |
|
|
44 | (1) |
|
|
44 | (2) |
|
|
44 | (1) |
|
3.4.2 Relation networks for meta learning |
|
|
45 | (1) |
|
3.4.3 Relationship to existing models |
|
|
46 | (1) |
|
3.5 Graph neural networks |
|
|
46 | (7) |
|
|
47 | (1) |
|
|
47 | (2) |
|
3.5.3 Graph neural networks for meta learning |
|
|
49 | (1) |
|
3.5.4 Relationship with existing models |
|
|
49 | (2) |
|
|
51 | (2) |
|
Chapter 4 Meta learning by optimization |
|
|
53 | (12) |
|
|
|
4.1 Optimization as model |
|
|
53 | (2) |
|
|
54 | (1) |
|
4.1.2 Parameter sharing and training strategy |
|
|
54 | (1) |
|
4.2 Model-agnostic meta learning |
|
|
55 | (3) |
|
4.2.1 A model-agnostic meta learning algorithm |
|
|
56 | (1) |
|
|
57 | (1) |
|
|
58 | (1) |
|
4.3 Almost no inner loop meta learning |
|
|
58 | (7) |
|
4.3.1 Freezing layer representations |
|
|
59 | (1) |
|
4.3.2 Representational similarity experiments |
|
|
59 | (1) |
|
4.3.3 Feature reuse happens early in learning |
|
|
60 | (2) |
|
|
62 | (3) |
|
Chapter 5 Model-based meta learning |
|
|
65 | (10) |
|
|
|
5.1 Memory-augmented neural networks |
|
|
65 | (2) |
|
|
65 | (1) |
|
5.1.2 Least recently used access |
|
|
66 | (1) |
|
5.1.3 Training and testing procedure |
|
|
66 | (1) |
|
5.2 Dynamic few shot visual learning |
|
|
67 | (4) |
|
|
67 | (1) |
|
5.2.2 Few-shot classification weight generator |
|
|
68 | (2) |
|
|
70 | (1) |
|
|
71 | (4) |
|
|
71 | (1) |
|
|
72 | (1) |
|
|
72 | (2) |
|
|
74 | (1) |
|
Chapter 6 Meta learning for domain generalization |
|
|
75 | (14) |
|
|
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (1) |
|
6.4 Meta learning Domain Generalization (MLDG) |
|
|
77 | (3) |
|
6.4.1 First order interpretation |
|
|
78 | (1) |
|
6.4.2 Sequential extension |
|
|
79 | (1) |
|
|
80 | (3) |
|
6.5.1 Learning the regularizer |
|
|
81 | (1) |
|
6.5.2 Training the final model |
|
|
82 | (1) |
|
6.5.3 Summary of the training pipeline |
|
|
83 | (1) |
|
|
83 | (2) |
|
|
85 | (4) |
|
|
86 | (3) |
|
PART 2 META LEARNING FOR MEDICAL IMAGING |
|
|
|
Chapter 7 Few-shot chest x-ray diagnosis using discriminative ensemble learning |
|
|
89 | (28) |
|
|
|
|
|
|
89 | (1) |
|
|
90 | (2) |
|
7.2.1 Deep CNN-based chest x-ray diagnosis |
|
|
90 | (1) |
|
|
91 | (1) |
|
|
92 | (9) |
|
|
93 | (1) |
|
7.3.2 Saliency-based classifier |
|
|
94 | (4) |
|
|
98 | (3) |
|
7.4 Experiments & results |
|
|
101 | (11) |
|
|
101 | (1) |
|
7.4.2 Performance measures & comparisons |
|
|
102 | (5) |
|
|
107 | (4) |
|
7.4.4 Notes on clinical applications |
|
|
111 | (1) |
|
|
112 | (5) |
|
|
112 | (1) |
|
Appendix 7.A On feature selection for autoencoders |
|
|
112 | (1) |
|
|
113 | (4) |
|
Chapter 8 Domain generalization of deep networks for medical image segmentation via meta learning |
|
|
117 | (24) |
|
|
|
|
|
|
117 | (1) |
|
|
118 | (1) |
|
8.2.1 Domain generalization |
|
|
118 | (1) |
|
|
118 | (1) |
|
8.3 Domain generalization with shape-aware meta learning |
|
|
119 | (5) |
|
|
119 | (3) |
|
8.3.2 Experimental results |
|
|
122 | (2) |
|
8.4 Federated domain generalization with meta learning in continuous frequency space |
|
|
124 | (12) |
|
|
124 | (1) |
|
|
125 | (5) |
|
8.4.3 Experimental results |
|
|
130 | (6) |
|
|
136 | (1) |
|
|
136 | (5) |
|
|
137 | (4) |
|
Chapter 9 Meta learning for adaptable lung nodule image analysis |
|
|
141 | (20) |
|
|
|
|
141 | (2) |
|
|
143 | (1) |
|
|
144 | (5) |
|
9.3.1 Memory-augmented capsule network |
|
|
144 | (1) |
|
9.3.2 FastCaps++ as feature extractor network |
|
|
144 | (2) |
|
9.3.3 Memory-augmented task network |
|
|
146 | (2) |
|
9.3.4 Episodic training with simulated domain shift |
|
|
148 | (1) |
|
|
149 | (2) |
|
9.4.1 Data-I: LUNA-16 lung nodule dataset |
|
|
150 | (1) |
|
9.4.2 Data-II: collected lung nodule dataset |
|
|
150 | (1) |
|
9.4.3 Data-III: collected incidental lung nodule dataset |
|
|
151 | (1) |
|
9.5 Experiments and results |
|
|
151 | (4) |
|
9.5.1 Training procedure and implementations details |
|
|
151 | (2) |
|
9.5.2 Baseline deep neural network performance |
|
|
153 | (1) |
|
9.5.3 Evaluation of adaptive classifier |
|
|
153 | (2) |
|
|
155 | (2) |
|
|
157 | (4) |
|
|
157 | (4) |
|
Chapter 10 Few-shot segmentation of 3D medical images |
|
|
161 | (24) |
|
|
|
|
|
|
|
|
161 | (2) |
|
10.1.1 Background on few-shot segmentation |
|
|
162 | (1) |
|
10.1.2 Challenges for medical few-shot segmentation |
|
|
162 | (1) |
|
|
163 | (1) |
|
|
163 | (1) |
|
|
163 | (1) |
|
|
163 | (1) |
|
10.2.2 Few-shot segmentation using deep learning |
|
|
164 | (1) |
|
|
164 | (6) |
|
10.3.1 Problem setup for few-shot segmentation |
|
|
165 | (1) |
|
10.3.2 Architectural design |
|
|
165 | (3) |
|
|
168 | (1) |
|
10.3.4 Volumetric segmentation strategy |
|
|
169 | (1) |
|
10.4 Dataset and experimental setup |
|
|
170 | (1) |
|
10.4.1 Dataset description |
|
|
170 | (1) |
|
10.4.2 Problem formulation |
|
|
170 | (1) |
|
10.4.3 Hyperparameters for training the network |
|
|
171 | (1) |
|
10.5 Experimental results and discussion |
|
|
171 | (9) |
|
10.5.1 `Squeeze & excitation' based interaction |
|
|
171 | (2) |
|
10.5.2 Effect of skip connections in the architecture |
|
|
173 | (1) |
|
10.5.3 Model complexity of the conditioner arm |
|
|
174 | (1) |
|
10.5.4 Effect of the number of support slice budget |
|
|
174 | (1) |
|
10.5.5 Comparison with existing approaches |
|
|
175 | (2) |
|
10.5.6 Comparison with upper bound model |
|
|
177 | (1) |
|
10.5.7 Qualitative results |
|
|
178 | (1) |
|
10.5.8 Dependence on support set |
|
|
178 | (2) |
|
10.5.9 Discussion on spatial SE as interaction blocks |
|
|
180 | (1) |
|
|
180 | (5) |
|
List of IDs in the visceral dataset |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
181 | (4) |
|
Chapter 11 Smart task design for meta learning medical image analysis systems |
|
|
185 | (28) |
|
|
|
|
|
|
|
|
185 | (1) |
|
|
186 | (3) |
|
|
186 | (2) |
|
11.2.2 Breast screening from DCE-MRI |
|
|
188 | (1) |
|
11.2.3 Microscopy image cell segmentation |
|
|
189 | (1) |
|
|
189 | (2) |
|
|
189 | (1) |
|
|
189 | (2) |
|
11.4 Task-augmentation weakly-supervised meta learning |
|
|
191 | (3) |
|
11.5 Unsupervised task formation meta learning |
|
|
194 | (1) |
|
11.6 Cross-domain few-shot meta learning |
|
|
195 | (3) |
|
|
198 | (7) |
|
|
198 | (1) |
|
11.7.2 Implementation details |
|
|
198 | (2) |
|
|
200 | (2) |
|
|
202 | (3) |
|
|
205 | (8) |
|
|
206 | (7) |
|
PART 3 META LEARNING FOR BIOMEDICAL AND HEALTH INFORMATICS |
|
|
|
Chapter 12 AGILE - a meta learning framework for few-shot brain cell classification |
|
|
213 | (22) |
|
|
|
|
213 | (1) |
|
|
214 | (2) |
|
12.2.1 Brain cell type classification |
|
|
214 | (1) |
|
12.2.2 Few-shot classification (FSC) |
|
|
215 | (1) |
|
|
216 | (1) |
|
|
216 | (7) |
|
12.3.1 Task-augmented meta learning |
|
|
217 | (2) |
|
12.3.2 Active learning with Bayesian uncertainty |
|
|
219 | (4) |
|
12.3.3 Binary cell type classifier |
|
|
223 | (1) |
|
12.4 Experiments and results |
|
|
223 | (9) |
|
|
223 | (3) |
|
12.4.2 Baselines and metrics |
|
|
226 | (1) |
|
12.4.3 Rat brain cell FSC |
|
|
227 | (4) |
|
12.4.4 Human brain cell FSC |
|
|
231 | (1) |
|
|
232 | (3) |
|
|
232 | (3) |
|
Chapter 13 Few-shot learning for dermatological disease diagnosis |
|
|
235 | (18) |
|
|
|
|
|
|
|
|
235 | (2) |
|
|
237 | (1) |
|
|
238 | (3) |
|
|
238 | (2) |
|
13.3.2 Understanding the role of multiple clusters |
|
|
240 | (1) |
|
|
241 | (5) |
|
13.4.1 Experimental setup |
|
|
241 | (2) |
|
|
243 | (1) |
|
13.4.3 Comparison between PCN and PN |
|
|
244 | (2) |
|
|
246 | (1) |
|
13.6 Role of hyperparameters |
|
|
246 | (4) |
|
13.6.1 Qualitative results |
|
|
247 | (3) |
|
|
250 | (3) |
|
|
251 | (2) |
|
Chapter 14 Knowledge-guided meta learning for disease prediction |
|
|
253 | (22) |
|
|
|
|
|
|
|
|
253 | (2) |
|
|
255 | (1) |
|
14.3 Analysis of meta learning on TCGA data |
|
|
256 | (5) |
|
14.3.1 Modified MAML for pan-cancer prediction |
|
|
256 | (2) |
|
14.3.2 Pan-cancer prediction |
|
|
258 | (3) |
|
14.4 Transfer learning vs. meta learning |
|
|
261 | (2) |
|
14.4.1 Experimental results |
|
|
262 | (1) |
|
14.5 Knowledge-guided meta learning for healthcare |
|
|
263 | (7) |
|
|
264 | (3) |
|
|
267 | (3) |
|
|
270 | (5) |
|
|
270 | (5) |
|
Chapter 15 Case study: few-shot pill recognition |
|
|
275 | (26) |
|
|
|
|
|
|
275 | (2) |
|
|
277 | (2) |
|
|
279 | (8) |
|
15.3.1 Pill segmentation and localization |
|
|
279 | (2) |
|
15.3.2 Multistream CNN for pill recognition |
|
|
281 | (6) |
|
15.4 Proposed CURE pill database |
|
|
287 | (2) |
|
15.5 Experimental results |
|
|
289 | (6) |
|
|
289 | (1) |
|
15.5.2 Imprinted text detection & recognition |
|
|
290 | (2) |
|
|
292 | (3) |
|
15.6 Demonstration for few-shot pill recognition: the `Pill Finder' application |
|
|
295 | (1) |
|
|
296 | (5) |
|
|
297 | (4) |
|
Chapter 16 Meta learning for anomaly detection in fundus photographs |
|
|
301 | (30) |
|
|
|
|
|
|
|
|
|
|
|
|
301 | (1) |
|
16.2 Related machine learning frameworks |
|
|
302 | (1) |
|
|
303 | (1) |
|
16.3.1 The OPHDIAT screening network |
|
|
304 | (1) |
|
16.3.2 The OphtaMaine screening network |
|
|
304 | (1) |
|
|
304 | (1) |
|
16.4.1 The OPHDIAT dataset |
|
|
304 | (1) |
|
16.4.2 The OphtaMaine dataset |
|
|
305 | (1) |
|
16.5 From frequent to rare ocular anomaly detection |
|
|
305 | (6) |
|
16.5.1 Deep learning for frequent condition detection |
|
|
307 | (1) |
|
16.5.2 Feature space definition |
|
|
308 | (1) |
|
16.5.3 F-distributed stochastic neighbor embedding (t-SNE) |
|
|
309 | (1) |
|
16.5.4 Feature space dimension reduction |
|
|
309 | (1) |
|
16.5.5 Probability function estimation |
|
|
310 | (1) |
|
16.5.6 Detecting rare conditions in one image |
|
|
310 | (1) |
|
16.6 Experiments in the OPHDIAT dataset |
|
|
311 | (6) |
|
16.6.1 Reference, validation, and testing |
|
|
311 | (1) |
|
16.6.2 Parameter selection |
|
|
312 | (1) |
|
16.6.3 Detection performance |
|
|
312 | (2) |
|
16.6.4 Heatmap generation |
|
|
314 | (1) |
|
16.6.5 Comparison with other machine learning frameworks |
|
|
315 | (2) |
|
16.7 From specific to general ocular anomaly detection |
|
|
317 | (2) |
|
16.7.1 The anomaly detection algorithm |
|
|
317 | (1) |
|
16.7.2 Development of the anomaly detection algorithm using OPHDIAT dataset |
|
|
317 | (1) |
|
16.7.3 Performance comparison with the baseline method |
|
|
318 | (1) |
|
16.7.4 Adaptation of the anomaly detection algorithm to the general population |
|
|
318 | (1) |
|
16.7.5 Evaluation of deep learning algorithms |
|
|
318 | (1) |
|
|
319 | (2) |
|
|
321 | (10) |
|
|
326 | (5) |
|
Chapter 17 Rare disease classification via difficulty-aware meta learning |
|
|
331 | (20) |
|
|
|
|
|
|
|
|
331 | (2) |
|
|
333 | (3) |
|
17.2.1 Skin lesion classification and segmentation from dermoscopic images |
|
|
333 | (1) |
|
17.2.2 Few-shot learning on skin lesion classification |
|
|
334 | (1) |
|
17.2.3 Rare disease diagnosis |
|
|
335 | (1) |
|
17.2.4 Meta learning in medical image analysis |
|
|
335 | (1) |
|
|
336 | (4) |
|
17.3.1 Preliminary knowledges |
|
|
336 | (1) |
|
|
337 | (1) |
|
17.3.3 Difficulty-aware meta learning framework |
|
|
337 | (2) |
|
17.3.4 Metatraining details |
|
|
339 | (1) |
|
17.4 Experiments and results |
|
|
340 | (3) |
|
17.4.1 Case study 1: ISIC 2018 skin lesion dataset |
|
|
340 | (2) |
|
17.4.2 Case study 2: validation on real clinical data |
|
|
342 | (1) |
|
|
343 | (1) |
|
|
343 | (8) |
|
|
343 | (1) |
|
|
344 | (7) |
|
PART 4 OTHER META LEARNING APPLICATIONS |
|
|
|
Chapter 18 Improved MR image reconstruction using federated learning |
|
|
351 | (18) |
|
|
|
|
|
|
|
351 | (2) |
|
|
353 | (1) |
|
|
354 | (4) |
|
18.3.1 FL-based MRI reconstruction |
|
|
354 | (2) |
|
18.3.2 FL-MR with cross-site modeling |
|
|
356 | (1) |
|
18.3.3 Training and implementation details |
|
|
357 | (1) |
|
18.4 Experiments and results |
|
|
358 | (7) |
|
|
359 | (1) |
|
18.4.2 Evaluation of the generalizability |
|
|
360 | (2) |
|
18.4.3 Evaluation of FL-based collaborations |
|
|
362 | (1) |
|
|
363 | (2) |
|
|
365 | (4) |
|
|
366 | (3) |
|
Chapter 19 Neural architecture search for medical image applications |
|
|
369 | (16) |
|
|
|
|
|
|
19.1 Neural architecture search: background |
|
|
369 | (4) |
|
|
370 | (1) |
|
|
371 | (1) |
|
19.1.3 Evaluation strategy |
|
|
372 | (1) |
|
19.2 NAS for medical imaging |
|
|
373 | (6) |
|
19.2.1 NAS for medical image classification |
|
|
374 | (1) |
|
19.2.2 NAS for medical image segmentation |
|
|
375 | (3) |
|
19.2.3 NAS for other medical image applications |
|
|
378 | (1) |
|
|
379 | (6) |
|
|
381 | (1) |
|
|
381 | (4) |
|
Chapter 20 Meta learning in the big data regime |
|
|
385 | (10) |
|
|
|
|
385 | (1) |
|
|
385 | (3) |
|
20.2.1 Approximating pseudo label gradient with meta learning |
|
|
386 | (1) |
|
20.2.2 Experimental setup |
|
|
387 | (1) |
|
20.2.3 Results and discussion |
|
|
388 | (1) |
|
|
388 | (4) |
|
|
389 | (1) |
|
20.3.2 Classifier baseline |
|
|
389 | (1) |
|
20.3.3 Metabaseline approach |
|
|
390 | (1) |
|
20.3.4 Experimental setup |
|
|
391 | (1) |
|
20.3.5 Results and discussion |
|
|
391 | (1) |
|
|
392 | (3) |
|
|
393 | (2) |
Index |
|
395 | |