|
|
xi | |
About the editors |
|
xiii | |
Preface |
|
xv | |
|
1 The dramatically changing face of computer vision |
|
|
|
|
1.1 Introduction - computer vision and its origins |
|
|
1 | (3) |
|
1.2 Part A - Understanding low-level image processing operators |
|
|
4 | (11) |
|
1.3 Part B - 2-D object location and recognition |
|
|
15 | (14) |
|
1.4 Part C - 3-D object location and the importance of invariance |
|
|
29 | (26) |
|
1.5 Part D - Tracking moving objects |
|
|
55 | (6) |
|
1.6 Part E - Texture analysis |
|
|
61 | (7) |
|
1.7 Part F - From artificial neural networks to deep learning methods |
|
|
68 | (18) |
|
|
86 | (7) |
|
|
87 | (6) |
|
2 Advanced methods for robust object detection |
|
|
|
|
|
|
93 | (2) |
|
|
95 | (1) |
|
|
96 | (1) |
|
|
97 | (1) |
|
|
98 | (3) |
|
|
101 | (2) |
|
|
103 | (3) |
|
2.8 Multiscale feature representation |
|
|
106 | (4) |
|
|
110 | (2) |
|
|
112 | (1) |
|
|
113 | (2) |
|
2.12 Detection performances |
|
|
115 | (1) |
|
|
115 | (4) |
|
|
116 | (3) |
|
3 Learning with limited supervision |
|
|
|
|
|
|
119 | (1) |
|
3.2 Context-aware active learning |
|
|
120 | (9) |
|
3.3 Weakly supervised event localization |
|
|
129 | (8) |
|
3.4 Domain adaptation of semantic segmentation using weak labels |
|
|
137 | (7) |
|
3.5 Weakly-supervised reinforcement learning for dynamical tasks |
|
|
144 | (7) |
|
|
151 | (8) |
|
|
153 | (6) |
|
4 Efficient methods for deep learning |
|
|
|
|
|
|
|
159 | (11) |
|
4.2 Efficient neural network architectures |
|
|
170 | (15) |
|
|
185 | (6) |
|
|
185 | (6) |
|
5 Deep conditional image generation |
|
|
|
|
|
|
191 | (3) |
|
5.2 Visual pattern learning: a brief review |
|
|
194 | (1) |
|
5.3 Classical generative models |
|
|
195 | (2) |
|
5.4 Deep generative models |
|
|
197 | (3) |
|
5.5 Deep conditional image generation |
|
|
200 | (1) |
|
5.6 Disentanglement for controllable synthesis |
|
|
201 | (15) |
|
5.7 Conclusion and discussions |
|
|
216 | (5) |
|
|
216 | (5) |
|
6 Deep face recognition using full and partial face images |
|
|
|
|
|
221 | (6) |
|
6.2 Components of deep face recognition |
|
|
227 | (4) |
|
6.3 Face recognition using full face images |
|
|
231 | (2) |
|
6.4 Deep face recognition using partial face data |
|
|
233 | (4) |
|
6.5 Specific model training for full and partial faces |
|
|
237 | (2) |
|
6.6 Discussion and conclusions |
|
|
239 | (4) |
|
|
240 | (3) |
|
7 Unsupervised domain adaptation using shallow and deep representations |
|
|
|
|
|
|
|
243 | (1) |
|
7.2 Unsupervised domain adaptation using manifolds |
|
|
244 | (3) |
|
7.3 Unsupervised domain adaptation using dictionaries |
|
|
247 | (11) |
|
7.4 Unsupervised domain adaptation using deep networks |
|
|
258 | (12) |
|
|
270 | (5) |
|
|
270 | (5) |
|
8 Domain adaptation and continual learning in semantic segmentation |
|
|
|
|
|
|
|
275 | (2) |
|
8.2 Unsupervised domain adaptation |
|
|
277 | (14) |
|
|
291 | (7) |
|
|
298 | (7) |
|
|
299 | (6) |
|
|
|
|
|
305 | (3) |
|
9.2 Template-based methods |
|
|
308 | (6) |
|
9.3 Online-learning-based methods |
|
|
314 | (9) |
|
9.4 Deep leaming-based methods |
|
|
323 | (4) |
|
9.5 The transition from tracking to segmentation |
|
|
327 | (4) |
|
|
331 | (6) |
|
|
332 | (5) |
|
10 Long-term deep object tracking |
|
|
|
|
|
|
337 | (4) |
|
10.2 Short-term visual object tracking |
|
|
341 | (4) |
|
10.3 Long-term visual object tracking |
|
|
345 | (22) |
|
|
367 | (6) |
|
|
368 | (5) |
|
11 Learning for action-based scene understanding |
|
|
|
|
|
|
373 | (2) |
|
11.2 Affordances of objects |
|
|
375 | (8) |
|
11.3 Functional parsing of manipulation actions |
|
|
383 | (7) |
|
11.4 Functional scene understanding through deep learning with language and vision |
|
|
390 | (7) |
|
|
397 | (2) |
|
|
399 | (7) |
|
|
399 | (7) |
|
12 Self-supervised temporal event segmentation inspired by cognitive theories |
|
|
|
|
|
|
|
406 | (2) |
|
12.2 The event segmentation theory from cognitive science |
|
|
408 | (2) |
|
12.3 Version 1: single-pass temporal segmentation using prediction |
|
|
410 | (11) |
|
12.4 Version 2: segmentation using attention-based event models |
|
|
421 | (7) |
|
12.5 Version 3: spatio-temporal localization using prediction loss map |
|
|
428 | (12) |
|
12.6 Other event segmentation approaches in computer vision |
|
|
440 | (3) |
|
|
443 | (7) |
|
|
444 | (6) |
|
13 Probabilistic anomaly detection methods using learned models from time-series data for multimedia self-aware systems |
|
|
|
|
|
|
|
|
450 | (1) |
|
13.2 Base concepts and state of the art |
|
|
451 | (7) |
|
13.3 Framework for computing anomaly in self-aware systems |
|
|
458 | (9) |
|
13.4 Case study results: anomaly detection on multisensory data from a self-aware vehicle |
|
|
467 | (9) |
|
|
476 | (5) |
|
|
477 | (4) |
|
14 Deep plug-and-play and deep unfolding methods for image restoration |
|
|
|
|
|
|
481 | (3) |
|
14.2 Half quadratic splitting (HQS) algorithm |
|
|
484 | (1) |
|
14.3 Deep plug-and-play image restoration |
|
|
485 | (7) |
|
14.4 Deep unfolding image restoration |
|
|
492 | (3) |
|
|
495 | (9) |
|
14.6 Discussion and conclusions |
|
|
504 | (7) |
|
|
505 | (6) |
|
15 Visual adversarial attacks and defenses |
|
|
|
|
|
|
|
511 | (1) |
|
|
512 | (2) |
|
15.3 Properties of an adversarial attack |
|
|
514 | (1) |
|
15.4 Types of perturbations |
|
|
515 | (1) |
|
|
515 | (7) |
|
|
522 | (1) |
|
15.7 Image classification |
|
|
523 | (6) |
|
15.8 Semantic segmentation and object detection |
|
|
529 | (1) |
|
|
529 | (2) |
|
15.10 Video classification |
|
|
531 | (2) |
|
15.11 Defenses against adversarial attacks |
|
|
533 | (4) |
|
|
537 | (8) |
|
|
538 | (7) |
Index |
|
545 | |