About the Authors |
|
xiii | |
Preface |
|
xvii | |
Acknowledgements |
|
xxi | |
List of Abbreviations |
|
xxiii | |
Part I Fundamentals and Basic Elements |
|
1 | (208) |
|
1 From Signal Processing to Machine Learning |
|
|
3 | (10) |
|
1.1 A New Science is Born: Signal Processing |
|
|
3 | (2) |
|
1.1.1 Signal Processing Before Being Coined |
|
|
3 | (1) |
|
1.1.2 1948: Birth of the Information Age |
|
|
4 | (1) |
|
1.1.3 1950s: Audio Engineering Catalyzes Signal Processing |
|
|
4 | (1) |
|
1.2 From Analog to Digital Signal Processing |
|
|
5 | (2) |
|
1.2.1 1960s: Digital Signal Processing Begins |
|
|
5 | (1) |
|
1.2.2 1970s: Digital Signal Processing Becomes Popular |
|
|
6 | (1) |
|
1.2.3 1980s: Silicon Meets Digital Signal Processing |
|
|
6 | (1) |
|
1.3 Digital Signal Processing Meets Machine Learning |
|
|
7 | (1) |
|
1.3.1 1990s: New Application Areas |
|
|
7 | (1) |
|
1.3.2 1990s: Neural Networks, Fuzzy Logic, and Genetic Optimization |
|
|
7 | (1) |
|
1.4 Recent Machine Learning in Digital Signal Processing |
|
|
8 | (5) |
|
1.4.1 Traditional Signal Assumptions Are No Longer Valid |
|
|
8 | (1) |
|
1.4.2 Encoding Prior Knowledge |
|
|
8 | (1) |
|
1.4.3 Learning and Knowledge from Data |
|
|
9 | (1) |
|
1.4.4 From Machine Learning to Digital Signal Processing |
|
|
9 | (1) |
|
1.4.5 From Digital Signal Processing to Machine Learning |
|
|
10 | (3) |
|
2 Introduction to Digital Signal Processing |
|
|
13 | (84) |
|
2.1 Outline of the Signal Processing Field |
|
|
13 | (25) |
|
2.1.1 Fundamentals on Signals and Systems |
|
|
14 | (7) |
|
|
21 | (3) |
|
|
24 | (4) |
|
|
28 | (2) |
|
|
30 | (1) |
|
2.1.6 System Identification |
|
|
31 | (5) |
|
2.1.7 Blind Source Separation |
|
|
36 | (2) |
|
2.2 From Time-Frequency to Compressed Sensing |
|
|
38 | (10) |
|
2.2.1 Time-Frequency Distributions |
|
|
38 | (3) |
|
|
41 | (3) |
|
2.2.3 Sparsity, Compressed Sensing, and Dictionary Learning |
|
|
44 | (4) |
|
2.3 Multidimensional Signals and Systems |
|
|
48 | (4) |
|
2.3.1 Multidimensional Signals |
|
|
49 | (2) |
|
2.3.2 Multidimensional Systems |
|
|
51 | (1) |
|
2.4 Spectral Analysis on Manifolds |
|
|
52 | (5) |
|
2.4.1 Theoretical Fundamentals |
|
|
52 | (2) |
|
|
54 | (3) |
|
2.5 Tutorials and Application Examples |
|
|
57 | (37) |
|
2.5.1 Real and Complex Signal Processing and Representations |
|
|
57 | (6) |
|
2.5.2 Convolution, Fourier Transform, and Spectrum |
|
|
63 | (4) |
|
2.5.3 Continuous-Time Signals and Systems |
|
|
67 | (3) |
|
2.5.4 Filtering Cardiac Signals |
|
|
70 | (4) |
|
2.5.5 Nonparametric Spectrum Estimation |
|
|
74 | (3) |
|
2.5.6 Parametric Spectrum Estimation |
|
|
77 | (4) |
|
|
81 | (3) |
|
2.5.8 Time-Frequency Representations and Wavelets |
|
|
84 | (3) |
|
2.5.9 Examples for Spectral Analysis on Manifolds |
|
|
87 | (7) |
|
2.6 Questions and Problems |
|
|
94 | (3) |
|
3 Signal Processing Models |
|
|
97 | (68) |
|
|
97 | (1) |
|
3.2 Vector Spaces, Basis, and Signal Models |
|
|
98 | (13) |
|
3.2.1 Basic Operations for Vectors |
|
|
98 | (2) |
|
|
100 | (1) |
|
|
101 | (1) |
|
|
102 | (2) |
|
3.2.5 Complex Signal Models |
|
|
104 | (1) |
|
3.2.6 Standard Noise Models in DSP |
|
|
105 | (2) |
|
3.2.7 The Role of the Cost Function |
|
|
107 | (2) |
|
3.2.8 The Role of the Regularizer |
|
|
109 | (2) |
|
3.3 Digital Signal Processing Models |
|
|
111 | (11) |
|
3.3.1 Sinusoidal Signal Models |
|
|
112 | (1) |
|
3.3.2 System Identification Signal Models |
|
|
113 | (3) |
|
3.3.3 Sinc Interpolation Models |
|
|
116 | (4) |
|
3.3.4 Sparse Deconvolution |
|
|
120 | (1) |
|
|
121 | (1) |
|
3.4 Tutorials and Application Examples |
|
|
122 | (38) |
|
3.4.1 Examples of Noise Models |
|
|
123 | (9) |
|
3.4.2 Autoregressive Exogenous System Identification Models |
|
|
132 | (6) |
|
3.4.3 Nonlinear System Identification Using Volterra Models |
|
|
138 | (2) |
|
3.4.4 Sinusoidal Signal Models |
|
|
140 | (4) |
|
3.4.5 Sinc-based Interpolation |
|
|
144 | (8) |
|
3.4.6 Sparse Deconvolution |
|
|
152 | (5) |
|
|
157 | (3) |
|
3.5 Questions and Problems |
|
|
160 | (1) |
|
3.A MATLAB simpleInterp Toolbox Structure |
|
|
161 | (4) |
|
4 Kernel Functions and Reproducing Kernel Hilbert Spaces |
|
|
165 | (44) |
|
|
165 | (4) |
|
4.2 Kernel Functions and Mappings |
|
|
169 | (5) |
|
4.2.1 Measuring Similarity with Kernels |
|
|
169 | (1) |
|
4.2.2 Positive-Definite Kernels |
|
|
169 | (1) |
|
4.2.3 Reproducing Kernel in Hilbert Space and Reproducing Property |
|
|
170 | (3) |
|
|
173 | (1) |
|
|
174 | (5) |
|
4.3.1 Tikhonov's Regularization |
|
|
175 | (1) |
|
4.3.2 Representer Theorem and Regularization Properties |
|
|
176 | (2) |
|
4.3.3 Basic Operations with Kernels |
|
|
178 | (1) |
|
4.4 Constructing Kernel Functions |
|
|
179 | (5) |
|
|
179 | (1) |
|
4.4.2 Properties of Kernels |
|
|
180 | (1) |
|
4.4.3 Engineering Signal Processing Kernels |
|
|
181 | (3) |
|
4.5 Complex Reproducing Kernel in Hilbert Spaces |
|
|
184 | (2) |
|
4.6 Support Vector Machine Elements for Regression and Estimation |
|
|
186 | (5) |
|
4.6.1 Support Vector Regression Signal Model and Cost Function |
|
|
186 | (1) |
|
4.6.2 Minimizing Functional |
|
|
187 | (4) |
|
4.7 Tutorials and Application Examples |
|
|
191 | (14) |
|
4.7.1 Kernel Calculations and Kernel Matrices |
|
|
191 | (3) |
|
4.7.2 Basic Operations with Kernels |
|
|
194 | (3) |
|
4.7.3 Constructing Kernels |
|
|
197 | (2) |
|
|
199 | (3) |
|
4.7.5 Application Example for Support Vector Regression Elements |
|
|
202 | (3) |
|
|
205 | (1) |
|
4.9 Questions and Problems |
|
|
205 | (4) |
Part II Function Approximation and Adaptive Filtering |
|
209 | (224) |
|
5 A Support Vector Machine Signal Estimation Framework |
|
|
211 | (30) |
|
|
211 | (2) |
|
5.2 A Framework for Support Vector Machine Signal Estimation |
|
|
213 | (3) |
|
5.3 Primal Signal Models for Support Vector Machine Signal Processing |
|
|
216 | (11) |
|
5.3.1 Nonparametric Spectrum and System Identification |
|
|
218 | (2) |
|
5.3.2 Orthogonal Frequency Division Multiplexing Digital Communications |
|
|
220 | (2) |
|
5.3.3 Convolutional Signal Models |
|
|
222 | (3) |
|
|
225 | (2) |
|
5.4 Tutorials and Application Examples |
|
|
227 | (11) |
|
5.4.1 Nonparametric Spectral Analysis with Primal Signal Models |
|
|
227 | (1) |
|
5.4.2 System Identification with Primal Signal Model gamma-filter |
|
|
228 | (2) |
|
5.4.3 Parametric Spectral Density Estimation with Primal Signal Models |
|
|
230 | (1) |
|
5.4.4 Temporal Reference Array Processing with Primal Signal Models |
|
|
231 | (2) |
|
5.4.5 Sinc Interpolation with Primal Signal Models |
|
|
233 | (1) |
|
5.4.6 Orthogonal Frequency Division Multiplexing with Primal Signal Models |
|
|
233 | (5) |
|
5.5 Questions and Problems |
|
|
238 | (3) |
|
6 Reproducing Kernel Hilbert Space Models for Signal Processing |
|
|
241 | (40) |
|
|
241 | (1) |
|
6.2 Reproducing Kernel Hilbert Space Signal Models |
|
|
242 | (16) |
|
6.2.1 Kernel Autoregressive Exogenous Identification |
|
|
244 | (3) |
|
6.2.2 Kernel Finite Impulse Response and the gamma-filter |
|
|
247 | (1) |
|
6.2.3 Kernel Array Processing with Spatial Reference |
|
|
248 | (1) |
|
6.2.4 Kernel Semiparametric Regression |
|
|
249 | (9) |
|
6.3 Tutorials and Application Examples |
|
|
258 | (21) |
|
6.3.1 Nonlinear System Identification with Support Vector Machine-Autoregressive and Moving Average |
|
|
258 | (2) |
|
6.3.2 Nonlinear System Identification with the gamma-filter |
|
|
260 | (4) |
|
6.3.3 Electric Network Modeling with Semiparametric Regression |
|
|
264 | (8) |
|
|
272 | (3) |
|
6.3.5 Spatial and Temporal Antenna Array Kernel Processing |
|
|
275 | (4) |
|
6.4 Questions and Problems |
|
|
279 | (2) |
|
7 Dual Signal Models for Signal Processing |
|
|
281 | (52) |
|
|
281 | (1) |
|
7.2 Dual Signal Model Elements |
|
|
281 | (2) |
|
7.3 Dual Signal Model Instantiations |
|
|
283 | (6) |
|
7.3.1 Dual Signal Model for Nonuniform Signal Interpolation |
|
|
283 | (1) |
|
7.3.2 Dual Signal Model for Sparse Signal Deconvolution |
|
|
284 | (1) |
|
7.3.3 Spectrally Adapted Mercer Kernels |
|
|
285 | (4) |
|
7.4 Tutorials and Application Examples |
|
|
289 | (42) |
|
7.4.1 Nonuniform Interpolation with the Dual Signal Model |
|
|
290 | (2) |
|
7.4.2 Sparse Deconvolution with the Dual Signal Model |
|
|
292 | (2) |
|
7.4.3 Doppler Ultrasound Processing for Fault Detection |
|
|
294 | (2) |
|
7.4.4 Spectrally Adapted Mercer Kernels |
|
|
296 | (8) |
|
7.4.5 Interpolation of Heart Rate Variability Signals |
|
|
304 | (5) |
|
7.4.6 Denoising in Cardiac Motion-Mode Doppler Ultrasound Images |
|
|
309 | (7) |
|
7.4.7 Indoor Location from Mobile Devices Measurements |
|
|
316 | (6) |
|
7.4.8 Electroanatomical Maps in Cardiac Navigation Systems |
|
|
322 | (9) |
|
7.5 Questions and Problems |
|
|
331 | (2) |
|
8 Advances in Kernel Regression and Function Approximation |
|
|
333 | (54) |
|
|
333 | (1) |
|
8.2 Kernel-Based Regression Methods |
|
|
333 | (15) |
|
8.2.1 Advances in Support Vector Regression |
|
|
334 | (4) |
|
8.2.2 Multi-output Support Vector Regression |
|
|
338 | (1) |
|
8.2.3 Kernel Ridge Regression |
|
|
339 | (2) |
|
8.2.4 Kernel Signal-to-Noise Regression |
|
|
341 | (2) |
|
8.2.5 Semi-supervised Support Vector Regression |
|
|
343 | (2) |
|
8.2.6 Model Selection in Kernel Regression Methods |
|
|
345 | (3) |
|
8.3 Bayesian Nonparametric Kernel Regression Models |
|
|
348 | (12) |
|
8.3.1 Gaussian Process Regression |
|
|
349 | (10) |
|
8.3.2 Relevance Vector Machines |
|
|
359 | (1) |
|
8.4 Tutorials and Application Examples |
|
|
360 | (22) |
|
8.4.1 Comparing Support Vector Regression, Relevance Vector Machines, and Gaussian Process Regression |
|
|
360 | (2) |
|
8.4.2 Profile-Dependent Support Vector Regression |
|
|
362 | (2) |
|
8.4.3 Multi-output Support Vector Regression |
|
|
364 | (2) |
|
8.4.4 Kernel Signal-to-Noise Ratio Regression |
|
|
366 | (2) |
|
8.4.5 Semi-supervised Support Vector Regression |
|
|
368 | (1) |
|
8.4.6 Bayesian Nonparametric Model |
|
|
369 | (1) |
|
8.4.7 Gaussian Process Regression |
|
|
370 | (9) |
|
8.4.8 Relevance Vector Machines |
|
|
379 | (3) |
|
|
382 | (1) |
|
8.6 Questions and Problems |
|
|
383 | (4) |
|
9 Adaptive Kernel Learning for Signal Processing |
|
|
387 | (46) |
|
|
387 | (1) |
|
9.2 Linear Adaptive Filtering |
|
|
387 | (5) |
|
9.2.1 Least Mean Squares Algorithm |
|
|
388 | (1) |
|
9.2.2 Recursive Least-Squares Algorithm |
|
|
389 | (3) |
|
9.3 Kernel Adaptive Filtering |
|
|
392 | (1) |
|
9.4 Kernel Least Mean Squares |
|
|
392 | (6) |
|
9.4.1 Derivation of Kernel Least Mean Squares |
|
|
393 | (1) |
|
9.4.2 Implementation Challenges and Dual Formulation |
|
|
394 | (1) |
|
9.4.3 Example on Prediction of the Mackey-Glass Time Series |
|
|
395 | (1) |
|
9.4.4 Practical Kernel Least Mean Squares Algorithms |
|
|
396 | (2) |
|
9.5 Kernel Recursive Least Squares |
|
|
398 | (8) |
|
9.5.1 Kernel Ridge Regression |
|
|
398 | (1) |
|
9.5.2 Derivation of Kernel Recursive Least Squares |
|
|
399 | (2) |
|
9.5.3 Prediction of the Mackey-Glass Time Series with Kernel Recursive Least Squares |
|
|
401 | (1) |
|
9.5.4 Beyond the Stationary Model |
|
|
402 | (3) |
|
9.5.5 Example on Nonlinear Channel Identification and Reconvergence |
|
|
405 | (1) |
|
9.6 Explicit Recursivity for Adaptive Kernel Models |
|
|
406 | (5) |
|
9.6.1 Recursivity in Hilbert Spaces |
|
|
406 | (2) |
|
9.6.2 Recursive Filters in Reproducing Kernel Hilbert Spaces |
|
|
408 | (3) |
|
9.7 Online Sparsification with Kernels |
|
|
411 | (3) |
|
9.7.1 Sparsity by Construction |
|
|
411 | (2) |
|
9.7.2 Sparsity by Pruning |
|
|
413 | (1) |
|
9.8 Probabilistic Approaches to Kernel Adaptive Filtering |
|
|
414 | (4) |
|
9.8.1 Gaussian Processes and Kernel Ridge Regression |
|
|
415 | (1) |
|
9.8.2 Online Recursive Solution for Gaussian Processes Regression |
|
|
416 | (1) |
|
9.8.3 Kernel Recursive Least Squares Tracker |
|
|
417 | (1) |
|
9.8.4 Probabilistic Kernel Least Mean Squares |
|
|
418 | (1) |
|
|
418 | (1) |
|
9.9.1 Selection of Kernel Parameters |
|
|
418 | (1) |
|
9.9.2 Multi-Kernel Adaptive Filtering |
|
|
419 | (1) |
|
9.9.3 Recursive Filtering in Kernel Hilbert Spaces |
|
|
419 | (1) |
|
9.10 Tutorials and Application Examples |
|
|
419 | (11) |
|
9.10.1 Kernel Adaptive Filtering Toolbox |
|
|
420 | (1) |
|
9.10.2 Prediction of a Respiratory Motion Time Series |
|
|
421 | (2) |
|
9.10.3 Online Regression on the KIN4OK Dataset |
|
|
423 | (2) |
|
9.10.4 The Mackey-Glass Time Series |
|
|
425 | (2) |
|
9.10.5 Explicit Recursivity on Reproducing Kernel in Hilbert Space and Electroencephalogram Prediction |
|
|
427 | (1) |
|
9.10.6 Adaptive Antenna Array Processing |
|
|
428 | (2) |
|
9.11 Questions and Problems |
|
|
430 | (3) |
Part III Classification, Detection, and Feature Extraction |
|
433 | (156) |
|
10 Support Vector Machine and Kernel Classification Algorithms |
|
|
435 | (68) |
|
|
435 | (1) |
|
10.2 Support Vector Machine and Kernel Classifiers |
|
|
435 | (17) |
|
10.2.1 Support Vector Machines |
|
|
435 | (6) |
|
10.2.2 Multiclass and Multilabel Support Vector Machines |
|
|
441 | (6) |
|
10.2.3 Least-Squares Support Vector Machine |
|
|
447 | (1) |
|
10.2.4 Kernel Fisher's Discriminant Analysis |
|
|
448 | (4) |
|
10.3 Advances in Kernel-Based Classification |
|
|
452 | (25) |
|
10.3.1 Large Margin Filtering |
|
|
452 | (2) |
|
10.3.2 Semi-supervised Learning |
|
|
454 | (6) |
|
10.3.3 Multiple Kernel Learning |
|
|
460 | (2) |
|
10.3.4 Structured-Output Learning |
|
|
462 | (6) |
|
|
468 | (9) |
|
10.4 Large-Scale Support Vector Machines |
|
|
477 | (8) |
|
10.4.1 Large-Scale Support Vector Machine Implementations |
|
|
477 | (1) |
|
10.4.2 Random Fourier Features |
|
|
478 | (2) |
|
10.4.3 Parallel Support Vector Machine |
|
|
480 | (3) |
|
|
483 | (2) |
|
10.5 Tutorials and Application Examples |
|
|
485 | (16) |
|
10.5.1 Examples of Support Vector Machine Classification |
|
|
485 | (7) |
|
10.5.2 Example of Least-Squares Support Vector Machine |
|
|
492 | (1) |
|
10.5.3 Kernel-Filtering Support Vector Machine for Brain-Computer Interface Signal Classification |
|
|
493 | (1) |
|
10.5.4 Example of Laplacian Support Vector Machine |
|
|
494 | (4) |
|
10.5.5 Example of Graph-Based Label Propagation |
|
|
498 | (1) |
|
10.5.6 Examples of Multiple Kernel Learning |
|
|
498 | (3) |
|
|
501 | (1) |
|
10.7 Questions and Problems |
|
|
502 | (1) |
|
11 Clustering and Anomaly Detection with Kernels |
|
|
503 | (40) |
|
|
503 | (3) |
|
|
506 | (8) |
|
11.2.1 Kernelization of the Metric |
|
|
506 | (2) |
|
11.2.2 Clustering in Feature Spaces |
|
|
508 | (6) |
|
11.3 Domain Description Via Support Vectors |
|
|
514 | (4) |
|
11.3.1 Support Vector Domain Description |
|
|
514 | (1) |
|
11.3.2 One-Class Support Vector Machine |
|
|
515 | (1) |
|
11.3.3 Relationship Between Support Vector Domain Description and Density Estimation |
|
|
516 | (1) |
|
11.3.4 Semi-supervised One-Class Classification |
|
|
517 | (1) |
|
11.4 Kernel Matched Subspace Detectors |
|
|
518 | (4) |
|
11.4.1 Kernel Orthogonal Subspace Projection |
|
|
518 | (2) |
|
11.4.2 Kernel Spectral Angle Mapper |
|
|
520 | (2) |
|
11.5 Kernel Anomaly Change Detection |
|
|
522 | (3) |
|
11.5.1 Linear Anomaly Change Detection Algorithms |
|
|
522 | (1) |
|
11.5.2 Kernel Anomaly Change Detection Algorithms |
|
|
523 | (2) |
|
11.6 Hypothesis Testing with Kernels |
|
|
525 | (4) |
|
11.6.1 Distribution Embeddings |
|
|
526 | (1) |
|
11.6.2 Maximum Mean Discrepancy |
|
|
527 | (1) |
|
11.6.3 One-Class Support Measure Machine |
|
|
528 | (1) |
|
11.7 Tutorials and Application Examples |
|
|
529 | (12) |
|
11.7.1 Example on Kernelization of the Metric |
|
|
529 | (1) |
|
11.7.2 Example on Kernel k-Means |
|
|
530 | (1) |
|
11.7.3 Domain Description Examples |
|
|
531 | (3) |
|
11.7.4 Kernel Spectral Angle Mapper and Kernel Orthogonal Subspace Projection Examples |
|
|
534 | (2) |
|
11.7.5 Example of Kernel Anomaly Change Detection Algorithms |
|
|
536 | (4) |
|
11.7.6 Example on Distribution Embeddings and Maximum Mean Discrepancy |
|
|
540 | (1) |
|
|
541 | (1) |
|
11.9 Questions and Problems |
|
|
542 | (1) |
|
12 Kernel Feature Extraction in Signal Processing |
|
|
543 | (46) |
|
|
543 | (2) |
|
12.2 Multivariate Analysis in Reproducing Kernel Hilbert Spaces |
|
|
545 | (10) |
|
12.2.1 Problem Statement and Notation |
|
|
545 | (1) |
|
12.2.2 Linear Multivariate Analysis |
|
|
546 | (3) |
|
12.2.3 Kernel Multivariate Analysis |
|
|
549 | (2) |
|
12.2.4 Multivariate Analysis Experiments |
|
|
551 | (4) |
|
12.3 Feature Extraction with Kernel Dependence Estimates |
|
|
555 | (15) |
|
12.3.1 Feature Extraction Using Hilbert-Schmidt Independence Criterion |
|
|
556 | (7) |
|
12.3.2 Blind Source Separation Using Kernels |
|
|
563 | (7) |
|
12.4 Extensions for Large-Scale and Semi-supervised Problems |
|
|
570 | (5) |
|
12.4.1 Efficiency with the Incomplete Cholesky Decomposition |
|
|
570 | (1) |
|
12.4.2 Efficiency with Random Fourier Features |
|
|
570 | (1) |
|
12.4.3 Sparse Kernel Feature Extraction |
|
|
571 | (2) |
|
12.4.4 Semi-supervised Kernel Feature Extraction |
|
|
573 | (2) |
|
12.5 Domain Adaptation with Kernels |
|
|
575 | (12) |
|
12.5.1 Kernel Mean Matching |
|
|
578 | (1) |
|
12.5.2 Transfer Component Analysis |
|
|
579 | (2) |
|
12.5.3 Kernel Manifold Alignment |
|
|
581 | (4) |
|
12.5.4 Relations between Domain Adaptation Methods |
|
|
585 | (1) |
|
12.5.5 Experimental Comparison between Domain Adaptation Methods |
|
|
585 | (2) |
|
|
587 | (1) |
|
12.7 Questions and Problems |
|
|
588 | (1) |
References |
|
589 | (42) |
Index |
|
631 | |