Preface |
|
xix | |
Acknowledgement |
|
xxi | |
|
|
xxiii | |
|
|
xxv | |
|
|
xxix | |
|
|
xxxi | |
|
1 Markov Chain and its Applications |
|
|
1 | (16) |
|
|
1 | (1) |
|
|
2 | (6) |
|
|
2 | (1) |
|
|
3 | (1) |
|
1.2.2.1 Transition probability |
|
|
3 | (2) |
|
1.2.2.2 State transition matrix |
|
|
5 | (3) |
|
1.3 Prediction Using Markov Chain |
|
|
8 | (5) |
|
|
8 | (3) |
|
1.3.2 Long-run Probability |
|
|
11 | (1) |
|
1.3.2.1 Algebraic solution |
|
|
11 | (1) |
|
|
12 | (1) |
|
1.4 Applications of Markov Chains |
|
|
13 | (4) |
|
1.4.1 Absorbing Nodes in a Markov Chain |
|
|
14 | (3) |
|
2 Hidden Markov Modelling (HMM) |
|
|
17 | (18) |
|
|
17 | (1) |
|
2.2 Emission Probabilities |
|
|
18 | (2) |
|
2.3 A Hidden Markov Model |
|
|
20 | (6) |
|
2.3.1 Setting up HMM Model |
|
|
21 | (1) |
|
2.3.2 HMM in Pictorial Form |
|
|
22 | (4) |
|
2.4 The Three Great Problems in HMM |
|
|
26 | (5) |
|
|
26 | (1) |
|
2.4.1.1 Problem 1: Classification or the likelihood problem (find p(O|λ)) |
|
|
26 | (1) |
|
2.4.1.2 Problems 2: Trajectory estimation problem |
|
|
26 | (1) |
|
2.4.1.3 Problem 3: System identification problem |
|
|
26 | (1) |
|
2.4.2 Solution to Problem 1: Estimation of Likelihood |
|
|
26 | (1) |
|
|
27 | (1) |
|
2.4.2.2 Forward recursion |
|
|
27 | (1) |
|
2.4.2.3 Backward recursion |
|
|
28 | (2) |
|
2.4.2.4 Solution to Problem 2: Trajectory estimation problem |
|
|
30 | (1) |
|
2.5 State Transition Table |
|
|
31 | (2) |
|
|
31 | (1) |
|
2.5.2 Output Symbol Table |
|
|
32 | (1) |
|
2.6 Solution to Problem 3: Find the Optimal HMM |
|
|
33 | (1) |
|
|
33 | (1) |
|
|
34 | (1) |
|
3 Introduction to Kalman Filters |
|
|
35 | (24) |
|
|
35 | (1) |
|
|
35 | (4) |
|
3.2.1 Step (1): Calculate Kalman Gain |
|
|
37 | (2) |
|
|
39 | (9) |
|
3.3.1 Models of the State Variables |
|
|
42 | (1) |
|
3.3.1.1 Using prediction and measurements in Kalman filters |
|
|
42 | (2) |
|
3.3.2 Gaussian Representation of State |
|
|
44 | (4) |
|
|
48 | (8) |
|
3.4.1 State Matrix for Object Moving in a Single Direction |
|
|
48 | (4) |
|
3.4.1.1 Tracking including measurements |
|
|
52 | (1) |
|
3.4.2 State Matrix of an Object Moving in Two Dimensions |
|
|
52 | (2) |
|
3.4.3 Objects Moving in Three-Dimensional Space |
|
|
54 | (2) |
|
3.5 Kalman Filter Models with Noise |
|
|
56 | (3) |
|
|
57 | (2) |
|
|
59 | (16) |
|
|
59 | (1) |
|
4.2 Processing Steps in Kalman Filter |
|
|
59 | (16) |
|
4.2.1 Covariance Matrices |
|
|
59 | (3) |
|
4.2.2 Computation Methods for Covariance Matrix |
|
|
62 | (1) |
|
|
62 | (3) |
|
4.2.2.2 Deviation matrix computation method |
|
|
65 | (2) |
|
4.2.3 Iterations in Kalman Filter |
|
|
67 | (8) |
|
|
75 | (18) |
|
|
75 | (1) |
|
5.2 Steps in Genetic Algorithm |
|
|
75 | (1) |
|
5.3 Terminology of Genetic Algorithms (GAs) |
|
|
76 | (2) |
|
|
78 | (3) |
|
5.4.1 Generic Requirements of a Fitness Function |
|
|
78 | (3) |
|
|
81 | (3) |
|
|
81 | (1) |
|
|
82 | (1) |
|
5.5.2.1 Single-position crossover |
|
|
82 | (1) |
|
|
83 | (1) |
|
|
83 | (1) |
|
|
84 | (1) |
|
5.6 Maximizing a Function of a Single Variable |
|
|
84 | (3) |
|
5.7 Continuous Genetic Algorithms |
|
|
87 | (6) |
|
5.7.1 Lowest Elevation on Topographical Maps |
|
|
87 | (3) |
|
5.7.2 Application of GA to Temperature Recording with Sensors |
|
|
90 | (1) |
|
|
91 | (2) |
|
6 Calculus on Computational Graphs |
|
|
93 | (12) |
|
|
93 | (2) |
|
6.1.1 Elements of Computational Graphs |
|
|
94 | (1) |
|
|
95 | (1) |
|
6.3 Computing Partial Derivatives |
|
|
96 | (3) |
|
6.3.1 Partial Derivatives: Two Cases of the Chain Rule |
|
|
97 | (1) |
|
6.3.1.1 Linear chain rule |
|
|
98 | (1) |
|
|
98 | (1) |
|
6.3.1.3 Multiple loop chain rule |
|
|
99 | (1) |
|
6.4 Computing of Integrals |
|
|
99 | (3) |
|
|
100 | (1) |
|
|
100 | (2) |
|
6.5 Multipath Compound Derivatives |
|
|
102 | (3) |
|
7 Support Vector Machines |
|
|
105 | (34) |
|
|
105 | (1) |
|
7.2 Essential Mathematics of SVM |
|
|
106 | (5) |
|
7.2.1 Introduction to Hyperplanes |
|
|
107 | (3) |
|
7.2.2 Parallel Hyperplanes |
|
|
110 | (1) |
|
7.2.3 Distance between Two Parallel Planes |
|
|
110 | (1) |
|
7.3 Support Vector Machines |
|
|
111 | (3) |
|
|
111 | (1) |
|
7.3.2 Linearly Separable Case |
|
|
112 | (2) |
|
7.4 Location of Optimal Hyperplane (Primal Problem) |
|
|
114 | (4) |
|
|
114 | (2) |
|
7.4.2 Distance of a Point xi from Separating Hyperplane |
|
|
116 | (1) |
|
7.4.2.1 Margin for support vector points |
|
|
117 | (1) |
|
7.4.3 Finding Optimal Hyperplane Problem |
|
|
117 | (1) |
|
|
117 | (1) |
|
7.5 The Lagrangian Optimization Function |
|
|
118 | (6) |
|
7.5.1 Optimization Involving Single Constraint |
|
|
119 | (1) |
|
7.5.2 Optimization with Multiple Constraints |
|
|
120 | (1) |
|
7.5.2.1 Single inequality constraint |
|
|
121 | (1) |
|
7.5.2.2 Multiple inequality constraints |
|
|
122 | (1) |
|
7.5.3 Karush--Kuhn--Tucker Conditions |
|
|
123 | (1) |
|
7.6 SVM Optimization Problems |
|
|
124 | (5) |
|
7.6.1 The Primal SVM Optimization Problem |
|
|
124 | (1) |
|
7.6.2 The Dual Optimization Problem |
|
|
125 | (1) |
|
7.6.2.1 Reformulation of the dual algorithm |
|
|
126 | (3) |
|
7.7 Linear SVM (Non-linearly Separable) Data |
|
|
129 | (10) |
|
|
129 | (1) |
|
7.7.1.1 Primal formulation including slack variable |
|
|
130 | (1) |
|
7.7.1.2 Dual formulation including slack variable |
|
|
130 | (1) |
|
7.7.1.3 Choosing C in soft margin cases |
|
|
131 | (1) |
|
7.7.2 Non-linear Data Classification Using Kernels |
|
|
132 | (3) |
|
7.7.2.1 Polynomial kernel function |
|
|
135 | (1) |
|
7.7.2.2 Multi-layer perceptron (Sigmoidal) kernel |
|
|
136 | (1) |
|
7.7.2.3 Gaussian radial basis function |
|
|
136 | (1) |
|
7.7.2.4 Creating new kernels |
|
|
137 | (1) |
|
|
137 | (2) |
|
8 Artificial Neural Networks |
|
|
139 | (14) |
|
|
139 | (1) |
|
|
139 | (14) |
|
8.2.1 Activation Functions |
|
|
143 | (1) |
|
|
143 | (1) |
|
8.2.1.2 Hyperbolic tangent |
|
|
144 | (1) |
|
8.2.1.3 Rectified Linear Unit (ReLU) |
|
|
144 | (1) |
|
|
145 | (1) |
|
8.2.1.5 Parametric rectifier |
|
|
145 | (1) |
|
|
146 | (1) |
|
|
146 | (1) |
|
8.2.1.8 Error calculation |
|
|
146 | (3) |
|
8.2.1.9 Output layer node |
|
|
149 | (1) |
|
8.2.1.10 Hidden layer nodes |
|
|
150 | (1) |
|
8.2.1.11 Summary of derivations |
|
|
151 | (2) |
|
9 Training of Neural Networks |
|
|
153 | (18) |
|
|
153 | (1) |
|
9.2 Practical Neural Network |
|
|
153 | (1) |
|
9.3 Backpropagation Model |
|
|
154 | (3) |
|
9.3.1 Computational Graph |
|
|
154 | (3) |
|
9.4 Backpropagation Example with Computational Graphs |
|
|
157 | (1) |
|
|
158 | (2) |
|
9.6 Practical Training of Neural Networks |
|
|
160 | (6) |
|
9.6.1 Forward Propagation |
|
|
161 | (2) |
|
9.6.2 Backward Propagation |
|
|
163 | (1) |
|
9.6.2.1 Adapting the weights |
|
|
164 | (2) |
|
9.7 Initialisation of Weights Methods |
|
|
166 | (3) |
|
9.7.1 Xavier Initialisation |
|
|
167 | (2) |
|
9.7.2 Batch Normalisation |
|
|
169 | (1) |
|
|
169 | (2) |
|
|
170 | (1) |
|
10 Recurrent Neural Networks |
|
|
171 | (14) |
|
|
171 | (1) |
|
10.2 Introduction to Recurrent Neural Networks |
|
|
171 | (3) |
|
10.3 Recurrent Neural Network |
|
|
174 | (11) |
|
11 Convolutional Neural Networks |
|
|
185 | (20) |
|
|
185 | (1) |
|
11.2 Convolution Matrices |
|
|
185 | (2) |
|
11.2.1 Three-Dimensional Convolution in CNN |
|
|
187 | (1) |
|
|
187 | (6) |
|
11.3.1 Design of Convolution Kernel |
|
|
190 | (1) |
|
11.3.1.1 Separable Gaussian kernel |
|
|
191 | (1) |
|
11.3.1.2 Separable Sobel kernel |
|
|
192 | (1) |
|
11.3.1.3 Computation advantage |
|
|
192 | (1) |
|
11.4 Convolutional Neural Networks |
|
|
193 | (8) |
|
11.4.1 Concepts and Hyperparameters |
|
|
193 | (1) |
|
|
193 | (1) |
|
11.4.1.2 Zero-padding (P) |
|
|
193 | (2) |
|
11.4.1.3 Receptive field (R) |
|
|
195 | (1) |
|
|
195 | (1) |
|
11.4.1.5 Activation function using rectified linear unit |
|
|
195 | (1) |
|
11.4.2 CNN Processing Stages |
|
|
196 | (1) |
|
11.4.2.1 Convolution layer |
|
|
196 | (3) |
|
|
199 | (2) |
|
11.4.4 The Fully Connected Layer |
|
|
201 | (1) |
|
11.5 CNN Design Principles |
|
|
201 | (1) |
|
|
202 | (3) |
|
|
203 | (2) |
|
12 Principal Component Analysis |
|
|
205 | (16) |
|
|
205 | (1) |
|
|
205 | (8) |
|
12.2.1 Co variance Matrices |
|
|
209 | (4) |
|
12.3 Computation of Principal Components |
|
|
213 | (8) |
|
12.3.1 PCA Using Vector Projection |
|
|
213 | (1) |
|
12.3.2 PCA Computation Using Covariance Matrices |
|
|
214 | (3) |
|
12.3.3 PCA Using Singular-Value Decomposition |
|
|
217 | (1) |
|
12.3.4 Applications of PCA |
|
|
218 | (1) |
|
12.3.4.1 Face recognition |
|
|
219 | (1) |
|
|
220 | (1) |
|
13 Moment-Generating Functions |
|
|
221 | (18) |
|
13.1 Moments of Random Variables |
|
|
221 | (3) |
|
13.1.1 Central Moments of Random Variables |
|
|
222 | (1) |
|
13.1.2 Properties of Moments |
|
|
222 | (2) |
|
13.2 Univariate Moment-Generating Functions |
|
|
224 | (2) |
|
13.3 Series Representation of MGF |
|
|
226 | (2) |
|
13.3.1 Properties of Probability Mass Functions |
|
|
227 | (1) |
|
13.3.2 Properties of Probability Distribution Functions fnof;(x) |
|
|
227 | (1) |
|
13.4 Moment-Generating Functions of Discrete Random Variables |
|
|
228 | (4) |
|
13.4.1 Bernoulli Random Variable |
|
|
228 | (1) |
|
13.4.2 Binomial Random Variables |
|
|
229 | (2) |
|
13.4.3 Geometric Random Variables |
|
|
231 | (1) |
|
13.4.4 Poisson Random Variable |
|
|
232 | (1) |
|
13.5 Moment-Generating Functions of Continuous Random Variables |
|
|
232 | (4) |
|
13.5.1 Exponential Distributions |
|
|
232 | (1) |
|
13.5.2 Normal Distribution |
|
|
233 | (2) |
|
13.5.3 Gamma Distribution |
|
|
235 | (1) |
|
13.6 Properties of Moment-Generating Functions |
|
|
236 | (1) |
|
13.7 Multivariate Moment-Generating Functions |
|
|
236 | (2) |
|
13.7.1 The Law of Large Numbers |
|
|
237 | (1) |
|
|
238 | (1) |
|
14 Characteristic Functions |
|
|
239 | (4) |
|
14.1 Characteristic Functions |
|
|
239 | (1) |
|
14.1.1 Properties of Characteristic Functions |
|
|
240 | (1) |
|
14.2 Characteristic Functions of Discrete Single Random Variables |
|
|
240 | (3) |
|
14.2.1 Characteristic Function of a Poisson Random Variable |
|
|
240 | (1) |
|
14.2.2 Characteristic Function of Binomial Random Variable |
|
|
241 | (1) |
|
14.2.3 Characteristic Functions of Continuous Random Variables |
|
|
242 | (1) |
|
15 Probability-Generating Functions |
|
|
243 | (16) |
|
15.1 Probability-Generating Functions |
|
|
243 | (1) |
|
15.2 Discrete Probability-Generating Functions |
|
|
243 | (10) |
|
|
244 | (2) |
|
15.2.2 Probability-Generating Function of Bernoulli Random Variable |
|
|
246 | (1) |
|
15.2.3 Probability-Generating Function for Binomial Random Variable |
|
|
246 | (1) |
|
15.2.4 Probability-Generating Function for Poisson Random Variable |
|
|
247 | (1) |
|
15.2.5 Probability-Generating Functions of Geometric Random Variables |
|
|
248 | (2) |
|
15.2.6 Probability-Generating Function of Negative Binomial Random Variables |
|
|
250 | (1) |
|
15.2.6.1 Negative binomial probability law |
|
|
251 | (2) |
|
15.3 Applications of Probability-Generating Functions in Data Analytics |
|
|
253 | (6) |
|
15.3.1 Discrete Event Applications |
|
|
253 | (1) |
|
|
253 | (1) |
|
|
253 | (1) |
|
15.3.2 Modelling of Infectious Diseases |
|
|
254 | (1) |
|
15.3.2.1 Early extinction probability |
|
|
255 | (1) |
|
15.3.2.1.1 Models of extinction probability |
|
|
256 | (1) |
|
|
257 | (2) |
|
16 Digital Identity Management System Using Artificial Neural Networks |
|
|
259 | (18) |
|
|
|
|
259 | (1) |
|
16.2 Digital Identity Metrics |
|
|
259 | (2) |
|
|
261 | (2) |
|
16.3.1 Fingerprint and Face Verification Challenges |
|
|
261 | (1) |
|
|
262 | (1) |
|
|
262 | (1) |
|
16.4 Biometrics System Architecture |
|
|
263 | (3) |
|
16.4.1 Fingerprint Recognition |
|
|
264 | (1) |
|
|
264 | (2) |
|
|
266 | (1) |
|
16.6 Artificial Neural Networks |
|
|
267 | (2) |
|
16.6.1 Artificial Neural Networks Implementation |
|
|
268 | (1) |
|
16.7 Multimodal Digital Identity Management System Implementation |
|
|
269 | (3) |
|
16.7.1 Terminal, Fingerprint Scanner and Camera |
|
|
269 | (1) |
|
16.7.2 Fingerprint and Face Recognition SDKs |
|
|
270 | (1) |
|
|
271 | (1) |
|
16.7.4 Verification: Connect to Host and Select Verification |
|
|
271 | (1) |
|
|
271 | (1) |
|
16.7.4.2 Successful verification |
|
|
271 | (1) |
|
|
272 | (5) |
|
|
272 | (5) |
|
17 Probabilistic Neural Network Classifiers for IoT Data Classification |
|
|
277 | (14) |
|
|
|
277 | (1) |
|
17.2 Probabilistic Neural Network (PNN) |
|
|
278 | (2) |
|
17.3 Generalized Regression Neural Network (GRNN) |
|
|
280 | (2) |
|
17.4 Vector Quantized GRNN (VQ-GRNN) |
|
|
282 | (4) |
|
|
286 | (1) |
|
17.6 Conclusion and Future Works |
|
|
287 | (4) |
|
|
288 | (3) |
|
18 MML Learning and Inference of Hierarchical Probabilistic Finite State Machines |
|
|
291 | (36) |
|
|
|
|
|
291 | (1) |
|
18.2 Finite State Machines (FSMs) and PFSMs |
|
|
292 | (4) |
|
18.2.1 Mathematical Definition of a Finite State Machine |
|
|
293 | (1) |
|
18.2.2 Representation of an FSM in a State Diagram |
|
|
293 | (3) |
|
18.3 MML Encoding and Inference of PFSMs |
|
|
296 | (11) |
|
|
297 | (1) |
|
18.3.1.1 Assertion code for hypothesis H |
|
|
297 | (3) |
|
18.3.1.2 Assertion code for data D generated by hypothesis H |
|
|
300 | (1) |
|
18.3.2 Inference of PFSM Using MML |
|
|
301 | (1) |
|
18.3.2.1 Inference of PFSM by ordered merging (OM) |
|
|
302 | (1) |
|
18.3.2.1.1 First stage merge |
|
|
302 | (1) |
|
18.3.2.1.2 Second stage merge |
|
|
303 | (1) |
|
18.3.2.1.3 Third stage merge |
|
|
303 | (1) |
|
18.3.2.1.4 Ordered merging (OM) algorithm |
|
|
304 | (1) |
|
18.3.2.2 Inference of PFSM using simulated annealing (SA) |
|
|
305 | (1) |
|
18.3.2.2.1 Simulated annealing (SA) |
|
|
306 | (1) |
|
18.3.2.2.2 Simulated annealing (SA) algorithm |
|
|
307 | (1) |
|
18.4 Hierarchical Probabilistic Finite State Machine (HPFSM) |
|
|
307 | (7) |
|
|
309 | (1) |
|
18.4.2 MML Assertion Code for the Hypothesis H of HPFSM |
|
|
310 | (3) |
|
18.4.3 Encoding the transitions of HPFSM |
|
|
313 | (1) |
|
|
314 | (9) |
|
18.5.1 Experiments on Artificial datasets |
|
|
314 | (1) |
|
|
314 | (3) |
|
|
317 | (2) |
|
18.5.2 Experiments on ADL Datasets |
|
|
319 | (4) |
|
|
323 | (4) |
|
|
324 | (3) |
Solution to Exercises |
|
327 | (2) |
Index |
|
329 | (6) |
About the Author |
|
335 | |