| Preface |
|
xv | |
| Author |
|
xvii | |
|
|
|
1 | (14) |
|
1.1 The Internet of Things (IOT) |
|
|
1 | (1) |
|
1.2 IOT Application Domains |
|
|
1 | (3) |
|
|
|
4 | (1) |
|
1.4 Performance Evaluation and Modeling of IOT Systems |
|
|
5 | (4) |
|
1.5 Machine Learning and Statistical Techniques For IOT |
|
|
9 | (2) |
|
|
|
11 | (4) |
|
|
|
13 | (1) |
|
|
|
13 | (2) |
|
Chapter 2 Review of Probability Theory |
|
|
15 | (24) |
|
|
|
15 | (1) |
|
2.2 Discrete Random Variables |
|
|
16 | (4) |
|
2.2.1 The Binomial Random Variable |
|
|
16 | (1) |
|
2.2.2 The Geometric Random Variable |
|
|
17 | (1) |
|
2.2.3 The Poisson Random Variable |
|
|
17 | (1) |
|
2.2.4 The Cumulative Distribution |
|
|
18 | (2) |
|
2.3 Continuous Random Variables |
|
|
20 | (8) |
|
2.3.1 The Uniform Random Variable |
|
|
21 | (1) |
|
2.3.2 The Exponential Random Variable |
|
|
21 | (3) |
|
2.3.3 Mixtures of Exponential Random Variables |
|
|
24 | (2) |
|
2.3.4 The Normal Random Variable |
|
|
26 | (2) |
|
2.4 The Joint Probability Distribution |
|
|
28 | (3) |
|
2.4.1 The Marginal Probability Distribution |
|
|
29 | (1) |
|
2.4.2 The Conditional Probability |
|
|
30 | (1) |
|
2.5 Expectation and Variance |
|
|
31 | (8) |
|
2.5.1 The Expectation and Variance of Some Random Variables |
|
|
33 | (3) |
|
|
|
36 | (1) |
|
|
|
37 | (2) |
|
Chapter 3 Simulation Techniques |
|
|
39 | (36) |
|
|
|
39 | (1) |
|
3.2 The Discrete-Event Simulation Technique |
|
|
40 | (6) |
|
3.2.1 Recertification of IoT Devices: A Simple Model |
|
|
40 | (3) |
|
3.2.2 Recertification of IoT Devices: A More Complex Model |
|
|
43 | (3) |
|
3.3 Generating Random Numbers |
|
|
46 | (8) |
|
3.3.1 Generating Pseudo-Random Numbers |
|
|
46 | (2) |
|
3.3.2 Generating Random Variates |
|
|
48 | (6) |
|
|
|
54 | (4) |
|
|
|
56 | (2) |
|
3.4.2 Selecting the Unit Time |
|
|
58 | (1) |
|
3.5 Estimation Techniques |
|
|
58 | (13) |
|
3.5.1 Collecting Endogenously Created Data |
|
|
58 | (2) |
|
3.5.2 Transient-State versus Steady-State Simulation |
|
|
60 | (1) |
|
3.5.3 Estimation of the Confidence Interval of the Mean |
|
|
61 | (8) |
|
3.5.4 Estimation of the Confidence Interval of a Percentile |
|
|
69 | (1) |
|
3.5.5 Estimation of the Confidence Interval of a Probability |
|
|
70 | (1) |
|
3.5.6 Achieving a Required Accuracy |
|
|
70 | (1) |
|
3.6 Validation of a Simulation Model |
|
|
71 | (1) |
|
|
|
71 | (4) |
|
|
|
72 | (1) |
|
|
|
73 | (1) |
|
Task 1 (to be completed after you read Section 3.2) |
|
|
73 | (1) |
|
Task 2 (to be completed after you read Section 3.3) |
|
|
73 | (1) |
|
Task 3 (to be completed after you read Section 3.5) |
|
|
73 | (1) |
|
|
|
74 | (1) |
|
Chapter 4 Hypothesis Testing |
|
|
75 | (14) |
|
4.1 Statistical Hypothesis Testing For a Mean |
|
|
75 | (7) |
|
|
|
78 | (1) |
|
4.1.2 Hypothesis Testing for the Difference between Two Population Means |
|
|
79 | (1) |
|
4.1.3 Hypothesis Testing for a Proportion |
|
|
80 | (1) |
|
4.1.4 Type I and Type II Errors |
|
|
81 | (1) |
|
4.2 Analysis Of Variance (Anova) |
|
|
82 | (7) |
|
|
|
85 | (1) |
|
|
|
86 | (1) |
|
|
|
87 | (2) |
|
Chapter 5 Multivariable Linear Regression |
|
|
89 | (28) |
|
5.1 Simple Linear Regression |
|
|
90 | (3) |
|
5.2 Multivariable Linear Regression |
|
|
93 | (8) |
|
5.2.1 Significance of the Regression Coefficients |
|
|
94 | (1) |
|
|
|
95 | (4) |
|
|
|
99 | (1) |
|
|
|
100 | (1) |
|
5.2.5 Data Transformations |
|
|
100 | (1) |
|
|
|
101 | (4) |
|
5.4 Polynomial Regression |
|
|
105 | (2) |
|
5.5 Confidence and Prediction Intervals |
|
|
107 | (1) |
|
5.6 Ridge, Lasso, and Elastic Net Regression |
|
|
108 | (9) |
|
|
|
109 | (2) |
|
|
|
111 | (1) |
|
5.6.3 Elastic Net Regression |
|
|
111 | (1) |
|
|
|
112 | (1) |
|
|
|
113 | (1) |
|
|
|
113 | (1) |
|
Task 1 Basic Statistics Analysis |
|
|
114 | (1) |
|
Task 2 Simple Linear Regression |
|
|
114 | (1) |
|
Task 3 Linear Multivariable Regression |
|
|
114 | (1) |
|
|
|
115 | (1) |
|
|
|
115 | (2) |
|
Chapter 6 Time Series Forecasting |
|
|
117 | (44) |
|
6.1 A Stationary Time Series |
|
|
117 | (9) |
|
6.1.1 How to Recognize Seasonality |
|
|
120 | (3) |
|
6.1.2 Techniques for Removing Non-Stationary Features |
|
|
123 | (3) |
|
6.2 Moving Average Or Smoothing Models |
|
|
126 | (3) |
|
6.2.1 The Simple Average Model |
|
|
126 | (1) |
|
6.2.2 The Exponential Moving Average Model |
|
|
127 | (1) |
|
6.2.3 The Average Age of a Model |
|
|
128 | (1) |
|
6.2.4 Selecting the Best Value for k and a |
|
|
129 | (1) |
|
6.3 The Moving Average MA(q) Model |
|
|
129 | (5) |
|
6.3.1 Derivation of the Mean and Variance of Xt |
|
|
131 | (1) |
|
6.3.2 Derivation of the Autocorrelation Function of the MA(1) |
|
|
132 | (1) |
|
6.3.3 Invertibility of MA(q) |
|
|
133 | (1) |
|
6.4 The Autoregressive Model |
|
|
134 | (7) |
|
|
|
134 | (3) |
|
6.4.2 Stationarity Condition of AR(p) |
|
|
137 | (1) |
|
6.4.3 Derivation of the Coefficients ai, i = 1, 2, ..., p |
|
|
138 | (2) |
|
6.4.4 Determination of the Order of AR(p) |
|
|
140 | (1) |
|
6.5 The Non-Seasonal ARIMA (p, d, q) Model |
|
|
141 | (4) |
|
6.5.1 Determination of the ARIMA Parameters |
|
|
143 | (2) |
|
|
|
145 | (3) |
|
6.6.1 Basic Steps for the Decomposition Model |
|
|
146 | (2) |
|
|
|
148 | (1) |
|
|
|
149 | (2) |
|
6.9 Vector Autoregression |
|
|
151 | (10) |
|
|
|
154 | (3) |
|
|
|
157 | (1) |
|
|
|
157 | (1) |
|
|
|
157 | (1) |
|
Task 1 Check for Stationarity |
|
|
158 | (1) |
|
Task 2 Fit a Simple Moving Average Model |
|
|
158 | (1) |
|
Task 3 Fit an Exponential Smoothing Model |
|
|
158 | (1) |
|
Task 4 Fit an Arma(p, q) Model |
|
|
158 | (1) |
|
Task 5 Comparison of All the Models |
|
|
159 | (1) |
|
|
|
159 | (2) |
|
Chapter 7 Dimensionality Reduction |
|
|
161 | (18) |
|
7.1 A Review of Eigenvalues and Eigenvectors |
|
|
161 | (2) |
|
7.2 Principal Component Analysis (PCA) |
|
|
163 | (8) |
|
|
|
165 | (6) |
|
7.3 Linear and Multiple Discriminant Analysis |
|
|
171 | (8) |
|
7.3.1 Linear Discriminant Analysis (LDA) |
|
|
172 | (4) |
|
7.3.2 Multiple Discriminant Analysis (MDA) |
|
|
176 | (2) |
|
|
|
178 | (1) |
|
|
|
178 | (1) |
|
Chapter 8 Clustering Techniques |
|
|
179 | (24) |
|
|
|
179 | (3) |
|
8.2 Hierarchical Clustering |
|
|
182 | (5) |
|
8.2.1 The Hierarchical Clustering Algorithm |
|
|
184 | (1) |
|
|
|
185 | (2) |
|
8.3 The k-Means Algorithm |
|
|
187 | (5) |
|
|
|
188 | (1) |
|
8.3.2 Determining the Number k of Clusters |
|
|
189 | (3) |
|
8.4 The Fuzzy c-Means Algorithm |
|
|
192 | (1) |
|
8.5 The Gaussian Mixture Decomposition |
|
|
193 | (2) |
|
|
|
195 | (8) |
|
8.6.1 Determining MinPts and ε |
|
|
198 | (1) |
|
8.6.2 Advantages and Disadvantages of DBSCAN |
|
|
198 | (1) |
|
|
|
199 | (1) |
|
|
|
200 | (1) |
|
|
|
200 | (1) |
|
Task 1 Hierarchical Clustering |
|
|
201 | (1) |
|
|
|
201 | (1) |
|
|
|
201 | (1) |
|
|
|
201 | (1) |
|
|
|
201 | (2) |
|
Chapter 9 Classification Techniques |
|
|
203 | (40) |
|
9.1 The k-Nearest Neighbor (K-NN) Method |
|
|
203 | (7) |
|
|
|
205 | (1) |
|
9.1.2 Using Kernels with the k-NN Method |
|
|
206 | (2) |
|
9.1.3 Curse of Dimensionality |
|
|
208 | (1) |
|
|
|
209 | (1) |
|
9.1.5 Advantages and Disadvantages of the k-NN Method |
|
|
210 | (1) |
|
9.2 The Naive Bayes Classifier |
|
|
210 | (7) |
|
9.2.1 The Simple Bayes Classifier |
|
|
211 | (1) |
|
9.2.2 The Naive Bayes Classifier |
|
|
212 | (2) |
|
9.2.3 The Gaussian Naive Bayes Classifier |
|
|
214 | (2) |
|
9.2.4 Advantages and Disadvantages |
|
|
216 | (1) |
|
9.2.5 The k-NN Method Using Bayes' Theorem |
|
|
216 | (1) |
|
|
|
217 | (15) |
|
|
|
219 | (1) |
|
9.3.2 Classification Trees |
|
|
220 | (6) |
|
9.3.3 Pre-Pruning and Post-Pruning |
|
|
226 | (4) |
|
9.3.4 Advantages and Disadvantages of Decision Trees |
|
|
230 | (1) |
|
9.3.5 Decision Trees Ensemble Methods |
|
|
231 | (1) |
|
|
|
232 | (11) |
|
9.4.1 The Binary Logistic Regression |
|
|
233 | (4) |
|
9.4.2 Multinomial Logistics Regression |
|
|
237 | (1) |
|
9.4.3 Ordinal Logistic Regression |
|
|
238 | (2) |
|
|
|
240 | (1) |
|
|
|
241 | (1) |
|
|
|
242 | (1) |
|
Chapter 10 Artificial Neural Networks |
|
|
243 | (36) |
|
10.1 The Feedforward Artificial Neural Network |
|
|
243 | (3) |
|
10.2 Other Artificial Neural Networks |
|
|
246 | (1) |
|
10.3 Activation Functions |
|
|
247 | (2) |
|
10.4 Calculation of the Output Value |
|
|
249 | (2) |
|
10.5 Selecting the Number of Layers and Nodes |
|
|
251 | (1) |
|
10.6 The Backpropagation Algorithm |
|
|
252 | (14) |
|
10.6.1 The Gradient Descent Algorithm |
|
|
254 | (2) |
|
10.6.2 Calculation of the Gradients |
|
|
256 | (10) |
|
10.7 Stochastic, Batch, Mini-Batch Gradient Descent Methods |
|
|
266 | (1) |
|
10.8 Feature Normalization |
|
|
267 | (1) |
|
|
|
268 | (4) |
|
10.9.1 The Early Stopping Method |
|
|
269 | (1) |
|
|
|
270 | (2) |
|
10.9.3 The Dropout Method |
|
|
272 | (1) |
|
10.10 Selecting The Hyper-Parameters |
|
|
272 | (7) |
|
10.10.1 Selecting the Learning Rate γ |
|
|
273 | (1) |
|
10.10.2 Selecting the Regularization Parameter λ |
|
|
274 | (1) |
|
|
|
275 | (1) |
|
|
|
276 | (1) |
|
|
|
276 | (1) |
|
Task 1 Train a Feedforward Neural Network |
|
|
277 | (1) |
|
Task 2 Automatic Grid Search |
|
|
277 | (1) |
|
Task 3 Compare the Best Trained Neural Network Model with Multivariable Regression |
|
|
277 | (1) |
|
|
|
277 | (2) |
|
Chapter 11 Support Vector Machines |
|
|
279 | (24) |
|
|
|
279 | (4) |
|
11.2 The SVM Algorithm: Linearly Separable Data |
|
|
283 | (4) |
|
11.3 Soft-Margin SVM (C-SVM) |
|
|
287 | (2) |
|
11.4 The SVM Algorithm: Non-Linearly Separable Data |
|
|
289 | (4) |
|
|
|
293 | (1) |
|
|
|
294 | (2) |
|
11.7 Selecting the Best Values for C and γ |
|
|
296 | (2) |
|
11.8 ε-Support Vector Regression (ε-SVR) |
|
|
298 | (5) |
|
|
|
300 | (1) |
|
|
|
301 | (1) |
|
|
|
301 | (1) |
|
|
|
302 | (1) |
|
|
|
302 | (1) |
|
Chapter 12 Hidden Markov Models |
|
|
303 | (34) |
|
|
|
303 | (4) |
|
12.2 Hidden Markov Models-An Example |
|
|
307 | (1) |
|
12.3 The Three Basic HMM Problems |
|
|
308 | (2) |
|
12.3.1 Problem 1 -- The Evaluation Problem |
|
|
309 | (1) |
|
12.3.2 Problem 2 -- The Decoding Problem |
|
|
309 | (1) |
|
12.3.3 Problem 3 -- The Learning Problem |
|
|
309 | (1) |
|
12.4 Mathematical Notation |
|
|
310 | (1) |
|
12.5 Solution to Problem 1 |
|
|
310 | (7) |
|
12.5.1 A Brute Force Solution |
|
|
311 | (1) |
|
12.5.2 The Forward-Backward Algorithm |
|
|
312 | (5) |
|
12.6 Solution to Problem 2 |
|
|
317 | (7) |
|
12.6.1 The Heuristic Solution |
|
|
317 | (2) |
|
12.6.2 The Viterbi Algorithm |
|
|
319 | (5) |
|
12.7 Solution to Problem 3 |
|
|
324 | (4) |
|
12.8 Selection of the Number of States N |
|
|
328 | (2) |
|
|
|
330 | (1) |
|
12.10 Continuous Observation Probability Distributions |
|
|
331 | (1) |
|
12.11 Autoregressive HMMS |
|
|
332 | (5) |
|
|
|
333 | (1) |
|
|
|
333 | (1) |
|
|
|
333 | (1) |
|
|
|
334 | (1) |
|
Task 2 Estimate the Most Probable Sequence Q |
|
|
334 | (1) |
|
|
|
335 | (1) |
|
|
|
335 | (2) |
|
APPENDIX A SOME BASIC CONCEPTS OF QUEUEING THEORY |
|
|
337 | (6) |
|
APPENDIX B MAXIMUM LIKELIHOOD ESTIMATION (MLE) |
|
|
343 | (8) |
|
|
|
343 | (3) |
|
B.2 Relation of MLE to Bayesian Inference |
|
|
346 | (1) |
|
B.3 MLE and the Least Squares Method |
|
|
347 | (1) |
|
B.4 MLE of the Gaussian MA(1) |
|
|
347 | (2) |
|
B.5 MLE of the Gaussian AR(1) |
|
|
349 | (2) |
| Index |
|
351 | |