About the Author |
|
xiii | |
About the Technical Reviewer |
|
xv | |
Acknowledgments |
|
xvii | |
Introduction |
|
xix | |
|
Chapter 1 Mathematical Foundations |
|
|
1 | (88) |
|
|
2 | (21) |
|
|
3 | (1) |
|
|
4 | (1) |
|
|
4 | (1) |
|
|
5 | (1) |
|
Matrix Operations and Manipulations |
|
|
5 | (4) |
|
Linear Independence of Vectors |
|
|
9 | (1) |
|
|
10 | (1) |
|
Identity Matrix or Operator |
|
|
11 | (1) |
|
|
12 | (2) |
|
|
14 | (1) |
|
|
15 | (1) |
|
Pseudo Inverse of a Matrix |
|
|
16 | (1) |
|
Unit Vector in the Direction of a Specific Vector |
|
|
17 | (1) |
|
Projection of a Vector in the Direction of Another Vector |
|
|
17 | (1) |
|
|
18 | (5) |
|
|
23 | (11) |
|
|
23 | (1) |
|
|
24 | (1) |
|
Successive Partial Derivatives |
|
|
25 | (1) |
|
Hessian Matrix of a Function |
|
|
25 | (1) |
|
Maxima and Minima of Functions |
|
|
26 | (2) |
|
Local Minima and Global Minima |
|
|
28 | (1) |
|
Positive Semi-Definite and Positive Definite |
|
|
29 | (1) |
|
|
29 | (1) |
|
|
30 | (1) |
|
|
31 | (1) |
|
Multivariate Convex and Non-convex Functions Examples |
|
|
31 | (3) |
|
|
34 | (1) |
|
|
34 | (21) |
|
Unions, Intersection, and Conditional Probability |
|
|
35 | (2) |
|
Chain Rule of Probability for Intersection of Event |
|
|
37 | (1) |
|
Mutually Exclusive Events |
|
|
37 | (1) |
|
|
37 | (1) |
|
Conditional Independence of Events |
|
|
38 | (1) |
|
|
38 | (1) |
|
Probability Mass Function |
|
|
38 | (1) |
|
Probability Density Function |
|
|
39 | (1) |
|
Expectation of a Random Variable |
|
|
39 | (1) |
|
Variance of a Random Variable |
|
|
39 | (1) |
|
|
40 | (4) |
|
|
44 | (1) |
|
|
44 | (1) |
|
Some Common Probability Distribution |
|
|
45 | (6) |
|
|
51 | (1) |
|
Maximum Likelihood Estimate |
|
|
52 | (1) |
|
Hypothesis Testing and p Value |
|
|
53 | (2) |
|
Formulation of Machine-Learning Algorithm and Optimization Techniques |
|
|
55 | (24) |
|
|
56 | (9) |
|
|
65 | (1) |
|
Optimization Techniques for Machine Learning |
|
|
66 | (11) |
|
Constrained Optimization Problem |
|
|
77 | (2) |
|
A Few Important Topics in Machine Learning |
|
|
79 | (8) |
|
Dimensionality Reduction Methods |
|
|
79 | (5) |
|
|
84 | (2) |
|
Regularization Viewed as a Constraint Optimization Problem |
|
|
86 | (1) |
|
|
87 | (2) |
|
Chapter 2 Introduction to Deep-Learning Concepts and TensorFlow |
|
|
89 | (64) |
|
Deep Learning and Its Evolution |
|
|
89 | (3) |
|
Perceptrons and Perceptron Learning Algorithm |
|
|
92 | (26) |
|
Geometrical Interpretation of Perceptron Learning |
|
|
96 | (1) |
|
Limitations of Perceptron Learning |
|
|
97 | (2) |
|
|
99 | (1) |
|
Hidden Layer Perceptrons' Activation Function for Non-linearity |
|
|
100 | (2) |
|
Different Activation Functions for a Neuron/Perceptron |
|
|
102 | (6) |
|
Learning Rule for Multi-Layer Perceptrons Network |
|
|
108 | (1) |
|
Backpropagation for Gradient Computation |
|
|
109 | (2) |
|
Generalizing the Backpropagation Method for Gradient Computation |
|
|
111 | (7) |
|
|
118 | (34) |
|
Common Deep-Learning Packages |
|
|
118 | (1) |
|
|
119 | (1) |
|
TensorFlow Basics for Development |
|
|
119 | (4) |
|
Gradient-Descent Optimization Methods from a Deep-Learning Perspective |
|
|
123 | (6) |
|
Learning Rate in Mini-batch Approach to Stochastic Gradient Descent |
|
|
129 | (1) |
|
|
130 | (8) |
|
XOR Implementation Using TensorFlow |
|
|
138 | (5) |
|
Linear Regression in TensorFlow |
|
|
143 | (3) |
|
Multi-class Classification with SoftMax Function Using Full-Batch Gradient Descent |
|
|
146 | (3) |
|
Multi-class Classification with SoftMax Function Using Stochastic Gradient Descent |
|
|
149 | (3) |
|
|
152 | (1) |
|
|
152 | (1) |
|
Chapter 3 Convolutional Neural Networks |
|
|
153 | (70) |
|
|
153 | (5) |
|
Linear Time Invariant (LTI) / Linear Shift Invariant (LSI) Systems |
|
|
153 | (2) |
|
Convolution for Signals in One Dimension |
|
|
155 | (3) |
|
Analog and Digital Signals |
|
|
158 | (3) |
|
|
160 | (1) |
|
|
161 | (8) |
|
Two-dimensional Unit Step Function |
|
|
161 | (2) |
|
2D Convolution of a Signal with an LSI System Unit Step Response |
|
|
163 | (2) |
|
2D Convolution of an Image to Different LSI System Responses |
|
|
165 | (4) |
|
Common Image-Processing Filters |
|
|
169 | (9) |
|
|
169 | (2) |
|
|
171 | (2) |
|
|
173 | (1) |
|
|
174 | (1) |
|
Sobel Edge-Detection Filter |
|
|
175 | (2) |
|
|
177 | (1) |
|
Convolution Neural Networks |
|
|
178 | (1) |
|
Components of Convolution Neural Networks |
|
|
179 | (3) |
|
|
180 | (1) |
|
|
180 | (2) |
|
|
182 | (1) |
|
Backpropagation Through the Convolutional Layer |
|
|
182 | (4) |
|
Backpropagation Through the Pooling Layers |
|
|
186 | (1) |
|
Weight Sharing Through Convolution and Its Advantages |
|
|
187 | (1) |
|
|
188 | (1) |
|
Translation Invariance Due to Pooling |
|
|
189 | (1) |
|
Dropout Layers and Regularization |
|
|
190 | (2) |
|
Convolutional Neural Network for Digit Recognition on the MNIST Dataset |
|
|
192 | (4) |
|
Convolutional Neural Network for Solving Real-World Problems |
|
|
196 | (8) |
|
|
204 | (2) |
|
Different Architectures in Convolutional Neural Networks |
|
|
206 | (5) |
|
|
206 | (2) |
|
|
208 | (1) |
|
|
209 | (1) |
|
|
210 | (1) |
|
|
211 | (10) |
|
Guidelines for Using Transfer Learning |
|
|
212 | (1) |
|
Transfer Learning with Google's lnceptionV3 |
|
|
213 | (3) |
|
Transfer Learning with Pre-trained VGG16 |
|
|
216 | (5) |
|
|
221 | (2) |
|
Chapter 4 Natural Language Processing Using Recurrent Neural Networks |
|
|
223 | (56) |
|
|
223 | (4) |
|
Vector Representation of Words |
|
|
227 | (1) |
|
|
228 | (24) |
|
Continuous Bag of Words (CBOW) |
|
|
228 | (3) |
|
Continuous Bag of Words Implementation in TensorFlow |
|
|
231 | (4) |
|
Skip-Gram Model for Word Embedding |
|
|
235 | (2) |
|
Skip-gram Implementation in TensorFlow |
|
|
237 | (3) |
|
Global Co-occurrence Statistics-based Word Vectors |
|
|
240 | (5) |
|
|
245 | (4) |
|
Word Analogy with Word Vectors |
|
|
249 | (3) |
|
Introduction to Recurrent Neural Networks |
|
|
252 | (26) |
|
|
254 | (1) |
|
Predicting the Next Word in a Sentence Through RNN Versus Traditional Methods |
|
|
255 | (1) |
|
Backpropagation Through Time (BPTT) |
|
|
256 | (3) |
|
Vanishing and Exploding Gradient Problem in RNN |
|
|
259 | (1) |
|
Solution to Vanishing and Exploding Gradients Problem in RNNs |
|
|
260 | (2) |
|
Long Short-Term Memory (LSTM) |
|
|
262 | (1) |
|
LSTM in Reducing Exploding- and Vanishing-Gradient Problems |
|
|
263 | (2) |
|
MNIST Digit Identification in TensorFlow Using Recurrent Neural Networks |
|
|
265 | (9) |
|
Gated Recurrent Unit (GRU) |
|
|
274 | (2) |
|
|
276 | (2) |
|
|
278 | (1) |
|
Chapter 5 Unsupervised Learning with Restricted Boltzmann Machines and Auto-encoders |
|
|
279 | (66) |
|
|
279 | (2) |
|
Bayesian Inference: Likelihood, Priors, and Posterior Probability Distribution |
|
|
281 | (5) |
|
Markov Chain Monte Carlo Methods for Sampling |
|
|
286 | (8) |
|
|
289 | (5) |
|
Restricted Boltzmann Machines |
|
|
294 | (28) |
|
Training a Restricted Boltzmann Machine |
|
|
299 | (5) |
|
|
304 | (1) |
|
|
305 | (1) |
|
Burn-in Period and Generating Samples in Gibbs Sampling |
|
|
306 | (1) |
|
Using Gibbs Sampling in Restricted Boltzmann Machines |
|
|
306 | (2) |
|
|
308 | (1) |
|
A Restricted Boltzmann Implementation in TensorFlow |
|
|
309 | (4) |
|
Collaborative Filtering Using Restricted Boltzmann Machines |
|
|
313 | (4) |
|
Deep Belief Networks (DBNs) |
|
|
317 | (5) |
|
|
322 | (18) |
|
Feature Learning Through Auto-encoders for Supervised Learning |
|
|
325 | (2) |
|
Kullback-Leibler (KL) Divergence |
|
|
327 | (2) |
|
Sparse Auto-Encoder Implementation in TensorFlow |
|
|
329 | (4) |
|
|
333 | (1) |
|
A Denoising Auto-Encoder Implementation in TensorFlow |
|
|
333 | (7) |
|
|
340 | (3) |
|
|
343 | (2) |
|
Chapter 6 Advanced Neural Networks |
|
|
345 | (48) |
|
|
345 | (28) |
|
Binary Thresholding Method Based on Histogram of Pixel Intensities |
|
|
345 | (1) |
|
|
346 | (3) |
|
Watershed Algorithm for Image Segmentation |
|
|
349 | (3) |
|
Image Segmentation Using K-means Clustering |
|
|
352 | (3) |
|
|
355 | (1) |
|
|
355 | (1) |
|
Fully Convolutional Network (FCN) |
|
|
356 | (2) |
|
Fully Convolutional Network with Downsampling and Upsampling |
|
|
358 | (6) |
|
|
364 | (1) |
|
Semantic Segmentation in TensorFlow with Fully Connected Neural Networks |
|
|
365 | (8) |
|
Image Classification and Localization Network |
|
|
373 | (2) |
|
|
375 | (3) |
|
|
376 | (1) |
|
|
377 | (1) |
|
Generative Adversarial Networks |
|
|
378 | (11) |
|
Maximin and Minimax Problem |
|
|
379 | (2) |
|
|
381 | (1) |
|
Minimax and Saddle Points |
|
|
382 | (1) |
|
GAN Cost Function and Training |
|
|
383 | (3) |
|
Vanishing Gradient for the Generator |
|
|
386 | (1) |
|
TensorFlow Implementation of a GAN Network |
|
|
386 | (3) |
|
TensorFlow Models' Deployment in Production |
|
|
389 | (3) |
|
|
392 | (1) |
Index |
|
393 | |