About the editors |
|
xv | |
Preface |
|
xvii | |
|
1 Computer vision and recognition-based safe automated systems |
|
|
1 | (16) |
|
|
|
|
|
|
2 | (2) |
|
1.1.1 Role of computer vision in automation |
|
|
3 | (1) |
|
1.1.2 Organization of the chapter |
|
|
3 | (1) |
|
1.2 Literature survey of safe automation systems |
|
|
4 | (1) |
|
1.3 Application of computer vision technology in automation |
|
|
5 | (6) |
|
1.3.1 Using face ID in mobile devices |
|
|
6 | (1) |
|
1.3.2 Automated automobiles |
|
|
6 | (1) |
|
1.3.3 Computer vision in agriculture |
|
|
7 | (1) |
|
1.3.4 Computer vision in the health sector |
|
|
8 | (1) |
|
1.3.5 Computer vision in the e-commerce industry |
|
|
8 | (2) |
|
|
10 | (1) |
|
1.3.7 Classifying and detecting objects |
|
|
10 | (1) |
|
1.3.8 Congregation data for training algorithms |
|
|
10 | (1) |
|
1.3.9 Low-light mode with computer vision |
|
|
10 | (1) |
|
1.4 Ensuring safety during COVID-19 using computer vision |
|
|
11 | (1) |
|
1.4.1 AI started from bringing humans closer to forcing them in keeping apart |
|
|
11 | (1) |
|
1.4.2 Access control through computer vision |
|
|
11 | (1) |
|
1.4.3 Thermal fever detection cameras |
|
|
11 | (1) |
|
1.4.4 Social distancing detection |
|
|
11 | (1) |
|
1.4.5 Sanitization prioritization |
|
|
12 | (1) |
|
1.4.6 Face mask compliance |
|
|
12 | (1) |
|
1.5 Discussion and conclusion |
|
|
12 | (5) |
|
|
13 | (4) |
|
2 DLA: deep learning accelerator |
|
|
17 | (34) |
|
Seyedeh Yasaman Hosseini Mirmahaleh |
|
|
|
|
18 | (1) |
|
2.2 ASIC-based design accelerator |
|
|
19 | (4) |
|
2.3 FPGA-based design accelerator |
|
|
23 | (2) |
|
2.4 NoC-based design accelerator |
|
|
25 | (7) |
|
2.5 Flow mapping and its impact on DLAs' performance |
|
|
32 | (8) |
|
2.6 A heuristic or dynamic algorithm's role on a DLA's efficiency |
|
|
40 | (4) |
|
2.7 Brief state-of-the-art survey |
|
|
44 | (7) |
|
|
46 | (5) |
|
3 Intelligent image retrieval system using deep neural networks |
|
|
51 | (32) |
|
|
|
|
|
52 | (2) |
|
3.2 Conventional content-based image retrieval (CBIR) system |
|
|
54 | (2) |
|
3.2.1 Semantic-based image retrieval (SBIR) system |
|
|
56 | (1) |
|
|
56 | (2) |
|
3.4 Image retrieval using convolutional neural networks (CNN) |
|
|
58 | (9) |
|
3.5 Image retrieval using autoencoders |
|
|
67 | (7) |
|
3.6 Image retrieval using generative adversarial networks (GAN) |
|
|
74 | (9) |
|
|
79 | (4) |
|
4 Handwritten digits recognition using dictionary learning |
|
|
83 | (34) |
|
|
|
|
|
84 | (3) |
|
4.1.1 Optical character recognition |
|
|
84 | (1) |
|
4.1.2 Handwritten recognition |
|
|
85 | (2) |
|
|
87 | (2) |
|
|
89 | (3) |
|
|
92 | (7) |
|
4.4.1 Dictionary pair learning model |
|
|
93 | (1) |
|
4.4.2 Incoherent dictionary pair learning (InDPL) |
|
|
94 | (2) |
|
4.4.3 Labeled projective dictionary pair learning |
|
|
96 | (3) |
|
4.5 Input data preparation |
|
|
99 | (2) |
|
4.5.1 Image preprocessing |
|
|
99 | (1) |
|
4.5.2 Histogram of oriented gradient |
|
|
99 | (2) |
|
4.5.3 Classification stage |
|
|
101 | (1) |
|
|
101 | (1) |
|
|
102 | (8) |
|
|
102 | (4) |
|
4.7.2 Benchmarking results |
|
|
106 | (4) |
|
|
110 | (7) |
|
|
111 | (6) |
|
5 Handwriting recognition using CNN and its optimization approach |
|
|
117 | (28) |
|
|
|
|
|
118 | (2) |
|
|
120 | (2) |
|
|
122 | (4) |
|
5.3.1 Convolutional neural network |
|
|
122 | (1) |
|
5.3.2 Gated convolutional neural network |
|
|
122 | (1) |
|
5.3.3 Gated recurrent unit (GRU) |
|
|
123 | (1) |
|
5.3.4 Connectionist temporal classification (CTC) |
|
|
123 | (1) |
|
|
124 | (1) |
|
5.3.6 Bi-directional gated recurrent unit (BiGRU) |
|
|
124 | (1) |
|
5.3.7 Squeeze and excited network (SENet) |
|
|
124 | (1) |
|
5.3.8 Linear bottleneck network |
|
|
125 | (1) |
|
5.3.9 Encoder and decoder model |
|
|
125 | (1) |
|
|
126 | (7) |
|
|
126 | (1) |
|
|
126 | (1) |
|
|
127 | (5) |
|
|
132 | (1) |
|
5.4.5 Training configurations |
|
|
132 | (1) |
|
|
132 | (1) |
|
5.4.7 Inference time testing |
|
|
132 | (1) |
|
5.4.8 Visualize inside the model |
|
|
133 | (1) |
|
|
133 | (8) |
|
5.5.1 Experiment 1: Bluche versus Puigcerver versus Flor model |
|
|
133 | (1) |
|
5.5.2 Experiment 2: performance comparison of the encoder |
|
|
134 | (1) |
|
5.5.3 Experiment 3: performance comparison of the decoder |
|
|
135 | (1) |
|
5.5.4 Experiment 4: performance comparison of the skipped connection |
|
|
136 | (2) |
|
5.5.5 Experiment 5: performance comparison of other ResFlor model |
|
|
138 | (1) |
|
5.5.6 Experiment 6: ResFlor residual with SE network |
|
|
139 | (1) |
|
5.5.7 Experiment 7: ResFlor with residual and bottleneck network |
|
|
140 | (1) |
|
|
141 | (1) |
|
5.7 Conclusion and future work |
|
|
142 | (3) |
|
|
142 | (1) |
|
|
142 | (3) |
|
6 Real-time face mask detection on edge IoT devices |
|
|
145 | (20) |
|
|
|
|
|
6.1 IoT devices and object detection |
|
|
145 | (2) |
|
6.1.1 IoT devices and object detection |
|
|
145 | (1) |
|
6.1.2 Real-time object detection on edge IoT devices |
|
|
146 | (1) |
|
6.1.3 A generic detection algorithm |
|
|
146 | (1) |
|
|
147 | (1) |
|
6.3 Traditional feature extraction techniques |
|
|
147 | (2) |
|
6.3.1 Histogram of oriented gradients (HOG) |
|
|
148 | (1) |
|
6.3.2 Scale invariant feature transform (SIFT) |
|
|
148 | (1) |
|
6.3.3 Speeded up robust features (SURF) |
|
|
149 | (1) |
|
6.4 Traditional detection methods |
|
|
149 | (1) |
|
6.4.1 Histogram of oriented gradients with support vector machines (HOG + SVM) |
|
|
149 | (1) |
|
6.5 Traditional face detection techniques |
|
|
150 | (2) |
|
6.5.1 Viola-Jones Haar cascade method |
|
|
150 | (2) |
|
|
152 | (1) |
|
6.7 Deep learning for object detection |
|
|
152 | (4) |
|
6.7.1 Convolutional neural networks (CNNs) |
|
|
152 | (1) |
|
6.7.2 Object detection using deep learning |
|
|
153 | (1) |
|
6.7.3 Faster RCNN for object detection |
|
|
154 | (1) |
|
6.7.4 Enhancing faster RCNN with MobileNet |
|
|
155 | (1) |
|
6.8 Internet and deep learning |
|
|
156 | (2) |
|
6.8.1 Client-server architecture |
|
|
156 | (2) |
|
6.9 Edge IoT architecture |
|
|
158 | (1) |
|
6.10 Implementing an edge IoT architecture |
|
|
158 | (3) |
|
|
158 | (1) |
|
6.10.2 Backend using Node.js |
|
|
159 | (2) |
|
6.10.3 MongoDB as database |
|
|
161 | (1) |
|
|
161 | (1) |
|
|
161 | (4) |
|
|
162 | (3) |
|
7 Current challenges and applications of DeepFake systems |
|
|
165 | (18) |
|
|
|
|
7.1 Introduction to DeepFake |
|
|
165 | (1) |
|
|
166 | (1) |
|
7.2 Various DeepFake detection methods available and their limitations |
|
|
166 | (8) |
|
7.2.1 Traditional detection methods |
|
|
167 | (4) |
|
7.2.2 Methods based on deep learning |
|
|
171 | (3) |
|
7.3 Applications used to forge the multimedia |
|
|
174 | (1) |
|
7.4 Current challenges and future of the technology |
|
|
175 | (2) |
|
7.4.1 Quality of DeepFake dataset |
|
|
175 | (1) |
|
7.4.2 Performance evaluation |
|
|
176 | (1) |
|
7.4.3 Explainability of detection results |
|
|
176 | (1) |
|
7.4.4 Temporal aggregation |
|
|
176 | (1) |
|
7.4.5 Social media laundering |
|
|
176 | (1) |
|
|
177 | (6) |
|
|
178 | (5) |
|
8 Vehicle control system based on eye, iris, and gesture recognition with eye tracking |
|
|
183 | (20) |
|
|
|
|
183 | (1) |
|
|
184 | (4) |
|
8.2.1 How eye tracker works |
|
|
185 | (3) |
|
|
188 | (3) |
|
|
190 | (1) |
|
8.4 Applications of eye tracking |
|
|
191 | (4) |
|
8.5 Top eye tracking hardware companies |
|
|
195 | (1) |
|
|
196 | (2) |
|
|
198 | (5) |
|
|
199 | (4) |
|
9 Sentiment analysis using deep learning |
|
|
203 | (20) |
|
|
|
|
9.1 Sentiment analysis: an interesting problem |
|
|
204 | (1) |
|
9.2 Sentiment and opinions |
|
|
205 | (1) |
|
9.3 Components of opinion |
|
|
205 | (3) |
|
9.3.1 Levels in sentiment analysis |
|
|
206 | (1) |
|
9.3.2 Classification techniques |
|
|
207 | (1) |
|
9.3.3 Classification types |
|
|
207 | (1) |
|
|
208 | (1) |
|
|
209 | (1) |
|
|
210 | (1) |
|
9.7 Hybrid learning approaches |
|
|
210 | (1) |
|
|
211 | (1) |
|
9.8.1 Deep belief network |
|
|
211 | (1) |
|
9.8.2 Convolutional neural networks |
|
|
211 | (1) |
|
9.8.3 Stacked autoencoders |
|
|
211 | (1) |
|
9.9 Convolutional neural networks |
|
|
212 | (2) |
|
|
212 | (1) |
|
|
213 | (1) |
|
|
213 | (1) |
|
|
214 | (5) |
|
9.10.1 Datasets and experimental setup |
|
|
215 | (1) |
|
|
216 | (1) |
|
9.10.3 Effect of filter region size |
|
|
217 | (1) |
|
9.10.4 Effect of number of filters |
|
|
217 | (1) |
|
9.10.5 Effect of different classifiers |
|
|
218 | (1) |
|
9.11 Conclusions and future scope |
|
|
219 | (4) |
|
|
220 | (3) |
|
10 Classification of prefeature extracted images with deep convolutional neural network in facial emotion recognition of vehicle driver |
|
|
223 | (30) |
|
|
|
10.1 Introduction and related work |
|
|
223 | (3) |
|
|
226 | (9) |
|
|
226 | (1) |
|
|
227 | (1) |
|
10.2.3 Prefeature extraction |
|
|
227 | (3) |
|
10.2.4 Convolutional neural networks |
|
|
230 | (2) |
|
|
232 | (2) |
|
|
234 | (1) |
|
10.2.7 System configuration |
|
|
235 | (1) |
|
10.3 Experiments and results |
|
|
235 | (12) |
|
10.4 Vehicle driver emotion recognition experimental setup, results, and discussion |
|
|
247 | (2) |
|
|
249 | (4) |
|
|
249 | (1) |
|
|
249 | (4) |
|
11 MobileNet architecture and its application to computer vision |
|
|
253 | (24) |
|
|
|
|
254 | (1) |
|
|
255 | (7) |
|
11.2.1 Artificial neural network |
|
|
255 | (3) |
|
11.2.2 Convolution neural network |
|
|
258 | (4) |
|
11.2.3 Deep convolution neural network |
|
|
262 | (1) |
|
11.3 Benchmarked convolutional neural network |
|
|
262 | (1) |
|
|
262 | (1) |
|
|
262 | (1) |
|
11.4 MobileNet architecture |
|
|
262 | (5) |
|
|
262 | (2) |
|
|
264 | (2) |
|
|
266 | (1) |
|
|
266 | (1) |
|
11.5 Model optimization techniques |
|
|
267 | (1) |
|
11.5.1 Quantization technique |
|
|
267 | (1) |
|
11.6 Quantized deep convolutional neural network |
|
|
267 | (2) |
|
|
268 | (1) |
|
11.7 Case study: healthcare domain |
|
|
269 | (2) |
|
11.7.1 Diabetic retinopathy |
|
|
269 | (1) |
|
11.7.2 Kaggle diabetic retinopathy image datasets |
|
|
269 | (1) |
|
|
270 | (1) |
|
11.7.4 Experiment results and discussion |
|
|
270 | (1) |
|
|
270 | (1) |
|
11.8 Selected MobileNet application |
|
|
271 | (2) |
|
11.8.1 Image classification |
|
|
271 | (1) |
|
|
272 | (1) |
|
|
273 | (1) |
|
|
273 | (4) |
|
|
273 | (4) |
|
12 Study on traffic enforcement cameras monitoring to detect the wrong-way movement of vehicles using deep convolutional neural network |
|
|
277 | (24) |
|
|
|
|
|
|
277 | (1) |
|
|
278 | (1) |
|
12.3 Techniques for data collection |
|
|
279 | (1) |
|
12.3.1 Closed-circuit television |
|
|
279 | (1) |
|
|
279 | (1) |
|
12.4 Purpose and benefit of the cameras monitoring system |
|
|
280 | (1) |
|
12.5 Techniques used in the monitoring of vehicles |
|
|
281 | (6) |
|
12.5.1 Convolution neural network |
|
|
281 | (3) |
|
|
284 | (1) |
|
12.5.3 Fast region-based convolution neural network |
|
|
285 | (1) |
|
|
285 | (1) |
|
12.5.5 Single-shot MultiBoxDetector |
|
|
286 | (1) |
|
12.5.6 You Only Look Once |
|
|
286 | (1) |
|
|
287 | (8) |
|
12.6.1 The detection of wrong-way drive of automobiles based on appearance using deep convolutional neural network |
|
|
287 | (2) |
|
12.6.2 Real-time wrong-direction detection based on deep learning |
|
|
289 | (2) |
|
12.6.3 A vehicle finding and counting system based on vision using deep learning |
|
|
291 | (2) |
|
12.6.4 A highway automobile discovery algorithm based on CNN |
|
|
293 | (2) |
|
12.6.5 Comparison of case studies |
|
|
295 | (1) |
|
|
295 | (6) |
|
|
297 | (4) |
|
13 Glasses for smart tourism applications |
|
|
301 | (36) |
|
|
|
Praveen Kumar Reddy Maddikunta |
|
|
|
|
302 | (3) |
|
|
304 | (1) |
|
13.1.2 Contribution of our work |
|
|
305 | (1) |
|
|
305 | (2) |
|
13.3 Existing technologies related to smart glasses |
|
|
307 | (6) |
|
13.3.1 Applications of smart glasses |
|
|
307 | (1) |
|
13.3.2 Smart glasses technology in the market |
|
|
307 | (1) |
|
13.3.3 Smart glasses solutions papers |
|
|
307 | (6) |
|
|
313 | (1) |
|
13.5 Functional architecture and technologies relevant |
|
|
314 | (6) |
|
13.5.1 Voice to text conversion-KALDI |
|
|
314 | (2) |
|
|
316 | (1) |
|
13.5.3 Facial features extraction |
|
|
317 | (1) |
|
13.5.4 Object (plant and animal) identification |
|
|
317 | (1) |
|
|
318 | (1) |
|
|
319 | (1) |
|
|
319 | (1) |
|
13.5.8 Text to speech conversion |
|
|
320 | (1) |
|
13.6 Proposed style of interaction (KBSIS) |
|
|
320 | (5) |
|
|
320 | (1) |
|
13.6.2 Dataset and model used |
|
|
321 | (1) |
|
|
322 | (1) |
|
13.6.4 Inference from input and processing |
|
|
322 | (1) |
|
|
323 | (2) |
|
13.7 Results and discussion |
|
|
325 | (5) |
|
|
325 | (1) |
|
|
325 | (1) |
|
13.7.3 Text from image and Translate |
|
|
325 | (1) |
|
13.7.4 Remembering face and naming |
|
|
325 | (2) |
|
13.7.5 Face characteristics |
|
|
327 | (1) |
|
|
328 | (1) |
|
13.7.7 Plant identification and search |
|
|
328 | (1) |
|
13.7.8 Animal identification and search |
|
|
328 | (2) |
|
|
330 | (7) |
|
|
330 | (7) |
|
14 Renal calculi detection using modified grey wolf optimization |
|
|
337 | (14) |
|
|
|
|
337 | (1) |
|
|
338 | (5) |
|
14.2.1 Image segmentation |
|
|
338 | (1) |
|
14.2.2 Grey wolf optimization |
|
|
339 | (2) |
|
|
341 | (2) |
|
14.3 Proposed approach for renal calculi detection |
|
|
343 | (3) |
|
14.3.1 Challenges in renal calculi detection |
|
|
343 | (1) |
|
|
343 | (3) |
|
14.4 Experiment and results |
|
|
346 | (2) |
|
|
346 | (1) |
|
14.4.2 Performance analysis |
|
|
346 | (2) |
|
14.5 Conclusions and future scope |
|
|
348 | (3) |
|
|
349 | (2) |
|
15 On multi-class aerial image classification using learning machines |
|
|
351 | (34) |
|
|
|
|
352 | (1) |
|
|
352 | (8) |
|
15.2.1 Deep learning networks |
|
|
353 | (4) |
|
|
357 | (1) |
|
15.2.3 Challenges for deep learning |
|
|
358 | (1) |
|
15.2.4 Challenges related to aerial video classification |
|
|
359 | (1) |
|
|
359 | (1) |
|
15.3 Learning architecture and classification |
|
|
360 | (6) |
|
15.3.1 Supervised learning architectures |
|
|
360 | (1) |
|
15.3.2 Unsupervised learning |
|
|
361 | (1) |
|
15.3.3 Deep learning for planning and situational awareness |
|
|
362 | (1) |
|
15.3.4 Deep learning for motion control |
|
|
362 | (1) |
|
|
363 | (1) |
|
|
364 | (2) |
|
|
366 | (4) |
|
15.4.1 Weight initialization |
|
|
366 | (1) |
|
15.4.2 Convolutional methods |
|
|
367 | (1) |
|
15.4.3 Activation functions |
|
|
367 | (1) |
|
15.4.4 Subsampling or pooling layer |
|
|
368 | (1) |
|
15.4.5 Optimization techniques |
|
|
368 | (2) |
|
15.4.6 Benchmark datasets |
|
|
370 | (1) |
|
15.5 Energy efficiency in learning approaches |
|
|
370 | (1) |
|
|
370 | (4) |
|
15.7 Development kits and frameworks |
|
|
374 | (1) |
|
15.8 Discussions and future directions |
|
|
375 | (10) |
|
|
376 | (9) |
|
16 Machine learning methodology toward identification of mature citrus fruits |
|
|
385 | (54) |
|
|
|
|
|
|
385 | (3) |
|
|
386 | (1) |
|
|
386 | (1) |
|
|
386 | (1) |
|
|
387 | (1) |
|
|
388 | (2) |
|
|
390 | (26) |
|
16.3.1 Image acquisition and data collection |
|
|
392 | (1) |
|
|
392 | (8) |
|
16.3.3 Feature extraction |
|
|
400 | (9) |
|
16.3.4 Machine learning model and database formation |
|
|
409 | (2) |
|
|
411 | (1) |
|
16.3.6 Match with dataset or testing the model |
|
|
411 | (1) |
|
|
412 | (1) |
|
16.3.8 Application design |
|
|
413 | (3) |
|
16.4 Experiments and result |
|
|
416 | (18) |
|
16.4.1 Qualification measures |
|
|
416 | (1) |
|
|
417 | (4) |
|
|
421 | (13) |
|
|
434 | (5) |
|
|
435 | (1) |
|
|
435 | (4) |
|
17 Automated detection of defects and grading of cashew kernels using machine learning |
|
|
439 | (28) |
|
|
|
|
440 | (2) |
|
|
441 | (1) |
|
17.1.2 Proposed methodology |
|
|
442 | (1) |
|
17.2 Defects and grades of cashew kernels |
|
|
442 | (3) |
|
17.2.1 Cashew kernel manufacturing process |
|
|
443 | (1) |
|
17.2.2 Defects of cashew kernel |
|
|
443 | (1) |
|
17.2.3 Grades of cashew kernel |
|
|
444 | (1) |
|
17.3 Implementation of the methodology |
|
|
445 | (9) |
|
17.3.1 Image preprocessing and segmentation |
|
|
447 | (3) |
|
17.3.2 Feature extraction |
|
|
450 | (3) |
|
|
453 | (1) |
|
17.4 Results and discussions |
|
|
454 | (9) |
|
|
463 | (4) |
|
|
463 | (4) |
Index |
|
467 | |