Muutke küpsiste eelistusi

AI Computing Systems: An Application Driven Perspective [Pehme köide]

(Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China), , , , (Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China), (State Key Lab of Processors, Institute o)
  • Formaat: Paperback / softback, 600 pages, kõrgus x laius: 235x191 mm, kaal: 910 g
  • Ilmumisaeg: 02-Feb-2023
  • Kirjastus: Morgan Kaufmann
  • ISBN-10: 0323953999
  • ISBN-13: 9780323953993
  • Formaat: Paperback / softback, 600 pages, kõrgus x laius: 235x191 mm, kaal: 910 g
  • Ilmumisaeg: 02-Feb-2023
  • Kirjastus: Morgan Kaufmann
  • ISBN-10: 0323953999
  • ISBN-13: 9780323953993
AI Computing Systems: An Application Driven Perspective adopts the principle of "application-driven, full-stack penetration" and uses the specific intelligent application of "image style migration" to provide students with a sound starting place to learn. This approach enables readers to obtain a full view of the AI computing system. A complete intelligent computing system involves many aspects such as processing chip, system structure, programming environment, software, etc., making it a difficult topic to master in a short time.
  • Provides an in-depth analysis of the underlying principles behind the use of knowledge in intelligent computing systems
  • Centers around application-driven and full-stack penetration, focusing on the knowledge required to complete this application at all levels of the software and hardware technology stack
  • Supporting experimental tutorials covering key knowledge points in each chapter provide practical guidance and formalization tools for developing a simple AI computing system
Biographies xi
Preface for the English version xiii
Preface xv
Motivation for this book xv
Value of an AI computing systems course xvi
Content of the AI computing systems course xvii
Writing of this book xx
Chapter 1 Introduction
1(16)
1.1 Artificial intelligence
1(8)
1.1.1 What is artificial intelligence?
1(1)
1.1.2 The history of AI
1(3)
1.1.3 Mainstreams in AI
4(5)
1.2 AI computing systems
9(4)
1.2.1 What are AI computing systems?
9(1)
1.2.2 The necessity of AICSs
9(1)
1.2.3 Trends in AICSs
10(3)
1.3 A driving example
13(2)
1.4 Summary
15(2)
Exercises
16(1)
Chapter 2 Fundamentals of neural networks
17(36)
2.1 From machine learning to neural networks
17(12)
2.1.1 Basic concepts
17(1)
2.1.2 Linear regression
18(4)
2.1.3 Perceptron
22(2)
2.1.4 Two-layer neural network: multilayer perceptron
24(2)
2.1.5 Deep neural networks (deep learning)
26(1)
2.1.6 The history of neural networks
27(2)
2.2 Neural network training
29(5)
2.2.1 Forward propagation
30(2)
2.2.2 Backward propagation
32(2)
2.3 Neural network design: the principle
34(8)
2.3.1 Network topology
34(1)
2.3.2 Activation function
35(4)
2.3.3 Loss function
39(3)
2.4 Overfitting and regularization
42(6)
2.4.1 Overfitting
42(1)
2.4.2 Regularization
43(5)
2.5 Cross-validation
48(2)
2.6 Summary
50(3)
Exercises
50(3)
Chapter 3 Deep learning
53(70)
3.1 Convolutional neural networks for image processing
53(12)
3.1.1 CNN components
55(1)
3.1.2 Convolutional layer
55(7)
3.1.3 Pooling layer
62(1)
3.1.4 Fully connected layer
63(1)
3.1.5 Softmax layer
63(1)
3.1.6 CNN architecture
63(2)
3.2 CNN-based classification algorithms
65(20)
3.2.1 AlexNet
66(4)
3.2.2 VGG
70(3)
3.2.3 Inception
73(8)
3.2.4 ResNet
81(4)
3.3 CNN-based object detection algorithms
85(16)
3.3.1 Evaluation metrics
85(3)
3.3.2 R-CNN series
88(7)
3.3.3 YOLO
95(3)
3.3.4 SSD
98(2)
3.3.5 Summary
100(1)
3.4 Sequence models: recurrent neural networks
101(8)
3.4.1 RNNs
101(5)
3.4.2 LSTM
106(2)
3.4.3 GRU
108(1)
3.4.4 Summary
109(1)
3.5 Generative adversarial networks
109(6)
3.5.1 GAN modeling
110(1)
3.5.2 Training in GAN
110(3)
3.5.3 The GAN framework
113(2)
3.6 Driving example
115(6)
3.6.1 CNN-based image style transfer
116(3)
3.6.2 Real-time style transfer
119(2)
3.7 Summary
121(2)
Exercises
121(2)
Chapter 4 Fundamentals of programming frameworks
123(44)
4.1 Necessities of programming frameworks
124(1)
4.2 Fundamentals of programming frameworks
124(2)
4.2.1 Generic programming frameworks
124(1)
4.2.2 TensorFlow basics
125(1)
4.3 TensorFlow: model and tutorial
126(16)
4.3.1 Computational graph
126(2)
4.3.2 Operations
128(1)
4.3.3 Tensors
129(3)
4.3.4 Tensor session
132(5)
4.3.5 Variable
137(3)
4.3.6 Placeholders
140(2)
4.3.7 Queue
142(1)
4.4 Deep learning inference in TensorFlow
142(6)
4.4.1 Load input
143(1)
4.4.2 Define the basic operations
144(2)
4.4.3 Create neural network models
146(2)
4.4.4 Output prediction
148(1)
4.5 Deep learning training in TensorFlow
148(15)
4.5.1 Data loading
148(6)
4.5.2 Training models
154(7)
4.5.3 Model checkpoint
161(2)
4.5.4 Image style transfer training
163(1)
4.6 Summary
163(4)
Exercises
164(3)
Chapter 5 Programming framework principles
167(40)
5.1 TensorFlow design principles
167(1)
5.1.1 High performance
167(1)
5.1.2 Easy development
168(1)
5.1.3 Portability
168(1)
5.2 TensorFlow computational graph mechanism
168(16)
5.2.1 Computational graph
169(8)
5.2.2 Local execution of a computational graph
177(6)
5.2.3 Distributed execution of computational graphs
183(1)
5.3 TensorFlow system implementation
184(14)
5.3.1 Overall architecture
184(2)
5.3.2 Computational graph execution module
186(2)
5.3.3 Device abstraction and management
188(4)
5.3.4 Network and communication
192(3)
5.3.5 Operator definition
195(3)
5.4 Programming framework comparison
198(6)
5.4.1 TensorFlow
199(4)
5.4.2 PyTorch
203(1)
5.4.3 MXNet
204(1)
5.4.4 Caffe
204(1)
5.5 Summary
204(3)
Exercises
205(2)
Chapter 6 Deep learning processors
207(40)
6.1 Deep learning processors (DLPs)
207(5)
6.1.1 The purpose of DLPs
207(1)
6.1.2 The development history of DLPs
208(3)
6.1.3 The design motivation
211(1)
6.2 Deep learning algorithm analysis
212(9)
6.2.1 Computational characteristics
212(4)
6.2.2 Memory access patterns
216(5)
6.3 DLP architecture
221(11)
6.3.1 Instruction set architecture
222(3)
6.3.2 Pipeline
225(2)
6.3.3 Computing unit
227(3)
6.3.4 Memory access unit
230(1)
6.3.5 Mapping from algorithm to chip
231(1)
6.3.6 Summary
232(1)
6.4 Optimization design
232(7)
6.4.1 Scalar MAC-based computing unit
233(2)
6.4.2 Sparsity
235(2)
6.4.3 Low bit-width
237(2)
6.5 Performance evaluation
239(3)
6.5.1 Performance metrics
239(1)
6.5.2 Benchmarking
240(1)
6.5.3 Factors affecting performance
241(1)
6.6 Other accelerators
242(2)
6.6.1 The GPU architecture
242(1)
6.6.2 The FPGA architecture
243(1)
6.6.3 Comparison of DLPs, GPU, and FPGA
244(1)
6.7 Summary
244(3)
Exercises
245(2)
Chapter 7 Architecture for AI computing systems
247(24)
7.1 Single-core deep learning processor
247(10)
7.1.1 Overall architecture
248(1)
7.1.2 Control module
249(4)
7.1.3 Arithmetic module
253(3)
7.1.4 Storage unit
256(1)
7.1.5 Summary of single-core deep learning processor
256(1)
7.2 The multicore deep learning processor
257(10)
7.2.1 The DLP-M architecture
258(1)
7.2.2 The cluster architecture
258(6)
7.2.3 Interconnection architecture
264(2)
7.2.4 Summary of multicore deep learning processors
266(1)
7.3 Summary
267(4)
Exercises
268(3)
Chapter 8 AI programming language for AI computing systems
271(108)
8.1 Necessity of AI programming language
271(9)
8.1.1 Semantic gap
272(1)
8.1.2 Hardware gap
273(5)
8.1.3 Platform gap
278(1)
8.1.4 Summary
278(2)
8.2 Abstraction of AI programming language
280(5)
8.2.1 Abstract hardware architecture
280(1)
8.2.2 Typical AI computing system
280(2)
8.2.3 Control model
282(1)
8.2.4 Computation model
283(1)
8.2.5 Memory model
283(2)
8.3 Programming models
285(10)
8.3.1 Heterogeneous programming model
285(5)
8.3.2 General AI programming model
290(5)
8.4 Fundamentals of AI programming language
295(9)
8.4.1 Syntax overview
295(2)
8.4.2 Datatype
297(2)
8.4.3 Macros, constants, and built-in variables
299(1)
8.4.4 I/O operation
299(1)
8.4.5 Scalar computation
300(1)
8.4.6 Tensor computation
301(1)
8.4.7 Control flow
302(1)
8.4.8 Serial program example
302(1)
8.4.9 Parallel program example
303(1)
8.5 Programming interface of AI applications
304(8)
8.5.1 Kernel function interface
305(2)
8.5.2 Runtime interface
307(2)
8.5.3 Usage example
309(3)
8.6 Debugging AI applications
312(19)
8.6.1 Functional debugging method
313(5)
8.6.2 Function debugging interface
318(3)
8.6.3 Function debugging tool
321(3)
8.6.4 Precision debugging method
324(1)
8.6.5 Function debugging practice
325(6)
8.7 Optimizing AI applications
331(15)
8.7.1 Performance tuning method
332(3)
8.7.2 Performance tuning interface
335(2)
8.7.3 Performance tuning tools
337(3)
8.7.4 Performance tuning practice
340(6)
8.8 System development on AI programming language
346(28)
8.8.1 High-performance library operator development
347(4)
8.8.2 Programming framework operator development
351(6)
8.8.3 System development and optimization practice
357(17)
8.9 Exercises
374(5)
Chapter 9 Practice: AI computing systems
379(18)
9.1 Basic practice: image style transfer
379(11)
9.1.1 Operator implementation based on AI programming language
379(3)
9.1.2 Implementation of image style transfer
382(4)
9.1.3 Image style transfer practice
386(4)
9.2 Advanced practice: object detection
390(4)
9.2.1 Operator implementation based on AI programming language
390(3)
9.2.2 Implementation of object detection
393(1)
9.3 Extended practices
394(3)
APPENDIX A Fundamentals of computer architecture
397(6)
A.1 The instruction set of general-purpose CPUs
397(2)
A.2 Memory hierarchy in computing systems
399(4)
A.2.1 Cache
399(2)
A.2.2 Scratchpad memory
401(2)
APPENDIX B Experimental environment
403(4)
B.1 Cloud platform
403(2)
B.1.1 Login
403(1)
B.1.2 Changing password
403(1)
B.1.3 Set up SSH client
404(1)
B.1.4 Unzipping the file package
404(1)
B.1.5 Setting environment variables
404(1)
B.2 Development board
405(2)
References 407(10)
Final words 417(4)
Index 421