Muutke küpsiste eelistusi

ReRAM-based Machine Learning [Kõva köide]

(Huawei Technologies, Shenzhen, China), (George Mason University (GMU), Department of Electrical and Computer Engineering, USA), (Southern University of Science and Technology (SUSTech), School of Microelectronics, China)
  • Formaat: Hardback, 261 pages, kõrgus x laius: 234x156 mm
  • Sari: Computing and Networks
  • Ilmumisaeg: 30-Apr-2021
  • Kirjastus: Institution of Engineering and Technology
  • ISBN-10: 1839530812
  • ISBN-13: 9781839530814
Teised raamatud teemal:
  • Formaat: Hardback, 261 pages, kõrgus x laius: 234x156 mm
  • Sari: Computing and Networks
  • Ilmumisaeg: 30-Apr-2021
  • Kirjastus: Institution of Engineering and Technology
  • ISBN-10: 1839530812
  • ISBN-13: 9781839530814
Teised raamatud teemal:

The transition towards exascale computing has resulted in major transformations in computing paradigms. The need to analyze and respond to such large amounts of data sets has led to the adoption of machine learning (ML) and deep learning (DL) methods in a wide range of applications.

One of the major challenges is the fetching of data from computing memory and writing it back without experiencing a memory-wall bottleneck. To address such concerns, in-memory computing (IMC) and supporting frameworks have been introduced. In-memory computing methods have ultra-low power and high-density embedded storage. Resistive Random-Access Memory (ReRAM) technology seems the most promising IMC solution due to its minimized leakage power, reduced power consumption and smaller hardware footprint, as well as its compatibility with CMOS technology, which is widely used in industry.

In this book, the authors introduce ReRAM techniques for performing distributed computing using IMC accelerators, present ReRAM-based IMC architectures that can perform computations of ML and data-intensive applications, as well as strategies to map ML designs onto hardware accelerators.

The book serves as a bridge between researchers in the computing domain (algorithm designers for ML and DL) and computing hardware designers.



Serving as a bridge between researchers in the computing domain and computing hardware designers, this book presents ReRAM techniques for distributed computing using IMC accelerators, ReRAM-based IMC architectures for machine learning (ML) and data-intensive applications, and strategies to map ML designs onto hardware accelerators.

Acronyms ix
Preface xi
About the authors xv
Part I Introduction
1(76)
1 Introduction
3(16)
1.1 Introduction
3(11)
1.1.1 Memory wall and powerwall
3(2)
1.1.2 Semiconductor memory
5(6)
1.1.3 Nonvolatile IMC architecture
11(3)
1.2 Challenges and contributions
14(2)
1.3 Book organization
16(3)
2 The need of in-memory computing
19(26)
2.1 Introduction
19(1)
2.2 Neuromorphic computing devices
20(4)
2.2.1 Resistive random-access memory
21(1)
2.2.2 Spin-transfer-torque magnetic random-access memory
22(1)
2.2.3 Phase change memory
23(1)
2.3 Characteristics of NVM devices for neuromorphic computing
24(1)
2.4 IMC architectures for machine learning
25(15)
2.4.1 Operating principles of IMC architectures
26(1)
2.4.2 Analog and digitized fashion of IMC
27(2)
2.4.3 Analog IMC
29(5)
2.4.4 Digitized IMC
34(1)
2.4.5 Literature review of IMC
34(6)
2.5 Analysis of IMC architectures
40(5)
3 The background of ReRAM devices
45(32)
3.1 ReRAM device and SPICE model
45(9)
3.1.1 Drift-type ReRAM device
45(7)
3.1.2 Diffusive-type ReRAM device
52(2)
3.2 ReRAM-crossbar structure
54(5)
3.2.1 Analog and digitized ReRAM crossbar
55(2)
3.2.2 Connection of ReRAM crossbar
57(2)
3.3 ReRAM-based oscillator
59(2)
3.4 Write-in scheme for multibit ReRAM storage
61(9)
3.4.1 ReRAM data storage
61(1)
3.4.2 Multi-threshold resistance for data storage
62(1)
3.4.3 Write and read
63(2)
3.4.4 Validation
65(2)
3.4.5 Encoding and 3-bit storage
67(3)
3.5 Logic functional units with ReRAM
70(1)
3.5.1 OR gate
70(1)
3.5.2 And gate
70(1)
3.6 ReRAM for logic operations
71(6)
3.6.1 Simulation settings
72(1)
3.6.2 ReRAM-based circuits
73(1)
3.6.3 ReRAM as a computational unit-cum-memory
74(3)
Part II Machine learning accelerators
77(88)
4 The background of machine learning algorithms
79(20)
4.1 SVM-based machine learning
79(1)
4.2 Single-layer feedforward neural network-based machine learning
80(7)
4.2.1 Single-layer feedforward network
80(4)
4.2.2 L2-norm-gradient-based learning
84(3)
4.3 DCNN-based machine learning
87(6)
4.3.1 Deep learning for multilayer neural network
87(1)
4.3.2 Convolutional neural network
87(1)
4.3.3 Binary convolutional neural network
88(5)
4.4 TNN-based machine learning
93(6)
4.4.1 Tensor-train decomposition and compression
93(1)
4.4.2 Tensor-train-based neural network
94(2)
4.4.3 Training TNN
96(3)
5 XIMA: the in-ReRAM machine learning architecture
99(16)
5.1 ReRAM network-based ML operations
99(9)
5.1.1 ReR AM-crossbar network
99(7)
5.1.2 Coupled ReRAM oscillator network
106(2)
5.2 ReRAM network-based in-memory ML accelerator
108(7)
5.2.1 Distributed ReRAM-crossbar in-memory architecture
109(2)
5.2.2 3D XIMA
111(4)
6 The mapping of machine learning algorithms on XIMA
115(50)
6.1 Machine learning algorithms on XIMA
115(26)
6.1.1 SLFN-based learning and inference acceleration
115(2)
6.1.2 BCNN-based inference acceleration on passive array
117(4)
6.1.3 BCNN-based inference acceleration on 1S1R array
121(1)
6.1.4 L2-norm gradient-based learning and inference acceleration
122(4)
6.1.5 Experimental evaluation of machine learning algorithms on XIMA architecture
126(15)
6.2 Machine learning algorithms on 3D XIMA
141(24)
6.2.1 On-chip design for SLFN
141(4)
6.2.2 On-chip design for TNNs
145(6)
6.2.3 Experimental evaluation of machine learning algorithms on 3D CMOS-ReRAM
151(14)
Part III Case studies
165(52)
7 Large-scale case study: accelerator for ResNet
167(22)
7.1 Introduction
167(1)
7.2 Deep neural network with quantization
168(6)
7.2.1 Basics of ResNet
168(2)
7.2.2 Quantized convolution and residual block
170(2)
7.2.3 Quantized BN
172(1)
7.2.4 Quantized activation function and pooling
172(1)
7.2.5 Quantized deep neural network overview
173(1)
7.2.6 Training strategy
173(1)
7.3 Device for in-memory computing
174(3)
7.3.1 ReRAM crossbar
174(2)
7.3.2 Customized DAC and ADC circuits
176(1)
7.3.3 In-memory computing architecture
176(1)
7.4 Quantized ResNet on ReRAM crossbar
177(3)
7.4.1 Mapping strategy
177(1)
7.4.2 Overall architecture
178(2)
7.5 Experiment result
180(9)
7.5.1 Experiment settings
180(1)
7.5.2 Device simulations
181(1)
7.5.3 Accuracy analysis
182(3)
7.5.4 Performance analysis
185(4)
8 Large-scale case study: accelerator for compressive sensing
189(26)
8.1 Introduction
189(3)
8.2 Background
192(2)
8.2.1 Compressive sensing and isometric distortion
192(1)
8.2.2 Optimized near-isometric embedding
192(2)
8.3 Boolean embedding for signal acquisition front end
194(3)
8.3.1 CMOS-based Boolean embedding circuit
194(1)
8.3.2 ReRAM crossbar-based Boolean embedding circuit
195(2)
8.3.3 Problem formulation
197(1)
8.4 IH algorithm
197(3)
8.4.1 Orthogonal rotation
198(1)
8.4.2 Quantization
199(1)
8.4.3 Overall optimization algorithm
199(1)
8.5 Row generation algorithm
200(3)
8.5.1 Elimination of norm equality constraint
200(1)
8.5.2 Convex relaxation of orthogonal constraint
201(1)
8.5.3 Overall optimization algorithm
202(1)
8.6 Numerical results
203(12)
8.6.1 Experiment setup
203(1)
8.6.2 IH algorithm on high-D ECG signals
204(3)
8.6.3 Row generation algorithm on low-D image patches
207(3)
8.6.4 Hardware performance evaluation
210(5)
9 Conclusions: wrap-up, open questions and challenges
215(2)
9.1 Conclusion
215(1)
9.2 Future work
216(1)
References 217(20)
Index 237
Hao Yu is a professor in the School of Microelectronics at Southern University of Science and Technology (SUSTech), China. His main research interests cover energy-efficient IC chip design and mmwave IC design. He is a senior member of IEEE and a member of ACM. He has written several books and holds 20 granted patents. He is a distinguished lecturer of IEEE Circuits and Systems and associate editor of Elsevier Integration, the VLSI Journal, Elsevier Microelectronics Journal, Nature Scientific Reports, ACM Transactions on Embedded Computing Systems and IEEE Transactions on Biomedical Circuits and Systems. He is also a technical program committee member of several IC conferences, including IEEE CICC, BioCAS, A-SSCC, ACM DAC, DATE and ICCAD. He obtained his Ph.D. degree from the EE department at UCLA, USA.



Leibin Ni is a Principle engineer at Huawei Technologies, Shenzhen, China. His research interests include emerging nonvolatile memory platforms, computing in-memory architecture, machine learning applications and low power designs. He is a member of IEEE. He received his Ph.D. from the Nanyang Technological University, Singapore.



Sai Manoj Pudukotai Dinakarrao is an assistant professor in the Department of Electrical and Computer Engineering at George Mason University (GMU), USA. His current research interests include hardware security, adversarial machine learning, Internet of things networks, deep learning in resource-constrained environments, in-memory computing, accelerator design, algorithms, design of self-aware many-core microprocessors and resource management in many-core microprocessors. He is a member of IEEE and ACM. He served as a guest editor to IEEE Design and Test Magazine and reviewer for multiple IEEE and ACM journals. Also, he is a technical program committee member of several CAD conferences, including ACM DAC, DATE, ICCAD, ASP-DAC, ESWEEK and many more. He received a Ph.D. degree in Electrical and Electronic Engineering from the Nanyang Technological University, Singapore.