|
|
1 | (16) |
|
1.1 The SoC---SDRAM Interface |
|
|
2 | (1) |
|
|
3 | (1) |
|
1.3 Cramming More Applications onto (Power-Constrained) SoCs |
|
|
4 | (2) |
|
|
6 | (3) |
|
1.4.1 Application Requirements |
|
|
6 | (1) |
|
|
7 | (1) |
|
1.4.3 Predictable Performance |
|
|
7 | (1) |
|
1.4.4 Composable Performance |
|
|
8 | (1) |
|
1.5 Requirements for SDRAM Controllers in Modern SoCs |
|
|
9 | (1) |
|
1.6 Problem Statement and Contributions |
|
|
10 | (3) |
|
1.6.1 Multi-generation Power-Aware Command Scheduling |
|
|
11 | (1) |
|
1.6.2 Improving Average-Case Performance Without Affecting Worst-Case Performance |
|
|
12 | (1) |
|
1.6.3 Reconfigurable Architecture |
|
|
12 | (1) |
|
|
13 | (4) |
|
|
14 | (3) |
|
2 Reconfigurable Real-Time Memory Controller Architecture |
|
|
17 | (40) |
|
|
18 | (6) |
|
|
19 | (3) |
|
2.1.2 Timings and Timing Constraints |
|
|
22 | (1) |
|
|
22 | (1) |
|
|
23 | (1) |
|
2.2 Pattern-Based SDRAM Controllers |
|
|
24 | (3) |
|
|
25 | (2) |
|
2.3 Controller Architecture |
|
|
27 | (8) |
|
|
28 | (3) |
|
|
31 | (3) |
|
|
34 | (1) |
|
2.3.4 Reconfiguration Infrastructure |
|
|
34 | (1) |
|
2.4 Worst-Case Performance Analysis |
|
|
35 | (12) |
|
2.4.1 Latency-Rate Servers |
|
|
35 | (1) |
|
2.4.2 Back-End Performance |
|
|
36 | (9) |
|
2.4.3 Front-End Performance |
|
|
45 | (1) |
|
2.4.4 Worst-Case Response Times |
|
|
46 | (1) |
|
2.5 CompSOC Controller Instance |
|
|
47 | (2) |
|
|
49 | (4) |
|
|
49 | (1) |
|
|
50 | (3) |
|
|
53 | (4) |
|
|
54 | (3) |
|
|
57 | (36) |
|
3.1 Generalized Command Scheduling Rules |
|
|
58 | (2) |
|
|
60 | (14) |
|
3.2.1 Pattern Generation with Variable Bank Interleaving |
|
|
63 | (4) |
|
3.2.2 BS PBGI Heuristic for DDR4 Pattern Generation |
|
|
67 | (2) |
|
|
69 | (1) |
|
3.2.4 ILP-Based Pattern Generation |
|
|
69 | (3) |
|
3.2.5 Memory Map Implications |
|
|
72 | (2) |
|
3.3 Composable Pattern Conversion |
|
|
74 | (4) |
|
3.3.1 Composable Memory Pattern Generation |
|
|
74 | (3) |
|
3.3.2 Impact on Memory Efficiency |
|
|
77 | (1) |
|
|
78 | (11) |
|
|
78 | (1) |
|
3.4.2 Evaluation of Pattern-Generation Heuristics |
|
|
79 | (3) |
|
3.4.3 Composable Patterns |
|
|
82 | (7) |
|
|
89 | (4) |
|
|
90 | (3) |
|
4 Cycle-Accurate SDRAM Power Modeling |
|
|
93 | (18) |
|
4.1 High-Level Description of the DRAMPower Model |
|
|
94 | (1) |
|
4.2 Background on SDRAM Currents |
|
|
94 | (2) |
|
4.3 SDRAM Power State Machine |
|
|
96 | (1) |
|
4.4 Determining the Energy Cost of a Command |
|
|
97 | (3) |
|
4.4.1 ACT, PRE, and PREA Commands |
|
|
98 | (1) |
|
|
99 | (1) |
|
|
100 | (1) |
|
4.5 Adaptation to LPDDR and WIDE I/O Memories |
|
|
100 | (1) |
|
4.6 Trace-Level Energy and Power Calculation in DRAMPower |
|
|
101 | (1) |
|
|
102 | (3) |
|
|
103 | (1) |
|
|
104 | (1) |
|
|
105 | (3) |
|
|
105 | (1) |
|
|
106 | (2) |
|
|
108 | (3) |
|
|
108 | (3) |
|
5 Power/Performance Trade-Offs |
|
|
111 | (14) |
|
5.1 Worst-Case Bandwidth, Energy, and Power Metrics |
|
|
111 | (2) |
|
5.1.1 Calculating Worst-Case Power and Energy Efficiency |
|
|
112 | (1) |
|
5.2 Worst-Case Bandwidth/Power Trends |
|
|
113 | (6) |
|
5.2.1 Comparing Pattern Configurations of a Single Memory Device |
|
|
116 | (1) |
|
5.2.2 Comparing Multiple Speed Bins and SDRAM Types |
|
|
117 | (2) |
|
5.3 Worst-Case Response Time of an Atom |
|
|
119 | (2) |
|
|
121 | (2) |
|
|
123 | (2) |
|
|
123 | (2) |
|
6 Conservative Open-Page Policy |
|
|
125 | (20) |
|
6.1 Conservative Open-Page Policy |
|
|
126 | (3) |
|
6.2 Impact on Pattern-Based Controller |
|
|
129 | (2) |
|
6.3 Using Explicit Precharge Commands |
|
|
131 | (3) |
|
|
134 | (10) |
|
|
134 | (2) |
|
6.4.2 Stall Time Reduction |
|
|
136 | (8) |
|
|
144 | (1) |
|
|
144 | (1) |
|
|
145 | (22) |
|
7.1 Reconfiguration Options |
|
|
146 | (2) |
|
7.2 Performance Guarantees During a Use-Case Switch |
|
|
148 | (1) |
|
7.3 Delay Block/Arbiter Reconfiguration with Persistent Clients |
|
|
149 | (1) |
|
7.4 Reconfigurable TDM Arbiter |
|
|
150 | (9) |
|
7.4.1 Latency-Rate Parameters for TDM Arbiters |
|
|
151 | (1) |
|
7.4.2 Safe TDM Arbiter Reconfiguration protocol |
|
|
152 | (1) |
|
7.4.3 Arbiter Architecture |
|
|
153 | (1) |
|
7.4.4 Latency-Rate Guarantees During Reconfiguration |
|
|
154 | (5) |
|
|
159 | (5) |
|
7.5.1 Predictable Performance During Reconfiguration |
|
|
160 | (2) |
|
7.5.2 Composable Performance During Reconfiguration |
|
|
162 | (2) |
|
|
164 | (3) |
|
|
165 | (2) |
|
|
167 | (16) |
|
|
167 | (11) |
|
8.1.1 Average-Case-Oriented Controllers |
|
|
167 | (1) |
|
8.1.2 Real-Time-Oriented Controllers |
|
|
168 | (10) |
|
8.2 SDRAM Performance Overviews |
|
|
178 | (1) |
|
|
179 | (4) |
|
|
180 | (3) |
|
9 Conclusions and Future Work |
|
|
183 | (6) |
|
|
183 | (3) |
|
|
186 | (3) |
|
|
187 | (2) |
Appendix A ILP Problem Formulation |
|
189 | (8) |
Appendix B Memory Specifications |
|
197 | (4) |
Appendix C Code Listings |
|
201 | |