Preface |
|
xi | |
|
|
1 | (184) |
|
1 Introduction to Parallel Processing |
|
|
3 | (28) |
|
|
4 | (3) |
|
|
5 | (1) |
|
|
6 | (1) |
|
|
7 | (1) |
|
1.2 Classification of Parallel Systems |
|
|
7 | (7) |
|
1.2.1 Parallel versus Distributed Systems |
|
|
13 | (1) |
|
1.3 The Relationship of Tasks and Data |
|
|
14 | (3) |
|
1.3.1 Inherent Difficulties |
|
|
15 | (1) |
|
|
16 | (1) |
|
|
16 | (1) |
|
1.4 Evaluating Parallel Implementations |
|
|
17 | (14) |
|
1.4.1 Realization Penalties |
|
|
18 | (2) |
|
1.4.2 Performance Metrics |
|
|
20 | (5) |
|
|
25 | (6) |
|
2 Task Scheduling and Data Management |
|
|
31 | (58) |
|
2.1 Problem Decomposition |
|
|
31 | (5) |
|
2.1.1 Algorithmic Decomposition |
|
|
32 | (1) |
|
2.1.2 Domain Decomposition |
|
|
32 | (2) |
|
2.1.3 Abstract Definition of a Task |
|
|
34 | (1) |
|
2.1.4 System Architecture |
|
|
34 | (2) |
|
|
36 | (10) |
|
|
36 | (5) |
|
2.2.2 Demand Driven Model |
|
|
41 | (5) |
|
2.2.3 Hybrid Computational Model |
|
|
46 | (1) |
|
|
46 | (7) |
|
2.3.1 Task Definition and Granularity |
|
|
46 | (2) |
|
2.3.2 Task Distribution and Control |
|
|
48 | (1) |
|
2.3.3 Algorithmic Dependencies |
|
|
49 | (4) |
|
2.4 Task Scheduling Strategies |
|
|
53 | (12) |
|
2.4.1 Data Driven Task Management Strategies |
|
|
53 | (1) |
|
2.4.2 Demand Driven Task Management Strategies |
|
|
54 | (5) |
|
2.4.3 Task Manager Process |
|
|
59 | (3) |
|
2.4.4 Distributed Task Management |
|
|
62 | (2) |
|
2.4.5 Preferred Bias Task Allocation |
|
|
64 | (1) |
|
|
65 | (24) |
|
2.5.1 World Model of the Data: No Data Management Required |
|
|
66 | (1) |
|
2.5.2 Virtual Shared Memory |
|
|
67 | (2) |
|
|
69 | (7) |
|
|
76 | (4) |
|
2.5.5 Minimizing the Impact of Remote Data Requests |
|
|
80 | (5) |
|
2.5.6 Data Management for Multistage Problems |
|
|
85 | (4) |
|
3 Parallel Global Illumination Algorithms |
|
|
89 | (44) |
|
|
90 | (2) |
|
|
92 | (2) |
|
|
94 | (2) |
|
|
96 | (13) |
|
3.4.1 Parallel Ray Tracing |
|
|
99 | (1) |
|
3.4.2 Demand Driven Ray Tracing |
|
|
99 | (5) |
|
3.4.3 Data Parallel Ray Tracing |
|
|
104 | (3) |
|
|
107 | (2) |
|
|
109 | (3) |
|
|
110 | (1) |
|
|
111 | (1) |
|
3.6 Full Matrix Radiosity |
|
|
112 | (4) |
|
3.6.1 Setting Up the Matrix of Form Factors |
|
|
113 | (2) |
|
3.6.2 Solving the Matrix of Form Factors |
|
|
115 | (1) |
|
3.6.3 Group Iterative Methods |
|
|
115 | (1) |
|
3.7 Progressive Refinement |
|
|
116 | (4) |
|
|
118 | (2) |
|
3.8 Hierarchical Radiosity |
|
|
120 | (2) |
|
3.8.1 Parallel Hierarchical Radiosity |
|
|
121 | (1) |
|
|
122 | (3) |
|
3.9.1 Parallel Particle Tracing |
|
|
123 | (1) |
|
|
124 | (1) |
|
3.10 Data Distribution and Data Locality |
|
|
125 | (6) |
|
|
126 | (1) |
|
3.10.2 Visibility Preprocessing |
|
|
127 | (1) |
|
3.10.3 Environment Mapping |
|
|
128 | (1) |
|
3.10.4 Geometric Simplification |
|
|
128 | (2) |
|
3.10.5 Directional Caching |
|
|
130 | (1) |
|
3.10.6 Reordering Computations |
|
|
130 | (1) |
|
|
131 | (2) |
|
4 Overview of Parallel Graphics Hardware |
|
|
133 | (20) |
|
|
133 | (3) |
|
4.2 Parallelism in Graphics Cards |
|
|
136 | (15) |
|
|
136 | (3) |
|
4.2.2 Hewlett-Packard Products |
|
|
139 | (3) |
|
4.2.3 SGI Products (Silicon Graphics, Inc.) |
|
|
142 | (4) |
|
|
146 | (3) |
|
4.2.5 Pomegranate Graphics Chip |
|
|
149 | (2) |
|
|
151 | (2) |
|
5 Coherence in Ray Tracing |
|
|
153 | (32) |
|
|
154 | (11) |
|
5.1.1 Distribution of Data Accesses |
|
|
155 | (3) |
|
5.1.2 Temporal Characteristics |
|
|
158 | (4) |
|
5.1.3 Temporal Behaviour per Ray Type |
|
|
162 | (2) |
|
|
164 | (1) |
|
|
165 | (20) |
|
|
166 | (1) |
|
|
167 | (2) |
|
5.2.3 Frame Coherence Algorithm |
|
|
169 | (5) |
|
5.2.4 Parallel Frame Coherence Algorithm |
|
|
174 | (2) |
|
|
176 | (8) |
|
|
184 | (1) |
|
|
185 | (152) |
|
6 Interactive Ray Tracing on a Supercomputer |
|
|
187 | (30) |
|
|
188 | (6) |
|
6.1.1 Conventional Operation |
|
|
189 | (3) |
|
6.1.2 Frameless Rendering |
|
|
192 | (1) |
|
|
193 | (1) |
|
6.2 Ray Tracing for Volume Visualization |
|
|
194 | (20) |
|
|
195 | (2) |
|
6.2.2 Traversal Optimizations |
|
|
197 | (4) |
|
|
201 | (4) |
|
|
205 | (6) |
|
|
211 | (3) |
|
6.3 Ray Tracing for Terrain Visualization |
|
|
214 | (1) |
|
|
215 | (2) |
|
7 Interactive Ray Tracing on PCs |
|
|
217 | (32) |
|
|
217 | (4) |
|
|
220 | (1) |
|
7.2 An Optimized Ray Tracing Implementation |
|
|
221 | (2) |
|
|
221 | (1) |
|
|
222 | (1) |
|
7.2.3 Coherence through Packets of Rays |
|
|
223 | (1) |
|
7.2.4 Parallelism through SIMD Extensions |
|
|
223 | (1) |
|
7.3 Ray Triangle Intersection Computation |
|
|
223 | (3) |
|
7.3.1 Optimized Barycentric Coordinate Test |
|
|
223 | (1) |
|
7.3.2 Evaluating Instruction Level Parallelism |
|
|
224 | (1) |
|
7.3.3 SIMD Barycentric Coordinate Test |
|
|
224 | (2) |
|
|
226 | (3) |
|
7.4.1 Traversal Algorithm |
|
|
226 | (2) |
|
7.4.2 Memory Layout for Better Caching |
|
|
228 | (1) |
|
|
229 | (1) |
|
|
229 | (2) |
|
7.6 Performance of the Ray Tracing Engine |
|
|
231 | (5) |
|
7.6.1 Comparison to Other Ray Tracers |
|
|
231 | (2) |
|
7.6.2 Reflection and Shadow Rays |
|
|
233 | (1) |
|
7.6.3 Comparison with Rasterization Hardware |
|
|
234 | (2) |
|
7.7 Interactive Ray Tracing on PC Clusters |
|
|
236 | (3) |
|
|
238 | (1) |
|
7.8 Distributed Data Management |
|
|
239 | (2) |
|
7.8.1 Explicit Data Management |
|
|
239 | (2) |
|
|
241 | (1) |
|
|
241 | (1) |
|
|
242 | (1) |
|
|
243 | (3) |
|
|
246 | (3) |
|
8 The "Kilauea" Massively Parallel Ray Tracer |
|
|
249 | (80) |
|
8.1 What Is the Kilauea Project? |
|
|
249 | (1) |
|
|
250 | (1) |
|
|
251 | (5) |
|
8.3.1 Hardware Environment |
|
|
251 | (1) |
|
|
252 | (1) |
|
|
252 | (1) |
|
|
253 | (1) |
|
|
253 | (1) |
|
8.3.6 Single Executable Binary |
|
|
254 | (1) |
|
8.3.7 Multiframe Rendering |
|
|
254 | (1) |
|
8.3.8 Global Illumination Renderer |
|
|
255 | (1) |
|
8.4 The ShotData File Format |
|
|
256 | (5) |
|
|
261 | (7) |
|
|
268 | (48) |
|
8.6.1 Low-Level Data Structure |
|
|
268 | (9) |
|
8.6.2 MPI (Message Passing Interface) Layer |
|
|
277 | (2) |
|
8.6.3 Tel Command Interface |
|
|
279 | (2) |
|
|
281 | (7) |
|
8.6.5 Details of Ray Tracing |
|
|
288 | (3) |
|
8.6.6 Shading Computation |
|
|
291 | (15) |
|
|
306 | (4) |
|
8.6.8 Things to Note in Shading Computation |
|
|
310 | (4) |
|
8.6.9 Development in General |
|
|
314 | (2) |
|
|
316 | (7) |
|
|
316 | (3) |
|
|
319 | (2) |
|
|
321 | (1) |
|
8.7.4 Consideration of Rendering Results |
|
|
322 | (1) |
|
|
323 | (2) |
|
8.9 Future Plans and Tasks |
|
|
325 | (4) |
|
9 Parallel Ray Tracing on a Chip |
|
|
329 | (8) |
|
9.1 The Smart Memories Chip |
|
|
329 | (2) |
|
|
331 | (2) |
|
|
333 | (3) |
|
|
334 | (1) |
|
9.3.2 Estimated Performance |
|
|
335 | (1) |
|
|
336 | (1) |
Bibliography |
|
337 | (26) |
Index |
|
363 | (6) |
Author Biographies |
|
369 | |