Muutke küpsiste eelistusi

E-raamat: Multithreading for Visual Effects

  • Formaat: 255 pages
  • Ilmumisaeg: 29-Jul-2014
  • Kirjastus: Apple Academic Press Inc.
  • Keel: eng
  • ISBN-13: 9781482243574
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 81,89 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Raamatukogudele
  • Formaat: 255 pages
  • Ilmumisaeg: 29-Jul-2014
  • Kirjastus: Apple Academic Press Inc.
  • Keel: eng
  • ISBN-13: 9781482243574
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

The team of software engineers and researchers was assembled in 2013 to offer a course on the use of multithreading techniques in visual effects, and decided to continue their collaboration into book form. Each taking a chapter, they cover Houdini: multithreading existing software, the Presto Execution System, designing for multithreading, LibEE: parallel evaluation of character rigs, simulating fluids on the CPU, simulating bullet physics with OpenCL, and OpenSubdiv: interoperating GPU compute and drawing. Annotation ©2014 Ringgold, Inc., Portland, OR (protoview.com)

Tackle the Challenges of Parallel Programming in the Visual Effects Industry

In Multithreading for Visual Effects, developers from DreamWorks Animation, Pixar, Side Effects, Intel, and AMD share their successes and failures in the messy real-world application area of production software. They provide practical advice on multithreading techniques and visual effects used in popular visual effects libraries (such as Bullet, OpenVDB, and OpenSubdiv), one of the industry’s leading visual effects packages (Houdini), and proprietary animation systems. This information is valuable not just to those in the visual effects arena, but also to developers of high performance software looking to increase performance of their code.

Diverse Solutions to Solve Performance Problems

After an introductory chapter, each subsequent chapter presents a case study that illustrates how the authors used multithreading techniques to achieve better performance. The authors discuss the problems that occurred and explain how they solved them. The case studies encompass solutions for shaving milliseconds, solutions for optimizing longer running tasks, multithreading techniques for modern CPU architectures, and massive parallelism using GPUs. Some of the case studies include open source projects so you can try out these techniques for yourself and see how well they work.

Arvustused

"Multithreading applications is hard, but for todays performance-critical codes, an absolute necessity. This book shows how the latest parallel programming technology can simplify the daunting challenge of producing fast and reliable software for multicore processors. Although the instructive case studies are drawn from visual effects applications, the authors cover the gamut of issues that developers face when parallelizing legacy applications from any domain." Charles Leiserson, MIT Computer Science and Artificial Intelligence Laboratory

"Multithreading graphics algorithms is a new and exciting area of research. It is crucial to computer graphics. This book will prove invaluable to researchers and practitioners alike. It will have a strong impact on movie visual effects and games." Jos Stam, Senior Principal Research Scientist, Autodesk, Inc.

"Visual effects programming is undergoing a renaissance as high-end videogame effects technology approaches the state-of-the-art defined by blockbuster Hollywood movies, empowered by the capabilities of multi-Teraflop GPU hardware. A wealth of graphics algorithms are now graduating into the realm of real-time rendering, yet todays programmers face a formidable challenge in structuring these algorithms to take full advantage of todays multi-core CPU architectures and deliver on their potential. This book, the collaborative result of many industry luminaries, wonderfully bridges the gap between the theory of multithreading and the practice of multithreading in advanced graphical applications. Join them on this journey to bring real-time visual effects technology to the next level!" Tim Sweeney, CEO and Founder of Epic Games

"valuable not just to those in the visual effects arena, but also to developers of high performance software looking to increase performance of their code." Scott R. Garrigus, NewTechReview

Preface xiii
Acknowledgments xv
Authors xvii
List of Figures
xix
List of Tables
xxiii
1 Introduction and Overview
1(18)
James Reinders
1.1 Introduction
1(1)
1.2 Overview of Case Studies
2(1)
1.3 Motivation
3(4)
1.3.1 Quickly Increasing Clock Speeds Ended by 2005
3(1)
1.3.2 The Move to Multicore
4(2)
1.3.3 SIMD Is Parallelism Too
6(1)
1.3.4 Highly Threaded Hardware
7(1)
1.4 Program in Tasks, Not Threads
7(1)
1.5 Value of Abstraction
8(1)
1.6 Scaling and Vectorization
9(1)
1.7 Advancing Programming Languages for Parallel Programming
10(1)
1.7.1 Abstraction
10(1)
1.7.2 Parallel Programming Needs
10(1)
1.7.3 Relaxed Sequential Semantics
11(1)
1.8 Parallel Programming in C and C++
11(5)
1.8.1 Brief Survey of Key Parallelism Options
12(1)
1.8.1.1 TBB
12(1)
1.8.1.2 Cilk Plus
12(1)
1.8.1.3 OpenMP
12(1)
1.8.1.4 OpenCL
13(1)
1.8.1.5 GPU Specific Models
13(1)
1.8.2 More on TBB: Intel Threading Building Blocks
13(1)
1.8.2.1 Parallel for: parallel for()
14(1)
1.8.2.2 Parallel Reductions: parallel reduce
15(1)
1.8.2.3 Parallel Invocation of Functions: parallel invoke
16(1)
1.8.2.4 Learning More about TBB
16(1)
1.9 Data Movement and Layout
16(1)
1.10 Summary
17(1)
1.11 Additional Reading
18(1)
2 Houdini: Multithreading Existing Software
19(28)
Jeff Lait
2.1 What Is Houdini?
19(2)
2.2 Rewrite or Refactor
21(9)
2.2.1 Cleaning Statics
22(5)
2.2.2 Threading the Simple Cases
27(3)
2.3 Patterns
30(4)
2.3.1 Always Be Reentrant
30(1)
2.3.2 Never Lock
31(1)
2.3.3 Atomics Are Slow
31(1)
2.3.4 Never Blindly Thread
32(1)
2.3.5 Command Line Control
33(1)
2.3.6 Constant Memory versus Number of Cores
33(1)
2.3.7 Memory Allocation
34(1)
2.4 Copy on Write
34(6)
2.4.1 Const Correctness
35(1)
2.4.2 Reader/Writer Locks
35(1)
2.4.3 Ownership Is Important
36(1)
2.4.4 Sole Ownership Is a Writer Lock
37(1)
2.4.5 Failure Modes of This System
38(2)
2.5 Dependencies
40(4)
2.5.1 Task Locks
41(2)
2.5.2 Mantra
43(1)
2.6 OpenCL
44(3)
3 The Presto Execution System: Designing for Multithreading
47(26)
George ElKoura
3.1 Introduction
48(1)
3.1.1 A Note about Interactivity
48(1)
3.2 Presto
49(3)
3.2.1 Presto Objects
50(1)
3.2.2 Rigging in Presto
51(1)
3.2.3 Animation in Presto
52(1)
3.3 Presto's Execution System
52(5)
3.3.1 Phases of Execution
53(1)
3.3.1.1 Compilation
53(1)
3.3.1.2 Scheduling
54(1)
3.3.1.3 Evaluation
54(1)
3.3.2 Engine Architecture
54(1)
3.3.2.1 Network
55(1)
3.3.2.2 Schedulers
55(1)
3.3.2.3 Data Managers
56(1)
3.3.2.4 Executors
56(1)
3.3.2.5 Engine Architecture and Multithreading
56(1)
3.4 User Extensions
57(2)
3.4.1 Dependencies Declared a Priori
57(1)
3.4.2 Client Callbacks Are Static Functions
57(1)
3.4.3 Presto Singletons Are Protected
58(1)
3.4.4 Iterators
58(1)
3.4.5 And Then There's Python
58(1)
3.4.5.1 Global Interpreter Lock
58(1)
3.4.5.2 Performance
59(1)
3.5 Memory Access Patterns
59(1)
3.6 Flexibility to Experiment
60(1)
3.6.1 Modular Design
60(1)
3.6.2 Targeting Other Platforms
60(1)
3.7 Multithreading Strategies
61(3)
3.7.1 Per-Node Multithreading
61(1)
3.7.2 Per-Branch Multithreading
62(1)
3.7.3 Per-Model Multithreading
62(2)
3.7.4 Per-Frame Multithreading
64(1)
3.8 Background Execution
64(5)
3.8.1 User Interaction
65(1)
3.8.2 Frame Scheduling
65(1)
3.8.3 Interruption
66(1)
3.8.4 Constant Data
67(1)
3.8.5 Problematic Data Structures
67(2)
3.9 Other Multithreading Strategies
69(1)
3.9.1 Strip Mining
69(1)
3.9.2 Predictive Computations
70(1)
3.10 Debugging and Profiling Tools
70(1)
3.11 Summary
71(2)
4 LibEE: Parallel Evaluation of Character Rigs
73(38)
Martin Watt
4.1 Introduction
74(2)
4.2 Motivation
76(1)
4.3 Specific Requirements for Character Animation
76(3)
4.3.1 Animation Graph Goals
77(1)
4.3.2 Animation Graph Features
77(1)
4.3.2.1 Few Unique Traversed Paths through Graph
77(1)
4.3.2.2 Animation Rigs Have Implicit Parallelism
78(1)
4.3.2.3 Expensive Nodes Which Can Be Internally Parallel
78(1)
4.3.3 Animation Graph Constraints
78(1)
4.3.3.1 No Graph Editing
78(1)
4.3.3.2 No Scripting Languages in Operators
78(1)
4.4 Graph
79(1)
4.4.1 Threading Engine
79(1)
4.4.2 Graph Evaluation Mechanism
80(1)
4.5 Threadsafety
80(5)
4.5.1 Node Threadsafety
81(1)
4.5.1.1 API Layer
81(1)
4.5.1.2 Parallel Unit Tests
81(1)
4.5.1.3 Threading Checker Tools
82(1)
4.5.1.4 Compiler Flags
82(1)
4.5.1.5 LD PRELOAD
83(1)
4.5.1.6 The Kill Switch
84(1)
4.5.2 Graph Threadsafety
84(1)
4.6 Scalability: Software Considerations
85(7)
4.6.1 Authoring Parallel Loops
86(1)
4.6.2 Overthreading
87(1)
4.6.3 Threading Fatigue
87(1)
4.6.4 Thread-Friendly Memory Allocators
88(1)
4.6.5 Oversubscription Due to Multiple Threading Models
88(1)
4.6.6 Cache Reuse---Chains of Nodes
89(1)
4.6.7 Cache Reuse---Scheduling Nodes to Maximize Sharing
89(1)
4.6.8 Task Priorities
89(1)
4.6.9 Graph Partitioning
89(2)
4.6.10 Other Processes Running on System
91(1)
4.6.11 The Memory Wall
91(1)
4.6.12 Failed Approaches Discussion
91(1)
4.7 Scalability: Hardware Considerations
92(3)
4.7.1 CPU Power Modes
92(1)
4.7.2 Turbo Clock
92(1)
4.7.3 NUMA
92(1)
4.7.4 Hyperthreading
93(1)
4.7.5 CPU Affinity
94(1)
4.7.6 Many-Core Architectures
94(1)
4.8 Production Considerations
95(2)
4.8.1 Character Systems Restructure
96(1)
4.8.2 No More Scripted Nodes
96(1)
4.8.3 Optimizing for Maximum Parallelism
96(1)
4.9 Threading Visualization Tool
97(3)
4.10 Rig Optimization Case Studies
100(4)
4.10.1 Case Study 1: Quadruped Critical Path Optimization
100(1)
4.10.2 Case Study 2: Hair Solver
100(1)
4.10.3 Case Study 3: Free Clothes!
100(4)
4.11 Overall Performance Results
104(1)
4.12 Limits of Scalability
104(2)
4.13 Summary
106(5)
5 Fluids: Simulation on the CPU
111(26)
Ronald Henderson
5.1 Motivation
111(1)
5.2 Programming Models
112(8)
5.2.1 Everything You Need to Get Started
114(1)
5.2.2 Example: Over
114(1)
5.2.3 Example: Dot Product
115(2)
5.2.4 Example: Maximum Absolute Value
117(1)
5.2.5 Platform Considerations
118(1)
5.2.6 Performance
119(1)
5.3 Fluid Simulation
120(16)
5.3.1 Data Structures
120(2)
5.3.2 Smoke, Fire, and Explosions
122(2)
5.3.2.1 Advection Solvers
124(2)
5.3.2.2 Elliptic Solvers
126(2)
5.3.3 Liquids
128(4)
5.3.3.1 Parallel Point Rasterization
132(4)
5.4 Summary
136(1)
6 Bullet Physics: Simulation with OpenCL
137(26)
Erwin Coumans
6.1 Introduction
138(2)
6.1.1 Rigid Body Dynamics Simulation
138(1)
6.1.2 Refactoring before the Full Rewrite
139(1)
6.2 Rewriting from Scratch Using OpenCL
140(5)
6.2.1 Brief OpenCL Introduction
140(2)
6.2.2 Exploiting the GPU
142(1)
6.2.3 Dealing with Branchy Code/Thread Divergence
143(1)
6.2.4 Serializing Data to Contiguous Memory
144(1)
6.2.5 Sharing CPU and GPU Code
144(1)
6.2.6 Precompiled Kernel Caching
145(1)
6.3 GPU Spatial Acceleration Structures
145(6)
6.3.1 Reference All Pairs Overlap Test
146(1)
6.3.2 Uniform Grid
147(1)
6.3.3 Parallel 1-Axis Sort and Sweep
148(1)
6.3.4 Parallel 3-Axis Sweep and Prune
149(1)
6.3.5 Hybrid Approaches
150(1)
6.3.6 Static Local Space AABB Tree
150(1)
6.4 GPU Contact Point Generation
151(4)
6.4.1 Collision Shape Representation
151(1)
6.4.2 Convex 3D Height Field Using Cube Maps
152(1)
6.4.3 Separating Axis Test
153(1)
6.4.4 Sutherland Hodgeman Clipping
153(1)
6.4.5 Minkowski Portal Refinement
154(1)
6.4.6 Contact Reduction
154(1)
6.5 GPU Constraint Solving
155(8)
6.5.1 Equations of Motion
155(1)
6.5.2 Contact and Friction Constraint Setup
155(1)
6.5.3 Parallel Projected Gauss-Seidel Method
156(1)
6.5.4 Batch Creation and Two-Stage Batching
157(2)
6.5.5 Non-Contact Constraints
159(1)
6.5.6 GPU Deterministic Simulation
159(1)
6.5.7 Conclusion and Future Work
159(4)
7 OpenSubdiv: Interoperating GPU Compute and Drawing
163(40)
Manuel Kraemer
7.1 Representing Shapes
164(2)
7.1.1 Why Fast Subdivision?
165(1)
7.1.2 Legacy
165(1)
7.1.3 OpenSubdiv
166(1)
7.2 The Control Cage
166(3)
7.2.1 Patches and Arbitrary Topology
166(1)
7.2.2 Topological Data Structures
167(1)
7.2.3 Manifold Surfaces
167(1)
7.2.4 The Limit Surface
168(1)
7.3 Uniform Subdivision
169(1)
7.3.1 Implementing Subdivision Schemata
169(1)
7.4 Serializing the Mesh Representation
170(3)
7.4.1 Case Study: Subdividing a Pyramid
170(1)
7.4.2 Generating Indexing Tables
170(2)
7.4.3 Preparing for Parallel Execution
172(1)
7.5 Transition from Multicores to Many-Cores
173(2)
7.5.1 Streaming Multiprocessors and SIMT
173(1)
7.5.2 Practical Implementation with OpenCL
174(1)
7.6 Reducing Branching Divergence
175(4)
7.6.1 Sorting Vertices by Type
176(1)
7.6.2 Further Vertex Sorting
176(3)
7.7 Optimization Trade-Offs
179(3)
7.7.1 Alternative Strategy: NVIDIA Dynamic Parallelism
179(1)
7.7.2 Alternative Strategy: Vertex Stencils
180(1)
7.7.3 Memory Bottlenecks
181(1)
7.8 Evaluating Our Progress
182(1)
7.9 Fundamental Limitations of Uniform Subdivision
183(3)
7.9.1 Exponential Growth
184(1)
7.9.2 Geometric Fidelity
184(1)
7.9.3 Animating Subdivision Surfaces
185(1)
7.9.4 Better, Faster, Different
185(1)
7.10 Feature-Adaptive Subdivision
186(4)
7.10.1 GPU Hardware Tessellation
186(1)
7.10.2 Catmull-Clark Terminology
187(1)
7.10.3 Bi-Cubic Patch Representation
188(1)
7.10.4 Feature-Adaptive Subdivision
189(1)
7.11 Implementing the GPU Rendering Engine
190(7)
7.11.1 Bi-Cubic Bspline Patches with GLSL
191(1)
7.11.1.1 Handling Surface Boundaries
192(1)
7.11.1.2 Handling Patch Transitions
193(1)
7.11.1.3 "End" Patches
194(2)
7.11.2 Mitigating Drawing Overheads
196(1)
7.12 Texturing
197(2)
7.12.1 Displacement Mapping
198(1)
7.13 Conclusion
199(4)
Bibliography 203(6)
Index 209
Martin Watt, Erwin Coumans, George ElKoura, Ronald Henderson, Manuel Kraemer, Jeff Lait, James Reinders