Preface |
|
xv | |
1 An Overview of Optimization |
|
1 | (14) |
|
Optimization Is Part of Software Development |
|
|
2 | (1) |
|
Optimization Is Effective |
|
|
3 | (1) |
|
|
3 | (3) |
|
A Nanosecond Here, a Nanosecond There |
|
|
6 | (1) |
|
Summary of Strategies for Optimizing C++ Code |
|
|
6 | (6) |
|
Use a Better Compiler, Use Your Compiler Better |
|
|
7 | (1) |
|
|
8 | (1) |
|
|
9 | (1) |
|
Reduce Memory Allocation and Copying |
|
|
10 | (1) |
|
|
11 | (1) |
|
Use Better Data Structures |
|
|
12 | (1) |
|
|
12 | (1) |
|
Optimize Memory Management |
|
|
12 | (1) |
|
|
12 | (3) |
2 Computer Behavior Affecting Optimization |
|
15 | (12) |
|
Lies C++ Believes About Computers |
|
|
16 | (1) |
|
The Truth About Computers |
|
|
17 | (7) |
|
|
17 | (1) |
|
Memory Is Not Accessed in Bytes |
|
|
18 | (1) |
|
Some Memory Accesses Are Slower than Others |
|
|
19 | (1) |
|
Memory Words Have a Big End and a Little End |
|
|
20 | (1) |
|
Memory Has Finite Capacity |
|
|
20 | (1) |
|
Instruction Execution Is Slow |
|
|
21 | (1) |
|
Making Decisions Is Hard for Computers |
|
|
22 | (1) |
|
There Are Multiple Streams of Program Execution |
|
|
22 | (2) |
|
Calling into the Operating System Is Expensive |
|
|
24 | (1) |
|
|
24 | (2) |
|
All Statements Are Not Equally Expensive |
|
|
24 | (1) |
|
Statements Are Not Executed in Order |
|
|
25 | (1) |
|
|
26 | (1) |
3 Measure Performance |
|
27 | (42) |
|
|
28 | (4) |
|
Performance Must Be Measured |
|
|
28 | (1) |
|
Optimizers Are Big Game Hunters |
|
|
29 | (1) |
|
|
29 | (2) |
|
|
31 | (1) |
|
|
32 | (5) |
|
|
34 | (1) |
|
Measure Baseline Performance and Set Goals |
|
|
35 | (2) |
|
You Can Improve Only What You Measure |
|
|
37 | (1) |
|
Profile Program Execution |
|
|
37 | (3) |
|
|
40 | (23) |
|
"A Little Learning" About Measuring Time |
|
|
40 | (6) |
|
Measuring Time with Computers |
|
|
46 | (8) |
|
Overcoming Measurement Obstacles |
|
|
54 | (4) |
|
|
58 | (4) |
|
Time Hot Functions in a Test Harness |
|
|
62 | (1) |
|
Estimate Code Cost to Find Hot Code |
|
|
63 | (3) |
|
Estimate the Cost of Individual C++ Statements |
|
|
63 | (1) |
|
Estimate the Cost of Loops |
|
|
64 | (2) |
|
Other Ways to Find Hot Spots |
|
|
66 | (1) |
|
|
67 | (2) |
4 Optimize String Use: A Case Study |
|
69 | (22) |
|
Why Strings Are a Problem |
|
|
69 | (3) |
|
Strings Are Dynamically Allocated |
|
|
70 | (1) |
|
|
70 | (1) |
|
Strings Do a Lot of Copying |
|
|
71 | (1) |
|
First Attempt at Optimizing Strings |
|
|
72 | (8) |
|
Use Mutating String Operations to Eliminate Temporaries |
|
|
74 | (1) |
|
Reduce Reallocation by Reserving Storage |
|
|
74 | (1) |
|
Eliminate Copying of String Arguments |
|
|
75 | (1) |
|
Eliminate Pointer Dereference Using Iterators |
|
|
76 | (1) |
|
Eliminate Copying of Returned String Values |
|
|
77 | (1) |
|
Use Character Arrays Instead of Strings |
|
|
78 | (2) |
|
Summary of First Optimization Attempt |
|
|
80 | (1) |
|
Second Attempt at Optimizing Strings |
|
|
80 | (8) |
|
|
80 | (2) |
|
|
82 | (1) |
|
Use a Better String Library |
|
|
83 | (4) |
|
|
87 | (1) |
|
Eliminate String Conversion |
|
|
88 | (2) |
|
Conversion from C String to std::string |
|
|
89 | (1) |
|
Converting Between Character Encodings |
|
|
89 | (1) |
|
|
90 | (1) |
5 Optimize Algorithms |
|
91 | (16) |
|
|
92 | (4) |
|
Best-Case, Average, and Worst-Case Time Cost |
|
|
95 | (1) |
|
|
95 | (1) |
|
|
96 | (1) |
|
Toolkit to Optimize Searching and Sorting |
|
|
96 | (1) |
|
Efficient Search Algorithms |
|
|
96 | (2) |
|
Time Cost of Searching Algorithms |
|
|
97 | (1) |
|
All Searches Are Equal When n Is Small |
|
|
98 | (1) |
|
Efficient Sort Algorithms |
|
|
98 | (2) |
|
Time Cost of Sorting Algorithms |
|
|
99 | (1) |
|
Replace Sorts Having Poor Worst-Case Performance |
|
|
99 | (1) |
|
Exploit Known Properties of the Input Data |
|
|
100 | (1) |
|
|
100 | (6) |
|
|
101 | (1) |
|
|
102 | (1) |
|
|
102 | (1) |
|
|
103 | (1) |
|
|
104 | (1) |
|
|
104 | (1) |
|
|
105 | (1) |
|
Optimizing the Expected Path |
|
|
105 | (1) |
|
|
105 | (1) |
|
|
105 | (1) |
|
|
106 | (1) |
6 Optimize Dynamically Allocated Variables |
|
107 | (40) |
|
|
108 | (5) |
|
Storage Duration of Variables |
|
|
108 | (3) |
|
|
111 | (1) |
|
Value Objects and Entity Objects |
|
|
112 | (1) |
|
C++ Dynamic Variable API Refresher |
|
|
113 | (6) |
|
Smart Pointers Automate Ownership of Dynamic Variables |
|
|
116 | (2) |
|
Dynamic Variables Have Runtime Cost |
|
|
118 | (1) |
|
Reduce Use of Dynamic Variables |
|
|
119 | (8) |
|
Create Class Instances Statically |
|
|
119 | (2) |
|
Use Static Data Structures |
|
|
121 | (3) |
|
Use std::make_shared Instead of new |
|
|
124 | (1) |
|
Don't Share Ownership Unnecessarily |
|
|
125 | (1) |
|
Use a "Master Pointer" to Own Dynamic Variables |
|
|
126 | (1) |
|
Reduce Reallocation of Dynamic Variables |
|
|
127 | (2) |
|
Preallocate Dynamic Variables to Prevent Reallocation |
|
|
127 | (1) |
|
Create Dynamic Variables Outside of Loops |
|
|
128 | (1) |
|
Eliminate Unneeded Copying |
|
|
129 | (8) |
|
Disable Unwanted Copying in the Class Definition |
|
|
130 | (1) |
|
Eliminate Copying on Function Call |
|
|
131 | (1) |
|
Eliminate Copying on Function Return |
|
|
132 | (2) |
|
|
134 | (2) |
|
Implement the "Copy on Write" Idiom |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
137 | (8) |
|
Nonstandard Copy Semantics: A Painful Hack |
|
|
138 | (1) |
|
std::swap(): The Poor Man's Move Semantics |
|
|
138 | (1) |
|
Shared Ownership of Entities |
|
|
139 | (1) |
|
The Moving Parts of Move Semantics |
|
|
140 | (1) |
|
Update Code to Use Move Semantics |
|
|
141 | (1) |
|
Subtleties of Move Semantics |
|
|
142 | (3) |
|
|
145 | (1) |
|
|
146 | (1) |
7 Optimize Hot Statements |
|
147 | (40) |
|
|
148 | (12) |
|
|
149 | (1) |
|
Use More Efficient Loop Statements |
|
|
149 | (1) |
|
|
150 | (1) |
|
Remove Invariant Code from Loops |
|
|
151 | (1) |
|
Remove Unneeded Function Calls from Loops |
|
|
152 | (3) |
|
Remove Hidden Function Calls from Loops |
|
|
155 | (1) |
|
Remove Expensive, Slow-Changing Calls from Loops |
|
|
156 | (1) |
|
Push Loops Down into Functions to Reduce Call Overhead |
|
|
157 | (1) |
|
Do Some Actions Less Frequently |
|
|
158 | (2) |
|
What About Everything Else? |
|
|
160 | (1) |
|
Remove Code from Functions |
|
|
160 | (14) |
|
|
161 | (4) |
|
Declare Brief Functions Inline |
|
|
165 | (1) |
|
Define Functions Before First Use |
|
|
165 | (1) |
|
Eliminate Unused Polymorphism |
|
|
165 | (1) |
|
Discard Unused Interfaces |
|
|
166 | (4) |
|
Select Implementation at Compile Time with Templates |
|
|
170 | (1) |
|
Eliminate Uses of the PIMPL Idiom |
|
|
171 | (2) |
|
Eliminate Calls into DLLs |
|
|
173 | (1) |
|
Use Static Member Functions Instead of Member Functions |
|
|
173 | (1) |
|
Move Virtual Destructor to Base Class |
|
|
174 | (1) |
|
|
174 | (8) |
|
|
175 | (1) |
|
|
176 | (1) |
|
Use Less-Expensive Operators |
|
|
177 | (1) |
|
Use Integer Arithmetic Instead of Floating Arithmetic |
|
|
177 | (2) |
|
Double May Be Faster than Float |
|
|
179 | (1) |
|
Replace Iterative Computations with Closed Forms |
|
|
180 | (2) |
|
Optimize Control Flow Idioms |
|
|
182 | (3) |
|
Use switch Instead of if-elseif-else |
|
|
182 | (1) |
|
Use Virtual Functions Instead of switch or if |
|
|
182 | (1) |
|
Use No-Cost Exception Handling |
|
|
183 | (2) |
|
|
185 | (2) |
8 Use Better Libraries |
|
187 | (14) |
|
Optimize Standard Library Use |
|
|
187 | (4) |
|
Philosophy of the C++ Standard Library |
|
|
188 | (1) |
|
Issues in Use of the C++ Standard Library |
|
|
188 | (3) |
|
Optimize Existing Libraries |
|
|
191 | (1) |
|
Change as Little as Possible |
|
|
191 | (1) |
|
Add Functions Rather than Change Functionality |
|
|
192 | (1) |
|
Design Optimized Libraries |
|
|
192 | (8) |
|
Code in Haste, Repent at Leisure |
|
|
193 | (1) |
|
Parsimony Is a Virtue in Library Design |
|
|
194 | (1) |
|
Make Memory Allocation Decisions Outside the Library |
|
|
194 | (1) |
|
When in Doubt, Code Libraries for Speed |
|
|
195 | (1) |
|
Functions Are Easier to Optimize than Frameworks |
|
|
195 | (1) |
|
Flatten Inheritance Hierarchies |
|
|
196 | (1) |
|
|
196 | (1) |
|
|
196 | (2) |
|
|
198 | (1) |
|
Beware of 'God Functions' |
|
|
199 | (1) |
|
|
200 | (1) |
9 Optimize Searching and Sorting |
|
201 | (28) |
|
Key/Value Tables Using std::map and std::string |
|
|
202 | (1) |
|
Toolkit to Improve Search Performance |
|
|
203 | (5) |
|
Make a Baseline Measurement |
|
|
204 | (1) |
|
Identify the Activity to Be Optimized |
|
|
204 | (1) |
|
Decompose the Activity to Be Optimized |
|
|
205 | (1) |
|
Change or Replace Algorithms and Data Structures |
|
|
206 | (2) |
|
Using the Optimization Process on Custom Abstractions |
|
|
208 | (1) |
|
Optimize Search Using std::map |
|
|
208 | (5) |
|
Use Fixed-Size Character Array Keys with std::map |
|
|
208 | (2) |
|
Use C-Style String Keys with std::map |
|
|
210 | (2) |
|
Using Map's Cousin std::set When the Key Is in the Value |
|
|
212 | (1) |
|
Optimize Search Using the <algorithm> Header |
|
|
213 | (7) |
|
Key/Value Table for Search in Sequence Containers |
|
|
214 | (1) |
|
std::find(): Obvious Name, 0(n) Time Cost |
|
|
215 | (1) |
|
std::binary_search(): Does Not Return Values |
|
|
216 | (1) |
|
Binary Search Using std::equal_range() |
|
|
216 | (1) |
|
Binary Search Using std::lower_bound() |
|
|
217 | (1) |
|
|
218 | (1) |
|
Handcoded Binary Search using strcmp() |
|
|
219 | (1) |
|
Optimize Search in Hashed Key/Value Tables |
|
|
220 | (5) |
|
Hashing with a std::unordered_map |
|
|
221 | (1) |
|
Hashing with Fixed Character Array Keys |
|
|
221 | (1) |
|
Hashing with Null-Terminated String Keys |
|
|
222 | (2) |
|
Hashing with a Custom Hash Table |
|
|
224 | (1) |
|
Stepanov's Abstraction Penalty |
|
|
225 | (1) |
|
Optimize Sorting with the C++ Standard Library |
|
|
226 | (2) |
|
|
228 | (1) |
10 Optimize Data Structures |
|
229 | (36) |
|
Get to Know the Standard Library Containers |
|
|
229 | (7) |
|
|
230 | (1) |
|
|
230 | (1) |
|
Experimenting with the Standard Library Containers |
|
|
231 | (5) |
|
std::vector and std::string |
|
|
236 | (6) |
|
Performance Consequences of Reallocation |
|
|
237 | (1) |
|
Inserting and Deleting in std::vector |
|
|
238 | (2) |
|
|
240 | (1) |
|
|
241 | (1) |
|
|
241 | (1) |
|
|
242 | (3) |
|
Inserting and Deleting in std::deque |
|
|
243 | (2) |
|
|
245 | (1) |
|
|
245 | (1) |
|
|
245 | (1) |
|
|
245 | (4) |
|
Inserting and Deleting in std::list |
|
|
247 | (1) |
|
|
248 | (1) |
|
|
248 | (1) |
|
|
249 | (1) |
|
|
249 | (2) |
|
Inserting and Deleting in std::forward_list |
|
|
250 | (1) |
|
Iterating in std::forward_list |
|
|
251 | (1) |
|
Sorting std::forward_list |
|
|
251 | (1) |
|
Lookup in std::forward_list |
|
|
251 | (1) |
|
std::map and std::multimap |
|
|
251 | (4) |
|
Inserting and Deleting in std::map |
|
|
252 | (3) |
|
|
255 | (1) |
|
|
255 | (1) |
|
|
255 | (1) |
|
std::set and std::multiset |
|
|
255 | (1) |
|
std::unordered_map and std::unordered_multimap |
|
|
256 | (5) |
|
Inserting and Deleting in std::unordered_map |
|
|
260 | (1) |
|
Iterating in std::unordered_map |
|
|
260 | (1) |
|
Lookup with std::unordered_map |
|
|
261 | (1) |
|
|
261 | (2) |
|
|
263 | (2) |
11 Optimize I/O |
|
265 | (14) |
|
A Recipe for Reading Files |
|
|
265 | (11) |
|
Create a Parsimonious Function Signature |
|
|
267 | (2) |
|
|
269 | (1) |
|
|
269 | (3) |
|
Take Bigger Bites—Use a Bigger Input Buffer |
|
|
272 | (1) |
|
Take Bigger Bites—Read a Line at a Time |
|
|
272 | (2) |
|
Shorten Calling Chains Again |
|
|
274 | (1) |
|
|
275 | (1) |
|
|
276 | (1) |
|
Reading from std::cin and Writing to std::cout |
|
|
277 | (1) |
|
|
278 | (1) |
12 Optimize Concurrency |
|
279 | (44) |
|
|
280 | (11) |
|
A Walk Through the Concurrency Zoo |
|
|
281 | (4) |
|
|
285 | (1) |
|
|
286 | (1) |
|
|
287 | (1) |
|
|
288 | (1) |
|
|
289 | (2) |
|
C++ Concurrency Facilities Refresher |
|
|
291 | (14) |
|
|
291 | (1) |
|
|
292 | (3) |
|
|
295 | (1) |
|
|
296 | (1) |
|
|
297 | (1) |
|
|
298 | (3) |
|
Atomic Operations on Shared Variables |
|
|
301 | (3) |
|
On Deck: Future C++ Concurrency Features |
|
|
304 | (1) |
|
Optimize Threaded C++ Programs |
|
|
305 | (9) |
|
Prefer std::async to std::thread |
|
|
306 | (2) |
|
Create as Many Runnable Threads as Cores |
|
|
308 | (1) |
|
Implement a Task Queue and Thread Pool |
|
|
309 | (1) |
|
Perform I/O in a Separate Thread |
|
|
310 | (1) |
|
Program Without Synchronization |
|
|
310 | (3) |
|
Remove Code from Startup and Shutdown |
|
|
313 | (1) |
|
Make Synchronization More Efficient |
|
|
314 | (6) |
|
Reduce the Scope of Critical Sections |
|
|
314 | (1) |
|
Limit the Number of Concurrent Threads |
|
|
315 | (1) |
|
Avoid the Thundering Herd |
|
|
316 | (1) |
|
|
317 | (1) |
|
|
317 | (2) |
|
Don't Busy-Wait on a Single-Core System |
|
|
319 | (1) |
|
|
319 | (1) |
|
Rolling Your Own Mutex May Be Ineffective |
|
|
319 | (1) |
|
Limit Producer Output Queue Length |
|
|
320 | (1) |
|
|
320 | (2) |
|
|
322 | (1) |
13 Optimize Memory Management |
|
323 | (34) |
|
C++ Memory Management API Refresher |
|
|
324 | (9) |
|
The Life Cycle of Dynamic Variables |
|
|
324 | (1) |
|
Memory Management Functions Allocate and Free Memory |
|
|
325 | (3) |
|
New-Expressions Construct Dynamic Variables |
|
|
328 | (3) |
|
Delete-Expressions Dispose of Dynamic Variables |
|
|
331 | (1) |
|
Explicit Destructor Calls Destroy Dynamic Variables |
|
|
332 | (1) |
|
High-Performance Memory Managers |
|
|
333 | (2) |
|
Provide Class-Specific Memory Managers |
|
|
335 | (8) |
|
Fixed-Size-Block Memory Manager |
|
|
336 | (2) |
|
|
338 | (2) |
|
Adding a Class-Specific operator new() |
|
|
340 | (2) |
|
Performance of the Fixed-Block Memory Manager |
|
|
342 | (1) |
|
Variations on the Fixed-Block Memory Manager |
|
|
342 | (1) |
|
Non-Thread Safe Memory Managers Are Efficient |
|
|
343 | (1) |
|
Provide Custom Standard Library Allocators |
|
|
343 | (12) |
|
|
346 | (1) |
|
Additional Definitions for C++98 Allocator |
|
|
347 | (5) |
|
|
352 | (2) |
|
A Fixed-Block Allocator for Strings |
|
|
354 | (1) |
|
|
355 | (2) |
Index |
|
357 | |