About the Author |
|
xxi | |
Preamble |
|
1 | (2) |
|
1 Financial Machine Learning as a Distinct Subject |
|
|
3 | (18) |
|
|
3 | (1) |
|
1.2 The Main Reason Financial Machine Learning Projects Usually Fail |
|
|
4 | (2) |
|
1.2.1 The Sisyphus Paradigm |
|
|
4 | (1) |
|
1.2.2 The Meta-Strategy Paradigm |
|
|
5 | (1) |
|
|
6 | (6) |
|
1.3.1 Structure by Production Chain |
|
|
6 | (3) |
|
1.3.2 Structure by Strategy Component |
|
|
9 | (3) |
|
1.3.3 Structure by Common Pitfall |
|
|
12 | (1) |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
14 | (4) |
|
|
18 | (3) |
|
|
19 | (1) |
|
|
20 | (1) |
|
|
20 | (1) |
|
|
21 | (70) |
|
2 Financial Data Structures |
|
|
23 | (20) |
|
|
23 | (1) |
|
2.2 Essential Types of Financial Data |
|
|
23 | (2) |
|
|
23 | (1) |
|
|
24 | (1) |
|
|
25 | (1) |
|
|
25 | (1) |
|
|
25 | (7) |
|
|
26 | (3) |
|
2.3.2 Information-Driven Bars |
|
|
29 | (3) |
|
2.4 Dealing with Multi-Product Series |
|
|
32 | (6) |
|
|
33 | (2) |
|
|
35 | (1) |
|
|
36 | (2) |
|
|
38 | (5) |
|
2.5.1 Sampling for Reduction |
|
|
38 | (1) |
|
2.5.2 Event-Based Sampling |
|
|
38 | (2) |
|
|
40 | (1) |
|
|
41 | (2) |
|
|
43 | (16) |
|
|
43 | (1) |
|
3.2 The Fixed-Time Horizon Method |
|
|
43 | (1) |
|
3.3 Computing Dynamic Thresholds |
|
|
44 | (1) |
|
3.4 The Triple-Barrier Method |
|
|
45 | (3) |
|
3.5 Learning Side and Size |
|
|
48 | (2) |
|
|
50 | (1) |
|
3.7 How to Use Meta-Labeling |
|
|
51 | (2) |
|
|
53 | (1) |
|
3.9 Dropping Unnecessary Labels |
|
|
54 | (5) |
|
|
55 | (1) |
|
|
56 | (3) |
|
|
59 | (16) |
|
|
59 | (1) |
|
|
59 | (1) |
|
4.3 Number of Concurrent Labels |
|
|
60 | (1) |
|
4.4 Average Uniqueness of a Label |
|
|
61 | (1) |
|
4.5 Bagging Classifiers and Uniqueness |
|
|
62 | (6) |
|
4.5.1 Sequential Bootstrap |
|
|
63 | (1) |
|
4.5.2 Implementation of Sequential Bootstrap |
|
|
64 | (1) |
|
4.5.3 A Numerical Example |
|
|
65 | (1) |
|
4.5.4 Monte Carlo Experiments |
|
|
66 | (2) |
|
|
68 | (2) |
|
|
70 | (1) |
|
|
71 | (4) |
|
|
72 | (1) |
|
|
73 | (1) |
|
|
73 | (2) |
|
5 Fractionally Differentiated Features |
|
|
75 | (16) |
|
|
75 | (1) |
|
5.2 The Stationarity vs. Memory Dilemma |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (3) |
|
|
77 | (1) |
|
5.4.2 Iterative Estimation |
|
|
78 | (2) |
|
|
80 | (1) |
|
|
80 | (4) |
|
|
80 | (2) |
|
5.5.2 Fixed-Width Window Fracdiff |
|
|
82 | (2) |
|
5.6 Stationarity with Maximum Memory Preservation |
|
|
84 | (4) |
|
|
88 | (3) |
|
|
88 | (1) |
|
|
89 | (1) |
|
|
89 | (2) |
|
|
91 | (48) |
|
|
93 | (10) |
|
|
93 | (1) |
|
6.2 The Three Sources of Errors |
|
|
93 | (1) |
|
6.3 Bootstrap Aggregation |
|
|
94 | (4) |
|
|
94 | (2) |
|
|
96 | (1) |
|
6.3.3 Observation Redundancy |
|
|
97 | (1) |
|
|
98 | (1) |
|
|
99 | (1) |
|
6.6 Bagging vs. Boosting in Finance |
|
|
100 | (1) |
|
6.7 Bagging for Scalability |
|
|
101 | (2) |
|
|
101 | (1) |
|
|
102 | (1) |
|
|
102 | (1) |
|
7 Cross-Validation in Finance |
|
|
103 | (10) |
|
|
103 | (1) |
|
7.2 The Goal of Cross-Validation |
|
|
103 | (1) |
|
7.3 Why K-Fold CV Fails in Finance |
|
|
104 | (1) |
|
7.4 A Solution: Purged K-Fold CV |
|
|
105 | (4) |
|
7.4.1 Purging the Training Set |
|
|
105 | (2) |
|
|
107 | (1) |
|
7.4.3 The Purged K-Fold Class |
|
|
108 | (1) |
|
7.5 Bugs in Sklearn's Cross-Validation |
|
|
109 | (4) |
|
|
110 | (1) |
|
|
111 | (2) |
|
|
113 | (16) |
|
|
113 | (1) |
|
8.2 The Importance of Feature Importance |
|
|
113 | (1) |
|
8.3 Feature Importance with Substitution Effects |
|
|
114 | (3) |
|
8.3.1 Mean Decrease Impurity |
|
|
114 | (2) |
|
8.3.2 Mean Decrease Accuracy |
|
|
116 | (1) |
|
8.4 Feature Importance without Substitution Effects |
|
|
117 | (4) |
|
8.4.1 Single Feature Importance |
|
|
117 | (1) |
|
8.4.2 Orthogonal Features |
|
|
118 | (3) |
|
8.5 Parallelized vs. Stacked Feature Importance |
|
|
121 | (1) |
|
8.6 Experiments with Synthetic Data |
|
|
122 | (7) |
|
|
127 | (1) |
|
|
127 | (2) |
|
9 Hyper-Parameter Tuning with Cross-Validation |
|
|
129 | (10) |
|
|
129 | (1) |
|
9.2 Grid Search Cross-Validation |
|
|
129 | (2) |
|
9.3 Randomized Search Cross-Validation |
|
|
131 | (3) |
|
9.3.1 Log-Uniform Distribution |
|
|
132 | (2) |
|
9.4 Scoring and Hyper-parameter Tuning |
|
|
134 | (5) |
|
|
135 | (1) |
|
|
136 | (1) |
|
|
137 | (2) |
|
|
139 | (108) |
|
|
141 | (10) |
|
|
141 | (1) |
|
10.2 Strategy-Independent Bet Sizing Approaches |
|
|
141 | (1) |
|
10.3 Bet Sizing from Predicted Probabilities |
|
|
142 | (2) |
|
10.4 Averaging Active Bets |
|
|
144 | (1) |
|
|
144 | (1) |
|
10.6 Dynamic Bet Sizes and Limit Prices |
|
|
145 | (6) |
|
|
148 | (1) |
|
|
149 | (1) |
|
|
149 | (2) |
|
11 The Dangers of Backtesting |
|
|
151 | (10) |
|
|
151 | (1) |
|
11.2 Mission Impossible: The Flawless Backtest |
|
|
151 | (1) |
|
11.3 Even If Your Backtest Is Flawless, It Is Probably Wrong |
|
|
152 | (1) |
|
11.4 Backtesting Is Not a Research Tool |
|
|
153 | (1) |
|
11.5 A Few General Recommendations |
|
|
153 | (2) |
|
|
155 | (6) |
|
|
158 | (1) |
|
|
158 | (1) |
|
|
159 | (2) |
|
12 Backtesting through Cross-Validation |
|
|
161 | (8) |
|
|
161 | (1) |
|
12.2 The Walk-Forward Method |
|
|
161 | (1) |
|
12.2.1 Pitfalls of the Walk-Forward Method |
|
|
162 | (1) |
|
12.3 The Cross-Validation Method |
|
|
162 | (1) |
|
12.4 The Combinatorial Purged Cross-Validation Method |
|
|
163 | (3) |
|
12.4.1 Combinatorial Splits |
|
|
164 | (1) |
|
12.4.2 The Combinatorial Purged Cross-Validation Backtesting Algorithm |
|
|
165 | (1) |
|
|
165 | (1) |
|
12.5 How Combinatorial Purged Cross-Validation Addresses Backtest Overfitting |
|
|
166 | (3) |
|
|
167 | (1) |
|
|
168 | (1) |
|
13 Backtesting on Synthetic Data |
|
|
169 | (26) |
|
|
169 | (1) |
|
|
169 | (1) |
|
|
170 | (2) |
|
|
172 | (1) |
|
13.5 Numerical Determination of Optimal Trading Rules |
|
|
173 | (3) |
|
|
173 | (1) |
|
|
174 | (2) |
|
13.6 Experimental Results |
|
|
176 | (16) |
|
13.6.1 Cases with Zero Long-Run Equilibrium |
|
|
177 | (3) |
|
13.6.2 Cases with Positive Long-Run Equilibrium |
|
|
180 | (2) |
|
13.6.3 Cases with Negative Long-Run Equilibrium |
|
|
182 | (10) |
|
|
192 | (3) |
|
|
192 | (1) |
|
|
193 | (2) |
|
|
195 | (16) |
|
|
195 | (1) |
|
14.2 Types of Backtest Statistics |
|
|
195 | (1) |
|
14.3 General Characteristics |
|
|
196 | (2) |
|
|
198 | (1) |
|
14.4.1 Time-Weighted Rate of Return |
|
|
198 | (1) |
|
|
199 | (3) |
|
14.5.1 Returns Concentration |
|
|
199 | (2) |
|
14.5.2 Drawdown and Time under Water |
|
|
201 | (1) |
|
14.5.3 Runs Statistics for Performance Evaluation |
|
|
201 | (1) |
|
14.6 Implementation Shortfall |
|
|
202 | (1) |
|
|
203 | (3) |
|
|
203 | (1) |
|
14.7.2 The Probabilistic Sharpe Ratio |
|
|
203 | (1) |
|
14.7.3 The Deflated Sharpe Ratio |
|
|
204 | (1) |
|
14.7.4 Efficiency Statistics |
|
|
205 | (1) |
|
14.8 Classification Scores |
|
|
206 | (1) |
|
|
207 | (4) |
|
|
208 | (1) |
|
|
209 | (1) |
|
|
209 | (2) |
|
15 Understanding Strategy Risk |
|
|
211 | (10) |
|
|
211 | (1) |
|
|
211 | (2) |
|
|
213 | (3) |
|
15.4 The Probability of Strategy Failure |
|
|
216 | (5) |
|
|
217 | (1) |
|
|
217 | (2) |
|
|
219 | (1) |
|
|
220 | (1) |
|
16 Machine Learning Asset Allocation |
|
|
221 | (26) |
|
|
221 | (1) |
|
16.2 The Problem with Convex Portfolio Optimization |
|
|
221 | (1) |
|
|
222 | (1) |
|
16.4 From Geometric to Hierarchical Relationships |
|
|
223 | (8) |
|
|
224 | (5) |
|
16.4.2 Quasi-Diagonalization |
|
|
229 | (1) |
|
16.4.3 Recursive Bisection |
|
|
229 | (2) |
|
|
231 | (3) |
|
16.6 Out-of-Sample Monte Carlo Simulations |
|
|
234 | (2) |
|
|
236 | (2) |
|
|
238 | (1) |
|
|
239 | (8) |
|
16.A.1 Correlation-based Metric |
|
|
239 | (1) |
|
16.A.2 Inverse Variance Allocation |
|
|
239 | (1) |
|
16.A.3 Reproducing the Numerical Example |
|
|
240 | (2) |
|
16.A.4 Reproducing the Monte Carlo Experiment |
|
|
242 | (2) |
|
|
244 | (1) |
|
|
245 | (2) |
|
PART 4 USEFUL FINANCIAL FEATURES |
|
|
247 | (54) |
|
|
249 | (14) |
|
|
249 | (1) |
|
17.2 Types of Structural Break Tests |
|
|
249 | (1) |
|
|
250 | (1) |
|
17.3.1 Brown-Durbin-Evans CUSUM Test on Recursive Residuals |
|
|
250 | (1) |
|
17.3.2 Chu-Stinchcombe-White CUSUM Test on Levels |
|
|
251 | (1) |
|
|
251 | (12) |
|
17.4.1 Chow-Type Dickey-Fuller Test |
|
|
251 | (1) |
|
17.4.2 Supremum Augmented Dickey-Fuller |
|
|
252 | (7) |
|
17.4.3 Sub- and Super-Martingale Tests |
|
|
259 | (2) |
|
|
261 | (1) |
|
|
261 | (2) |
|
|
263 | (18) |
|
|
263 | (1) |
|
|
263 | (1) |
|
18.3 The Plug-in (or Maximum Likelihood) Estimator |
|
|
264 | (1) |
|
18.4 Lempel-Ziv Estimators |
|
|
265 | (4) |
|
|
269 | (2) |
|
|
270 | (1) |
|
|
270 | (1) |
|
|
270 | (1) |
|
18.6 Entropy of a Gaussian Process |
|
|
271 | (1) |
|
18.7 Entropy and the Generalized Mean |
|
|
271 | (4) |
|
18.8 A Few Financial Applications of Entropy |
|
|
275 | (6) |
|
|
275 | (1) |
|
18.8.2 Maximum Entropy Generation |
|
|
275 | (1) |
|
18.8.3 Portfolio Concentration |
|
|
275 | (1) |
|
18.8.4 Market Microstructure |
|
|
276 | (1) |
|
|
277 | (1) |
|
|
278 | (1) |
|
|
279 | (2) |
|
19 Microstructural Features |
|
|
281 | (20) |
|
|
281 | (1) |
|
19.2 Review of the Literature |
|
|
281 | (1) |
|
19.3 First Generation: Price Sequences |
|
|
282 | (4) |
|
|
282 | (1) |
|
|
282 | (1) |
|
19.3.3 High-Low Volatility Estimator |
|
|
283 | (1) |
|
19.3.4 Corwin and Schultz |
|
|
284 | (2) |
|
19.4 Second Generation: Strategic Trade Models |
|
|
286 | (4) |
|
|
286 | (2) |
|
|
288 | (1) |
|
19.4.3 Hasbrouck's Lambda |
|
|
289 | (1) |
|
19.5 Third Generation: Sequential Trade Models |
|
|
290 | (3) |
|
19.5.1 Probability of Information-based Trading |
|
|
290 | (2) |
|
19.5.2 Volume-Synchronized Probability of Informed Trading |
|
|
292 | (1) |
|
19.6 Additional Features from Microstructural Datasets |
|
|
293 | (2) |
|
19.6.1 Distibution of Order Sizes |
|
|
293 | (1) |
|
19.6.2 Cancellation Rates, Limit Orders, Market Orders |
|
|
293 | (1) |
|
19.6.3 Time-Weighted Average Price Execution Algorithms |
|
|
294 | (1) |
|
|
295 | (1) |
|
19.6.5 Serial Correlation of Signed Order Flow |
|
|
295 | (1) |
|
19.7 What Is Microstructural Information? |
|
|
295 | (6) |
|
|
296 | (2) |
|
|
298 | (3) |
|
PART 5 HIGH-PERFORMANCE COMPUTING RECIPES |
|
|
301 | (52) |
|
20 Multiprocessing and Vectorization |
|
|
303 | (16) |
|
|
303 | (1) |
|
20.2 Vectorization Example |
|
|
303 | (1) |
|
20.3 Single-Thread vs. Multithreading vs. Multiprocessing |
|
|
304 | (2) |
|
|
306 | (3) |
|
|
306 | (1) |
|
20.4.2 Two-Nested Loops Partitions |
|
|
307 | (2) |
|
20.5 Multiprocessing Engines |
|
|
309 | (6) |
|
20.5.1 Preparing the Jobs |
|
|
309 | (2) |
|
20.5.2 Asynchronous Calls |
|
|
311 | (1) |
|
20.5.3 Unwrapping the Callback |
|
|
312 | (1) |
|
20.5.4 Pickle/Unpickle Objects |
|
|
313 | (1) |
|
|
313 | (2) |
|
20.6 Multiprocessing Example |
|
|
315 | (4) |
|
|
316 | (1) |
|
|
317 | (1) |
|
|
317 | (2) |
|
21 Brute Force and Quantum Computers |
|
|
319 | (10) |
|
|
319 | (1) |
|
21.2 Combinatorial Optimization |
|
|
319 | (1) |
|
21.3 The Objective Function |
|
|
320 | (1) |
|
|
321 | (1) |
|
21.5 An Integer Optimization Approach |
|
|
321 | (4) |
|
21.5.1 Pigeonhole Partitions |
|
|
321 | (2) |
|
21.5.2 Feasible Static Solutions |
|
|
323 | (1) |
|
21.5.3 Evaluating Trajectories |
|
|
323 | (2) |
|
|
325 | (4) |
|
|
325 | (1) |
|
|
326 | (1) |
|
|
327 | (1) |
|
|
327 | (1) |
|
|
328 | (1) |
|
22 High-Performance Computational Intelligence and Forecasting Technologies |
|
|
329 | (24) |
|
|
|
|
329 | (1) |
|
22.2 Regulatory Response to the Flash Crash of 2010 |
|
|
329 | (1) |
|
|
330 | (1) |
|
|
331 | (4) |
|
|
335 | (2) |
|
22.5.1 Message Passing Interface |
|
|
335 | (1) |
|
22.5.2 Hierarchical Data Format 5 |
|
|
336 | (1) |
|
22.5.3 In Situ Processing |
|
|
336 | (1) |
|
|
337 | (1) |
|
|
337 | (12) |
|
|
337 | (1) |
|
22.6.2 Blobs in Fusion Plasma |
|
|
338 | (2) |
|
22.6.3 Intraday Peak Electricity Usage |
|
|
340 | (1) |
|
22.6.4 The Flash Crash of 2010 |
|
|
341 | (5) |
|
22.6.5 Volume-synchronized Probability of Informed Trading Calibration |
|
|
346 | (1) |
|
22.6.6 Revealing High Frequency Events with Non-uniform Fast Fourier Transform |
|
|
347 | (2) |
|
22.7 Summary and Call for Participation |
|
|
349 | (1) |
|
|
350 | (3) |
|
|
350 | (3) |
Index |
|
353 | |