Preface |
|
xi | |
|
|
1 | (10) |
|
1.1 Computational approach |
|
|
1 | (1) |
|
|
2 | (1) |
|
|
3 | (2) |
|
|
5 | (1) |
|
1.5 How to read this book |
|
|
6 | (1) |
|
1.6 Supplementary materials |
|
|
7 | (1) |
|
1.7 Formalisms and terminology |
|
|
7 | (2) |
|
|
9 | (2) |
|
|
11 | (32) |
|
|
11 | (2) |
|
2.2 Ordinary least squares |
|
|
13 | (2) |
|
|
15 | (2) |
|
2.4 Solving least squares with the singular value decomposition |
|
|
17 | (2) |
|
2.5 Directly solving the linear system |
|
|
19 | (3) |
|
2.6 (*) Solving linear models using the QR decomposition |
|
|
22 | (2) |
|
2.7 (?) Sensitivity analysis |
|
|
24 | (4) |
|
2.8 (*) Relationship between numerical and statistical error |
|
|
28 | (3) |
|
2.9 Implementation and notes |
|
|
31 | (1) |
|
2.10 Application: Cancer incidence rates |
|
|
32 | (8) |
|
|
40 | (3) |
|
3 Ridge Regression and Principal Component Analysis |
|
|
43 | (32) |
|
|
43 | (3) |
|
|
46 | (7) |
|
3.3 (*) A Bayesian perspective |
|
|
53 | (3) |
|
3.4 Principal component analysis |
|
|
56 | (7) |
|
3.5 Implementation and notes |
|
|
63 | (2) |
|
3.6 Application: NYC taxicab data |
|
|
65 | (7) |
|
|
72 | (3) |
|
|
75 | (48) |
|
|
75 | (1) |
|
|
76 | (5) |
|
|
81 | (4) |
|
|
85 | (4) |
|
|
89 | (6) |
|
4.6 (*) Smoothing splines |
|
|
95 | (5) |
|
|
100 | (4) |
|
4.8 Implementation and notes |
|
|
104 | (1) |
|
4.9 Application: U.S. census tract data |
|
|
105 | (15) |
|
|
120 | (3) |
|
5 Generalized Linear Models |
|
|
123 | (28) |
|
5.1 Classification with linear models |
|
|
123 | (5) |
|
|
128 | (3) |
|
5.3 Iteratively reweighted GLMs |
|
|
131 | (4) |
|
|
135 | (3) |
|
5.5 (*) Multi-Class regression |
|
|
138 | (1) |
|
5.6 Implementation and notes |
|
|
139 | (1) |
|
5.7 Application: Chicago crime prediction |
|
|
140 | (8) |
|
|
148 | (3) |
|
|
151 | (28) |
|
6.1 Multivariate linear smoothers |
|
|
151 | (4) |
|
6.2 Curse of dimensionality |
|
|
155 | (3) |
|
|
158 | (5) |
|
6.4 (*) Additive models as linear models |
|
|
163 | (3) |
|
6.5 (*) Standard errors in additive models |
|
|
166 | (4) |
|
6.6 Implementation and notes |
|
|
170 | (2) |
|
6.7 Application: NYC nights data |
|
|
172 | (6) |
|
|
178 | (1) |
|
7 Penalized Regression Models |
|
|
179 | (28) |
|
|
179 | (1) |
|
7.2 Penalized regression with the l0- and l1-norms |
|
|
180 | (2) |
|
7.3 Orthogonal data matrix |
|
|
182 | (4) |
|
7.4 Convex optimization and the elastic net |
|
|
186 | (2) |
|
|
188 | (5) |
|
7.6 (*) Active set screening using the KKT conditions |
|
|
193 | (5) |
|
7.7 (*) The generalized elastic net model |
|
|
198 | (2) |
|
7.8 Implementation and notes |
|
|
200 | (1) |
|
7.9 Application: Amazon product reviews |
|
|
201 | (5) |
|
|
206 | (1) |
|
|
207 | (54) |
|
8.1 Dense neural network architecture |
|
|
207 | (4) |
|
8.2 Stochastic gradient descent |
|
|
211 | (2) |
|
8.3 Backward propagation of errors |
|
|
213 | (3) |
|
8.4 Implementing backpropagation |
|
|
216 | (8) |
|
8.5 Recognizing handwritten digits |
|
|
224 | (2) |
|
8.6 (?) Improving SGD and regularization |
|
|
226 | (6) |
|
8.7 (*) Classification with neural networks |
|
|
232 | (7) |
|
8.8 (*) Convolutional neural networks |
|
|
239 | (10) |
|
8.9 Implementation and notes |
|
|
249 | (1) |
|
8.10 Application: Image classification with EMNIST |
|
|
249 | (10) |
|
|
259 | (2) |
|
9 Dimensionality Reduction |
|
|
261 | (36) |
|
9.1 Unsupervised learning |
|
|
261 | (1) |
|
|
262 | (4) |
|
9.3 Kernel principal component analysis |
|
|
266 | (6) |
|
|
272 | (5) |
|
9.5 t-Distributed stochastic neighbor embedding (t-SNE) |
|
|
277 | (5) |
|
|
282 | (1) |
|
9.7 Implementation and notes |
|
|
283 | (1) |
|
9.8 Application: Classifying and visualizing fashion MNIST |
|
|
284 | (11) |
|
|
295 | (2) |
|
10 Computation in Practice |
|
|
297 | (34) |
|
10.1 Reference implementations |
|
|
297 | (1) |
|
|
298 | (6) |
|
10.3 Sparse generalized linear models |
|
|
304 | (3) |
|
10.4 Computation on row chunks |
|
|
307 | (4) |
|
|
311 | (7) |
|
|
318 | (2) |
|
10.7 Implementation and notes |
|
|
320 | (1) |
|
|
321 | (8) |
|
|
329 | (2) |
|
A Linear algebra and matrices |
|
|
331 | (6) |
|
|
331 | (2) |
|
|
333 | (4) |
|
B Floating Point Arithmetic and Numerical Computation |
|
|
337 | (6) |
|
B.1 Floating point arithmetic |
|
|
337 | (3) |
|
|
340 | (3) |
Bibliography |
|
343 | (16) |
Index |
|
359 | |