Notation and Code Examples |
|
xi | |
Preface |
|
xiii | |
Acknowledgments |
|
xvii | |
|
|
1 | |
|
|
4 | |
|
2 The Multi—Layer Perceptron Model |
|
|
9 | |
|
2.1 The multi—layer perceptron (MLP) |
|
|
9 | |
|
2.2 The first and second derivatives |
|
|
12 | |
|
2.3 Additional hidden layers |
|
|
14 | |
|
|
15 | |
|
2.5 Complements and exercises |
|
|
16 | |
|
3 Linear Discriminant Analysis |
|
|
19 | |
|
3.1 An alternative method |
|
|
21 | |
|
|
22 | |
|
3.3 Flexible and penalized LDA |
|
|
23 | |
|
3.4 Relationship of MLP models to LDA |
|
|
26 | |
|
|
27 | |
|
3.6 Complements and exercises |
|
|
30 | |
|
4 Activation and Penalty Functions |
|
|
35 | |
|
|
35 | |
|
4.2 Interpreting outputs as probabilities |
|
|
35 | |
|
4.3 The "universal approximator" and consistency |
|
|
37 | |
|
|
38 | |
|
4.5 Binary variables and logistic regression |
|
|
39 | |
|
4.6 MLP models and cross-entropy |
|
|
40 | |
|
4.7 A derivation of the softmax activation function |
|
|
43 | |
|
4.8 The "natural" pairing and Δq |
|
|
45 | |
|
4.9 A comparison of least squares and cross-entropy |
|
|
47 | |
|
|
48 | |
|
4.11 Complements and exercises |
|
|
48 | |
|
5 Model Fitting and Evaluation |
|
|
53 | |
|
|
53 | |
|
5.2 Error rate estimation |
|
|
54 | |
|
5.3 Model selection for MLP models |
|
|
57 | |
|
|
62 | |
|
5.5 Complements and exercises |
|
|
65 | |
|
|
69 | |
|
|
69 | |
|
|
70 | |
|
|
71 | |
|
6.4 Interpreting and evaluating task-based MLP models |
|
|
76 | |
|
6.5 Evaluating the models |
|
|
87 | |
|
|
88 | |
|
6.7 Complements and exercises |
|
|
89 | |
|
7 Incorporating Spatial Information into an MLP Classifier |
|
|
93 | |
|
7.1 Allocation and neighbor information |
|
|
93 | |
|
|
98 | |
|
|
100 | |
|
|
101 | |
|
|
107 | |
|
7.6 Example - Martin's farm |
|
|
109 | |
|
|
111 | |
|
7.8 Complements and exercises |
|
|
114 | |
|
8 Influence Curves for the Multi–layer Perceptron Classifier |
|
|
121 | |
|
|
121 | |
|
|
122 | |
|
|
123 | |
|
|
124 | |
|
|
128 | |
|
8.6 Influence curves for pc |
|
|
136 | |
|
8.7 Summary and Conclusion |
|
|
139 | |
|
9 The Sensitivity Curves of the MLP Classifier |
|
|
143 | |
|
|
143 | |
|
9.2 The sensitivity curve |
|
|
144 | |
|
|
145 | |
|
|
151 | |
|
|
157 | |
10 A Robust Fitting Procedure for MLP Models |
|
159 | |
|
|
159 | |
|
10.2 The effect of a hidden layer |
|
|
160 | |
|
10.3 Comparison of MLP with robust logistic regression |
|
|
162 | |
|
|
166 | |
|
|
172 | |
|
|
175 | |
|
10.7 Complements and exercises |
|
|
176 | |
11 Smoothed Weights |
|
179 | |
|
|
179 | |
|
|
184 | |
|
|
187 | |
|
|
198 | |
|
11.5 Complements and exercises |
|
|
200 | |
12 Translation Invariance |
|
203 | |
|
|
203 | |
|
|
205 | |
|
|
208 | |
|
|
209 | |
|
|
214 | |
13 Fixed-slope Training |
|
219 | |
|
|
219 | |
|
|
221 | |
|
|
222 | |
|
|
222 | |
|
|
223 | |
|
|
223 | |
Bibliography |
|
227 | |
Appendix A: Function Minimization |
|
245 | |
|
|
245 | |
|
|
246 | |
|
|
247 | |
|
A.4 The method of scoring |
|
|
249 | |
|
|
250 | |
|
|
250 | |
|
A.7 Scaled conjugate gradients |
|
|
252 | |
|
A.8 Variants on vanilla "back-propagation" |
|
|
253 | |
|
|
254 | |
|
A.10 The simplex algorithm |
|
|
254 | |
|
|
255 | |
|
|
255 | |
|
A.13 Discussion and Conclusion |
|
|
256 | |
Appendix B: Maximum Values of the Influence Curve |
|
261 | |
Topic Index |
|
265 | |