|
|
1 | (10) |
|
Artificial Neural Networks |
|
|
2 | (1) |
|
The Organisation of this Book |
|
|
3 | (8) |
|
Part I Single Stream Networks |
|
|
|
|
11 | (20) |
|
|
11 | (2) |
|
Quantification of Information |
|
|
13 | (3) |
|
Entropy and the Gaussian Distribution |
|
|
14 | (2) |
|
Principal Component Analysis |
|
|
16 | (2) |
|
Weight Decay in Hebbian Learning |
|
|
18 | (3) |
|
Principal Components and Weight Decay |
|
|
19 | (2) |
|
|
21 | (2) |
|
|
21 | (1) |
|
|
22 | (1) |
|
Oja's Weighted Subspace Algorithm |
|
|
22 | (1) |
|
Sanger's Generalized Hebbian Algorithm |
|
|
23 | (1) |
|
|
23 | (2) |
|
Independent Component Analysis |
|
|
25 | (4) |
|
A Restatement of the Problem |
|
|
26 | (2) |
|
|
28 | (1) |
|
|
29 | (2) |
|
The Negative Feedback Network |
|
|
31 | (26) |
|
|
31 | (10) |
|
Equivalence to Oja's Subspace Algorithm |
|
|
31 | (3) |
|
|
34 | (1) |
|
Plasticity and Continuity |
|
|
35 | (1) |
|
Speed of Learning and Information Content |
|
|
36 | (1) |
|
|
37 | (4) |
|
|
41 | (5) |
|
Properties of the VW Network |
|
|
42 | (1) |
|
|
43 | (3) |
|
Using Distance Differences |
|
|
46 | (2) |
|
Equivalence to Sanger's Algorithm |
|
|
47 | (1) |
|
Minor Components Analysis |
|
|
48 | (6) |
|
|
49 | (1) |
|
Use of Minor Components Analysis |
|
|
50 | (2) |
|
Robustness of Regession Solutions |
|
|
52 | (1) |
|
|
53 | (1) |
|
|
54 | (3) |
|
|
57 | (28) |
|
Analysis of Differential Learning Rates |
|
|
59 | (11) |
|
|
68 | (1) |
|
|
69 | (1) |
|
Differential Activation Functions |
|
|
70 | (12) |
|
Model 1: Lateral Activation Functions |
|
|
70 | (3) |
|
Model 2: Lateral and Feedforward Activation Functions |
|
|
73 | (2) |
|
Model 3: Feedforward Activation Functions |
|
|
75 | (2) |
|
|
77 | (3) |
|
|
80 | (2) |
|
Emergent Properties of the Peer-Inhibition Network |
|
|
82 | (1) |
|
|
83 | (2) |
|
|
85 | (26) |
|
|
87 | (1) |
|
|
88 | (4) |
|
|
89 | (2) |
|
|
91 | (1) |
|
|
92 | (1) |
|
|
92 | (16) |
|
Principal Factor Analysis |
|
|
93 | (1) |
|
|
94 | (1) |
|
Relation to Non-negativity |
|
|
94 | (1) |
|
|
95 | (1) |
|
|
96 | (1) |
|
|
97 | (2) |
|
|
99 | (1) |
|
|
100 | (1) |
|
Dimensionality of the Output Space |
|
|
101 | (1) |
|
The Minimum Overcomplete Basis |
|
|
102 | (2) |
|
|
104 | (4) |
|
|
108 | (3) |
|
Exploratory Data Analysis |
|
|
111 | (26) |
|
Exploratory Projection Pursuit |
|
|
112 | (1) |
|
|
112 | (1) |
|
|
113 | (1) |
|
The Projection Pursuit Network |
|
|
114 | (9) |
|
|
114 | (2) |
|
The Projection Pursuit Indices |
|
|
116 | (1) |
|
Principal Component Analysis |
|
|
117 | (1) |
|
Convergence of the Algorithm |
|
|
117 | (2) |
|
|
119 | (2) |
|
Using Hyperbolic Functions |
|
|
121 | (1) |
|
|
122 | (1) |
|
|
123 | (4) |
|
Indices Based on Information Theory |
|
|
123 | (2) |
|
|
125 | (2) |
|
|
127 | (1) |
|
Using Exploratory Projection Pursuit |
|
|
127 | (6) |
|
Hierarchical Exploratory Projection Pursuit |
|
|
128 | (3) |
|
|
131 | (2) |
|
Independent Component Analysis |
|
|
133 | (3) |
|
|
136 | (1) |
|
|
137 | (32) |
|
|
137 | (3) |
|
|
138 | (1) |
|
|
138 | (2) |
|
The Classification Network |
|
|
140 | (3) |
|
|
141 | (2) |
|
|
143 | (1) |
|
|
143 | (6) |
|
|
144 | (1) |
|
Comparison with Kohonen Feature Maps |
|
|
145 | (1) |
|
|
145 | (3) |
|
Self-Organisation on Voice Data |
|
|
148 | (1) |
|
|
148 | (1) |
|
|
149 | (9) |
|
Summary of the Training Algorithm |
|
|
151 | (2) |
|
|
153 | (4) |
|
|
157 | (1) |
|
The Negative Feedback Coding Network |
|
|
158 | (9) |
|
|
159 | (1) |
|
|
160 | (2) |
|
|
162 | (1) |
|
|
163 | (1) |
|
Approximate Topological Equivalence |
|
|
164 | (1) |
|
A Hierarchical Feature Map |
|
|
165 | (1) |
|
A Biological Implementation |
|
|
166 | (1) |
|
|
167 | (2) |
|
Maximum Likelihood Hebbian Learning |
|
|
169 | (22) |
|
The Negative Feedback Network and Cost Functions |
|
|
169 | (2) |
|
|
171 | (1) |
|
Insensitive Hebbian Learning |
|
|
171 | (5) |
|
Principal Component Analysis |
|
|
171 | (3) |
|
|
174 | (1) |
|
Other Negative Feedback Networks |
|
|
175 | (1) |
|
The Maximum Likelihood EPP Algorithm |
|
|
176 | (5) |
|
Minimum Likelihood Hebbian Learning |
|
|
177 | (1) |
|
|
178 | (2) |
|
|
180 | (1) |
|
|
181 | (5) |
|
|
182 | (1) |
|
|
182 | (1) |
|
Independent Component Analysis |
|
|
183 | (3) |
|
|
186 | (5) |
|
Part II Dual Stream Networks |
|
|
|
Two Neural Networks for Canonical Correlation Analysis |
|
|
191 | (18) |
|
Statistical Canonical Correlation Analysis |
|
|
191 | (1) |
|
The First Canonical Correlation Network |
|
|
192 | (2) |
|
|
194 | (8) |
|
|
195 | (1) |
|
|
195 | (1) |
|
|
196 | (2) |
|
|
198 | (1) |
|
|
199 | (1) |
|
|
200 | (2) |
|
A Second Neural Implementation of CCA |
|
|
202 | (2) |
|
|
204 | (2) |
|
|
204 | (1) |
|
|
205 | (1) |
|
Linear Discriminant Analysis |
|
|
206 | (1) |
|
|
207 | (2) |
|
Alternative Derivations of CCA Networks |
|
|
209 | (8) |
|
A Probabilistic Perspective |
|
|
209 | (2) |
|
Putting Priors on the Probabilities |
|
|
210 | (1) |
|
|
211 | (1) |
|
A Model Derived from Becker's Model 1 |
|
|
212 | (3) |
|
Who Is Telling the Truth? |
|
|
213 | (1) |
|
A Model Derived from Becker's Second Model |
|
|
214 | (1) |
|
|
215 | (2) |
|
Kernel and Nonlinear Correlations |
|
|
217 | (30) |
|
|
217 | (4) |
|
|
217 | (4) |
|
The Search for Independence |
|
|
221 | (4) |
|
Using Minimum Correlation to Extract Independent Sources |
|
|
222 | (1) |
|
|
223 | (1) |
|
|
223 | (2) |
|
Kernel Canonical Correlation Analysis |
|
|
225 | (9) |
|
Kernel Principal Correlation Analysis |
|
|
226 | (1) |
|
Kernel Canonical Correlation Analysis |
|
|
227 | (3) |
|
|
230 | (1) |
|
|
231 | (3) |
|
Relevance Vector Regression |
|
|
234 | (3) |
|
|
237 | (1) |
|
Appearance-Based Object Recognition |
|
|
237 | (3) |
|
Mixtures of Linear Correlations |
|
|
240 | (7) |
|
Many Locally Linear Correlations |
|
|
240 | (1) |
|
|
241 | (5) |
|
|
246 | (1) |
|
Exploratory Correlation Analysis |
|
|
247 | (28) |
|
Exploratory Correlation Analysis |
|
|
247 | (4) |
|
|
251 | (2) |
|
|
251 | (1) |
|
Dual Stream Blind Source Separation |
|
|
252 | (1) |
|
|
253 | (1) |
|
|
254 | (2) |
|
FastECA for Several Units |
|
|
255 | (1) |
|
Comparison of ECA and FastECA |
|
|
256 | (1) |
|
Local Filter Formation From Natural Stereo Images |
|
|
256 | (10) |
|
|
256 | (3) |
|
Sparse Coding of Natural Images |
|
|
259 | (1) |
|
|
260 | (6) |
|
Twinned Maximum Likelihood Learning |
|
|
266 | (4) |
|
Unmixing of Sound Signals |
|
|
270 | (1) |
|
|
271 | (4) |
|
Multicollinearity and Partial Least Squares |
|
|
275 | (16) |
|
|
276 | (1) |
|
|
276 | (4) |
|
Relation to Partial Least Squares |
|
|
279 | (1) |
|
Extracting Multiple Canonical Correlations |
|
|
280 | (1) |
|
Experiments on Multicollinear Data |
|
|
281 | (3) |
|
|
281 | (1) |
|
|
281 | (1) |
|
|
281 | (3) |
|
A Neural Implementation of Partial Least Squares |
|
|
284 | (4) |
|
Introducing Nonlinear Correlations |
|
|
284 | (1) |
|
|
285 | (1) |
|
|
285 | (1) |
|
Mixtures of Linear Neural PLS |
|
|
286 | (1) |
|
Nonlinear Neural PLS Regression |
|
|
287 | (1) |
|
|
288 | (3) |
|
|
291 | (18) |
|
|
291 | (2) |
|
Properties of Twinned Principal Curves |
|
|
293 | (12) |
|
Comparison with Single Principal Curves |
|
|
293 | (2) |
|
|
295 | (2) |
|
|
297 | (2) |
|
Termination Criteria: MSE |
|
|
299 | (1) |
|
Termination Criteria: Using Derivative Information |
|
|
299 | (2) |
|
Alternative Twinned Principal Curves |
|
|
301 | (4) |
|
Twinned Self-Organising Maps |
|
|
305 | (2) |
|
Predicting Student's Exam Marks |
|
|
306 | (1) |
|
|
307 | (2) |
|
|
309 | (6) |
|
|
309 | (3) |
|
|
312 | (1) |
|
|
312 | (3) |
|
A Negative Feedback Artificial Neural Networks |
|
|
315 | (8) |
|
A.1 The Interneuron Model |
|
|
315 | (2) |
|
|
317 | (3) |
|
|
318 | (1) |
|
|
319 | (1) |
|
A.3 Related Biological Models |
|
|
320 | (3) |
|
B Previous Factor Analysis Models |
|
|
323 | (18) |
|
B.1 Foldiak's Sixth Model |
|
|
323 | (3) |
|
B.1.1 Implementation Details |
|
|
324 | (1) |
|
|
325 | (1) |
|
B.2 Competitive Hebbian Learning |
|
|
326 | (1) |
|
B.3 Multiple Cause Models |
|
|
327 | (3) |
|
|
328 | (2) |
|
|
330 | (1) |
|
B.4 Predictability Minimisation |
|
|
330 | (2) |
|
|
332 | (2) |
|
|
334 | (1) |
|
|
334 | (7) |
|
B.6.1 Mixtures of Gaussians |
|
|
335 | (2) |
|
B.6.2 A Logistic Belief Network |
|
|
337 | (1) |
|
B.6.3 The Helmholtz Machine and the EM Algorithm |
|
|
337 | (1) |
|
B.6.4 The Wake-Sleep Algorithm |
|
|
338 | (1) |
|
B.6.5 Olshausen and Field's Sparse Coding Network |
|
|
339 | (2) |
|
|
341 | (12) |
|
|
341 | (3) |
|
C.1.1 An Example Separation |
|
|
342 | (1) |
|
C.1.2 Learning the Weights |
|
|
343 | (1) |
|
|
344 | (1) |
|
C.2.1 Simulations and Discussion |
|
|
345 | (1) |
|
C.3 Information Maximisation |
|
|
345 | (5) |
|
C.3.1 The Learning Algorithm |
|
|
347 | (3) |
|
C.4 Penalised Minimum Reconstruction Error |
|
|
350 | (1) |
|
|
350 | (1) |
|
|
351 | (2) |
|
C.5.1 FastICA for One Unit |
|
|
352 | (1) |
|
D Previous Dual Stream Approaches |
|
|
353 | (10) |
|
|
353 | (2) |
|
|
355 | (1) |
|
|
356 | (2) |
|
|
358 | (5) |
|
|
363 | (8) |
|
|
363 | (3) |
|
|
363 | (1) |
|
|
363 | (1) |
|
|
364 | (1) |
|
E.1.4 Random Dot Stereograms |
|
|
365 | (1) |
|
E.1.5 Nonlinear Manifolds |
|
|
365 | (1) |
|
|
366 | (5) |
|
|
366 | (1) |
|
|
366 | (1) |
|
|
366 | (1) |
|
|
366 | (1) |
|
E.2.5 Children's Gait Data |
|
|
367 | (1) |
|
|
367 | (1) |
|
|
368 | (1) |
|
|
368 | (1) |
|
|
368 | (1) |
|
|
368 | (1) |
|
|
368 | (3) |
References |
|
371 | (10) |
Index |
|
381 | |