Muutke küpsiste eelistusi

E-raamat: Mathematical Aspects of Deep Learning

Edited by (Universität Wien, Austria), Edited by (Ludwig-Maximilians-Universität Munchen)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 22-Dec-2022
  • Kirjastus: Cambridge University Press
  • Keel: eng
  • ISBN-13: 9781009035682
  • Formaat - PDF+DRM
  • Hind: 86,44 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: PDF+DRM
  • Ilmumisaeg: 22-Dec-2022
  • Kirjastus: Cambridge University Press
  • Keel: eng
  • ISBN-13: 9781009035682

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

In recent years the development of new classification and regression algorithms based on deep learning has led to a revolution in the fields of artificial intelligence, machine learning, and data analysis. The development of a theoretical foundation to guarantee the success of these algorithms constitutes one of the most active and exciting research topics in applied mathematics. This book presents the current mathematical understanding of deep learning methods from the point of view of the leading experts in the field. It serves both as a starting point for researchers and graduate students in computer science, mathematics, and statistics trying to get into the field and as an invaluable reference for future research.

The development of a theoretical foundation for deep learning methods constitutes one of the most active and exciting research topics in applied mathematics. Written by leading experts in the field, this book acts as a mathematical introduction to deep learning for researchers and graduate students trying to get into the field.

Muu info

A mathematical introduction to deep learning, written by a group of leading experts in the field.
Contributors xiii
Preface xv
1 The Modern Mathematics of Deep Learning
1(111)
Julius Berner
Philipp Grohs
Gitta Kutyniok
Philipp Petersen
1.1 Introduction
1(30)
1.1.1 Notation
4(1)
1.1.2 Foundations of Learning Theory
5(18)
1.1.3 Do We Need a New Theory?
23(8)
1.2 Generalization of Large Neural Networks
31(10)
1.2.1 Kernel Regime
31(2)
1.2.2 Norm-Based Bounds and Margin Theory
33(2)
1.2.3 Optimization and Implicit Regularization
35(3)
1.2.4 Limits of Classical Theory and Double Descent
38(3)
1.3 The Role of Depth in the Expressivity of Neural Networks
41(8)
1.3.1 Approximation of Radial Functions
41(3)
1.3.2 Deep ReLU Networks
44(3)
1.3.3 Alternative Notions of Expressivity
47(2)
1.4 Deep Neural Networks Overcome the Curse of Dimensionality
49(8)
1.4.1 Manifold Assumption
49(2)
1.4.2 Random Sampling
51(2)
1.4.3 PDE Assumption
53(4)
1.5 Optimization of Deep Neural Networks
57(8)
1.5.1 Loss Landscape Analysis
57(4)
1.5.2 Lazy Training and Provable Convergence of Stochastic Gradient Descent
61(4)
1.6 Tangible Effects of Special Architectures
65(13)
1.6.1 Convolutional Neural Networks
66(2)
1.6.2 Residual Neural Networks
68(2)
1.6.3 Framelets and U-Nets
70(3)
1.6.4 Batch Normalization
73(3)
1.6.5 Sparse Neural Networks and Pruning
75
1.6.6 Recurrent Neural Networks
76(2)
1.7 Describing the Features that a Deep Neural Network Learns
78(3)
1.7.1 Invariances and the Scattering Transform
78(1)
1.7.2 Hierarchical Sparse Representations
79(2)
1.8 Effectiveness in Natural Sciences
81(31)
1.8.1 Deep Neural Networks Meet Inverse Problems
82(2)
1.8.2 PDE-Based Models
84(28)
2 Generalization in Deep Learning
112(21)
K. Kawaguchi
Y. Bengio
L. Kaelbling
2.1 Introduction
112(1)
2.2 Background
113(3)
2.3 Rethinking Generalization
116(5)
2.3.1 Consistency of Theory
118(1)
2.3.2 Differences in Assumptions and Problem Settings
119(2)
2.3.3 Practical Role of Generalization Theory
121(1)
2.4 Generalization Bounds via Validation
121(1)
2.5 Direct Analyses of Neural Networks
122(9)
2.5.1 Model Description via Deep Paths
123(2)
2.5.2 Theoretical Insights via Tight Theory for Every Pair (P,5)
125(2)
2.5.3 Probabilistic Bounds over Random Datasets
127(3)
2.5.4 Probabilistic Bound for 0-1 Loss with Multi-Labels
130(1)
2.6 Discussions and Open Problems
131(2)
Appendix A Additional Discussions
133(4)
A1 Simple Regularization Algorithm
133(2)
A2 Relationship to Other Fields
135(1)
A3 SGD Chooses Direction in Terms of w
135(1)
A4 Simple Implementation of Two-Phase Training Procedure
136(1)
A5 On Proposition 2.3
136(1)
A6 On Extensions
137(1)
Appendix B Experimental Details
137(1)
Appendix C Proofs
138
C1 Proof of Theorem 2.1
139(1)
C2 Proof of Corollary 2.2
139(1)
C3 Proof of Theorem 2.7
140(1)
C4 Proof of Theorem 2.9
141(1)
C5 Proof of Theorem 2.10
142(1)
C6 Proof of Proposition 2.5
143(6)
3 Expressivity of Deep Neural Networks
149(51)
Ingo Giihring
Mones Raskin
Gitta Kutyniok
3.1 Introduction
149(6)
3.1.1 Neural Networks
151(3)
3.1.2 Goal and Outline of this
Chapter
154(1)
3.1.3 Notation
154(1)
3.2 Shallow Neural Networks
155(6)
3.2.1 Universality of Shallow Neural Networks
156(3)
3.2.2 Lower Complexity Bounds
159(1)
3.2.3 Upper Complexity Bounds
160(1)
3.3 Universality of Deep Neural Networks
161(2)
3.4 Approximation of Classes of Smooth Functions
163(4)
3.5 Approximation of Piecewise Smooth Functions
167(5)
3.6 Assuming More Structure
172(5)
3.6.1 Hierachical Structure
172(2)
3.6.2 Assumptions on the Data Manifold
174(1)
3.6.3 Expressivity of Deep Neural Networks for Solutions ofPDEs
175(2)
3.7 Deep Versus Shallow Neural Networks
177(3)
3.8 Special Neural Network Architectures and Activation Functions
180(20)
3.8.1 Convolutional Neural Networks
180(4)
3.8.2 Residual Neural Networks
184(1)
3.8.3 Recurrent Neural Networks
185(15)
4 Optimization Landscape of Neural Networks
200(29)
Rene Vidal
Zhihui Zhu
Benjamin D. Haeffele
4.1 Introduction
201(4)
4.2 Basics of Statistical Learning
205(1)
4.3 Optimization Landscape of Linear Networks
206(8)
4.3.1 Single-Hidden-Layer Linear Networks with Squared Loss and Fixed Size Regularization
207(5)
4.3.2 Deep Linear Networks with Squared Loss
212(2)
4.4 Optimization Landscape of Nonlinear Networks
214(11)
4.4.1 Motivating Example
215(6)
4.4.2 Positively Homogeneous Networks
221(4)
4.5 Conclusions
225(4)
5 Explaining the Decisions of Convolutional and Recurrent Neural Networks
229(38)
Wojciech Samek
Leila Arras
Ahmed Osman
Gregoire Montavon
Klaus-Robert Muller
5.1 Introduction
229(2)
5.2 Why Explainability?
231(2)
5.2.1 Practical Advantages of Explainability
231(1)
5.2.2 Social and Legal Role of Explainability
232(1)
5.2.3 Theoretical Insights Through Explainability
232(1)
5.3 From Explaining Linear Models to General Model Explainability
233(5)
5.3.1 Explainability of Linear Models
233(2)
5.3.2 Generalizing Explainability to Nonlinear Models
235(1)
5.3.3 Short Survey on Explanation Methods
236(2)
5.4 Layer-Wise Relevance Propagation
238(13)
5.4.1 LRP in Convolutional Neural Networks
239(3)
5.4.2 Theoretical Interpretation of the LRP Redistribution Process
242(6)
5.4.3 Extending LRP to LSTM Networks
248(3)
5.5 Explaining a Visual Question Answering Model
251(7)
5.6 Discussion
258(9)
6 Stochastic Feedforward Neural Networks: Universal Approximation
267(47)
Thomas Merkh
Guido Montufar
6.1 Introduction
268(3)
6.2 Overview of Previous Works and Results
271(2)
6.3 Markov Kernels and Stochastic Networks
273(3)
6.3.1 Binary Probability Distributions and Markov Kernels
273(1)
6.3.2 Stochastic Feedforward Networks
274(2)
6.4 Results for Shallow Networks
276(2)
6.4.1 Fixed Weights in the Output Layer
277(1)
6.4.2 Trainable Weights in the Output Layer
278(1)
6.5 Proofs for Shallow Networks
278(8)
6.5.1 Fixed Weights in the Output Layer
279(4)
6.5.2 Trainable Weights in the Second Layer
283(2)
6.5.3 Discussion of the Proofs for Shallow Networks
285(1)
6.6 Results for Deep Networks
286(3)
6.6.1 Parameter Count
288(1)
6.6.2 Approximation with Finite Weights and Biases
288(1)
6.7 Proofs for Deep Networks
289(10)
6.7.1 Notation
289(1)
6.7.2 Probability Mass Sharing
290(3)
6.7.3 Universal Approximation
293(3)
6.7.4 Error Analysis for Finite Weights and Biases
296(2)
6.7.5 Discussion of the Proofs for Deep Networks
298(1)
6.8 Lower Bounds for Shallow and Deep Networks
299(3)
6.8.1 Parameter Counting Lower Bounds
299(2)
6.8.2 Minimum Width
301(1)
6.9 A Numerical Example
302(4)
6.10 Conclusion
306(1)
6.11 Open Problems
307(7)
7 Deep Learning as Sparsity-Enforcing Algorithms
314(24)
A. Aberdam
J. Sulam
7.1 Introduction
314(2)
7.2 Related Work
316(1)
7.3 Background
317(3)
7.4 Multilayer Sparse Coding
320(4)
7.4.1 ML-SC Pursuit and the Forward Pass
321(2)
7.4.2 ML-SC: A Projection Approach
323(1)
7.5 The Holistic Way
324(3)
7.6 Multilayer Iterative Shrinkage Algorithms
327(5)
7.6.1 Towards Principled Recurrent Neural Networks
329(3)
7.7 Final Remarks and Outlook
332(6)
8 The Scattering Transform
338(62)
Joan Bruna
8.1 Introduction
338(1)
8.2 Geometric Stability
339(7)
8.2.1 Euclidean Geometric Stability
340(1)
8.2.2 Representations with Euclidean Geometric Stability
341(1)
8.2.3 Non-Euclidean Geometric Stability
342(1)
8.2.4 Examples
343(3)
8.3 Scattering on the Translation Group
346(17)
8.3.1 Windowed Scattering Transform
346(3)
8.3.2 Scattering Metric and Energy Conservation
349(2)
8.3.3 Local Translation Invariance and Lipschitz Continuity with Respect to Deformations
351(3)
8.3.4 Algorithms
354(3)
8.3.5 Empirical Analysis of Scattering Properties
357(5)
8.3.6 Scattering in Modern Computer Vision
362(1)
8.4 Scattering Representations of Stochastic Processes
363(8)
8.4.1 Expected Scattering
363(4)
8.4.2 Analysis of Stationary Textures with Scattering
367(2)
8.4.3 Multifractal Analysis with Scattering Moments
369(2)
8.5 Non-Euclidean Scattering
371(13)
8.5.1 Joint versus Separable Scattering
372(1)
8.5.2 Scattering on Global Symmetry Groups
372(3)
8.5.3 Graph Scattering
375(8)
8.5.4 Manifold Scattering
383(1)
8.6 Generative Modeling with Scattering
384(9)
8.6.1 Sufficient Statistics
384(1)
8.6.2 Microcanonical Scattering Models
385(2)
8.6.3 Gradient Descent Scattering Reconstruction
387(2)
8.6.4 Regularising Inverse Problems with Scattering
389(2)
8.6.5 Texture Synthesis with Microcanonical Scattering
391(2)
8.7 Final Remarks
393(7)
9 Deep Generative Models and Inverse Problems
400(22)
Alexandros G. Dimakis
9.1 Introduction
400(1)
9.2 How to Tame High Dimensions
401(5)
9.2.1 Sparsity
401(1)
9.2.2 Conditional Independence
402(1)
9.2.3 Deep Generative Models
403(1)
9.2.4 GANs and VAEs
404(1)
9.2.5 Invertible Generative Models
405(1)
9.2.6 Untrained Generative Models
405(1)
9.3 Linear Inverse Problems Using Deep Generative Models
406(8)
9.3.1 Reconstruction from Gaussian Measurements
407(2)
9.3.2 Optimization Challenges
409(1)
9.3.3 Extending the Range of the Generator
410(1)
9.3.4 Non-Linear Inverse Problems
410(2)
9.3.5 Inverse Problems with Untrained Generative Priors
412(2)
9.4 Supervised Methods for Inverse Problems
414(8)
10 Dynamical Systems and Optimal Control Approach to Deep Learning
422(17)
Weinan E. Jiequn Han
Qianxiao Li
10.1 Introduction
422(2)
10.1.1 The Problem of Supervised Learning
423(1)
10.2 ODE Formulation
424(1)
10.3 Mean-Field Optimal Control and Pontryagin's Maximum Principle
425(3)
10.3.1 Pontryagin's Maximum Principle
426(2)
10.4 Method of Successive Approximations
428(7)
10.4.1 Extended Pontryagin Maximum Principle
428(1)
10.4.2 The Basic Method of Successive Approximation
428(3)
10.4.3 Extended Method of Successive Approximation
431(2)
10.4.4 Discrete PMP and Discrete MSA
433(2)
10.5 Future Work
435(4)
11 Bridging Many-Body Quantum Physics and Deep Learning via Tensor Networks
439
Yoav Levine
Or Sharir
Nadav Cohen
Amnon Shashua
11.1 Introduction
440(2)
11.2 Preliminaries - Many-Body Quantum Physics
442(8)
11.2.1 The Many-Body Quantum Wave Function
443(1)
11.2.2 Quantum Entanglement Measures
444(3)
11.2.3 Tensor Networks
447(3)
11.3 Quantum Wave Functions and Deep Learning Architectures
450(3)
11.3.1 Convolutional and Recurrent Networks as Wave Functions
450(3)
11.3.2 Tensor Network Representations of Convolutional and Recurrent Networks
453(1)
11.4 Deep Learning Architecture Design via Entanglement Measures
453(7)
11.4.1 Dependencies via Entanglement Measures
454(2)
11.4.2 Quantum-Physics-Inspired Control of Inductive Bias
456(4)
11.5 Power of Deep Learning for Wave Function Representations
460(7)
11.5.1 Entanglement Scaling of Deep Recurrent Networks
461(2)
11.5.2 Entanglement Scaling of Overlapping Convolutional Networks
463(4)
11.6 Discussion
467
Philipp Grohs is Professor of Applied Mathematics at the University of Vienna and Group Leader of Mathematical Data Science at the Austrian Academy of Sciences. Gitta Kutyniok is Bavarian AI Chair for Mathematical Foundations of Artificial Intelligence at Ludwig-Maximilians Universität München and Adjunct Professor for Machine Learning at the University of Tromsø.