Muutke küpsiste eelistusi

Approximate Iterative Algorithms [Pehme köide]

  • Formaat: Paperback / softback, 372 pages, kõrgus x laius: 246x174 mm, kaal: 453 g
  • Ilmumisaeg: 10-Oct-2019
  • Kirjastus: CRC Press
  • ISBN-10: 0367378884
  • ISBN-13: 9780367378882
  • Formaat: Paperback / softback, 372 pages, kõrgus x laius: 246x174 mm, kaal: 453 g
  • Ilmumisaeg: 10-Oct-2019
  • Kirjastus: CRC Press
  • ISBN-10: 0367378884
  • ISBN-13: 9780367378882
Iterative algorithms often rely on approximate evaluation techniques, which may include statistical estimation, computer simulation or functional approximation. This volume presents methods for the study of approximate iterative algorithms, providing tools for the derivation of error bounds and convergence rates, and for the optimal design of such algorithms. Techniques of functional analysis are used to derive analytical relationships between approximation methods and convergence properties for general classes of algorithms. This work provides the necessary background in functional analysis and probability theory. Extensive applications to Markov decision processes are presented.





This volume is intended for mathematicians, engineers and computer scientists, who work on learning processes in numerical analysis and are involved with optimization, optimal control, decision analysis and machine learning.
1 Introduction
1(2)
Part I Mathematical background 3(186)
2 Real analysis and linear algebra
5(22)
2.1 Definitions and notation
5(11)
2.1.1 Numbers, sets and vectors
5(1)
2.1.2 Logical notation
6(1)
2.1.3 Set algebra
6(1)
2.1.4 The supremum and infimum
7(1)
2.1.5 Rounding off
7(1)
2.1.6 Functions
7(1)
2.1.7 Sequences and limits
8(1)
2.1.8 Infinite series
9(2)
2.1.9 Geometric series
11(1)
2.1.10 Classes of real valued functions
11(1)
2.1.11 Graphs
12(1)
2.1.12 The binomial coefficient
13(1)
2.1.13 Stirling's approximation of the factorial
13(1)
2.1.14 L'Hopital's rule
14(1)
2.1.15 Taylor's theorem
14(1)
2.1.16 The lp norm
14(1)
2.1.17 Power means
15(1)
2.2 Equivalence relationships
16(1)
2.3 Linear algebra
16(11)
2.3.1 Matrices
16(2)
2.3.2 Eigenvalues and spectral decomposition
18(3)
2.3.3 Symmetric, Hermitian and positive definite matrices
21(1)
2.3.4 Positive matrices
22(2)
2.3.5 Stochastic matrices
24(1)
2.3.6 Nonnegative matrices and graph structure
25(2)
3 Background - measure theory
27(24)
3.1 Topological spaces
27(3)
3.1.1 Bases of topologies
29(1)
3.1.2 Metric space topologies
29(1)
3.2 Measure spaces
30(11)
3.2.1 Formal construction of measures
31(2)
3.2.2 Completion of measures
33(1)
3.2.3 Outer measure
34(1)
3.2.4 Extension of measures
35(1)
3.2.5 Counting measure
36(1)
3.2.6 Lebesgue measure
36(1)
3.2.7 Borel sets
36(1)
3.2.8 Dynkin system theorem
37(1)
3.2.9 Signed measures
37(1)
3.2.10 Decomposition of measures
38(1)
3.2.11 Measurable functions
39(2)
3.3 Integration
41(3)
3.3.1 Convergence of integrals
42(1)
3.3.2 Lp spaces
43(1)
3.3.3 Radon-Nikodym derivative
43(1)
3.4 Product spaces
44(7)
3.4.1 Product topologies
44(1)
3.4.2 Product measures
45(4)
3.4.3 The Kolmogorov extension theorem
49(2)
4 Background - probability theory
51(46)
4.1 Probability measures - basic properties
52(7)
4.2 Moment generating functions (MGF) and cumulant generating functions (CGF)
59(4)
4.2.1 Moments and cumulants
61(1)
4.2.2 MGF and CGF of independent sums
62(1)
4.2.3 Relationship of the CGF to the normal distribution
62(1)
4.2.4 Probability generating functions
63(1)
4.3 Conditional distributions
63(3)
4.4 Martingales
66(2)
4.4.1 Stopping times
68(1)
4.5 Some important theorems
68(2)
4.6 Inequalities for tail probabilities
70(4)
4.6.1 Chernoff bounds
71(1)
4.6.2 Chernoff bound for the normal distribution
71(1)
4.6.3 Chernoff bound for the gamma distribution
72(1)
4.6.4 Sample means
72(1)
4.6.5 Some inequalities for bounded random variables
73(1)
4.7 Stochastic ordering
74(3)
4.7.1 MGF ordering of the gamma and exponential distribution
75(1)
4.7.2 Improved bounds based on hazard functions
76(1)
4.8 Theory of stochastic limits
77(5)
4.8.1 Covergence of random variables
77(1)
4.8.2 Convergence of measures
78(1)
4.8.3 Total variation norm
79(3)
4.9 Stochastic kernels
82(3)
4.9.1 Measurability of measure kernels
83(1)
4.9.2 Continuity of measure kernels
84(1)
4.10 Convergence of sums
85(1)
4.11 The law of large numbers
86(2)
4.12 Extreme value theory
88(1)
4.13 Maximum likelihood estimation
89(2)
4.14 Nonparametric estimates of distributions
91(1)
4.15 Total variation distance for discrete distributions
92(5)
5 Background - stochastic processes
97(28)
5.1 Counting processes
97(3)
5.1.1 Renewal processes
98(1)
5.1.2 Poisson process
99(1)
5.2 Markov processes
100(11)
5.2.1 Discrete state spaces
101(1)
5.2.2 Global properties of Markov chains
102(4)
5.2.3 General state spaces
106(3)
5.2.4 Geometric ergodicity
109(2)
5.2.5 Spectral properties of Markov chains
111(1)
5.3 Continuous-time Markov chains
111(3)
5.3.1 Birth and death processes
114(1)
5.4 Queueing systems
114(4)
5.4.1 Queueing systems as birth and death processes
115(1)
5.4.2 Utilization factor
116(1)
5.4.3 General queueing systems and embedded Markov chains
116(2)
5.5 Adapted counting processes
118(7)
5.5.1 Asymptotic behavior
119(4)
5.5.2 Relationship to adapted events
123(2)
6 Functional analysis
125(32)
6.1 Metric spaces
126(2)
6.1.1 Contractive mappings
126(2)
6.2 The Banach fixed point theorem
128(3)
6.2.1 Stopping rules for fixed point algorithms
130(1)
6.3 Vector spaces
131(3)
6.3.1 Quotient spaces
133(1)
6.3.2 Basis of a vector space
133(1)
6.3.3 Operators
133(1)
6.4 Banach spaces
134(5)
6.4.1 Banach spaces and completeness
136(1)
6.4.2 Linear operators
137(2)
6.5 Norms and norm equivalence
139(3)
6.5.1 Norm dominance
140(1)
6.5.2 Equivalence properties of norm equivalence classes
140(2)
6.6 Quotient spaces and seminorms
142(2)
6.7 Hilbert spaces
144(2)
6.8 Examples of Banach spaces
146(7)
6.8.1 Finite dimensional spaces
146(1)
6.8.2 Matrix norms and the submultiplicative property
147(1)
6.8.3 Weighted norms on function spaces
147(2)
6.8.4 Span seminorms
149(2)
6.8.5 Operators on span quotient spaces
151(2)
6.9 Measure kernels as linear operators
153(4)
6.9.1 The contraction property of stochastic kernels
153(1)
6.9.2 Stochastic kernels and the span seminorm
153(4)
7 Fixed point equations
157(16)
7.1 Contraction as a norm equivalence property
158(2)
7.2 Linear fixed point equations
160(1)
7.3 The geometric series theorem
161(2)
7.4 Invariant transformations of fixed point equations
163(1)
7.5 Fixed point algorithms and the span seminorm
164(4)
7.5.1 Approximations in the span seminorm
166(1)
7.5.2 Magnitude of fixed points in the span seminorm
167(1)
7.6 Stopping rules for fixed point algorithms
168(1)
7.6.1 Fixed point iteration in the span seminorm
169(1)
7.7 Perturbations of fixed point equations
169(4)
8 The distribution of a maximum
173(16)
8.1 General approach
174(1)
8.2 Bounds on M based on MGFs
174(4)
8.2.1 Sample means
176(1)
8.2.2 Gamma distribution
177(1)
8.3 Bounds for varying marginal distributions
178(3)
8.3.1 Example
180(1)
8.4 Tail probabilities of maxima
181(4)
8.4.1 Extreme value distributions
182(1)
8.4.2 Tail probabilities based on Boole's inequality
182(1)
8.4.3 The normal case
183(1)
8.4.4 The gamma(α, λ) case
184(1)
8.5 Variance mixtures based on random sample sizes
185(1)
8.6 Bounds for maxima based on the first two moments
186(5)
8.6.1 Stability
188(1)
Part II General theory of approximate iterative algorithms 189(58)
9 Background - linear convergence
191(8)
9.1 Linear convergence
191(3)
9.2 Construction of envelopes - the nonstochastic case
194(2)
9.3 Construction of envelopes - the stochastic case
196(1)
9.4 A version of l'Hopital's rule for series
196(3)
10 A general theory of approximate iterative algorithms (AIA)
199(40)
10.1 A general tolerance model
201(1)
10.2 Example: a preliminary model
201(1)
10.3 Model elements of an AIA
202(2)
10.3.1 Lipschitz kernels
202(1)
10.3.2 Lipschitz convolutions
203(1)
10.4 A classification system for AIAs
204(4)
10.4.1 Relative error model
206(2)
10.5 General inequalities
208(5)
10.5.1 Hilbert space models of AIAs
209(4)
10.6 Nonexpansive operators
213(7)
10.6.1 Application of general inequalities to nonexpansive AIAs
214(2)
10.6.2 Weakly contractive AIAs
216(1)
10.6.3 Examples
216(2)
10.6.4 Stochastic approximation (Robbins-Monro algorithm)
218(2)
10.7 Rates of convergence for AIAs
220(10)
10.7.1 Monotonicity of the Lipschitz kernel
221(1)
10.7.2 Case I - strongly contractive models with nonvanishing bounds
221(1)
10.7.3 Case II - rapidly vanishing approximation error
222(1)
10.7.4 Case III - approximation error decreasing at contraction rate
223(1)
10.7.5 Case IV - Approximation error greater than contraction rate
224(1)
10.7.6 Case V - Contraction rates approaching 1
224(3)
10.7.7 Adjustments for relative error models
227(1)
10.7.8 A comparison of Banach space and Hilbert space models
228(2)
10.8 Stochastic approximation as a weakly contractive algorithm
230(1)
10.9 Tightness of algorithm tolerance
231(1)
10.10 Finite bounds
232(3)
10.10.1 Numerical example
233(2)
10.11 Summary of convergence rates for strongly contractive models
235(4)
11 Selection of approximation schedules for coarse-to-fine AlAs
239(8)
11.1 Extending the tolerance model
239(3)
11.1.1 Comparison model for tolerance schedules
241(1)
11.1.2 Regularity conditions for the computation function
242(1)
11.2 Main result
242(1)
11.3 Examples of cost functions
243(2)
11.4 A general principle for AIAs
245(2)
Part III Application to Markov decision processes 247(104)
12 Markov decision processes (MDP) - background
249(30)
12.1 Model definition
250(3)
12.2 The optimal control problem
253(2)
12.2.1 Adaptive control policies
253(1)
12.2.2 Optimal control policies
253(2)
12.3 Dynamic programming and linear operators
255(6)
12.3.1 The dynamic programming operator (DPO)
256(1)
12.3.2 Finite horizon dynamic programming
257(1)
12.3.3 Infinite horizon problem
258(1)
12.3.4 Classes of MDP
259(1)
12.3.5 Measurability of the DPO
260(1)
12.4 Dynamic programming and value iteration
261(4)
12.4.1 Value iteration and optimality
262(3)
12.5 Regret and s-optimal solutions
265(2)
12.6 Banach space structure of dynamic programming
267(7)
12.6.1 The contraction property
269(1)
12.6.2 Contraction properties of the DPO
269(3)
12.6.3 The equivalence of uniform convergence and contraction for the DPO
272(2)
12.7 Average cost criterion for MDP
274(5)
13 Markov decision processes - value iteration
279(16)
13.1 Value iteration on quotient spaces
279(2)
13.2 Contraction in the span seminorm
281(2)
13.2.1 Contraction properties of the DPO
281(2)
13.3 Stopping rules for value iteration
283(1)
13.4 Value iteration in the span seminorm
283(1)
13.5 Example: M/D/1/K queueing system
284(4)
13.6 Efficient calculation of |||QP||| SP
288(3)
13.7 Example: M/D/1/K system with optimal control of service capacity
291(1)
13.8 Policy iteration
292(1)
13.9 Value iteration for the average cost optimization
293(2)
14 Model approximation in dynamic programming - general theory
295(14)
14.1 The general inequality for MDPs
295(3)
14.2 Model distance
298(2)
14.3 Regret
300(2)
14.4 A comment on the approximation of regret
302(2)
14.5 Example
304(5)
15 Sampling based approximation methods
309(18)
15.1 Modeling maxima
310(10)
15.1.1 Nonuniform sample allocation: Dependence on qmin, and the 'Curse of the Supremum Norm'
313(1)
15.1.2 Some queueing system examples
314(1)
15.1.3 Truncated geometric model
315(1)
15.1.4 M/G/1/K queueing model
316(2)
15.1.5 Restarting schemes
318(2)
15.2 Continuous state/action spaces
320(1)
15.3 Parametric estimation of MDP models
321(6)
16 Approximate value iteration by truncation
327(6)
16.1 Truncation algorithm
328(1)
16.2 Regularity conditions for tolerance-cost model
329(1)
16.2.1 Suboptimal orderings
329(1)
16.3 Example
330(3)
17 Grid approximations of MDPs with continuous state/action spaces
333(8)
17.1 Discretization methods
333(2)
17.2 Complexity analysis
335(1)
17.3 Application of approximation schedules
336(5)
18 Adaptive control of MDPs
341(10)
18.1 Regret bounds for adaptive policies
342(1)
18.2 Definition of an adaptive MDP
343(2)
18.3 Online parameter estimation
345(2)
18.4 Exploration schedule
347(4)
Bibliography 351(6)
Subject index 357
Dr. Almudevar was born in Halifax and raised in Ontario, Canada. He completed a PhD in Statistics at the University of Toronto, and is currently a faculty member in the Department of Biostatistics and Computational Biology at the University of Rochester. He has a wide range of interests, which include biological network modeling, analysis of genetic data, immunological modeling and clinical applications of technological home monitoring. He has a more general interest in optimization and control theory, with an emphasis on the computational issues associated with Markov decision processes.