Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Approximate Iterative Algorithms [Taylor & Francis e-raamat]

Anthony Louis Almudevar

Formaat: 372 pages
Ilmumisaeg: 10-Oct-2019
Kirjastus: CRC Press
ISBN-13: 9780429210853

Teised raamatud teemal:

Taylor & Francis e-raamat
Hind: 170,80 €*
* hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
Tavahind: 244,00 €
Säästad 30%

Formaat: 372 pages
Ilmumisaeg: 10-Oct-2019
Kirjastus: CRC Press
ISBN-13: 9780429210853

Teised raamatud teemal:

Rohkem infot Taylor & Francis e-raamatute kohta

Raamatu kodulehekülg: https://www.taylorfrancis.com/books/9780429210853

Iterative algorithms often rely on approximate evaluation techniques, which may include statistical estimation, computer simulation or functional approximation. This volume presents methods for the study of approximate iterative algorithms, providing tools for the derivation of error bounds and convergence rates, and for the optimal design of such algorithms. Techniques of functional analysis are used to derive analytical relationships between approximation methods and convergence properties for general classes of algorithms. This work provides the necessary background in functional analysis and probability theory. Extensive applications to Markov decision processes are presented.

This volume is intended for mathematicians, engineers and computer scientists, who work on learning processes in numerical analysis and are involved with optimization, optimal control, decision analysis and machine learning.

1 Introduction

(2)

Part I Mathematical background

(186)

2 Real analysis and linear algebra

(22)

2.1 Definitions and notation

(11)

2.1.1 Numbers, sets and vectors

(1)

2.1.2 Logical notation

(1)

2.1.3 Set algebra

(1)

2.1.4 The supremum and infimum

(1)

2.1.5 Rounding off

(1)

2.1.6 Functions

(1)

2.1.7 Sequences and limits

(1)

2.1.8 Infinite series

(2)

2.1.9 Geometric series

(1)

2.1.10 Classes of real valued functions

(1)

2.1.11 Graphs

(1)

2.1.12 The binomial coefficient

(1)

2.1.13 Stirling's approximation of the factorial

(1)

2.1.14 L'Hopital's rule

(1)

2.1.15 Taylor's theorem

(1)

2.1.16 The lp norm

(1)

2.1.17 Power means

(1)

2.2 Equivalence relationships

(1)

2.3 Linear algebra

(11)

2.3.1 Matrices

(2)

2.3.2 Eigenvalues and spectral decomposition

(3)

2.3.3 Symmetric, Hermitian and positive definite matrices

(1)

2.3.4 Positive matrices

(2)

2.3.5 Stochastic matrices

(1)

2.3.6 Nonnegative matrices and graph structure

(2)

3 Background - measure theory

(24)

3.1 Topological spaces

(3)

3.1.1 Bases of topologies

(1)

3.1.2 Metric space topologies

(1)

3.2 Measure spaces

(11)

3.2.1 Formal construction of measures

(2)

3.2.2 Completion of measures

(1)

3.2.3 Outer measure

(1)

3.2.4 Extension of measures

(1)

3.2.5 Counting measure

(1)

3.2.6 Lebesgue measure

(1)

3.2.7 Borel sets

(1)

3.2.8 Dynkin system theorem

(1)

3.2.9 Signed measures

(1)

3.2.10 Decomposition of measures

(1)

3.2.11 Measurable functions

(2)

3.3 Integration

(3)

3.3.1 Convergence of integrals

(1)

3.3.2 Lp spaces

(1)

3.3.3 Radon-Nikodym derivative

(1)

3.4 Product spaces

(7)

3.4.1 Product topologies

(1)

3.4.2 Product measures

(4)

3.4.3 The Kolmogorov extension theorem

(2)

4 Background - probability theory

(46)

4.1 Probability measures - basic properties

(7)

4.2 Moment generating functions (MGF) and cumulant generating functions (CGF)

(4)

4.2.1 Moments and cumulants

(1)

4.2.2 MGF and CGF of independent sums

(1)

4.2.3 Relationship of the CGF to the normal distribution

(1)

4.2.4 Probability generating functions

(1)

4.3 Conditional distributions

(3)

4.4 Martingales

(2)

4.4.1 Stopping times

(1)

4.5 Some important theorems

(2)

4.6 Inequalities for tail probabilities

(4)

4.6.1 Chernoff bounds

(1)

4.6.2 Chernoff bound for the normal distribution

(1)

4.6.3 Chernoff bound for the gamma distribution

(1)

4.6.4 Sample means

(1)

4.6.5 Some inequalities for bounded random variables

(1)

4.7 Stochastic ordering

(3)

4.7.1 MGF ordering of the gamma and exponential distribution

(1)

4.7.2 Improved bounds based on hazard functions

(1)

4.8 Theory of stochastic limits

(5)

4.8.1 Covergence of random variables

(1)

4.8.2 Convergence of measures

(1)

4.8.3 Total variation norm

(3)

4.9 Stochastic kernels

(3)

4.9.1 Measurability of measure kernels

(1)

4.9.2 Continuity of measure kernels

(1)

4.10 Convergence of sums

(1)

4.11 The law of large numbers

(2)

4.12 Extreme value theory

(1)

4.13 Maximum likelihood estimation

(2)

4.14 Nonparametric estimates of distributions

(1)

4.15 Total variation distance for discrete distributions

(5)

5 Background - stochastic processes

(28)

5.1 Counting processes

(3)

5.1.1 Renewal processes

(1)

5.1.2 Poisson process

(1)

5.2 Markov processes

100

(11)

5.2.1 Discrete state spaces

101

(1)

5.2.2 Global properties of Markov chains

102

(4)

5.2.3 General state spaces

106

(3)

5.2.4 Geometric ergodicity

109

(2)

5.2.5 Spectral properties of Markov chains

111

(1)

5.3 Continuous-time Markov chains

111

(3)

5.3.1 Birth and death processes

114

(1)

5.4 Queueing systems

114

(4)

5.4.1 Queueing systems as birth and death processes

115

(1)

5.4.2 Utilization factor

116

(1)

5.4.3 General queueing systems and embedded Markov chains

116

(2)

5.5 Adapted counting processes

118

(7)

5.5.1 Asymptotic behavior

119

(4)

5.5.2 Relationship to adapted events

123

(2)

6 Functional analysis

125

(32)

6.1 Metric spaces

126

(2)

6.1.1 Contractive mappings

126

(2)

6.2 The Banach fixed point theorem

128

(3)

6.2.1 Stopping rules for fixed point algorithms

130

(1)

6.3 Vector spaces

131

(3)

6.3.1 Quotient spaces

133

(1)

6.3.2 Basis of a vector space

133

(1)

6.3.3 Operators

133

(1)

6.4 Banach spaces

134

(5)

6.4.1 Banach spaces and completeness

136

(1)

6.4.2 Linear operators

137

(2)

6.5 Norms and norm equivalence

139

(3)

6.5.1 Norm dominance

140

(1)

6.5.2 Equivalence properties of norm equivalence classes

140

(2)

6.6 Quotient spaces and seminorms

142

(2)

6.7 Hilbert spaces

144

(2)

6.8 Examples of Banach spaces

146

(7)

6.8.1 Finite dimensional spaces

146

(1)

6.8.2 Matrix norms and the submultiplicative property

147

(1)

6.8.3 Weighted norms on function spaces

147

(2)

6.8.4 Span seminorms

149

(2)

6.8.5 Operators on span quotient spaces

151

(2)

6.9 Measure kernels as linear operators

153

(4)

6.9.1 The contraction property of stochastic kernels

153

(1)

6.9.2 Stochastic kernels and the span seminorm

153

(4)

7 Fixed point equations

157

(16)

7.1 Contraction as a norm equivalence property

158

(2)

7.2 Linear fixed point equations

160

(1)

7.3 The geometric series theorem

161

(2)

7.4 Invariant transformations of fixed point equations

163

(1)

7.5 Fixed point algorithms and the span seminorm

164

(4)

7.5.1 Approximations in the span seminorm

166

(1)

7.5.2 Magnitude of fixed points in the span seminorm

167

(1)

7.6 Stopping rules for fixed point algorithms

168

(1)

7.6.1 Fixed point iteration in the span seminorm

169

(1)

7.7 Perturbations of fixed point equations

169

(4)

8 The distribution of a maximum

173

(16)

8.1 General approach

174

(1)

8.2 Bounds on M based on MGFs

174

(4)

8.2.1 Sample means

176

(1)

8.2.2 Gamma distribution

177

(1)

8.3 Bounds for varying marginal distributions

178

(3)

8.3.1 Example

180

(1)

8.4 Tail probabilities of maxima

181

(4)

8.4.1 Extreme value distributions

182

(1)

8.4.2 Tail probabilities based on Boole's inequality

182

(1)

8.4.3 The normal case

183

(1)

8.4.4 The gamma(α, λ) case

184

(1)

8.5 Variance mixtures based on random sample sizes

185

(1)

8.6 Bounds for maxima based on the first two moments

186

(5)

8.6.1 Stability

188

(1)

Part II General theory of approximate iterative algorithms

189

(58)

9 Background - linear convergence

191

(8)

9.1 Linear convergence

191

(3)

9.2 Construction of envelopes - the nonstochastic case

194

(2)

9.3 Construction of envelopes - the stochastic case

196

(1)

9.4 A version of l'Hopital's rule for series

196

(3)

10 A general theory of approximate iterative algorithms (AIA)

199

(40)

10.1 A general tolerance model

201

(1)

10.2 Example: a preliminary model

201

(1)

10.3 Model elements of an AIA

202

(2)

10.3.1 Lipschitz kernels

202

(1)

10.3.2 Lipschitz convolutions

203

(1)

10.4 A classification system for AIAs

204

(4)

10.4.1 Relative error model

206

(2)

10.5 General inequalities

208

(5)

10.5.1 Hilbert space models of AIAs

209

(4)

10.6 Nonexpansive operators

213

(7)

10.6.1 Application of general inequalities to nonexpansive AIAs

214

(2)

10.6.2 Weakly contractive AIAs

216

(1)

10.6.3 Examples

216

(2)

10.6.4 Stochastic approximation (Robbins-Monro algorithm)

218

(2)

10.7 Rates of convergence for AIAs

220

(10)

10.7.1 Monotonicity of the Lipschitz kernel

221

(1)

10.7.2 Case I - strongly contractive models with nonvanishing bounds

221

(1)

10.7.3 Case II - rapidly vanishing approximation error

222

(1)

10.7.4 Case III - approximation error decreasing at contraction rate

223

(1)

10.7.5 Case IV - Approximation error greater than contraction rate

224

(1)

10.7.6 Case V - Contraction rates approaching 1

224

(3)

10.7.7 Adjustments for relative error models

227

(1)

10.7.8 A comparison of Banach space and Hilbert space models

228

(2)

10.8 Stochastic approximation as a weakly contractive algorithm

230

(1)

10.9 Tightness of algorithm tolerance

231

(1)

10.10 Finite bounds

232

(3)

10.10.1 Numerical example

233

(2)

10.11 Summary of convergence rates for strongly contractive models

235

(4)

11 Selection of approximation schedules for coarse-to-fine AlAs

239

(8)

11.1 Extending the tolerance model

239

(3)

11.1.1 Comparison model for tolerance schedules

241

(1)

11.1.2 Regularity conditions for the computation function

242

(1)

11.2 Main result

242

(1)

11.3 Examples of cost functions

243

(2)

11.4 A general principle for AIAs

245

(2)

Part III Application to Markov decision processes

247

(104)

12 Markov decision processes (MDP) - background

249

(30)

12.1 Model definition

250

(3)

12.2 The optimal control problem

253

(2)

12.2.1 Adaptive control policies

253

(1)

12.2.2 Optimal control policies

253

(2)

12.3 Dynamic programming and linear operators

255

(6)

12.3.1 The dynamic programming operator (DPO)

256

(1)

12.3.2 Finite horizon dynamic programming

257

(1)

12.3.3 Infinite horizon problem

258

(1)

12.3.4 Classes of MDP

259

(1)

12.3.5 Measurability of the DPO

260

(1)

12.4 Dynamic programming and value iteration

261

(4)

12.4.1 Value iteration and optimality

262

(3)

12.5 Regret and s-optimal solutions

265

(2)

12.6 Banach space structure of dynamic programming

267

(7)

12.6.1 The contraction property

269

(1)

12.6.2 Contraction properties of the DPO

269

(3)

12.6.3 The equivalence of uniform convergence and contraction for the DPO

272

(2)

12.7 Average cost criterion for MDP

274

(5)

13 Markov decision processes - value iteration

279

(16)

13.1 Value iteration on quotient spaces

279

(2)

13.2 Contraction in the span seminorm

281

(2)

13.2.1 Contraction properties of the DPO

281

(2)

13.3 Stopping rules for value iteration

283

(1)

13.4 Value iteration in the span seminorm

283

(1)

13.5 Example: M/D/1/K queueing system

284

(4)

13.6 Efficient calculation of |||QP||| SP

288

(3)

13.7 Example: M/D/1/K system with optimal control of service capacity

291

(1)

13.8 Policy iteration

292

(1)

13.9 Value iteration for the average cost optimization

293

(2)

14 Model approximation in dynamic programming - general theory

295

(14)

14.1 The general inequality for MDPs

295

(3)

14.2 Model distance

298

(2)

14.3 Regret

300

(2)

14.4 A comment on the approximation of regret

302

(2)

14.5 Example

304

(5)

15 Sampling based approximation methods

309

(18)

15.1 Modeling maxima

310

(10)

15.1.1 Nonuniform sample allocation: Dependence on qmin, and the 'Curse of the Supremum Norm'

313

(1)

15.1.2 Some queueing system examples

314

(1)

15.1.3 Truncated geometric model

315

(1)

15.1.4 M/G/1/K queueing model

316

(2)

15.1.5 Restarting schemes

318

(2)

15.2 Continuous state/action spaces

320

(1)

15.3 Parametric estimation of MDP models

321

(6)

16 Approximate value iteration by truncation

327

(6)

16.1 Truncation algorithm

328

(1)

16.2 Regularity conditions for tolerance-cost model

329

(1)

16.2.1 Suboptimal orderings

329

(1)

16.3 Example

330

(3)

17 Grid approximations of MDPs with continuous state/action spaces

333

(8)

17.1 Discretization methods

333

(2)

17.2 Complexity analysis

335

(1)

17.3 Application of approximation schedules

336

(5)

18 Adaptive control of MDPs

341

(10)

18.1 Regret bounds for adaptive policies

342

(1)

18.2 Definition of an adaptive MDP

343

(2)

18.3 Online parameter estimation

345

(2)

18.4 Exploration schedule

347

(4)

Bibliography

351

(6)

Subject index

357

Dr. Almudevar was born in Halifax and raised in Ontario, Canada. He completed a PhD in Statistics at the University of Toronto, and is currently a faculty member in the Department of Biostatistics and Computational Biology at the University of Rochester. He has a wide range of interests, which include biological network modeling, analysis of genetic data, immunological modeling and clinical applications of technological home monitoring. He has a more general interest in optimization and control theory, with an emphasis on the computational issues associated with Markov decision processes.

Püsilink: https://www.kriso.ee/db/9780429210853_pe.html

Märksõnad:

E-raamat: Approximate Iterative Algorithms [Taylor & Francis e-raamat]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Kirjastuste teemad

Vali ostukorv