Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Approximate Dynamic Programming: Solving the Curses of Dimensionality 2nd edition [Kõva köide]

3.73/5 (19 hinnangut Goodreads-ist)

Warren B. Powell (Princeton University, USA)

Formaat: Hardback, 656 pages, kõrgus x laius x paksus: 236x155x41 mm, kaal: 1043 g, Charts: 5 B&W, 0 Color; Photos: 5 B&W, 0 Color; Drawings: 63 B&W, 0 Color; Maps: 2 B&W, 0 Color; Tables: 0 B&W, 0 Color; Graphs: 39 B&W, 0 Color
Sari: Wiley Series in Probability and Statistics
Ilmumisaeg: 18-Nov-2011
Kirjastus: John Wiley & Sons Inc
ISBN-10: 047060445X
ISBN-13: 9780470604458

Teised raamatud teemal:

Applied mathematics - (Hetkel poes: 1 nimetust)

Kõva köide
Hind: 166,08 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja
Raamatukogudele

Formaat: Hardback, 656 pages, kõrgus x laius x paksus: 236x155x41 mm, kaal: 1043 g, Charts: 5 B&W, 0 Color; Photos: 5 B&W, 0 Color; Drawings: 63 B&W, 0 Color; Maps: 2 B&W, 0 Color; Tables: 0 B&W, 0 Color; Graphs: 39 B&W, 0 Color
Sari: Wiley Series in Probability and Statistics
Ilmumisaeg: 18-Nov-2011
Kirjastus: John Wiley & Sons Inc
ISBN-10: 047060445X
ISBN-13: 9780470604458

Teised raamatud teemal:

Applied mathematics - (Hetkel poes: 1 nimetust)

Püsilink: https://www.kriso.ee/db/9780470604458.html

Märksõnad:

Praise for the First Edition "Finally, a book devoted to dynamic programming and written using the language of operations research (OR)! This beautiful book fills a gap in the libraries of OR specialists and practitioners." Computing Reviews

This new edition showcases a focus on modeling and computation for complex classes of approximate dynamic programming problems

Understanding approximate dynamic programming (ADP) is vital in order to develop practical and high-quality solutions to complex industrial problems, particularly when those problems involve making decisions in the presence of uncertainty. Approximate Dynamic Programming, Second Edition uniquely integrates four distinct disciplinesMarkov decision processes, mathematical programming, simulation, and statisticsto demonstrate how to successfully approach, model, and solve a wide range of real-life problems using ADP.

The book continues to bridge the gap between computer science, simulation, and operations research and now adopts the notation and vocabulary of reinforcement learning as well as stochastic search and simulation optimization. The author outlines the essential algorithms that serve as a starting point in the design of practical solutions for real problems. The three curses of dimensionality that impact complex problems are introduced and detailed coverage of implementation challenges is provided. The Second Edition also features:

A new chapter describing four fundamental classes of policies for working with diverse stochastic optimization problems: myopic policies, look-ahead policies, policy function approximations, and policies based on value function approximations

A new chapter on policy search that brings together stochastic search and simulation optimization concepts and introduces a new class of optimal learning strategies

Updated coverage of the exploration exploitation problem in ADP, now including a recently developed method for doing active learning in the presence of a physical state, using the concept of the knowledge gradient

A new sequence of chapters describing statistical methods for approximating value functions, estimating the value of a fixed policy, and value function approximation while searching for optimal policies

The presented coverage of ADP emphasizes models and algorithms, focusing on related applications and computation while also discussing the theoretical side of the topic that explores proofs of convergence and rate of convergence. A related website features an ongoing discussion of the evolving fields of approximation dynamic programming and reinforcement learning, along with additional readings, software, and datasets.

Requiring only a basic understanding of statistics and probability, Approximate Dynamic Programming, Second Edition is an excellent book for industrial engineering and operations research courses at the upper-undergraduate and graduate levels. It also serves as a valuable reference for researchers and professionals who utilize dynamic programming, stochastic programming, and control theory to solve problems in their everyday work.

Preface to the Second Edition

Preface to the First Edition

Acknowledgments

xvii

1 The Challenges of Dynamic Programming

(24)

1.1 A Dynamic Programming Example: A Shortest Path Problem

(1)

1.2 The Three Curses of Dimensionality

(3)

1.3 Some Real Applications

(5)

1.4 Problem Classes

(4)

1.5 The Many Dialects of Dynamic Programming

(2)

1.6 What Is New in This Book?

(2)

1.7 Pedagogy

(3)

1.8 Bibliographic Notes

(3)

2 Some Illustrative Models

(32)

2.1 Deterministic Problems

(5)

2.2 Stochastic Problems

(16)

2.3 Information Acquisition Problems

(3)

2.4 A Simple Modeling Framework for Dynamic Programs

(4)

2.5 Bibliographic Notes

(3)

Problems

(3)

3 Introduction to Markov Decision Processes

(54)

3.1 The Optimality Equations

(7)

3.2 Finite Horizon Problems

(1)

3.3 Infinite Horizon Problems

(2)

3.4 Value Iteration

(6)

3.5 Policy Iteration

(1)

3.6 Hybrid Value-Policy Iteration

(1)

3.7 Average Reward Dynamic Programming

(1)

3.8 The Linear Programming Method for Dynamic Programs

(1)

3.9 Monotone Policies

(6)

3.10 Why Does It Work?

(19)

3.11 Bibliographic Notes

103

(8)

Problems

103

(8)

4 Introduction to Approximate Dynamic Programming

111

(56)

4.1 The Three Curses of Dimensionality (Revisited)

112

(2)

4.2 The Basic Idea

114

(8)

4.3 Q-Learning and SARSA

122

(4)

4.4 Real-Time Dynamic Programming

126

(1)

4.5 Approximate Value Iteration

127

(2)

4.6 The Post-Decision State Variable

129

(15)

4.7 Low-Dimensional Representations of Value Functions

144

(2)

4.8 So Just What Is Approximate Dynamic Programming?

146

(3)

4.9 Experimental Issues

149

(6)

4.10 But Does It Work?

155

(1)

4.11 Bibliographic Notes

156

(11)

Problems

158

(9)

5 Modeling Dynamic Programs

167

(54)

5.1 Notational Style

169

(1)

5.2 Modeling Time

170

(4)

5.3 Modeling Resources

174

(4)

5.4 The States of Our System

178

(9)

5.5 Modeling Decisions

187

(2)

5.6 The Exogenous Information Process

189

(9)

5.7 The Transition Function

198

(8)

5.8 The Objective Function

206

(5)

5.9 A Measure-Theoretic View of Information

211

(2)

5.10 Bibliographic Notes

213

(8)

Problems

214

(7)

6 Policies

221

(28)

6.1 Myopic Policies

224

(1)

6.2 Lookahead Policies

224

(8)

6.3 Policy Function Approximations

232

(3)

6.4 Value Function Approximations

235

(4)

6.5 Hybrid Strategies

239

(3)

6.6 Randomized Policies

242

(2)

6.7 How to Choose a Policy?

244

(3)

6.8 Bibliographic Notes

247

(2)

Problems

247

(2)

7 Policy Search

249

(40)

7.1 Background

250

(3)

7.2 Gradient Search

253

(3)

7.3 Direct Policy Search for Finite Alternatives

256

(6)

7.4 The Knowledge Gradient Algorithm for Discrete Alternatives

262

(8)

7.5 Simulation Optimization

270

(4)

7.6 Why Does It Work?

274

(11)

7.7 Bibliographic Notes

285

(4)

Problems

286

(3)

8 Approximating Value Functions

289

(48)

8.1 Lookup Tables and Aggregation

290

(14)

8.2 Parametric Models

304

(10)

8.3 Regression Variations

314

(2)

8.4 Nonparametric Models

316

(9)

8.5 Approximations and the Curse of Dimensionality

325

(3)

8.6 Why Does It Work?

328

(5)

8.7 Bibliographic Notes

333

(4)

Problems

334

(3)

9 Learning Value Function Approximations

337

(46)

9.1 Sampling the Value of a Policy

337

(10)

9.2 Stochastic Approximation Methods

347

(2)

9.3 Recursive Least Squares for Linear Models

349

(7)

9.4 Temporal Difference Learning with a Linear Model

356

(2)

9.5 Bellman's Equation Using a Linear Model

358

(6)

9.6 Analysis of TD(0), LSTD, and LSPE Using a Single State

364

(2)

9.7 Gradient-Based Methods for Approximate Value Iteration

366

(5)

9.8 Least Squares Temporal Differencing with Kernel Regression

371

(2)

9.9 Value Function Approximations Based on Bayesian Learning

373

(3)

9.10 Why Does It Work

376

(3)

9.11 Bibliographic Notes

379

(4)

Problems

381

(2)

10 Optimizing While Learning

383

(36)

10.1 Overview of Algorithmic Strategies

385

(1)

10.2 Approximate Value Iteration and Q-Learning Using Lookup Tables

386

(11)

10.3 Statistical Bias in the Max Operator

397

(3)

10.4 Approximate Value Iteration and Q-Learning Using Linear Models

400

(2)

10.5 Approximate Policy Iteration

402

(6)

10.6 The Actor-Critic Paradigm

408

(2)

10.7 Policy Gradient Methods

410

(1)

10.8 The Linear Programming Method Using Basis Functions

411

(2)

10.9 Approximate Policy Iteration Using Kernel Regression

413

(2)

10.10 Finite Horizon Approximations for Steady-State Applications

415

(1)

10.11 Bibliographic Notes

416

(3)

Problems

418

(1)

11 Adaptive Estimation and Stepsizes

419

(38)

11.1 Learning Algorithms and Stepsizes

420

(5)

11.2 Deterministic Stepsize Recipes

425

(8)

11.3 Stochastic Stepsizes

433

(4)

11.4 Optimal Stepsizes for Nonstationary Time Series

437

(10)

11.5 Optimal Stepsizes for Approximate Value Iteration

447

(2)

11.6 Convergence

449

(2)

11.7 Guidelines for Choosing Stepsize Formulas

451

(1)

11.8 Bibliographic Notes

452

(5)

Problems

453

(4)

12 Exploration Versus Exploitation

457

(40)

12.1 A Learning Exercise: The Nomadic Trucker

457

(3)

12.2 An Introduction to Learning

460

(4)

12.3 Heuristic Learning Policies

464

(6)

12.4 Gittins Indexes for Online Learning

470

(7)

12.5 The Knowledge Gradient Policy

477

(5)

12.6 Learning with a Physical State

482

(10)

12.7 Bibliographic Notes

492

(5)

Problems

493

(4)

13 Value Function Approximations for Resource Allocation Problems

497

(44)

13.1 Value Functions versus Gradients

498

(1)

13.2 Linear Approximations

499

(2)

13.3 Piecewise-Linear Approximations

501

(4)

13.4 Solving a Resource Allocation Problem Using Piecewise-Linear Functions

505

(4)

13.5 The SHAPE Algorithm

509

(4)

13.6 Regression Methods

513

(3)

13.7 Cutting Planes

516

(12)

13.8 Why Does It Work?

528

(7)

13.9 Bibliographic Notes

535

(6)

Problems

536

(5)

14 Dynamic Resource Allocation Problems

541

(52)

14.1 An Asset Acquisition Problem

541

(6)

14.2 The Blood Management Problem

547

(10)

14.3 A Portfolio Optimization Problem

557

(3)

14.4 A General Resource Allocation Problem

560

(13)

14.5 A Fleet Management Problem

573

(7)

14.6 A Driver Management Problem

580

(5)

14.7 Bibliographic Notes

585

(8)

Problems

586

(7)

15 Implementation Challenges

593

(14)

15.1 Will ADP Work for Your Problem?

593

(1)

15.2 Designing an ADP Algorithm for Complex Problems

594

(2)

15.3 Debugging an ADP Algorithm

596

(1)

15.4 Practical Issues

597

(5)

15.5 Modeling Your Problem

602

(2)

15.6 Online versus Offline Models

604

(2)

15.7 If It Works, Patent It!

606

(1)

Bibliography

607

(16)

Index

623

WARREN B. POWELL, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell has authored more than 160 published articles on stochastic optimization, approximate dynamicprogramming, and dynamic resource management.

Approximate Dynamic Programming: Solving the Curses of Dimensionality 2nd edition [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv