|
|
1 | (19) |
|
Examples of constrained dynamic control problems |
|
|
1 | (2) |
|
On solution approaches for CMDPs with expected costs |
|
|
3 | (2) |
|
|
5 | (2) |
|
Cost criteria and assumptions |
|
|
7 | (1) |
|
The convex analytical approach and occupation measures |
|
|
8 | (2) |
|
Linear Programming and Lagrangian approach for CMDPs |
|
|
10 | (2) |
|
|
12 | (5) |
|
The structure of the book |
|
|
17 | (2) |
I Part One: Finite MDPs |
|
19 | (38) |
|
Markov decision processes |
|
|
21 | (6) |
|
|
21 | (2) |
|
Cost criteria and the constrained problem |
|
|
23 | (1) |
|
|
24 | (1) |
|
The dominance of Markov policies |
|
|
25 | (2) |
|
|
27 | (10) |
|
Occupation measure and the primal LP |
|
|
27 | (3) |
|
Dynamic programming and dual LP: the unconstrained case |
|
|
30 | (2) |
|
Constrained control: Lagrangian approach |
|
|
32 | (1) |
|
|
33 | (1) |
|
|
34 | (3) |
|
The expected average cost |
|
|
37 | (8) |
|
Occupation measure and the primal LP |
|
|
37 | (4) |
|
Equivalent Linear Program |
|
|
41 | (1) |
|
|
42 | (1) |
|
|
43 | (2) |
|
Flow and service control in a single-server queue |
|
|
45 | (12) |
|
|
45 | (2) |
|
|
47 | (6) |
|
The original constrained problem |
|
|
53 | (1) |
|
Structure of randomization and implementation issues |
|
|
53 | (1) |
|
On coordination between controllers |
|
|
54 | (1) |
|
|
55 | (2) |
II Part Two: Infinite MDPs |
|
57 | (124) |
|
MDPs with infinite state and action spaces |
|
|
59 | (16) |
|
|
59 | (2) |
|
|
61 | (1) |
|
Mixed policies and topologic structure* |
|
|
62 | (1) |
|
The dominance of Markov policies |
|
|
63 | (2) |
|
Aggregation of states* |
|
|
65 | (3) |
|
Extra randomization in the policies* |
|
|
68 | (2) |
|
Equivalent quasi-Markov model and quasi-Markov policies* |
|
|
70 | (5) |
|
The total cost: classification of MDPs |
|
|
75 | (26) |
|
Transient and Absorbing MDPs |
|
|
75 | (2) |
|
MDPs with uniform Lyapunov functions |
|
|
77 | (1) |
|
Equivalence of MDP with unbounded and bounded costs* |
|
|
78 | (6) |
|
Properties of MDPs with uniform Lyapunov functions* |
|
|
84 | (5) |
|
Properties for fixed initial distribution* |
|
|
89 | (4) |
|
Examples of uniform Lyapunov functions |
|
|
93 | (3) |
|
|
96 | (5) |
|
The total cost: occupation measures and the primal LP |
|
|
101 | (16) |
|
|
101 | (3) |
|
Continuity of occupation measures |
|
|
104 | (6) |
|
More properties of MDPs* |
|
|
110 | (1) |
|
Characterization of the sets of occupation measure |
|
|
110 | (2) |
|
Relation between cost and occupation measure |
|
|
112 | (2) |
|
Dominating classes of policies |
|
|
114 | (1) |
|
Equivalent Linear Program |
|
|
115 | (1) |
|
|
116 | (1) |
|
The total cost: Dynamic and Linear Programming |
|
|
117 | (20) |
|
Non-constrained control: Dynamic and Linear Programming |
|
|
118 | (1) |
|
Super-harmonic functions and Linear Programming |
|
|
118 | (9) |
|
|
127 | (1) |
|
Constrained control: Lagrangian approach |
|
|
128 | (3) |
|
|
131 | (1) |
|
|
132 | (1) |
|
A second LP approach for optimal mixed policies |
|
|
133 | (1) |
|
|
134 | (3) |
|
|
137 | (6) |
|
The equivalent total cost model |
|
|
137 | (1) |
|
Occupation measure and LP |
|
|
138 | (1) |
|
Non-negative immediate cost |
|
|
138 | (1) |
|
Weak contracting assumptions and Lyapunov functions |
|
|
139 | (1) |
|
Example: flow and service control |
|
|
140 | (3) |
|
The expected average cost |
|
|
143 | (22) |
|
|
143 | (4) |
|
Completeness properties of stationary policies |
|
|
147 | (3) |
|
Relation between cost and occupation measure |
|
|
150 | (4) |
|
Dominating classes of policies |
|
|
154 | (3) |
|
Equivalent Linear Program |
|
|
157 | (1) |
|
|
158 | (1) |
|
The contracting framework |
|
|
158 | (2) |
|
Other conditions for the uniform integrability |
|
|
160 | (1) |
|
The case of uniform Lyapunov conditions |
|
|
161 | (4) |
|
Expected average cost: Dynamic Programming and LP |
|
|
165 | (16) |
|
The non-constrained case: optimality inequality |
|
|
165 | (4) |
|
Non-constrained control: cost bounded below |
|
|
169 | (2) |
|
Dynamic programming and uniform Lyapunov function |
|
|
171 | (2) |
|
Superharmonic functions and linear programming |
|
|
173 | (3) |
|
|
176 | (1) |
|
Constrained control: Lagrangian approach |
|
|
176 | (2) |
|
|
178 | (1) |
|
A second LP approach for optimal mixed policies |
|
|
179 | (2) |
III Part Three: Asymptotic methods and approximations |
|
181 | (36) |
|
|
183 | (10) |
|
|
183 | (3) |
|
Approximation of the values |
|
|
186 | (4) |
|
Approximation and robustness of the policies |
|
|
190 | (3) |
|
Convergence of discounted constrained MDPs |
|
|
193 | (6) |
|
Convergence in the discount factor |
|
|
193 | (1) |
|
Convergence to the expected average cost |
|
|
194 | (1) |
|
The case of uniform Lyapunov function |
|
|
195 | (4) |
|
Convergence as the horizon tends to infinity |
|
|
199 | (6) |
|
|
199 | (1) |
|
The expected average cost: stationary policies |
|
|
200 | (1) |
|
The expected average cost: general policies |
|
|
201 | (4) |
|
State truncation and approximation |
|
|
205 | (12) |
|
The approximating sets of states |
|
|
206 | (2) |
|
|
208 | (3) |
|
Scheme II: the total cost |
|
|
211 | (3) |
|
Scheme III: the total cost |
|
|
214 | (1) |
|
The expected average cost |
|
|
214 | (1) |
|
Infinite MDPs: on the number of randomizations |
|
|
215 | (2) |
Appendix: Convergence of probability measures |
|
217 | (4) |
References |
|
221 | (14) |
List of Symbols and Notation |
|
235 | (4) |
Index |
|
239 | |