Table of Contents:
  • v. 1. [no special title]
  • v. 2. Approximate dynamic programming
  • VOLUME 1 : 1 THE DYNAMIC PROGRAMMING ALGORITHM
  • 1.1. Introduction, p.2
  • 1.2. The basic problem, p.14
  • 1.3. The dynamic programming algorithm, p.20
  • 1.4. State augmentation and other reformulations, p.37
  • 1.5. Some mathematical issues, p.44
  • 1.6. Dynamic programming and minimax control, p.49
  • 1.7. Notes, sources, excercises, p.53
  • 2. DETERMINISTIC SYSTEMS AND THE SHORTEST PATH PROBLEM
  • 2.1. Finite-state systems and shortest paths, p.69
  • 2.2. Some shortest path applications, p.72
  • 2.3. Shortest path algorithms, p.81
  • 2.4. Notes, sources, and exercises, p.101
  • 3. PROBLEMS WITH PERFECT STATE INFORMATION
  • 3.1. Linear systems and quadratic cost, p.110
  • 3.2. Inventory control, p.125
  • 3.3. Dynamic portfolio analysis, p.134
  • 3.4. Optimal stopping problems, p.140
  • 3.5. Scheduling an dthe interchange argument, p.150
  • 3.6. Set-membership description of uncertainty, p.154
  • 3.7. Notes, sources, exercises, p.165
  • 4. PROBLEMS WITH IMPERFECT STATE INFORMATION
  • 4.1. Reduction to the perfect information case, p.184
  • 4.2. Linear systems and quadratic cost, p.195
  • 4.3. Sufficient statistics, p.202
  • 4.4. Notes, sources, and exercises, p.221
  • 5. INTRODUCTION TO INFINITE HORIZON PROBLEMS
  • 5.1. An overwiew, p.232
  • 5.2. Stochastic shortest path problems, p.236
  • 5.3. Computational methods, p.245
  • 5.4. Discounted problems, p.249
  • 5.5. Average cost per stage problems, p.253
  • 5.6. Semi-Markov problems, p.267
  • 5.7. Notes, sources, and exercises, p.277
  • 6. APPROXIMATE DYNAMIC PROGRAMMING
  • 6.1. Cost approximation and limited lookahead, p.296
  • 6.2. Problem approximation, p.307
  • 6.3. Parametric cost approximation, p.327
  • 6.4. On-line approximation and optimization, p.352
  • 6.5. Simulation-based cost-to-go approximation, p.389
  • 6.6. Aproximation in policy space, p.395
  • 6.7. Adaptive control, p.397
  • 6.8. Discretization issues, p.405
  • 6.9. Notes, sources, and exercises, p.408
  • 7. DETERMINISTIC CONTINUOUS-TIME OPTIMAL CONTROL
  • 7.1. Continuous-time optimal control, p.426
  • 7.2. The Hamilton-Jakobi-Bellman equation, p.429
  • 7.3. The Pontryagin minimum principle, p.435
  • 7.4. Extensions of the minimum principle, p.451
  • 7.5. Notes, sources, and exercises, p.461
  • Appendix A: A MATHEMATICAL REVIEW
  • Appendix B: ON OPTIMIZATION THEORY
  • Appendix C: ON PROBABILITY THEORY
  • Appendix D: ON FINITE-STATE MARKOV CHAINS
  • Appendix E: LEAST SQUARES ESTIMATION AND KALMAN FILTERING
  • Appendix F: FORMULATING PROBLEMS OF DECISION UNDER UNCERTAINTY
  • References, p.533
  • Index, p.551
  • VOLUME 1 : 1 THE DYNAMIC PROGRAMMING ALGORITHM
  • 1.1. Introduction, p.2
  • 1.2. The basic problem, p.14
  • 1.3. The dynamic programming algorithm, p.20
  • 1.4. State augmentation and other reformulations, p.37
  • 1.5. Some mathematical issues, p.44
  • 1.6. Dynamic programming and minimax control, p.49
  • 1.7. Notes, sources, exercises, p.53
  • 2. DETERMINISTIC SYSTEMS AND THE SHORTEST PATH PROBLEM
  • 2.1. Finite-state systems and shortest paths, p.69
  • 2.2. Some shortest path applications, p.72
  • 2.3. Shortest path algorithms, p.81
  • 2.4. Notes, sources, and exercises, p.101
  • 3. PROBLEMS WITH PERFECT STATE INFORMATION
  • 3.1. Linear systems and quadratic cost, p.110
  • 3.2. Inventory control, p.125
  • 3.3. Dynamic portfolio analysis, p.134
  • 3.4. Optimal stopping problems, p.140
  • 3.5. Scheduling and the interchange argument, p.150
  • 3.6. Set-membership description of uncertainty, p.154
  • 3.7. Notes, sources, exercises, p.165
  • 4. PROBLEMS WITH IMPERFECT STATE INFORMATION
  • 4.1. Reduction to the perfect information case, p.184
  • 4.2. Linear systems and quadratic cost, p.195
  • 4.3. Sufficient statistics, p.202
  • 4.4. Notes, sources, and exercises, p.221
  • 5. INTRODUCTION TO INFINITE HORIZON PROBLEMS
  • 5.1. An overview, p.232
  • 5.2. Stochastic shortest path problems, p.236
  • 5.3. Computational methods, p.245
  • 5.4. Discounted problems, p.249
  • 5.5. Average cost per stage problems, p.253
  • 5.6. Semi-Markov problems, p.267
  • 5.7. Notes, sources, and exercises, p.277
  • 6. APPROXIMATE DYNAMIC PROGRAMMING
  • 6.1. Cost approximation and limited lookahead, p.296
  • 6.2. Problem approximation, p.307
  • 6.3. Parametric cost approximation, p.327
  • 6.4. On-line approximation and optimization, p.352
  • 6.5. Simulation-based cost-to-go approximation, p.389
  • 6.6. Aproximation in policy space, p.395
  • 6.7. Adaptive control, p.397
  • 6.8. Discretization issues, p.405
  • 6.9. Notes, sources, and exercises, p.408
  • 7. DETERMINISTIC CONTINUOUS-TIME OPTIMAL CONTROL
  • 7.1. Continuous-time optimal control, p.426
  • 7.2. The Hamilton-Jakobi-Bellman equation, p.429
  • 7.3. The Pontryagin minimum principle, p.435
  • 7.4. Extensions of the minimum principle, p.451
  • 7.5. Notes, sources, and exercises, p.461
  • Appendix A: A MATHEMATICAL REVIEW
  • Appendix B: ON OPTIMIZATION THEORY
  • Appendix C: ON PROBABILITY THEORY
  • Appendix D: ON FINITE-STATE MARKOV CHAINS
  • Appendix E: LEAST SQUARES ESTIMATION AND KALMAN FILTERING
  • Appendix F: FORMULATING PROBLEMS OF DECISION UNDER UNCERTAINTY
  • References, p.533
  • Index, p.551
  • VOLUME 2 : Approximate Dynamic Programming
  • 1 DICOUNTED PROBLEMS
  • THEORY
  • 1.1. Minimization of total cost
  • introduction, p.3
  • 1.2. Discounted problems
  • bounded cost per stage, p.14
  • 1.3. Scheduling and multiarmed bandit problems, p.22
  • 1.4. Discounted continuous-time problems, p.32
  • 1.5. The role of contraction mappings, p.45
  • 1.6. General forms of discounted dynamic programming, p.57
  • 1.7. Notes, sources, and exercises, p.71
  • 2. DISCOUNTED PROBLEMS
  • COMPUTATIONAL METHODS
  • 2.1. Markovian decision problems, p.82
  • 2.2. Value iteration, p.84
  • 2.3. Policy iteration, p.97
  • 2.4. Linear programming methods, p.112
  • 2.5. Methods for general discounted problems, p.115
  • 2.6. Asynchronous algorithms, p.138
  • 2.7. Notes, Sources, and exercises, p.156
  • 3. STOCHASTIC SHORTEST PATH PROBLEMS
  • 3.1. Problem formulation, p.172
  • 3.2. Main results, p.175
  • 3.3. Underlying contraction properties, p.182
  • 3.4. Value iteration, p.184
  • 3.5. Policy iteration, p.189
  • 3.6. Countable-state problems, p.201
  • 3.7. Notes, sources,and exercises, p.204
  • 4. UNDISCOUNTED PROBLEMS
  • 4.1. Unbounded costs per stage, p.214
  • 4.2. Linear systems and quadratic, p.231
  • 4.3. Inventory control, p.233
  • 4.4. Optimal stopping, p.235
  • 4.5. Optimal gambling strategies, p.241
  • 4.6. Continuous-time problems
  • control of queues, p.248
  • 4.7. Nonstationary and periodic problems, p.256
  • 4.8. Notes, sources, and exercises, p.261
  • 5. AVERAGE COST PER STAGE PROBLEMS
  • 5.1. Finite-spaces average cost models, p.274
  • 5.2. Conditions for equal average cost for all initial states, p.298
  • 5.3. Value iteration, p.304
  • 5.4. Policy iteration, p.329
  • 5.5. Linear programming, p.339
  • 5.6. Infinite-spaces average cost models, p.345
  • 5.7. Notes, sources, and exercises, p.374
  • 6. APPROXIMATE DYNAMIC PROGRAMMING
  • DISCOUNTED MODELS
  • 6.1. General issues of simulation-based cost approximation, p.391
  • 6.2. Direct policy evaluation
  • gradient methods, p.418
  • 6.3. Projected Equation methods for policy evaluation, p.423
  • 6.4. Policy iteration issues, p.451
  • 6.5. Aggregation methods, p.474
  • 6.6 Q-learning, p.493
  • 6.7. Notes, sources, and exercises, p.511
  • 7. APPROXIMATE DYNAMIC PROGRAMMING
  • NONDISCOUNTED MODELS AND GENERALIZATIONS
  • 7.1. Stochastic shortest path problems, p.532
  • 7.2. Average cost problems, p.537
  • 7.3. General problems and Monte Carlo linear algebra, p.552
  • 7.4. Approximation in policy space, p.620
  • 7.5. Notes, sources, and exercises, p.629
  • Appendix A : MEASURE-THEORETIC ISSUES IN DYNAMIC PROGRAMMING
  • References, p.657
  • Index, p.691
  • Volume 1. [no special title]
  • volume 2. Approximate dynamic programming
  • v. 1. [no special title]
  • v. 2. Approximate dynamic programming