Dynamic Programming 9781400835386

This classic book is an introduction to dynamic programming, presented by the scientist who coined the term and develope

199 33 23MB

English Pages 392 [372] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Programming Interview Problems: Dynamic Programming (with solutions in Python)

4,448 743 13MB Read more

Dynamic Programming [Reprint ed.] 0486428095, 9780486428093

An introduction to the mathematical theory of multistage decision processes, this text takes a "functional equation

259 21 13MB Read more

Dynamic Programming for Coding Interviews 9781946556707

1,692 249 3MB Read more

DPMax: Dynamic Programming to the Max Third Edition 1999575881, 9781999575885

DPMax stands for 'dynamic programming to the max'. It highlights the graphical and textual analyses of 2 of th

215 83 2MB Read more

Adaptive dynamic programming: single and multiple controllers 9789811317118, 9789811317125

520 40 3MB Read more

Programming Interview Problems: Dynamic Programming (with solutions in Python) [1 ed.]

• Are you preparing for a programming interview? • Would you like to work at one of the Internet giants, such as Google,

1,806 284 6MB Read more

Programming PHP: Creating Dynamic Web Pages 9781492054085, 1492054089

Why is PHP the most widely used programming language on the web? This updated edition teaches everything you need to kno

1,157 136 4MB Read more

Dynamic programming and inventory control 9781607507697, 9781607507703, 1607507692

Presents a unified theory of dynamic programming and Markov decision processes and its application to a major field of o

335 49 3MB Read more

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control [1 ed.] 111810420X, 9781118104200

Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in

937 199 44MB Read more

An Introduction to Optimal Control Theory: The Dynamic Programming Approach 3031211383, 9783031211386

This book introduces optimal control problems for large families of deterministic and stochastic systems with discrete o

305 44 2MB Read more

Dynamic Programming
9781400835386

Author / Uploaded
Richard E. Bellman

Citation preview

DYNAMIC PROGRAMMING

DYNAMIC PROGRAMMING BY

RICHARD BELLMAN With a new introduction by Stuart Dreyfus

PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD

Copyright © 1957 by Princeton University Press New introduction © 2010 by Princeton University Press Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, 6 Oxford Street, Woodstock, Oxfordshire OX20 1TW press.princeton.edu First edition, 1957 First Princeton Landmarks in Mathematics edition, with a new introduction, 2010 Library of Congress Control Number 2009943155 ISBN 978-0-691-14668-3 Printed on acid-free paper. °o Printed in the United States of America 1 3 5 7 9

10

8 6 4 2

To Betty-Jo whose decision processes defy analysis

Contents INTRODUCTION TO THE 2010 EDITION

XV

PREFACE

Xix

CHAPTER I A MULTI-STAGE ALLOCATION PROCESS SECTION

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.20 1.21 1.22 1.23 1.24 1.25 1.26

Introduction A multi-stage allocation process Discussion Functional equation approach Discussion A multi-dimensional maximization problem A "smoothing" problem Infinite stage approximation Existence and uniqueness theorems Successive approximations Approximation in policy space Properties of the solution—I: Convexity Properties of the solution—II: Concavity Properties of the solution—III: Concavity An "ornery" example A particular example—I A particular example—II Approximation and stability Time-dependent processes Multi-activity processes Multi-dimensional structure theorems Locating the unique maximum of a concave function Continuity and memory Stochastic allocation processes Functional equations Stieltjes integrals Exercises and research problems Bibliography and comments

. . .

3 4 5 7 9 10 10 11 12 16 16 19 20 22 25 26 28 29 30 31 33 34 37 38 39 40 40 59 vii

CONTENTS

CHAPTER II A STOCHASTIC MULTI-STAGE DECISION PROCESS SECTION

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14

Introduction 61 Stochastic gold-mining 61 Enumerative treatment 62 Functional equation approach 63 Infinite stage approximation 63 Existence and uniqueness 64 Approximation in policy space and monotone convergence . 65 The solution 66 Discussion 69 Some generalizations 69 The form off (x, y) 71 The problem for a finite number of stages 72 A three-choice problem 74 A stability theorem 76 Exercises and research problems 77 Bibliography and comments 79 CHAPTER III THE STRUCTURE OF DYNAMIC PROGRAMMING PROCESSES

81 Introduction Discussion of the two preceding processes 81 The principle of optimality 83 Mathematical formulation—I. A discrete deterministic process 83 Mathematical formulation—II. A discrete stochastic process 85 Mathematical formulation—III. A continuous deterministic process 86 3.7 Continuous stochastic processes 87 3.8 Generalizations 87 3.9 Causality and optimality 87 3.10 Approximation in policy space 88 Exercises and research problems 90 Bibliography and comments 115

3.1 3.2 3.3 3.4 3.5 3.6

viii

CONTENTS

CHAPTER IV EXISTENCE AND UNIQUENESS THEOREMS SECTION

4.1 Introduction 4.2 A fundamental inequality 4.3 Equations of type one 4.4 Equations of type two 4.5 Monotone convergence 4.6 Stability theorems 4.7 Some directions of generalization 4.8 An equation of the third type 4.9 An "optimal inventory" equation Exercises and research problems Bibliography and comments

116 117 119 121 122 123 124 125 129 132 151

CHAPTER V THE OPTIMAL INVENTORY EQUATION 5.1 5.2

5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15

Introduction 152 Formulation of the general problem 153 A. Finite total time period 154 B. Unbounded time period—discounted cost 156 C. Unbounded time period—partially expendable items . . 1 5 6 D. Unbounded time period—one period lag in supply . . . 156 E. Unbounded time period—two period lag 157 A simple observation 157 Constant stock level—preliminary discussion 158 Proportional cost—one-dimensional case 159 Proportional cost—multi-dimensional case 164 Finite time period 166 Finite time—multi-dimensional case 169 Non-proportional penalty cost—red tape 169 Particular cases 171 The form of the general solution 171 Fixed costs 172 Preliminaries to a discussion of more complicated policies . 173 Unbounded process—one period time lag 173 Convex cost function—unbounded process 176 Exercises and research problems 178 Bibliography and comments 182 ix

CONTENTS

CHAPTER VI BOTTLENECK PROBLEMS IN MULTI-STAGE PRODUCTION PROCESSES SECTION

6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15

Introduction 183 A general class of multi-stage production problems . . . . 1 8 4 Discussion of the preceding model 187 Functional equations 188 A continuous version 189 Notation 191 Dynamic programming formulation 192 The basic functional equation 192 The resultant nonlinear partial differential equation . . . . 1 9 3 Application of the partial differential equation 193 A particular example 194 A dual problem 197 Verification of the solution given in § 10 200 Computational solution 202 Nonlinear problems 203 Exercises and research problems 204 Bibliography and comments 205 CHAPTER VII BOTTLENECK PROBLEMS: EXAMPLES

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8

Introduction Preliminaries Delta-functions The solution The modified w solution The equilibrium solution A short-time w solution Description of solution and proof Bibliography and comments

207 209 211 211 214 215 217 218 221

CHAPTER VIII A CONTINUOUS STOCHASTIC DECISION PROCESS 8.1 8.2 8.3 x

Introduction Continuous versions—I: A differential approach ' Continuous versions—II: An integral approach

222 223 224

CONTENTS SECTION

8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18

Preliminary discussion Mixing at a point Reformulation of the gold-mining process Derivation of the differential equations The variational procedure The behavior of Kt The solution for T = = oo Solution for finite total time The three-choice problem Some lemmas and preliminary results Mixed policies The solution for infinite time, D > 0 D P'. q.