This book concerns continuous-time controlled Markov chains, also known as continuous-time Markov decision processes. They form a class of stochastic control problems in which a single decision-maker wishes to optimize a given objective function. This book is also concerned with Markov games, where two decision-makers (or players) try to optimize their own objective function. Both decision-making processes appear in a large number of applications in economics, operations research, engineering, and computer science, among other areas. An extensive, self-contained, up-to-date analysis of basic optimality criteria (such as discounted and average reward), and advanced optimality criteria (e.g., bias, overtaking, sensitive discount, and Blackwell optimality) is presented. A particular emphasis is made on the application of the results herein: algorithmic and computational issues are discussed, and applications to population models and epidemic processes are shown. This book is addressed to students and researchers in the fields of stochastic control and stochastic games. Moreover, it could be of interest also to undergraduate and beginning graduate students because the reader is not supposed to have a high mathematical background: a working knowledge of calculus, linear algebra, probability, and continuous-time Markov chains should suffice to understand the contents of the book.

Description-Table Of Contents

1. Introduction. 1.1. Preliminary examples. 1.2. Overview of the book. 1.3. Contents. 1.4. Notation -- 2. Controlled Markov chains. 2.1. Introduction. 2.2. The control model. 2.3. Existence of controlled Markov chains. 2.4. Exponential ergodicity. 2.5. Proof of Theorem 2.11. 2.6. Conclusions -- 3. Basic optimality criteria. 3.1. Introduction. 3.2. The finite horizon case. 3.3. The infinite horizon discounted reward. 3.4. The long-run expected average reward. 3.5. The vanishing discount approach to average optimality. 3.6. Pathwise average optimality. 3.7. Canonical triplets and finite horizon control problems. 3.8. Conclusions -- 4. Policy iteration and approximation theorems. 4.1. Introduction. 4.2. The policy iteration algorithm. 4.3. Approximating discounted reward CMCs. 4.4. Approximating average reward CMCs. 4.5. Conclusions -- 5. Overtaking, bias, and variance optimality. 5.1. Introduction. 5.2. Bias and overtaking optimality. 5.3. Variance minimization. 5.4. Comparison of variance and overtaking optimality. 5.5. Conclusions -- 6. Sensitive discount optimality. 6.1. Introduction. 6.2. The Laurent series expansion. 6.3. The vanishing discount approach (revisited). 6.4. The average reward optimality equations. 6.5. Strong discount optimality. 6.6. Sensitive discount optimality in the class of stationary policies. 6.7. Conclusions -- 7. Blackwell optimality. 7.1. Introduction. 7.2. Blackwell optimality in the class of stationary policies. 7.3. Blackwell optimality in the class of all policies. 7.4. Conclusions -- 8. Constrained controlled Markov chains. 8.1. Introduction. 8.2. Discounted reward constrained CMCs. 8.3. Average reward constrained CMCs. 8.4. Pathwise constrained CMCs. 8.5. The vanishing discount approach to constrained CMCs. 8.6. Conclusions -- 9. Applications. 9.1. Introduction. 9.2. Controlled queueing systems. 9.3. A controlled birth-and-death process. 9.4. A population system with catastrophes. 9.5. Controlled epidemic processes. 9.6. Conclusions -- 10. Zero-sum Markov games. 10.1. Introduction. 10.2. The zero-sum Markov game model. 10.3. Discount optimality. 10.4. Average optimality. 10.5. The family of average optimal strategies. 10.6. Conclusions -- 11. Bias and overtaking equilibria for Markov games. 11.1. Introduction. 11.2. Bias optimality. 11.3. Overtaking optimality. 11.4. A counterexample on bias and overtaking optimality. 11.5. Conclusions.