Risk-averse dynamic programming for Markov decision processes

Research output: Contribution to journalArticlepeer-review

237 Scopus citations

Abstract

We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive risk-averse dynamic programming equations and a value iteration method. For the infinite horizon problem we develop a risk-averse policy iteration method and we prove its convergence. We also propose a version of the Newton method to solve a nonsmooth equation arising in the policy iteration method and we prove its global convergence. Finally, we discuss relations to min-max Markov decision models.

Original languageEnglish (US)
Pages (from-to)235-261
Number of pages27
JournalMathematical Programming
Volume125
Issue number2
DOIs
StatePublished - Oct 2010

All Science Journal Classification (ASJC) codes

  • Software
  • Mathematics(all)

Keywords

  • Dynamic risk measures
  • Markov risk measures
  • Min-max Markov models
  • Nonsmooth Newton's method
  • Policy iteration
  • Value iteration

Fingerprint

Dive into the research topics of 'Risk-averse dynamic programming for Markov decision processes'. Together they form a unique fingerprint.

Cite this