Abstract
We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive risk-averse dynamic programming equations and a value iteration method. For the infinite horizon problem we develop a risk-averse policy iteration method and we prove its convergence. We also propose a version of the Newton method to solve a nonsmooth equation arising in the policy iteration method and we prove its global convergence. Finally, we discuss relations to min-max Markov decision models.
Original language | English (US) |
---|---|
Pages (from-to) | 235-261 |
Number of pages | 27 |
Journal | Mathematical Programming |
Volume | 125 |
Issue number | 2 |
DOIs | |
State | Published - Oct 2010 |
All Science Journal Classification (ASJC) codes
- Software
- Mathematics(all)
Keywords
- Dynamic risk measures
- Markov risk measures
- Min-max Markov models
- Nonsmooth Newton's method
- Policy iteration
- Value iteration