Semi-Markov decision processes: Nonstandard criteria

M. Baykal-GÜRSOY, K. Gürsoy

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Considered are semi-Markov decision processes (SMDPs) with finite state and action spaces. We study two criteria: the expected average reward per unit time subject to a sample path constraint on the average cost per unit time and the expected time-average variability. Under a certain condition, for communicating SMDPs, we construct (randomized) stationary policies that are ε-optimal for each criterion; the policy is optimal for the first criterion under the unichain assumption and the policy is optimal and pure for a specific variability function in the second criterion. For general multichain SMDPs, by using a state space decomposition approach, similar results are obtained.

Original languageEnglish (US)
Pages (from-to)635-657
Number of pages23
JournalProbability in the Engineering and Informational Sciences
Volume21
Issue number4
DOIs
StatePublished - Oct 2007

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Management Science and Operations Research
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'Semi-Markov decision processes: Nonstandard criteria'. Together they form a unique fingerprint.

Cite this