Conditional models for contextual human motion recognition

Cristian Sminchisescu, Atul Kanaujia, Dimitris Metaxas

Research output: Contribution to journalArticlepeer-review

167 Scopus citations


We describe algorithms for recognizing human motion in monocular video sequences, based on discriminative conditional random fields (CRFs) and maximum entropy Markov models (MEMMs). Existing approaches to this problem typically use generative structures like the hidden Markov model (HMM). Therefore, they have to make simplifying, often unrealistic assumptions on the conditional independence of observations given the motion class labels and cannot accommodate rich overlapping features of the observation or long-term contextual dependencies among observations at multiple timesteps. This makes them prone to myopic failures in recognizing many human motions, because even the transition between simple human activities naturally has temporal segments of ambiguity and overlap. The correct interpretation of these sequences requires more holistic, contextual decisions, where the estimate of an activity at a particular timestep could be constrained by longer windows of observations, prior and even posterior to that timestep. This would not be computationally feasible with a HMM which requires the enumeration of a number of observation sequences exponential in the size of the context window. In this work we follow a different philosophy: instead of restrictively modeling the complex image generation process - the observation, we work with models that can unrestrictedly take it as an input, hence condition on it. Conditional models like the proposed CRFs seamlessly represent contextual dependencies and have computationally attractive properties: they support efficient, exact recognition using dynamic programming, and their parameters can be learned using convex optimization. We introduce conditional graphical models as complementary tools for human motion recognition and present an extensive set of experiments that show not only how these can successfully classify diverse human activities like walking, jumping, running, picking or dancing, but also how they can discriminate among subtle motion styles like normal walks and wander walks.

Original languageEnglish (US)
Pages (from-to)210-220
Number of pages11
JournalComputer Vision and Image Understanding
Issue number2-3 SPEC. ISS.
StatePublished - Nov 2006

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition


  • Conditional models
  • Discriminative models
  • Feature selection
  • Hidden Markov models
  • Human motion recognition
  • Markov random fields
  • Multiclass logistic regression
  • Optimization


Dive into the research topics of 'Conditional models for contextual human motion recognition'. Together they form a unique fingerprint.

Cite this