Situation recognition from multimodal data

Vivek K. Singh, Siripen Pongpaichet, Ramesh Jain

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Situation recognition is the problem of deriving actionable insights from heterogeneous, real-time, big multimedia data to benefit human lives and resources in different applications. This tutorial will discuss the recent developments towards converting multitudes of data streams including weather patterns, stock prices, social media, traffic information, and disease incidents into actionable insights. For multiple decades, multimedia researchers have been building approaches like entity resolution, object detection, and scene recognition, to understand different aspects of the observed world. Unlike the past though, now we do not need to undertake sensemaking based on data coming from a single media element, modality, time-frame, or location of media capture. Real world phenomena are now being observed by multiple media streams, each complementing the other in terms of data characteristics, observed features, perspectives, and vantage points. Each of these multimedia streams can now be assumed to be available in realtime and increasingly larger portion of these come inscribed with space and time semantics. The number of such media elements available (e.g. Tweets, Flickr posts, sensor updates) is already in the order of trillions, and computing resources required for analyzing them are becoming increasingly available. We expect these trends to continue and one of the biggest challenges in multimedia computing in the near term to be that of concept recognition from such multimodal data. As shown in Figure 1, the challenges in situation recognition are fundamentally different from those in object or event recognition. They involve dealing with multiple media, each capturing real world phenomena from multiple vantage locations spread over time. Detecting situations in time to take appropriate actions for saving lives and resources can transform multiple aspects of human life including health, natural disaster, traffic, economy, social reforms, business decisions and so on. Examples of such relevant situations include beautiful-days/ hurricanes/ wildfires, traffic (jams / smooth/ normal), economic recessions/ booms, blockbusters, droughts/ great-monsoons, seasons (early-fall/ fall/ latefall), demonstrations/ celebrations, social uprisings/ happinessindex, flash-mobs, flocking and so on. This tutorial will provide the audience with a thorough theoretical and practical grounding on the field of situation recognition. It will bring together the work by multiple scholars working in the area of situation recognition both within and outside the multimedia research community. The attendees would be introduced to the different interpretations of situations across multiple fields, and how it builds upon and extends the efforts on object detection, event detection, scene recognition and so on. The tutorial will provide a review of recent efforts within the multimedia community towards detecting real-time situations, and the attendees will be introduced to multiple practical situation recognition approaches and applications. Specific attention will be paid to discussing the relevant open research challenges for the community to extensively advance the state of the art in situation recognition.

Original languageEnglish (US)
Title of host publicationICMR 2016 - Proceedings of the 2016 ACM International Conference on Multimedia Retrieval
PublisherAssociation for Computing Machinery, Inc
Number of pages2
ISBN (Electronic)9781450343596
StatePublished - Jun 6 2016
Event6th ACM International Conference on Multimedia Retrieval, ICMR 2016 - New York, United States
Duration: Jun 6 2016Jun 9 2016

Publication series

NameICMR 2016 - Proceedings of the 2016 ACM International Conference on Multimedia Retrieval


Other6th ACM International Conference on Multimedia Retrieval, ICMR 2016
Country/TerritoryUnited States
CityNew York

All Science Journal Classification (ASJC) codes

  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction
  • Software


  • Concept detection
  • Event detection
  • Events
  • Multimedia data fusion
  • Situation recognition


Dive into the research topics of 'Situation recognition from multimodal data'. Together they form a unique fingerprint.

Cite this