Stream processing for near real-time scientific data analysis

Jong Youl Choi, Tahsin Kurc, Jeremy Logan, Matthew Wolf, Eric Suchyta, James Kress, David Pugmire, Norbert Podhorszki, Eun Kyu Byun, Mark Ainsworth, Manish Parashar, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

The demand for near real-time analysis of streaming data is increasing rapidly in scientific projects. This trend is driven by the fact that it is expensive and time consuming to design and execute complex experiments and simulations. During an experiment, the research team and the team at the experiment facility will want to analyze data as it is generated, interpret it, and collaboratively make decisions to modify the experiment parameters or abort the experiment in order to prevent events that may damage experimental instruments or to avoid wasting resources if there is a problem. The increasing velocity and volume of streaming data and the multi-institutional nature of large-scale scientific projects present challenges to near real-time analysis of streaming data. In this work we develop a framework to address these challenges. This framework provides an interface for applications to define and interact with named, self-describing streams, takes advantage of advanced network technologies, and implements support for the reduction and compression of data at the source. We describe this framework and demostrate its application in three scientific applications.

Original languageEnglish (US)
Title of host publication2016 New York Scientific Data Summit, NYSDS 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781467390514
DOIs
StatePublished - Nov 17 2016
Event2016 New York Scientific Data Summit, NYSDS 2016 - New York, United States
Duration: Aug 14 2016Aug 17 2016

Publication series

Name2016 New York Scientific Data Summit, NYSDS 2016 - Proceedings

Other

Other2016 New York Scientific Data Summit, NYSDS 2016
CountryUnited States
CityNew York
Period8/14/168/17/16

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Stream processing for near real-time scientific data analysis'. Together they form a unique fingerprint.

Cite this