Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure

Shantenu Jha, Daniel S. Katz, Andre Luckow, Neil Chue Hong, Omer Rana, Yogesh Simmhan

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Datasets are growing larger and becoming distributed; their location, availability, and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While “static” data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.

Original languageEnglish (US)
Article numbere4032
JournalConcurrency Computation
Volume29
Issue number8
DOIs
StatePublished - Apr 25 2017

Fingerprint

Infrastructure
Engineering Application
Software System
Availability
Scenarios
Requirements

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Keywords

  • data intensive
  • distributed
  • dynamic
  • scientific applications

Cite this

Jha, Shantenu ; Katz, Daniel S. ; Luckow, Andre ; Chue Hong, Neil ; Rana, Omer ; Simmhan, Yogesh. / Introducing distributed dynamic data-intensive (D3) science : Understanding applications and infrastructure. In: Concurrency Computation. 2017 ; Vol. 29, No. 8.
@article{748e2e2fd9aa4d7c8295fb6fa97d66f5,
title = "Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure",
abstract = "A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Datasets are growing larger and becoming distributed; their location, availability, and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While “static” data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.",
keywords = "data intensive, distributed, dynamic, scientific applications",
author = "Shantenu Jha and Katz, {Daniel S.} and Andre Luckow and {Chue Hong}, Neil and Omer Rana and Yogesh Simmhan",
year = "2017",
month = "4",
day = "25",
doi = "10.1002/cpe.4032",
language = "English (US)",
volume = "29",
journal = "Concurrency Computation Practice and Experience",
issn = "1532-0626",
publisher = "John Wiley and Sons Ltd",
number = "8",

}

Introducing distributed dynamic data-intensive (D3) science : Understanding applications and infrastructure. / Jha, Shantenu; Katz, Daniel S.; Luckow, Andre; Chue Hong, Neil; Rana, Omer; Simmhan, Yogesh.

In: Concurrency Computation, Vol. 29, No. 8, e4032, 25.04.2017.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Introducing distributed dynamic data-intensive (D3) science

T2 - Understanding applications and infrastructure

AU - Jha, Shantenu

AU - Katz, Daniel S.

AU - Luckow, Andre

AU - Chue Hong, Neil

AU - Rana, Omer

AU - Simmhan, Yogesh

PY - 2017/4/25

Y1 - 2017/4/25

N2 - A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Datasets are growing larger and becoming distributed; their location, availability, and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While “static” data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.

AB - A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Datasets are growing larger and becoming distributed; their location, availability, and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While “static” data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.

KW - data intensive

KW - distributed

KW - dynamic

KW - scientific applications

UR - http://www.scopus.com/inward/record.url?scp=85011649864&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85011649864&partnerID=8YFLogxK

U2 - 10.1002/cpe.4032

DO - 10.1002/cpe.4032

M3 - Article

AN - SCOPUS:85011649864

VL - 29

JO - Concurrency Computation Practice and Experience

JF - Concurrency Computation Practice and Experience

SN - 1532-0626

IS - 8

M1 - e4032

ER -