TY - GEN
T1 - In-situ feature-based objects tracking for large-scale scientific simulations
AU - Zhang, Fan
AU - Lasluisa, Solomon
AU - Jin, Tong
AU - Rodero, Ivan
AU - Bui, Hoang
AU - Parashar, Manish
PY - 2012
Y1 - 2012
N2 - Emerging scientific simulations on leadership class systems are generating huge amounts of data. However, the increasing gap between computation and disk I/O speeds makes traditional data analytics pipelines based on post-processing cost prohibitive and often infeasible. In this paper, we investigate an alternate approach that aims to bring the analytics closer to the data using data staging and the in-situ execution of data analysis operations. Specifically, we present the design, implementation and evaluation of a framework that can support in-situ feature based object tracking on distributed scientific datasets. Central to this framework is the scalable decentralized and online clustering (DOC) and cluster tracking algorithm, which executes in-situ (on different cores) and in parallel with the simulation processes, and retrieves data from the simulations directly via on-chip shared memory. The results from our experimental evaluation demonstrate that the in-situ approach significantly reduces the cost of data movement, that the presented framework can support scalable feature-based object tracking, and that it can be effectively used for in-situ analytics for large scale simulations.
AB - Emerging scientific simulations on leadership class systems are generating huge amounts of data. However, the increasing gap between computation and disk I/O speeds makes traditional data analytics pipelines based on post-processing cost prohibitive and often infeasible. In this paper, we investigate an alternate approach that aims to bring the analytics closer to the data using data staging and the in-situ execution of data analysis operations. Specifically, we present the design, implementation and evaluation of a framework that can support in-situ feature based object tracking on distributed scientific datasets. Central to this framework is the scalable decentralized and online clustering (DOC) and cluster tracking algorithm, which executes in-situ (on different cores) and in parallel with the simulation processes, and retrieves data from the simulations directly via on-chip shared memory. The results from our experimental evaluation demonstrate that the in-situ approach significantly reduces the cost of data movement, that the presented framework can support scalable feature-based object tracking, and that it can be effectively used for in-situ analytics for large scale simulations.
KW - Scientific data analysis
KW - feature-based object tracking
KW - scalable in-situ data analytics
UR - http://www.scopus.com/inward/record.url?scp=84876554712&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876554712&partnerID=8YFLogxK
U2 - 10.1109/SC.Companion.2012.100
DO - 10.1109/SC.Companion.2012.100
M3 - Conference contribution
AN - SCOPUS:84876554712
SN - 9780769549569
T3 - Proceedings - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
SP - 736
EP - 740
BT - Proceedings - 2012 SC Companion
T2 - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
Y2 - 10 November 2012 through 16 November 2012
ER -