TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping

Jong Youl Choi, Jeremy Logan, Matthew Wolf, George Ostrouchov, Tahsin Kurc, Qing Liu, Norbert Podhorszki, Scott Klasky, Melissa Romanus, Qian Sun, Manish Parashar, Randy Michael Churchill, Cs Chang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Task mapping is an important problem in parallel and distributed computing. The goal in task mapping is to find an optimal layout of the processes of an application (or a task) onto a given network topology. We target this problem in the context of staging applications. A staging application consists of two or more parallel applications (also referred to as staging tasks) which run concurrently and exchange data over the course of computation. Task mapping becomes a more challenging problem in staging applications, because not only data is exchanged between the staging tasks, but also the processes of a staging task may exchange data with each other. We propose a novel method, called Task Graph Embedding (TGE), that harnesses the observable graph structures of parallel applications and network topologies. TGE employs a machine learning based algorithm to find the best representation of a graph, called an embedding, onto a space in which the task-To-processor mapping problem can be solved. We evaluate and demonstrate the effectiveness of TGE experimentally with the communication patterns extracted from runs of XGC, a large-scale fusion simulation code, on Titan.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages587-591
Number of pages5
ISBN (Electronic)9781538623268
DOIs
StatePublished - Sep 22 2017
Event2017 IEEE International Conference on Cluster Computing, CLUSTER 2017 - Honolulu, United States
Duration: Sep 5 2017Sep 8 2017

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2017-September
ISSN (Print)1552-5244

Other

Other2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
CountryUnited States
CityHonolulu
Period9/5/179/8/17

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Signal Processing

Fingerprint Dive into the research topics of 'TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping'. Together they form a unique fingerprint.

Cite this