X10 and APGAS at petascale
Olivier Tardieu, Benjamin Herta, et al.
PPoPP 2014
The X10 programming language is intended to ease the programming of scalable concurrent and distributed applications. X10 augments a familiar imperative object-oriented programming model with constructs to support light-weight asynchronous tasks as well as execution across multiple address spaces. A crucial aspect of X10's runtime system is the scheduling of concurrent tasks. Workstealing schedulers have been shown to efficiently load balance fine-grain divide-and-conquer task-parallel program on SMPs and multicores. But X10 is not limited to shared-memory fork-join parallelism. X10 permits tasks to suspend and synchronize by means of conditional atomic blocks and remote task invocations. In this paper, we demonstrate that work-stealing scheduling principles are applicable to a rich programming language such as X10, achieving performance at scale without compromising expressivity, ease of use, or portability.We design and implement a portable work-stealing execution engine for X10.While this engine is biased toward the efficient execution of fork-join parallelism in shared memory, it handles the full X10 language, especially conditional atomic blocks and distribution. We show that this engine improves the run time of a series of benchmark programs by several orders of magnitude when used in combination with the C++ backend compiler and runtime for X10. It achieves scaling comparable to state-of-the art work-stealing scheduler implementations-the Cilk++ compiler and the Java fork/join framework-despite the dramatic increase in generality. Copyright © 2012 ACM.
Olivier Tardieu, Benjamin Herta, et al.
PPoPP 2014
Jeeva Paudel, Olivier Tardieu, et al.
HiPC 2014
Martin Hirzel, Rodric Rabbah, et al.
SEMS/ICSE 2015
Tong Chen, Haibo Lin, et al.
ICS 2008