A first view of exedra: A domain-specific language for large graph analytics workflows
Abstract
In recent years, many programming models, software libraries, and middleware have appeared for processing large graphs of various forms. However, there exists a significant usability gap between the graph analysis scientists, and High Performance Computing (HPC) application programmers due to the complexity of HPC graph analysis software. In this paper we provide a basic view of Exedra, a domain-specific language (DSL) for large graph analysis in which we aim to eliminate the aforementioned complexities. Exedra consists of high level language constructs for specifying different graph analysis tasks on distributed environments. We implemented Exedra DSL on a scalable graph analysis platform called Dipper. Dipper uses Igraph/R interface for creating graph analysis workflows which in turn gets translated to Exedra statements. Exedra statements are interpreted by Dipper interpreter, and gets mapped to user specified libraries/ middleware. Exedra DSL allows for synthesize of graph algorithms that are more efficient compared to bare use of graph libraries while maintaining a standard interface that could use even future graph analysis software. We evaluated Exedra's feasibility for expressing graph analysis tasks by running Dipper on a cluster of four nodes. We observed that Dipper has the ability of reducing the time taken for graph analysis when the workflow was distributed on all four nodes despite the communication, and data format conversion overhead of the Dipper framework.