GraphBLAS: Handling performance concerns in large graph analytics - Invited paper
Abstract
Emerging applications in health-care, social media analytics, cybersecurity, homeland security, and marketing require large graph analytics. Attaining good performance on these applications on modern day hardware is challenging because of the complex pipelines and deep memory hierarchy of these machines. In this paper, we review the linear algebra formulation of graph-analytics and show that it effectively handles the separation of performance concerns, best handled by system developers, from application logic concerns. The linear algebra formulation leverages the community experience in optimizing both hardware and software for applications that have a substantial linear algebra component. We review the GraphBLAS API, a compact C API for linear algebra formulation of graph algorithms. The core semiring operations are described first, followed by the rest of the API. We then illustrate how commonly used graph algorithms are implemented using the main GraphBLAS API calls. Executing these algorithms on a highly optimized linear algebra run-time validates that the time spent in execution of the algorithm is indeed almost entirely in the library, thus delegating the performance concerns solely to the library developer. Furthermore, the linear algebra formulation consistently outperforms the textbook version of these algorithms by a factor of two to five. Vector and matrix multiplications consume the majority of the computational time, particularly as problem size increases, putting them in the cross hairs for performance optimization.