Publication
CLUSTER 2019
Conference paper

Diagnostic Analysis: Directional Relation Graph

View publication

Abstract

System administrators can employ various diagnostic tests to identify failures in high performance computing systems, but manual analysis of the results can be time-consuming. Moreover, the execution of these tests can occupy system resources and individual diagnostic results only represent the instantaneous state of the system. In this paper, we propose the use of a directional relation graph to summarize and visualize diagnostic results over time. The graph is a visual representation of the frequency of different test failures and relations among failures in a specific time range. We demonstrate the directional relation graph using diagnostic results obtained during the execution of synthetic anomalies. Furthermore, we discuss how graph analysis of relations among failures can narrow the suite of tests to reduce overall test time.

Date

Publication

CLUSTER 2019

Authors

Share