Bolt: Towards a scalable docker registry via hyperconvergence
Michael Littley, Ali Anwar, et al.
CLOUD 2019
IBM Spectrum Scale’s parallel file system General Parallel File System (GPFS) has a 20-year development history with over 100 contributing developers. Its ability to support strict POSIX semantics across more than 10K clients leads to a complex design with intricate interactions between the cluster nodes. Tracing has proven to be a vital tool to understand the behavior and the anomalies of such a complex software product. However, the necessary trace information is often buried in hundreds of gigabytes of by-product trace records. Further, the overhead of tracing can significantly impact running applications and file system performance, limiting the use of tracing in a production system. In this research article, we discuss the evolution of the mature and highly scalable GPFS tracing tool and present the exploratory study of GPFS’ new tracing interface, FlexTrace, which allows developers and users to accurately specify what to trace for the problem they are trying to solve. We evaluate our methodology and prototype, demonstrating that the proposed approach has negligible overhead, even under intensive I/O workloads and with low-latency storage devices.
Michael Littley, Ali Anwar, et al.
CLOUD 2019
Vasily Tarasov, Deepak Jain, et al.
HotStorage 2013
Nannan Zhao, Hadeel Albahar, et al.
USENIX ATC 2020
Zhen Cao, Vasily Tarasov, et al.
FAST 2017