Compressed linear algebra for largescale machine learning
Abstract
Large-scale machine learning (ML) algorithms are often-iterative, using repeated read-only data access and I/Obound-matrix-vector multiplications to converge to an optimal-model. It is crucial for performance to fit the data into-single-node or distributed main memory. General-purpose,-heavy- and lightweight compression techniques struggle to-achieve both good compression ratios and fast decompression-speed to enable block-wise uncompressed operations.-Hence, we initiate work on compressed linear algebra (CLA),-in which lightweight database compression techniques are-applied to matrices and then linear algebra operations such-as matrix-vector multiplication are executed directly on the-compressed representations. We contribute effective column-compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments-show that CLA achieves in-memory operations performance-close to the uncompressed case and good compression-ratios that allow us to fit larger datasets into available memory.-We thereby obtain significant end-to-end performance-improvements up to 26x or reduced memory requirements.