Frequent directions: Simple and deterministic matrix sketching
Abstract
We describe a new algorithm called FrequentDirections for deterministic matrix sketching in the row-update model. The algorithm is presented an arbitrary input matrix A ϵ Rnxd one row at a time. It performs O(dl) operations per row and maintains a sketch matrix B ϵ Rlxd such that for any {equation presented}.Here, Ak stands for the minimizer of {equation presented} over all rank k matrices (similarly for Bk) and πBk (A) is the rank k matrix resulting from projecting A on the row span of Bk. We show that both of these bounds are the best possible for the space allowed. The summary is mergeable and hence trivially parallelizable. Moreover, FrequentDirections outperforms exemplar implementations of existing streaming algorithms in the space-error tradeo . This paper combines, simplifies, and extends the results of Liberty [Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013], Ghashami and Phillips [Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, 2014], and Woodru [Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems, 2014].