PaCK: Scalable parameter-free clustering on K-partite graphs
Abstract
Given an author-paper-conference graph, how can we automatically find groups for author, paper and conference respectively. Existing work either (1) requires fine tuning of several parameters, or (2) can only be applied to bipartite graphs (e.g., author-paper graph, or paper-conference graph). To address this problem, in this paper, we propose PaCK for clustering such k-partite graphs. By optimizing an information-theoretic criterion, PaCK searches for the best number of clusters for each type of object and generates the corresponding clustering. The unique feature of PaCK over existing methods for clustering k-partite graphs lies in its parameter-free nature. Furthermore, it can be easily generalized to the cases where certain connectivity relations are expressed as tensors, e.g., time-evolving data. The proposed algorithm is scalable in the sense that it is linear with respect to the total number of edges in the graphs. We present the theoretical analysis as well as the experimental evaluations to demonstrate both its effectiveness and efficiency.