Scaling up link prediction with ensembles
Abstract
A network with n nodes contains O(n2) possible links. Even for networks of modest size, it is often difficult to evaluate all pair-wise possibilities for links in a meaningful way. Furthermore, even though link prediction is closely related to missing value estimation problems, such as collaborative filtering, it is often difficult to use sophisticated models such as latent factor methods because of their computational complexity over very large networks. Due to this computational complexity, most known link prediction methods are designed for evaluating the link propensity over a specified subset of links, rather than for performing a global search over the entire networks. In practice, however, it is essential to perform an exhaustive search over the entire networks. In this paper, we propose an ensemble enabled approach to scaling up link prediction, which is able to decompose traditional link prediction problems into sub-problems of smaller size. These subproblems are each solved with the use of latent factor models, which can be effectively implemented over networks of modest size. Furthermore, the ensemble enabled approach has several advantages in terms of performance. We show the advantage of using ensemble-based latent factor models with experiments on very large networks. Experimental results demonstrate the effectiveness and scalability of our approach.