A Divide-and-Conquer Approach for Large-Scale Multi-label Learning
Abstract
Recently, the multi-label learning has drawn considerable attention as it has many applications in text classification, image annotation and query/keyword suggestions etc. In recent years, a number of remedies have been proposed to address this challenging task. However, they are either tree based methods which has the expensive train costs or embedding based methods which has relatively lower accuracy since using simple reduction techniques. This paper addresses the issue by developing an efficient divide-and-conquer based approach. Specifically, it involves: a) utilizing the feature vector to cluster the training data into several clusters, b) reformulating the multi-label problems as recommended problems by treating each label as an item to be recommended, and c) learning an advanced factorization model to recommend the subset of labels to each point for local cluster. Extensive experiments on several real world multi-label datasets demonstrate the efficiency of our proposed algorithm.