DivGroup: A Diversified Approach to Divide Collection of Patterns into Uniform Groups
Abstract
Similarity based grouping of patterns has been explored profusely under the well celebrated clustering paradigm in pattern recognition and machine learning. In clustering, objects in the same cluster are similar to each other and objects belonging to different clusters are dissimilar in a corresponding sense. However, it is not rare to come across situations where instead of a similarity based grouping, forming groups of diverse objects is needed. Resource allocation across different parts of an organization, performing cross-validation splits of dataset with class imbalance, heterogeneous or mixed ability partitioning of students, etc. Are the applications of grouping which require each group to contain diverse set of patterns. Moreover, these applications also demand different groups to be similar to each other in some sense. In this work, we propose a generic framework for partitioning a collection of patterns into a set of groups such that the above two criteria are fulfilled. To the best of our knowledge, this is the first work to propose such a framework irrespective of any particular application. Towards this end, it turns out that finding an optimal solution to the problem that we developed is NP Hard. So we Propose an approximate solution for the same. We conduct experiments on both synthetic and real world datasets to evaluate the performance of the proposed algorithm. We show the merit of the algorithm by comparing the results with some related state-of-the-art baseline methods.