An iterative and re-weighting framework for rejection and uncertainty resolution in crowdsourcing
Abstract
In practical applications of crowdsourcing, labelers may be uncertain or refuse to label a particular instance (or reject) due to the inherent difficulty, and each labeler may be given a different set of instances for big dataset applications. These various issues lead to missing and uncertain labels. Existing crowdsourcing methods have limited capabilities when these two problems exist. In this paper, we propose an Iterative Re-weighted Consensus Maximization framework to address the missing and uncertain label problem. The intuitive idea is to use an iterated framework to estimate each labeler's hidden competence and formulate it as a spectral clustering problem in the functional space, in order to minimize the overall loss given missing and uncertain information. One main advantage of the proposed method from stateof- The-art Bayesian model averaging based approaches is that it uncovers the intrinsic consistency among different set of answers and mines the best possible ground truth. Formal analysis demonstrates that the proposed framework has lower generalization error than widely adopted majority voting techniques for crowdsourcing. Experimental studies show that the proposed framework outperforms state-of-the-art baselines on several benchmark datasets. Copyright © 2012 by the Society for Industrial and Applied Mathematics.