Zhixian Yan, Dipanjan Chakraborty, et al.
EDBT 2011
In this paper, we present different techniques to improve natural language call routing. We first describe methods to improve a single classifier: boosting, discriminative training (DT) and automatic relevance feedback (ARF). An interesting feature of some of these algorithms is the ability to re-weight the training data in order to focus the classifier on documents judged difficult to classify. We explore ways of deriving and combining uncorrelated classifiers in order to improve accuracy; we discuss specifically the linear interpolation and the constrained minimization techniques. All these approaches are probabilistic and are inspired from the information retrieval domain. They are evaluated using two similarity metrics, a common cosine measure from the vector space model, and a beta measure which had given good results in the similar task of e-mail steering. Compared to the baseline classifiers, we show an interesting improvement in the classification accuracy on call routing for a banking task: Up to 20% reported for the ARF method, up to 30% for the boosting technique, and more than 45% for the DT approach. Another relative improvement of 11% is also obtained when we combine the classifiers with the constrained minimization approach using a confusion measure and DT. More importantly, synergistic effects of DT on the boosting algorithm were demonstrated: More iterations were possible because DT reduced the classification error rate of individual classifiers trained on re-weighted data by an average of 72%. © 2003 Elsevier B.V. All rights reserved.