Publication
EMNLP 2006
Conference paper
Empirical study on the performance stability of named entity recognition model across domains
Abstract
When a machine learning-based named entity recognition system is employed in a new domain, its performance usually degrades. In this paper, we provide an empirical study on the impact of training data size and domain information on the performance stability of named entity recognition models. We present an informative sample selection method for building high quality and stable named entity recognition models across domains. Experimental results show that the performance of the named entity recognition model is enhanced significantly after being trained with these informative samples. © 2006 Association for Computational Linguistics.