Resolving entity morphs in censored data
Abstract
In some societies, internet users have to create information morphs (e.g. "Peace West King" to refer to "Bo Xilai") to avoid active censorship or achieve other communication goals. In this paper we aim to solve a new problem of resolving entity morphs to their real targets. We exploit temporal constraints to collect crosssource comparable corpora relevant to any given morph query and identify target candidates. Then we propose various novel similarity measurements including surface features, meta-path based semantic features and social correlation features and combine them in a learning-to-rank framework. Experimental results on Chinese Sina Weibo data demonstrate that our approach is promising and significantly outperforms baseline methods 1.© 2013 Association for Computational Linguistics.