Topic-specific parser design in an air travel natural language understanding application
Abstract
In this paper we contrast a traditional approach to semantic parsing for Natural Language Understanding applications in which a single parser captures a whole application domain, with an alternative approach consisting of a collection of smaller parsers, each able to handle only a portion of the domain. We implement this topic-specific parsing strategy by fragmenting the training corpus into subject specific subsets and developing from each subset a corresponding subject parser. We demonstrate this procedure on the Darpa Communicator task, and we observe that given an appropriate smoothing mechanism to overcome data sparseness, the set of subject-specific parsers performs as effectively (in accuracy terms) as the original parser. We present experiments both under supervised and unsupervised subject selection modes.