Encoding extraction as inferences
Abstract
The analysis of natural-language text involves many different kinds of processes that might be described in multiple ways. One way to describe these processes is in terms of the semantics of their requirements and results. Such a description makes it possible to view these processes as analogous to inference rules in a theorem-proving system. This analogy is useful for metacognition because there is existing theory and infrastructure for manipulating inference rules. This paper presents a representational framework for text analysis processes. We describe a taxonomy of text extraction tasks that we have represented as inference rules. We also describe a working system that encodes the behavior of text analysis components as a graph of inferences. This representation is used to present browsable explanations of text extraction; in future work, we expect to perform additional automated reasoning over this encoding of text analysis processes.