Intelligent forms processing system
Abstract
This paper describes an intelligent forms processing system (IFPS) which provides capabilities for automatically indexing form documents for storage/retrieval to/from a document library and for capturing information from scanned form images using intelligent character recognition (ICR). The system also provides capabilities for efficiently storing form images. IFPS consists of five major processing components: (1) An interactive document analysis stage that analyzes a blank form in order to define a model of each type of form to be accepted by the system; the parameters of each model are stored in a form library. (2) A form recognition module that collects features of an input form in order to match it against one represented in the form library; the primary features used in this step are the pattern of lines defining data areas on the form. (3) A data extraction component that registers the selected model to the input form, locates data added to the form in fields of interest, and removes the data image to a separate image area. A simple mask defining the center of the data region suffices to initiate the extraction process; search routines are invoked to track data that extends beyond the masks. Other special processing is called on to detect lines that intersect the data image and to delete the lines with minimum distortion to the rest of the image. (4) An ICR unit that converts the extracted image data to symbol code for input to data base or other conventional processing systems. Three types of ICR logic have been implemented in order to accommodate monospace typing, proportionally spaced machine text, and handprinted alphanumerics. (5) A forms dropout module that removes the fixed part of a form and retains only the data filled in for storage. The stored data can be later combined with the fixed form to reconstruct the original form. This provides for extremely efficient storage of form images, thus making possible the storage of very large number of forms in the system. IFPS is implemented as part of a larger image management system called Image and Records Management system (IRM). It is being applied in forms data management in several state government applications. © 1992 Springer-Verlag New York Inc.