Conference paper

Study of human and machine discrete utterance recognition (DUR)


Performance evaluation of DUR systems has typically consisted of percentage-correct recognition (PCR) for specific vocabularies. This numerical measure is misleading because it presumes that 100% recognition is equally achievable for all vocabularies. This, in fact, is not the case. In this paper, results from an experiment which compared human-listener performance to that of a particular recognition machine will be presented. Three different vocabularies were studied. Preliminary results for normalizing machine performance with respect to the difficulty of a test vocabulary are given. Relevant data from the experiment are included to demonstrate the problem and its potential solution.
