STORY SEGMENTATION AND TOPIC DETECTION FOR RECOGNIZED SPEECH
S. Dharanipragada, Martin Franz, et al.
INTERSPEECH - Eurospeech 1999
Previous work addressing the issue of word distribution in documents has shown the importance of word repetitiveness as an indicator of the word content-bearing characteristics. In this paper we propose a simple method using a measure of the tendency of words to repeat within a document to separate the words with similar document frequencies, but different topic discriminating characteristics. We describe the application of the new measure in query-document relevance scoring. Experiments on the TREC Ad Hoc and Spoken Document Retrieval tasks show useful performance improvements.
S. Dharanipragada, Martin Franz, et al.
INTERSPEECH - Eurospeech 1999
L.R. Bahl, S. Balakrishnan-Aiyer, et al.
ICASSP 1995
S. McCarley, Martin Franz
SIGIR Forum (ACM Special Interest Group on Information Retrieval)
S. Dharanipragada, Martin Franz, et al.
ICSLP 2000