Publication
SIGIR 2007
Conference paper
Story segmentation of broadcast news in Arabic, Chinese and English using multi-window features
Abstract
The paper describes a maximum entropy based story segmentation system for Arabic, Chinese and English. In experiments with broadcast news data from TDT-3, TDT-4, and corpora collected in the DARPA GALE project we obtain a substantial performance gain using multiple overlapping windows for text-based features.