Magic Conditions
Inderpal Singh Mumick, Sheldon J. Finkelstein, et al.
ACM TODS
This tutorial makes the case for developing a unified framework that manages information extraction from unstructured data (focusing in particular on text). We first survey research on information extraction in the database, AI, NLP, IR, and Web communities in recent years. Then we discuss why this is the right time for the database community to actively participate and address the problem of managing information extraction (including in particular the challenges of maintaining and querying the extracted information, and accounting for the imprecision and uncertainty inherent in the extraction process). Finally, we show how interested researchers can take the next step, by pointing to open problems, available datasets, applicable standards, and software tools. We do not assume prior knowledge of text management, NLP, extraction techniques, or machine learning. Copyright 2006 ACM.
Inderpal Singh Mumick, Sheldon J. Finkelstein, et al.
ACM TODS
Ashutosh Garg, Sreeram Balakrishnan, et al.
ICASSP 2004
Laura Chiticariu, Rajasekar Krishnamurthy, et al.
ACL 2010
Amol Ghoting, Rajasekar Krishnamurthy, et al.
ICDE 2011