Linking tabular columns to unseen ontologies
Abstract
We introduce a novel approach for linking table columns to types in an ontology unseen during training. As the target ontology is unknown to the model during training, this may be considered a zero-shot linking task at the ontological level. This task is often a requirement for businesses that wish to semantically enrich their tabular data with types from their custom or industry-specific ontologies without the benefit of initial supervision. In this paper, we describe specific approaches and provide datasets for this new task: training models on open domain tables using a broad source ontology and evaluating them on increasingly difficult tables with target ontologies having different levels of type granularity. We use pre-trained Transformer encoder models and a range of encoding strategies to explore methods of encoding increasing amounts of ontological knowledge, such as type glossaries and taxonomies, to obtain better zero-shot performance. We demonstrate these results empirically through extensive experiments on three new public benchmark datasets.