Publication
PAKDD 2021
Workshop paper
Data Augmentation for Fairness in Personal Knowledge Graph Population
Abstract
Cold start knowledge base population (KBP) is the problem of populating a knowledge base from unstructured documents. While neural networks have led to improvements in the different tasks that are part of KBP, the overall F1 of the end-to-end system remains quite low. This problem is more acute in personal knowledge bases, which present additional challenges with regard to data protection, fairness and privacy. In this work, we use data augmentation to populate a more complete personal knowledge base from the TACRED dataset. We then use explainability techniques and representative set sampling to show that the augmented knowledge base is more fair and diverse as well.