Tracking the impact of fact deletions on knowledge graph queries using provenance polynomials
Abstract
Critical business applications in domains ranging from technical support to healthcare increasingly rely on large-scale, automatically constructed knowledge graphs. These applications use the results of complex queries over knowledge graphs in order to help users in taking crucial decisions such as which drug to administer, or whether certain actions are compliant with all the regulatory requirements and so on. However, these knowledge graphs constantly evolve, and the newer versions may adversely impact the results of queries that the previously taken business decisions were based on. We propose a framework based on provenance polynomials to track the impact of knowledge graph changes on arbitrary SPARQL query results. Focusing on the deletion of facts, we show how to efficiently determine the queries impacted by the change, develop ways to incrementally maintain these polynomials, and present an efficient implementation on top of RDF graph databases. Our experimental evaluation over large-scale RDF/SPARQL benchmarks show the effectiveness of our proposal.