MGrid: A matching grid for querying XML streams
Abstract
Efficiently querying of XML streams has gained importance as XML streams have become popular. Many previous XML stream processors suffer from tracking the explosive number of matching paths. In this paper, we present techniques, based on notions of information passing and inheritance, which can be used to dramatically reduce the time and space complexity in querying XML streams from polynomial to linear. The main contributions of the paper are: (1) we present a qTree, where each query node is modeled as an autonomous unit which takes inputs from its children, executes its own logic, and generates output to its parent; (2) we introduce matching grid (mGrid) that creates matching units (mUnits) according to the templates in a qTree and implicitly tracks matching paths; (3) we leverage information passing and inheritance between mUnits in an mGrid; and (4) we theoretically prove that the time and space complexity for explicitly tracking matching paths is O(nk/k!) for a single-branch document of depth n and a single-branch query of depth k (k<<n). Our experiments show that mGrid performs well and is highly scalable with both the document size and its depth.