Anomaly detection in information streams without prior domain knowledge
Abstract
A key goal of information analytics is to identify patterns of anomalous behavior. Such identification of anomalies is required in a variety of applications such as systems management, sensor networks, and security. However, most of the current state of the art on anomaly detection relies on using a predefined knowledge base. This knowledge base may consist of a predefined set of policies and rules, a set of templates representing predefined patterns in the data, or a description of events that constitutes anomalous behavior. When used in practice, a significant limitation of information analytics is the effort that goes into defining and creating the predefined knowledge base and the need to have prior information about the domain. In this paper, we present an approach that can identify anomalies in the information stream without requiring any prior domain knowledge. The proposed approach simultaneously monitors and analyzes the data stream at multiple temporal scales and learns the evolution of normal behavior over time in each time scale. The proposed approach is not sensitive to the choice of the distance metric and hence is applicable in various domains and applications. We have studied the effectiveness of the approach using different data sets. © 2011 IBM.