Publication
NeurIPS 2024
Workshop paper

Towards Unbiased Evaluation of Time-series Anomaly Detector

Abstract

Time series anomaly detection (TSAD) is an evolving area of research motivated by its critical applications, such as detecting seismic activity, sensor failures in industrial plants, predicting crashes in the stock market, and so on. Anomalies are rare events, making the F1-score the most commonly adopted metric for anomaly detection. However, in time series the challenge of using standard F1-score is the dissociation between ‘time points’ and ‘time events’. To accommodate this, anomaly predictions are adjusted, called point adjustment (PA), before the F1-score evaluation. However, these adjustments are heuristics-based, and biased towards true positive detection, resulting in over-estimated detector performance. However, the current time-series foundation model literature continues to use PA for model evaluation. Such obtained model perspectives are not a true indication of the performance. This work proposes an alternative adjustment protocol called “Balanced point adjustment” (BA). It addresses the limitations of existing point adjustments and provides fairness guarantees backed by axiomatic definitions of TSAD evaluation