Albert Atserias, Anuj Dawar, et al.
Journal of the ACM
Privacy preservation is becoming a critical issue to data-mining processes. In practice, a data transformation process is often needed to preserve privacy. However, data transformation would introduce a data quality issue. In this case, the impact on data quality due to the data transformation should be estimated and made clear to the user of the data transformation process. In this article, we consider the problem of k-anonymization transformation in associative classification. The privacy preservation and data quality issues are considered in twofold. First, we propose a frequency-based data quality metric to represent the data quality for associative classification. Second, a novel heuristic algorithm, namely minimum classification correction rate transformation, is proposed. The algorithm is guided by the classification correction rate of the given datasets. We validate our proposed metric and algorithm with University of California-Irvine repository datasets. The experiment results have shown that our proposed metric can effectively demonstrate the data quality for associative classification. The results also show that the proposed algorithm is not only efficient but also highly effective.
Albert Atserias, Anuj Dawar, et al.
Journal of the ACM
Zhikun Yuen, Paula Branco, et al.
DSAA 2023
Daniel Karl I. Weidele, Hendrik Strobelt, et al.
SysML 2019
Kenneth L. Clarkson, Elad Hazan, et al.
Journal of the ACM