Template-based privacy preservation in classification problems KeWang
Abstract
In this paper, we present a template-based privacy preservation to protect against the threats caused by data mining abilities. The problem has dual goals: preserve the information for a wanted classification analysis and. limit the usefulness of unwanted sensitive inferences that may be derived from the data. Sensitive inferences are specified by a set of "privacy templates". Each template specifies the sensitive information to be protected, a set of identifying attributes, and the maximum association between the two. We show that suppressing the domain values is an effective way to eliminate sensitive inferences. For a large data set, finding an optimal suppression is hard, since it requires optimization over all suppressions. We present an approximate but scalable solution. We demonstrate the effectiveness of this approach on real life data sets. © 2005 IEEE.