Anonymization on relational data to protect privacy against re-identification attacks has been studied extensively in recent years. The problem has been shown to be NP-hard and several heuristic and approximation algorithms have been proposed. However, many published data exist in set-valued format (e.g., transactional, search query and recommendation data), but their anonymization techniques have not been well studied. Previous work  proposed borrowing suppression-based approximation algorithms on relational data and the concept of flipping to achieve O(k log k) k-anonymity approximation on set-valued data in two phases. In this work, we propose a new approach to anonymize set-valued data in one single phase. The proposed approach is based on direct estimation of minimal number of addition and deletion operations and without using the suppression technique. We show that the proposed approach can achieve O(log k)- approximation solution to k-anonymity on set-valued data. Experimental results also demonstrate that the proposed algorithms require less number of addition/deletion operations as well as information loss on both real world and synthetic data sets, compared with previous approach.
|Number of pages||15|
|Journal||International Journal of Innovative Computing, Information and Control|
|Publication status||Published - 2011 Dec 1|
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Information Systems
- Computational Theory and Mathematics