Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Local Generalization and Bucketization Technique for Personalized Privacy Preservation (2008.11016v1)

Published 25 Aug 2020 in cs.CR

Abstract: Anonymization technique has been extensively studied and widely applied for privacy-preserving data publishing. In most previous approaches, a microdata table consists of three categories of attribute: explicit-identifier, quasi-identifier (QI), and sensitive attribute. Actually, different individuals may have different view on the sensitivity of different attributes. Therefore, there is another type of attribute that contains both QI values and sensitive values, namely, semi-sensitive attribute. Based on such observation, we propose a new anonymization technique, called local generalization and bucketization, to prevent identity disclosure and protect the sensitive values on each semi-sensitive attribute and sensitive attribute. The rationale is to use local generalization and local bucketization to divide the tuples into local equivalence groups and partition the sensitive values into local buckets, respectively. The protections of local generalization and local bucketization are independent, so that they can be implemented by appropriate algorithms without weakening other protection, respectively. Besides, the protection of local bucketization for each semi-sensitive attribute and sensitive attribute is also independent. Consequently, local bucketization can comply with various principles in different attributes according to the actual requirements of anonymization. The conducted extensive experiments illustrate the effectiveness of the proposed approach.

Citations (9)

Summary

  • The paper proposes a hybrid anonymization technique combining local generalization and local bucketization to secure identities and sensitive values while preserving data utility.
  • It extends personalized privacy by allowing individuals to designate semi-sensitive attributes, thereby enhancing the granularity of privacy protection.
  • Experimental validation confirms the method achieves k-anonymity and l-diversity, demonstrating an effective balance between privacy risk reduction and data usability.

Local Generalization and Bucketization Technique for Personalized Privacy Preservation

The paper "Local Generalization and Bucketization Technique for Personalized Privacy Preservation" by Boyu Li, Kun He, and Geng Sun introduces an advanced approach to address personalized privacy requirements in data anonymization. Traditional privacy-preserving techniques categorize attributes into explicit identifiers, quasi-identifiers (QIs), and sensitive attributes without considering individual variations in sensitivity perception. The authors propose a new class of attributes termed semi-sensitive attributes, containing both QI and sensitive values, acknowledging the varied sensitivity levels individuals may assign to their data.

Summary of Contributions

The paper presents a hybrid anonymization strategy, Local Generalization and Bucketization (LGB), designed to safeguard identity and sensitive information by leveraging local equivalence groups and local bucket structures. The key contributions of this research include:

  1. Innovative Anonymization Technique: LGB combines local generalization and local bucketization to independently secure identities and sensitive values. This dual-layered approach allows for flexible implementation in various anonymization scenarios while maintaining high data utility.
  2. Extension of Personalized Anonymity: The paper extends the paradigm of personalized anonymity by permitting individuals to identify their sensitive values, thereby enriching the granularity of privacy protection beyond conventional models.
  3. Formalization and Analysis: The paper demonstrates the effectiveness of LGB in adhering to kk-anonymity and ll-diversity principles. The authors detail the theoretical underpinnings ensuring that the probability of identity disclosure and sensitive value exposure is minimized to acceptable thresholds (i.e., 1/kk and 1/ll, respectively).
  4. Efficient Implementation Algorithm: An algorithm is developed to partition data into local equivalence groups and local buckets, achieving kk-anonymity and ll-diversity compliance. The algorithm includes options for multi-dimensional partitioning and NCP minimization, thus catering to varied data utility requirements.
  5. Experimental Validation: Through extensive experiments, the authors evaluate the technique's performance in terms of discernibility metric, normalized certainty penalty (NCP), and query answering accuracy. The results demonstrate the balance between privacy and utility, showcasing flexibility in addressing different application contexts.

Practical and Theoretical Implications

From a practical standpoint, LGB is tailored to address complex privacy challenges inherent in modern data publishing scenarios. As data becomes increasingly granular and personalized, the ability to cater to individual privacy preferences becomes invaluable. This method provides a customizable privacy-preserving framework applicable to real-world datasets, as evidenced by the experiments conducted using US Census data.

Theoretically, the introduction of semi-sensitive attributes and the notion of localization in generalization and bucketization enrich the privacy literature by offering mechanisms that are both robust in safeguarding privacy and adaptable to user-specific sensitivity levels.

Future Directions

The paper hints at potential advancements in the area of incremental data publishing and the broader application of LGB in collaborative and continuous data release environments. Leveraging machine learning to dynamically adjust privacy levels while maintaining data utility is a promising avenue for future research, potentially enriching the adaptive capabilities of privacy-preserving techniques.

In conclusion, this paper contributes significantly to personalized privacy preservation by introducing a versatile technique that addresses individual sensitivity preferences while maintaining the delicate balance between data utility and privacy.