Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Privacy Preserving Record Linkage via grams Projections (1208.2773v1)

Published 14 Aug 2012 in cs.DB

Abstract: Record linkage has been extensively used in various data mining applications involving sharing data. While the amount of available data is growing, the concern of disclosing sensitive information poses the problem of utility vs privacy. In this paper, we study the problem of private record linkage via secure data transformations. In contrast to the existing techniques in this area, we propose a novel approach that provides strong privacy guarantees under the formal framework of differential privacy. We develop an embedding strategy based on frequent variable length grams mined in a private way from the original data. We also introduce personalized threshold for matching individual records in the embedded space which achieves better linkage accuracy than the existing global threshold approach. Compared with the state-of-the-art secure matching schema, our approach provides formal, provable privacy guarantees and achieves better scalability while providing comparable utility.

Citations (12)

Summary

We haven't generated a summary for this paper yet.