Papers
Topics
Authors
Recent
2000 character limit reached

AlignDP: Hybrid Differential Privacy

Updated 26 December 2025
  • AlignDP is a hybrid differential privacy mechanism that partitions user data into rare events shielded by PAC indistinguishability and non-rare events privatized using RAPPOR.
  • It employs effective zero-ε LDP for rare events and standard ε-LDP for frequent events, ensuring unbiased frequency estimation under strong privacy guarantees.
  • The framework balances privacy and utility through rigorous theoretical bounds, empirical metrics, and global aggregation, making it ideal for secure LLM deployments.

AlignDP is a hybrid differential privacy (DP) mechanism developed to mitigate the risks posed by extraction, distillation, and unauthorized fine-tuning of LLMs. Distinct from post-hoc watermarking or monitoring strategies, AlignDP operates at the data interface by partitioning user data into rare and non-rare components, shielding rare events via PAC indistinguishability (effectively yielding zero-ε local DP) and privatizing non-rare events using RAPPOR. This two-tier framework enforces strong privacy guarantees while retaining statistical utility for frequent categories, with composition and budget constraints enforced by a global aggregator. The theoretical underpinnings establish limits on PAC extensions, tight bounds for RAPPOR estimation error, and utility trade-offs for each privacy regime (Gaikwad, 19 Dec 2025).

1. Two-Tier Architecture of AlignDP

Let each user record be X=(X1,,Xd)X = (X_1,\dots,X_d), with marginal distributions μi\mu_i over their respective domains Di\mathcal{D}_i. Fixing a threshold α>0\alpha>0, each field ii is partitioned as

Ri={xDi:μi(x)<α},Ni=DiRi.R_i = \{x\in\mathcal{D}_i:\mu_i(x)<\alpha\},\quad N_i = \mathcal{D}_i\setminus R_i.

  • Rare events (xRix\in R_i) are processed by a PAC indistinguishability shield. The mechanism MM outputs the symbol xx, but only aggregate counts are released, bounded by a PAC-style indistinguishability parameter δ(n,α)\delta(n, \alpha).
  • Non-rare events (xNix\in N_i) are encoded via kk-ary randomized response (RAPPOR). Each xx is mapped to a one-hot vector v{0,1}kv\in\{0,1\}^k, bits flipped independently with probability pp, yielding privatized vector yy sent to the aggregator.

This architecture ensures that rare events are hidden with “effective zero–ϵ\epsilonLDP, while non-rare events support unbiased frequency estimation under standard LDP.

2. Formal Privacy Guarantees

PAC-Indistinguishability (Rare Events)

Define mechanism MrareM_{\text{rare}} for rare categories. MrareM_{\text{rare}} is said to satisfy PAC-indistinguishability with parameter δ(n,α)\delta(n, \alpha) if, for any x,xRix, x' \in R_i and any (possibly randomized) distinguisher A\mathcal{A} observing nn outputs,

Pr[A outputs “xx]Pr[A outputs “xx]δ(n,α).|\Pr[\mathcal{A}\ \text{outputs “}x\text{”} \mid x] - \Pr[\mathcal{A}\ \text{outputs “}x'\text{”} \mid x']| \leq \delta(n, \alpha).

A Hoeffding-type bound yields

δ(n,α)=exp(2n(αμi(x))2),xRi.\delta(n,\alpha) = \exp(-2n(\alpha-\mu_i(x))^2),\quad x\in R_i.

As δ0\delta\to 0, this approaches (0,δ)(0,\delta)-DP, i.e., “zero–ϵ\epsilon” LDP for rare events.

Local Differential Privacy for Non-Rare Events (RAPPOR)

For non-rare xNix\in N_i, the kk-ary randomized response mechanism Mrr:Ni{0,1}kM_{\text{rr}}: N_i \to \{0,1\}^k is ϵ\epsilon-LDP if

Pr[Mrr(x)=y]eϵPr[Mrr(x)=y]x,xNi,y{0,1}k.\Pr[M_{\text{rr}}(x) = y] \leq e^{\epsilon}\,\Pr[M_{\text{rr}}(x') = y] \quad \forall x, x'\in N_i, \forall y\in\{0,1\}^k.

RAPPOR with bit-flip probability pp achieves

ϵ=ln1pp.\epsilon = \ln \frac{1-p}{p}.

Each nn-user aggregate yields, for each category jj,

q=1p+pk,μ^i(j)=yj1k(1q)q1k(1q).q = 1-p+\frac{p}{k},\quad \hat{\mu}_i(j) = \frac{y_j - \frac{1}{k}(1-q)}{q-\frac{1}{k}(1-q)}.

Resulting in unbiased estimates with variance Var[μ^i(j)]p(1p)n\mathrm{Var}[\hat{\mu}_i(j)] \leq \frac{p(1-p)}{n}.

3. Fundamental Theoretical Results

Theorem 1: PAC Shielding of Rare Events

For xRix \in R_i with μi(x)<α\mu_i(x)<\alpha, nn i.i.d. samples yield:

δ(n,α)=exp(2n(αμi(x))2)\delta(n, \alpha) = \exp\big(-2n(\alpha-\mu_i(x))^2\big)

No adversary can distinguish xx from another rare value with advantage exceeding δ(n,α)\delta(n, \alpha). This bound follows from Hoeffding's inequality applied to empirical frequencies and thresholding at α\alpha.

Theorem 2: ϵ\epsilon-LDP for RAPPOR

For non-rare categories, symmetric bit-flip RAPPOR with probability pp satisfies

ϵ=ln1pp.\epsilon = \ln \frac{1-p}{p}.

Frequency estimators μ^i(j)\hat{\mu}_i(j) are unbiased, with variance upper bound p(1p)/n\leq p(1-p)/n.

Theorem 3: Global Composition

Aggregating up to kk RAPPOR reports, each with privacy loss ϵ\epsilon, yields:

ϵtotkϵ\epsilon_{\mathrm{tot}} \leq k\epsilon

(Basic composition.) For any δ>0\delta > 0,

ϵtot2kln(1/δ)ϵ+kϵ(eϵ1)\epsilon_{\mathrm{tot}} \leq \sqrt{2k\ln(1/\delta)}\,\epsilon + k\epsilon(e^\epsilon - 1)

(Pinsker–type advanced composition).

PAC shielding does not compose beyond the rare domain. If μi(x)α\mu_i(x)\geq\alpha, the adversary’s distinguishing probability increases with nn, requiring DP to control leakage.

4. Analysis of Utility–Privacy Trade-offs

  • Non-Rare (RAPPOR): Mean-squared error per category:

MSEp(1p)n\mathrm{MSE} \leq \frac{p(1-p)}{n}

With privacy budget ϵ\epsilon, set p=(1+eϵ)1p=(1+e^{\epsilon})^{-1}; thus p(1p)eϵ/(1+eϵ)2p(1-p) \approx e^{-\epsilon}/(1+e^{-\epsilon})^2, yielding

MSEeϵ(1+eϵ)21n\mathrm{MSE} \approx \frac{e^{-\epsilon}}{(1+e^{-\epsilon})^2}\frac{1}{n}

MSE decreases exponentially in ϵ\epsilon and as $1/n$ with user count.

  • Rare (PAC Shielding): Utility loss is the suppression of frequency estimation in RiR_i. Since xRiμi(x)Riα\sum_{x\in R_i}\mu_i(x)\le |R_i|\alpha, the suppressed probability mass is at most Riα|R_i|\alpha. For small α\alpha (e.g., 1%1\%), overall impact is minimal.
  • Hybrid Choice: Reducing α\alpha lowers the suppressed mass but increases the proportion of categories privatized by RAPPOR, increasing estimation error. Typically, α\alpha is chosen small enough for Ri|R_i| to remain modest, balancing the risk of leaking low-frequency identifiers and the noise introduced to moderately frequent events.

5. Empirical Performance and Metrics

Simulations with n=1000n=1000 users, d=10d=10 fields (each size k=20k=20), and threshold α=0.01\alpha=0.01 yield:

Metric Rare (RiR_i) Non-rare (NiN_i)
Categories per field 4\approx 4 16\approx 16
MAE (est. freq.) 0.001\approx 0.001 matches MSE bound
Top-5 accuracy (n=104n=10^4) n/a 80%\approx 80\%
KL divergence (n=104n=10^4) n/a 0.0013\approx 0.0013
Spearman's ρ\rho (n=104n=10^4) n/a 0.798\approx 0.798

PAC shielding keeps rare event estimates at noise floor (MAE 0.001\approx 0.001), invariant to query repetition. Non-rare RAPPOR outputs (with p=0.25p=0.25, ϵ1.1\epsilon\simeq1.1) are consistent with theoretical MSE bounds, decaying as $1/n$. Repeated querying (up to 100) demonstrates that rare category estimation remains at noise floor, and non-rare recovery saturates at correlation coefficient ρ0.99\rho\approx0.99. No repetition permits the adversary to breach the shield or exceed the RAPPOR noise ceiling.

6. Context and Significance in LLM Privacy

AlignDP introduces a principled interface-level defense for LLMs, contrasting with reactive watermarking or monitoring approaches. By enforcing PAC indistinguishability for rare values and LDP for frequent values, it ensures robust mitigation of low-frequency signal leakage—often the locus of identification risk—while supporting meaningful aggregate analytics. The systematic integration of two privacy regimes, composition-aware aggregation, and explicit utility analysis positions AlignDP as a primary candidate for data sharing and queryable LLM deployments under privacy constraints (Gaikwad, 19 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to AlignDP.