Information Disclosure Rate (IDR)

Updated 11 September 2025

Information Disclosure Rate (IDR) is a measure of the amount of information revealed per unit (e.g., record, transaction) while balancing utility constraints and privacy requirements.
IDR is grounded in classical information theory using rate-distortion and conditional entropy to evaluate the trade-offs in data sanitization, synthetic data production, and secure computation.
In strategic and competitive contexts, IDR informs equilibrium disclosure strategies and mechanism designs, optimizing trade-offs between reaching market goals and controlling privacy risks.

Information Disclosure Rate (IDR) is a technical concept quantifying the rate at which information—whether public, sensitive, or strategic—is revealed by agents, platforms, or mechanisms in a variety of data-intensive, competitive, or privacy-critical environments. IDR measures the amount of information effectively disclosed per basic unit (e.g., per data record, transaction, communication, or agent signal) as a function of utility constraints, privacy requirements, strategic objectives, or equilibrium responses. The concept plays a central role in frameworks ranging from statistical disclosure control and secure computation to mechanism design and competitive information design, defining the trade-off between the utility of released information and the risk or inefficiency induced by its partial or strategic concealment.

1. Information-Theoretic Foundations of IDR

The quantification of IDR is strongly rooted in classical information theory—specifically, rate–distortion theory and conditional entropy formulations. In the context of database sanitization, each database entry is modeled as a multidimensional vector with designated public (X₍ᵣ₎) and private (Xₕ) attributes. Utility of disclosures is controlled by a distortion metric,

$\Delta_\ell \equiv \mathbb{E} \left[ \frac{1}{n} \sum_{i=1}^n g\big( f_\ell(X_{r,i}), f_\ell(\widehat{X}_{r,i}) \big) \right] \leq D_\ell + \epsilon$

where high utility implies low distortion between original and released data per function fₗ and metric g(·,·).

Privacy is governed by equivocation (conditional entropy) of private attributes conditioned on both the released data and any user side-information Zⁿ: $\Delta_p \equiv \frac{1}{n} H\big( X_h^n | W, Z^n \big) \geq E - \epsilon$

Minimizing the rate R(D,E) under these constraints defines the rate–distortion–equivocation region: $\mathcal{R} = \left\{ (R, D, E) : D \geq 0, 0 \leq E \leq \Gamma(D), R \geq R(D, E) \right\}$

This formalism tightly links utility (distortion threshold D), privacy (equivocation bound E), and IDR (minimal R required to meet both), demonstrating that data utility and privacy are coupled through minimal permissible information leakage (Sankar et al., 2010).

2. IDR in Statistical Disclosure Control and Microdata Privacy

IDR is a critical construct for assessing security and privacy risk in released microdata. For identity disclosure, risk is driven by the uniqueness of quasi-identifiers (QIDs) and the ability for an adversary to match records. Recent frameworks calculate both identity and attribute disclosure risk, favoring a composite measure that sums over all possible partitions of observed attributes into known and unknown sets: $D(r) = \sum_{i=1}^{2^m} L_{KS_i}(r) \cdot a \cdot C_{UKS_i}(r)$ with L (likelihood of identification) and C (consequence or sensitivity of attribute values), refined through probabilistic modeling of adversary knowledge rather than fixed QID sets (Orooji et al., 2019). This allows direct calculation of IDR as the aggregate risk of disclosed information per record—decision-makers can then control IDR through targeted anonymization or perturbation.

For synthetic data, identity and attribute disclosure risk are quantified via Replicated Uniques ( $RepU$ ) and Disclosive in Synthetic Correct Original ( $DiSCO$ ): $RepU = 100\,\frac{\sum (s_{.q} \mid d_{.q}=1 \land s_{.q}=1)}{N_d}, \quad DiSCO = 100\,\frac{\sum_{q} \sum_{t} (d_{tq} \mid ps_{tq} = 1)}{N_d}$ where $RepU$ denotes the proportion of unique records in the original that remain unique in the synthetic, and $DiSCO$ measures the fraction where synthetic records correctly reveal the target attribute given a matched key (Raab et al., 24 Jun 2024).

3. Strategic and Competitive Disclosure Dynamics

IDR extends beyond privacy to applications where agents disclose information strategically to maximize objectives—market share, platform revenue, or selection likelihood. In competitive markets, such as oligopolistic search or online platforms with seller–buyer interactions, firms balance the value of attracting customers through greater disclosure against the risk of excessive competition.

A robust equilibrium concept arising in these models is the "upper-censorship equilibrium" (UCE), where full disclosure is provided for signals below a threshold a, and outcomes above a are pooled: $a^M = c_F^{-1}\left( \frac{1}{h(c^M)} \right), \quad c_F(a) = \int_a^1 (1-F(t))\,dt$ IDR in these contexts is interpreted as the fraction of distributional support over which full information is disclosed; higher a yields higher IDR and market informativeness (Hwang et al., 7 Apr 2025).

Moreover, in competitive Bayesian persuasion, agents commit to signaling schemes that selectively reveal quality, and the realized social welfare is compared to the full-information benchmark. IDR is therefore implicitly characterized by the price-of-anarchy (PoA) bounds, which quantify the societal efficiency loss attributable to strategic nondisclosure: $\text{PoA} = \frac{SW^{\max}}{\min_{\mathcal{Z} \in NE} SW(\mathcal{Z})}$ Demonstrating that PoA remains bounded—and thus that IDR is sufficient for efficient selection—even with agent heterogeneity (Banerjee et al., 14 Apr 2025).

4. Mechanism Design with Endogenous Information Structures

Recent advances in mechanism design exploit endogenous information disclosure, moving away from the exogenous information structures of classical theory. By controlling which signals are disclosed to buyers or participants, designers can attain simple mechanisms (such as item pricing) that are competitive even in multi-dimensional settings.

For example, horizontal disclosure—where only an item's index is revealed—can be paired with posted pricing to extract substantial (up to full) surplus when valuation distributions are negatively correlated: $p_i = \mathbb{E}_F[v_i | i^*(v) = i]$ When full surplus extraction is not possible, coarse horizontal disclosure and two-price schemes guarantee at least 50.17% of the optimal revenue for arbitrary distributions and correlation structures (Cai et al., 25 Feb 2025). This identifies IDR as a key lever for simplifying mechanisms and achieving near-optimal outcomes.

5. IDR in Secure and Privacy-Preserving Computation

In secure multi-party computation (SMPC), IDR quantifies the leakage about participants’ private inputs resulting from function outputs, independent of protocol-level security. For average or sum functions (e.g., salary analysis),

$I(\mathbf{X}_T; O) = H(\mathbf{X}_T) - H(\mathbf{X}_T | O)$

where $\mathbf{X}_T$ are target inputs, $O$ is the output. The corresponding entropy loss determines the information disclosed. With enough participants (e.g., 5 for <5% loss, 24 for <1% loss in salary computation), IDR can be tightly bounded and explicitly controlled via closed-form entropy calculations and by managing cohort overlap in repeated evaluations (Baccarini et al., 2022). These results highlight how IDR can be kept low even in cryptographically secure systems by careful statistical design.

6. Contextual and Machine Learning-Based IDR Management

IDR considerations have practical implications for the design of context-aware and personalized information systems. For example, in email notification platforms, accidental disclosures are prevalent—53% of users report at least 10% of notifications pose a risk in work contexts, and 73% in personal settings. Machine learning classifiers that exploit user, content, and context features can predict discomfort and guide adaptive notification policies, effectively managing IDR in real time (Kim et al., 2018). Key insights include the importance of personalization, field-level control of disclosures, and context-aware suppression strategies.

In financial markets, quality assessment of information disclosure (using multidimensional annotations: question identification, relevance, answer readability, answer relevance) enables benchmarking NLP models and can be formalized as a composite score function—schematically,

$\text{Quality Score} = \alpha \cdot \text{Q\_ID} + \beta \cdot \text{Q\_Rel} + \gamma \cdot \text{A\_Read} + \delta \cdot \text{A\_Rel}$

Increasing IDR corresponds to higher scores in these quality dimensions, supporting transparency and regulatory goals (Xu et al., 17 Jun 2024).

7. Limitations, Open Problems, and Directions for Future Research

Despite substantial progress in quantifying and managing IDR, several limitations and open problems remain. Real-world distributions may violate symmetry or independence, requiring new models. Communication constraints (e.g., limited message alphabets), estimation biases, and adversarial knowledge are complex and not fully captured in extant frameworks. Furthermore, optimal partitioning schemes, dynamic information design, and network effects (in social media or crowdsourcing) await systematic treatment.

A plausible implication is that future research will refine IDR metrics for increasingly complex, high-dimensional, and interactive environments, possibly incorporating active learning or online estimation for real-time control. Robust adversary models for privacy risk, dynamic mechanism adaptation in marketplaces, and automated monitoring of disclosure quality all represent active domains for advancing the theory and application of Information Disclosure Rate.