Curie Policy Language (CPL)

Updated 22 February 2026

CPL is a formal policy description language that precisely defines share and acquire clauses using structured syntax and conditionals.
It supports secure, automated negotiation of data sharing agreements through mechanisms like secure multi-party computation and optional differential privacy.
Validated in healthcare consortia, CPL enables fine-grained, alliance-based data control to enhance predictive model performance.

The Curie Policy Language (CPL) is a formal policy description language central to the Curie framework for secure, policy-driven data exchange among members whose relationships may be governed by complex regulatory, political, trust, and strategic considerations. CPL enables each participant to precisely declare both the conditions under which it is willing to share data and the conditions under which it seeks to acquire data, supporting controlled collaboration without requiring mutual trust or centralized governance. The language serves as the foundation for automated negotiation of global data-sharing agreements, which are enforced using secure multi-party computation (MPC) and, optionally, differential privacy (DP). Curie and CPL were introduced and validated in the context of healthcare prediction consortia, notably for secure multi-institutional warfarin dosing model development (Celik et al., 2017).

1. CPL Syntax and Structure

CPL provides an expressive, structured syntax presented in Backus–Naur Form (BNF). The core elements are "share" and "acquire" clauses, with optional "sub" clauses for modular filtering and attributes for named variables. Clause components include members (to/from whom data is shared/acquired), conditionals (logical or data-dependent requirements), and selections (row-level filters). Below is a simplified representation of the CPL grammar:

<curie_policy>    ::= <statement> (‘;’ <statement> )*
<statement>       ::= <share_clause> | <acquire_clause> | <sub_clause> | <attribute>
<share_clause>    ::= 'share' ':' [<members>] ':' [<conditionals>] '::' <selections>
<acquire_clause>  ::= 'acquire' ':' [<members>] ':' [<conditionals>] '::' <selections>
<sub_clause>      ::= <tag> ':' [<conditionals>] '::' <selections>
<attribute>       ::= <identifier> ':=' '<' <value> '>' | <identifier> ':=' '<' <value_list> '>'
<conditionals>    ::= (<var> '=' <value> (',' <conditionals>)*) 
                   |  'evaluate' '(' <data_ref> ',' <alg_arg> ',' <threshold> ')' (',' <conditionals>)* |  ε
<selections>      ::= <filters> | <tag>
<filters>         ::= <filter> (',' <filters>)*
<filter>          ::= <var> <operation> <value> | ε
<members>         ::= <member> (',' <members>)*
<member>          ::= <identifier>
<operation>       ::= '=' | '<' | '>' | '!=' | 'in' | 'like' | …
<value>           ::= string or number
<value_list>      ::= '{' <value> (',' <value>)* '}'

Conditionals can be straightforward Boolean predicates on member attributes or requester's metadata (e.g., "country=US") or privacy-preserving data-dependent evaluations (e.g., intersection size, Jaccard index, Pearson or cosine similarity) computed via secure two-party protocols.

2. Policy Semantics and Evaluation

CPL semantics are clause-oriented and evaluated top-down. A "share" clause of the form

1	share : M_j : C_1, ..., C_k :: S_1, ..., S_m

is interpreted as: when member $M_j$ requests data, if all conditionals $C_i$ evaluate to true, exactly the subset of rows meeting the selections $S_j$ will be shared. Conversely, "acquire" clauses specify the data to request from a member, again gated by the declared conditionals.

Selections may reference simple row filters (e.g., "age > 65", "race in {Asian, White}"), or invoke named subclauses for modular filtering logic. Data-dependent conditionals such as

1	evaluate(local_column, 'Jaccard index', θ)

trigger secure two-party computation protocols to compute the relevant statistic, passing if the result meets threshold $\theta$ . If no clause matches, no data are shared or acquired. Clause matching halts at the first clause whose conditionals succeed.

3. Local Policy Articulation

Each consortium member maintains a local CPL file, comprising two main types of statements:

Share-clauses: Specify what data the participant is willing to share, to whom, and under what conditions.
Acquire-clauses: Specify what data the participant intends to request, from whom, and under what conditions.

A minimal example for a three-member consortium is as follows:

@M1
acquire: M2:                   :: age > 25 ;
acquire: M3:  evaluate(col=age,'Jaccard',0.3):: race='Asian' ;
share:   M2:                  :: * ;
share:   M3:                  :: * ;

@M2
acquire: M1:                  :: * ;
share:   M1: NATO=true,EU=true :: country in {US,CA,UK} ;
share:   M1:                  :: race='White' ;

@M3
acquire: M1:                  :: genotype='A/A' ;
share:   M1: evaluate(col=genotype,'Intersection size',10) :: * ;
share:   M1:                  :: weight>150 ;

These policies permit intricate, multi-faceted control, such as alliance-restricted sharing, attribute-based filtering, and data similarity gating.

4. Negotiation and Global Agreement Formation

Pairwise negotiation is performed among all $n$ members:

Each participant $P_i$ sends its acquire-clause for $P_j$ directly to $P_j$ .
On receipt, $P_j$ evaluates its share-clauses relevant to $P_i$ , applies conditionals (including data-dependent privacy-preserving checks), and calculates the intersection between $P_i$ 's requested subset and $P_j$ 's allowable subset. This intersection forms the agreed subset $S_{j→i}$ .
$P_j$ returns a minimal negotiation response (pointer to the subset) to $P_i$ .
After negotiation with all peers, $P_i$ holds ${S_{j→i} : j≠i}$ specifying permitted data slices.

The agreed data for $P_i$ to acquire from $P_j$ is defined by

$S_{i \gets j} = A_{i \gets j} \cap SA_{j \gets i}, \quad \textrm{subject to} \quad C_{i \to j} \wedge C_{j \to i}$

where $C_{i \to j}$ and $C_{j \to i}$ are each side's respective conditionals. If either party's policy is not satisfied, no data are exchanged between that pair.

5. Enforcement via Secure MPC and Differential Privacy

After policy negotiation, each member $P_i$ constructs local summary statistics over its permitted subset:

$O_i = X_i^T X_i \in \mathbb{R}^{d \times d}$ (feature covariance)
$V_i = X_i^T y_i \in \mathbb{R}^{d \times 1}$ (feature-outcome cross-term)

where $X_i$ is the matrix of selected feature rows and $y_i$ the outcome vector. With $d \ll |$ rows $|$ , these statistics are compact. The global linear model solution is computed by:

$\beta^* = \Big(\sum_i O_i\Big)^{-1}\Big(\sum_i V_i\Big)$

Computation proceeds via a homomorphic encryption (HE) ring protocol:

Initiator generates (pk, sk), broadcasts pk.
The initiator encrypts and forwards $(C_O, C_V) = (\text{Encrypt}_{pk}(O_{init}), \text{Encrypt}_{pk}(V_{init}))$ .
Each party adds its local encrypted $O_i, V_i$ and forwards along the ring.
When the data returns to the initiator, decryption recovers $\sum_i O_i, \sum_i V_i$ .

No party ever learns another member's raw data or cleartext statistics. Optionally, to achieve $\epsilon$ -differential privacy, the functional mechanism (Zhang et al., VLDB '12) is applied by adding calibrated Laplace noise to the objective, based on pre-scaled columns mapped to $[-1,1]$ . This step occurs post-aggregation, with no extra communication.

6. Case Study: Healthcare Consortia and Policy Expressiveness

CPL expressiveness and enforcement were validated in a warfarin dosing study covering 24 institutions in 9 countries. Five principal consortium architectures were included:

P.1: Single source (no cross-institution sharing)
P.2: Nation-wide (e.g., US-within-US)
P.3: Regional (e.g., North America, Europe, Asia grouped)
P.4: NATO–EU alliance-based
P.5: Global (all share with all)

Example policies for these settings:

Nation-wide:

1 2	acquire: country='US':: ; share: country='US':: ;

Global:
1 2
acquire: *:: * ; share: *:: * ;

In the experiments, for all configurations, acquire policies were fully satisfied by share policies; for the nation-wide (US) consortium, all 51 shares matched the 51 requests. The resulting global models reduced mean absolute percentage error (MAPE) by up to 25% over using only local models. Fine-grained policies (e.g., race-balanced, alliance-constrained) allowed participants to optimize data mixes for targeted accuracy gains.

7. Performance and Overhead Characteristics

Performance and scalability metrics indicated:

Negotiation overhead scales as $\mathcal{O}(n^2)$ in the worst case, but each interaction involves only minimal metadata (e.g., 13 members incur ~156 messages).
Non-cryptographic filters/conditionals execute in $\ll$ 1 ms per clause; data-dependent tests (e.g., Jaccard, intersection, Pearson, cosine) require 10–100 ms each, resulting in under 20 s for 25 × 25 pairs.
The predominant runtime cost is the HE-based MPC matrix aggregation (one $\mathcal{O}(d^2)$ matrix-add per member). For $d = 40$ and $n \leq 50$ , total time is generally less than 60 seconds. Key generation incurs one-time $\mathcal{O}(d^2)$ overhead.
Differential privacy via the functional mechanism imposes negligible additional cost after statistic aggregation.

CPL, therefore, achieves a tractable balance between expressiveness and computational overhead. Consortium partners can articulate intricate, enforceable data-sharing requirements—ranging from attribute-based and alliance-based rules to private data similarity gates—while maintaining scalable and principled secure computation (Celik et al., 2017).

Markdown Report Issue Upgrade to Chat

References (1)

Curie: Policy-based Secure Data Exchange (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curie Policy Language (CPL).

Curie Policy Language (CPL)

1. CPL Syntax and Structure

2. Policy Semantics and Evaluation

3. Local Policy Articulation

4. Negotiation and Global Agreement Formation

5. Enforcement via Secure MPC and Differential Privacy

6. Case Study: Healthcare Consortia and Policy Expressiveness

7. Performance and Overhead Characteristics

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Curie Policy Language (CPL)

1. CPL Syntax and Structure

2. Policy Semantics and Evaluation

3. Local Policy Articulation

4. Negotiation and Global Agreement Formation

5. Enforcement via Secure MPC and Differential Privacy

6. Case Study: Healthcare Consortia and Policy Expressiveness

7. Performance and Overhead Characteristics

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research