Papers
Topics
Authors
Recent
Search
2000 character limit reached

Curie Policy Language (CPL)

Updated 22 February 2026
  • CPL is a formal policy description language that precisely defines share and acquire clauses using structured syntax and conditionals.
  • It supports secure, automated negotiation of data sharing agreements through mechanisms like secure multi-party computation and optional differential privacy.
  • Validated in healthcare consortia, CPL enables fine-grained, alliance-based data control to enhance predictive model performance.

The Curie Policy Language (CPL) is a formal policy description language central to the Curie framework for secure, policy-driven data exchange among members whose relationships may be governed by complex regulatory, political, trust, and strategic considerations. CPL enables each participant to precisely declare both the conditions under which it is willing to share data and the conditions under which it seeks to acquire data, supporting controlled collaboration without requiring mutual trust or centralized governance. The language serves as the foundation for automated negotiation of global data-sharing agreements, which are enforced using secure multi-party computation (MPC) and, optionally, differential privacy (DP). Curie and CPL were introduced and validated in the context of healthcare prediction consortia, notably for secure multi-institutional warfarin dosing model development (Celik et al., 2017).

1. CPL Syntax and Structure

CPL provides an expressive, structured syntax presented in Backus–Naur Form (BNF). The core elements are "share" and "acquire" clauses, with optional "sub" clauses for modular filtering and attributes for named variables. Clause components include members (to/from whom data is shared/acquired), conditionals (logical or data-dependent requirements), and selections (row-level filters). Below is a simplified representation of the CPL grammar:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<curie_policy>    ::= <statement> (‘;’ <statement> )*
<statement>       ::= <share_clause> | <acquire_clause> | <sub_clause> | <attribute>
<share_clause>    ::= 'share' ':' [<members>] ':' [<conditionals>] '::' <selections>
<acquire_clause>  ::= 'acquire' ':' [<members>] ':' [<conditionals>] '::' <selections>
<sub_clause>      ::= <tag> ':' [<conditionals>] '::' <selections>
<attribute>       ::= <identifier> ':=' '<' <value> '>' | <identifier> ':=' '<' <value_list> '>'
<conditionals>    ::= (<var> '=' <value> (',' <conditionals>)*) 
                   |  'evaluate' '(' <data_ref> ',' <alg_arg> ',' <threshold> ')' (',' <conditionals>)* |  ε
<selections>      ::= <filters> | <tag>
<filters>         ::= <filter> (',' <filters>)*
<filter>          ::= <var> <operation> <value> | ε
<members>         ::= <member> (',' <members>)*
<member>          ::= <identifier>
<operation>       ::= '=' | '<' | '>' | '!=' | 'in' | 'like' | …
<value>           ::= string or number
<value_list>      ::= '{' <value> (',' <value>)* '}'

Conditionals can be straightforward Boolean predicates on member attributes or requester's metadata (e.g., "country=US") or privacy-preserving data-dependent evaluations (e.g., intersection size, Jaccard index, Pearson or cosine similarity) computed via secure two-party protocols.

2. Policy Semantics and Evaluation

CPL semantics are clause-oriented and evaluated top-down. A "share" clause of the form

1
share : M_j : C_1, ..., C_k :: S_1, ..., S_m

is interpreted as: when member MjM_j requests data, if all conditionals CiC_i evaluate to true, exactly the subset of rows meeting the selections SjS_j will be shared. Conversely, "acquire" clauses specify the data to request from a member, again gated by the declared conditionals.

Selections may reference simple row filters (e.g., "age > 65", "race in {Asian, White}"), or invoke named subclauses for modular filtering logic. Data-dependent conditionals such as

1
evaluate(local_column, 'Jaccard index', θ)

trigger secure two-party computation protocols to compute the relevant statistic, passing if the result meets threshold θ\theta. If no clause matches, no data are shared or acquired. Clause matching halts at the first clause whose conditionals succeed.

3. Local Policy Articulation

Each consortium member maintains a local CPL file, comprising two main types of statements:

  • Share-clauses: Specify what data the participant is willing to share, to whom, and under what conditions.
  • Acquire-clauses: Specify what data the participant intends to request, from whom, and under what conditions.

A minimal example for a three-member consortium is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@M1
acquire: M2:                   :: age > 25 ;
acquire: M3:  evaluate(col=age,'Jaccard',0.3):: race='Asian' ;
share:   M2:                  :: * ;
share:   M3:                  :: * ;

@M2
acquire: M1:                  :: * ;
share:   M1: NATO=true,EU=true :: country in {US,CA,UK} ;
share:   M1:                  :: race='White' ;

@M3
acquire: M1:                  :: genotype='A/A' ;
share:   M1: evaluate(col=genotype,'Intersection size',10) :: * ;
share:   M1:                  :: weight>150 ;

These policies permit intricate, multi-faceted control, such as alliance-restricted sharing, attribute-based filtering, and data similarity gating.

4. Negotiation and Global Agreement Formation

Pairwise negotiation is performed among all nn members:

  1. Each participant PiP_i sends its acquire-clause for PjP_j directly to PjP_j.
  2. On receipt, PjP_j evaluates its share-clauses relevant to PiP_i, applies conditionals (including data-dependent privacy-preserving checks), and calculates the intersection between PiP_i's requested subset and PjP_j's allowable subset. This intersection forms the agreed subset SjiS_{j→i}.
  3. PjP_j returns a minimal negotiation response (pointer to the subset) to PiP_i.
  4. After negotiation with all peers, PiP_i holds Sji:ji{S_{j→i} : j≠i} specifying permitted data slices.

The agreed data for PiP_i to acquire from PjP_j is defined by

Sij=AijSAji,subject toCijCjiS_{i \gets j} = A_{i \gets j} \cap SA_{j \gets i}, \quad \textrm{subject to} \quad C_{i \to j} \wedge C_{j \to i}

where CijC_{i \to j} and CjiC_{j \to i} are each side's respective conditionals. If either party's policy is not satisfied, no data are exchanged between that pair.

5. Enforcement via Secure MPC and Differential Privacy

After policy negotiation, each member PiP_i constructs local summary statistics over its permitted subset:

  • Oi=XiTXiRd×dO_i = X_i^T X_i \in \mathbb{R}^{d \times d} (feature covariance)
  • Vi=XiTyiRd×1V_i = X_i^T y_i \in \mathbb{R}^{d \times 1} (feature-outcome cross-term)

where XiX_i is the matrix of selected feature rows and yiy_i the outcome vector. With dd \ll |rows|, these statistics are compact. The global linear model solution is computed by:

β=(iOi)1(iVi)\beta^* = \Big(\sum_i O_i\Big)^{-1}\Big(\sum_i V_i\Big)

Computation proceeds via a homomorphic encryption (HE) ring protocol:

  1. Initiator generates (pk, sk), broadcasts pk.
  2. The initiator encrypts and forwards (CO,CV)=(Encryptpk(Oinit),Encryptpk(Vinit))(C_O, C_V) = (\text{Encrypt}_{pk}(O_{init}), \text{Encrypt}_{pk}(V_{init})).
  3. Each party adds its local encrypted Oi,ViO_i, V_i and forwards along the ring.
  4. When the data returns to the initiator, decryption recovers iOi,iVi\sum_i O_i, \sum_i V_i.

No party ever learns another member's raw data or cleartext statistics. Optionally, to achieve ϵ\epsilon-differential privacy, the functional mechanism (Zhang et al., VLDB '12) is applied by adding calibrated Laplace noise to the objective, based on pre-scaled columns mapped to [1,1][-1,1]. This step occurs post-aggregation, with no extra communication.

6. Case Study: Healthcare Consortia and Policy Expressiveness

CPL expressiveness and enforcement were validated in a warfarin dosing study covering 24 institutions in 9 countries. Five principal consortium architectures were included:

  • P.1: Single source (no cross-institution sharing)
  • P.2: Nation-wide (e.g., US-within-US)
  • P.3: Regional (e.g., North America, Europe, Asia grouped)
  • P.4: NATO–EU alliance-based
  • P.5: Global (all share with all)

Example policies for these settings:

  • Nation-wide:
    1
    2
    
    acquire: *country='US':: * ;
    share:   *country='US':: * ;
  • Global:
    1
    2
    
    acquire: *:: * ;
    share:   *:: * ;

In the experiments, for all configurations, acquire policies were fully satisfied by share policies; for the nation-wide (US) consortium, all 51 shares matched the 51 requests. The resulting global models reduced mean absolute percentage error (MAPE) by up to 25% over using only local models. Fine-grained policies (e.g., race-balanced, alliance-constrained) allowed participants to optimize data mixes for targeted accuracy gains.

7. Performance and Overhead Characteristics

Performance and scalability metrics indicated:

  • Negotiation overhead scales as O(n2)\mathcal{O}(n^2) in the worst case, but each interaction involves only minimal metadata (e.g., 13 members incur ~156 messages).
  • Non-cryptographic filters/conditionals execute in \ll1 ms per clause; data-dependent tests (e.g., Jaccard, intersection, Pearson, cosine) require 10–100 ms each, resulting in under 20 s for 25 × 25 pairs.
  • The predominant runtime cost is the HE-based MPC matrix aggregation (one O(d2)\mathcal{O}(d^2) matrix-add per member). For d=40d = 40 and n50n \leq 50, total time is generally less than 60 seconds. Key generation incurs one-time O(d2)\mathcal{O}(d^2) overhead.
  • Differential privacy via the functional mechanism imposes negligible additional cost after statistic aggregation.

CPL, therefore, achieves a tractable balance between expressiveness and computational overhead. Consortium partners can articulate intricate, enforceable data-sharing requirements—ranging from attribute-based and alliance-based rules to private data similarity gates—while maintaining scalable and principled secure computation (Celik et al., 2017).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curie Policy Language (CPL).