Papers
Topics
Authors
Recent
2000 character limit reached

Privacy-Preserving Formal Context Analysis

Updated 4 December 2025
  • Privacy-preserving Formal Context Analysis (PFCA) is a secure framework that integrates fully homomorphic encryption with formal concept analysis to extract precise concepts from sensitive data.
  • It uses bitwise encrypted data operations and torus-based FHE to compute concept lattices without exposing the underlying binary context, maintaining both accuracy and confidentiality.
  • The approach guarantees IND-CPA security while delivering exact FCA results, though at the cost of increased computational complexity and communication overhead.

Privacy-preserving Formal Context Analysis (PFCA) is a cryptographically-secure framework for conducting Formal Concept Analysis (FCA) on large-scale, sensitive datasets, where the goal is to extract knowledge or discover cognitive concepts without exposing underlying data to external services. PFCA combines binary data representation with fully homomorphic encryption (FHE), enabling secure concept construction on outsourced infrastructure while preserving the confidentiality of the formal context. The protocol yields exact FCA results and rigorous semantic security guarantees, at the cost of increased computational and communication overhead (Chen et al., 27 Nov 2025).

1. Formal Concept Analysis Foundations

A formal context is a triple K=(G,M,I)\mathcal{K}=(G, M, I), consisting of a finite set of objects G={g1,,gm}G = \{g_1, \dots, g_m\}, a finite set of attributes M={m1,,mn}M = \{m_1, \dots, m_n\}, and an incidence relation IG×MI \subseteq G \times M, where (g,m)I(g,m)\in I denotes object gg possesses attribute mm.

FCA derives concepts as pairs (A,B)(A, B) where AGA \subseteq G, BMB \subseteq M satisfy A=BA' = B and B=AB' = A under the Galois connection:

  • A={mMgA:(g,m)I}A' = \{ m \in M \mid \forall g \in A: (g,m)\in I \},
  • B={gGmB:(g,m)I}B' = \{ g \in G \mid \forall m \in B: (g,m)\in I \}.

Concepts are ordered via (A1,B1)(A2,B2)(A_1, B_1) \le (A_2, B_2) iff A1A2A_1 \subseteq A_2 (B2B1B_2 \subseteq B_1), producing a concept lattice.

2. Data Encoding and Ciphertext Operations

PFCA represents the context as a $0$-$1$ matrix I{0,1}m×nI \in \{0,1\}^{m \times n}. Each object and attribute is encoded as a bit-vector:

  • Object row: ri=(ri1,,rin){0,1}n\mathbf{r}_i = (r_{i1}, \ldots, r_{in}) \in \{0,1\}^n;
  • Attribute column: sj=(r1j,,rmj)T{0,1}m\mathbf{s}_j = (r_{1j}, \ldots, r_{mj})^T \in \{0,1\}^m.

Encryption proceeds bitwise: for object gig_i, ciphertext vector oi=(ci1,,cin)\mathbf{o}_i = (c_{i1},\dots,c_{in}) with cij=E(rij)c_{ij} = \mathcal{E}(r_{ij}). Similarly for attributes.

PFCA homomorphically evaluates:

  • Componentwise multiplication: u~v\mathbf{u} \tilde{\otimes} \mathbf{v}, where (u~v)i=uivi(\mathbf{u} \tilde{\otimes} \mathbf{v})_i = u_i \otimes v_i;
  • Aggregate sum: S~(u)=u1u2un\tilde{S}(\mathbf{u}) = u_1 \oplus u_2 \oplus \dots \oplus u_n.

For kk object vectors, P~(u1,,uk)=u1~~uk\tilde{P}(\mathbf{u}_1,\ldots,\mathbf{u}_k) = \mathbf{u}_1 \tilde{\otimes} \dots \tilde{\otimes} \mathbf{u}_k; F~=S~P~\tilde{F} = \tilde{S} \circ \tilde{P}. The decryption yields D(F~(oi1,,oik))=j=1n=1kri,j\mathcal{D}(\tilde{F}(\mathbf{o}_{i_1},\dots,\mathbf{o}_{i_k})) = \sum_{j=1}^n \prod_{\ell=1}^k r_{i_\ell, j}, the count of common attributes among the objects.

3. Torus-Based Fully Homomorphic Encryption

PFCA employs a torus-based FHE scheme, such as TFHE, configured for 128-bit security. Key generation selects secret key sZqns\in\mathbb{Z}_q^n with accompanying public evaluation key.

Encryption: Encs(m{0,1})\textrm{Enc}_s(m\in\{0,1\}) produces (a,b)Rq×Rq(a,b)\in R_q \times R_q with

  • aa sampled from RqR_q, error ee drawn from a discrete Gaussian;
  • b=a,s+q2m+e(modq)b = \langle a,s \rangle + \frac{q}{2}m + e \pmod{q}.

Decryption: recovers mm from ba,s(modq)b - \langle a, s \rangle \pmod{q} as m=round(2μ/q)m = \textrm{round}(2\mu / q).

Supported homomorphic operations include bitwise XOR ()(\oplus) and AND ()(\otimes), enabling vectorized logical computations on encrypted data.

4. Protocol for Secure Concept Construction

The protocol consists of:

  1. Key Setup: Data owner (DO) generates FHE keys.
  2. Context Encryption: DO encrypts incidence matrix entry-by-entry and uploads II^* to the cloud server (CS).
  3. Homomorphic Evaluation: CS, given encrypted object subsets XGX \subseteq G, computes F~\tilde{F} (attribute intersection cardinalities) using the homomorphic operators. Analogous computation applies to attribute subsets for intent calculation.
  4. Concept Enumeration: For each XGX \subseteq G, CS tests concept maximality by evaluating F~(X)\tilde{F}(X) and its extensions.
  5. Decryption: DO decrypts results, reconstructing the full set of concepts.

Algorithm 1 details enumeration of privacy concepts via f~\tilde{f}-induction; Algorithm 2 provides dual g~\tilde{g}-induction for attribute-centric concept discovery.

5. Security Guarantees and Analysis

The protocol is situated in the honest-but-curious model: CS executes protocol steps but seeks to infer plaintext information.

PFCA relies on the semantic (IND-CPA) security of FHE: given encrypted vectors E(u)\mathcal{E}(u) and E(v)\mathcal{E}(v), no polynomial-time adversary can distinguish uu from vv. All protocol interactions except final concept-size decryptions remain ciphertext-protected.

Correctness is formally established: PFCA recovers the FCA concept lattice exactly if computations proceed faithfully. Privacy is proved by reduction: protocol traces expose no information beyond concept-size aggregates due to FHE ciphertext indistinguishability and noise masking, as formalized in Theorem 2.

6. Computational Complexity and Performance Benchmarks

PFCA imposes significant overhead:

  • Encryption: O(mn)O(mn) ciphertexts for the context matrix.
  • Enumeration: For each subset XGX \subseteq G (objects), f~(X)\tilde{f}(X) requires X1|X|-1 vector homomorphic ANDs, n1n-1 homomorphic XORs. Complete enumeration over 2m2^m subsets yields O(2mmn)O(2^m m n) homomorphic operations; analogously O(2nnm)O(2^n n m) for attribute subsets.

Communication involves upload of mnm \cdot n ciphertexts (size λ\lambda per ciphertext) and cloud-owner exchanges of result ciphertexts per query.

Empirical evaluation (AMD EPYC, 32-core, TFHE) reveals generation times for UCI datasets (rows ×\times columns):

Dataset (Rows×Cols) HECC (s) TEM (s) TIA (s)
8,124 × 18 41.07 217.34 46.00
12,960 × 12 4.84 40.82 356.00
19,735 × 15 34.98 53.58 6486
20,000 × 22 1715.4 9525.7 2216.0
48,842 × 20 1227.6 3676.9 4495.0
53,413 × 14 44.54 77.08 45658
253,680 × 21 14925.1 75869.6 1296000

Parallelization yields up to 8×8\times speedup on concept enumeration.

7. Concrete Example: Toy Context Computation

For objects U={o1,o2,o3,o4}U = \{o_1, o_2, o_3, o_4\} and attributes A={a1,...,a5}A = \{a_1, ..., a_5\}, consider context II from the source table.

PFCA steps:

  1. Encrypt: o2=(E(1),E(1),0,0,0)\mathbf{o}_2 = (\mathcal{E}(1), \mathcal{E}(1), 0, 0, 0); o4=(E(1),E(1),E(1),E(1),0)\mathbf{o}_4 = (\mathcal{E}(1), \mathcal{E}(1), \mathcal{E}(1), \mathcal{E}(1), 0).
  2. Evaluate: P~(o2,o4)=(E(1)E(1),)\tilde{P}(\mathbf{o}_2, \mathbf{o}_4) = (\mathcal{E}(1)\otimes\mathcal{E}(1), \ldots).
  3. Sum: F~(o2,o4)=E(2)\tilde{F}(\mathbf{o}_2, \mathbf{o}_4) = \mathcal{E}(2).
  4. Decrypt: DO recovers 2, denoting two common attributes. Homomorphic tests confirm these as {a1,a2}\{a_1, a_2\}.
  5. Extend: Maximality checks complete the privacy concept lattice recovery over all XUX \subseteq U.

8. Comparison with Alternative FCA Approaches

Comparison of privacy and efficiency across paradigms:

Method Privacy Accuracy Overhead
Classical FCA (In-Close, CbO) None Exact Low
FedFCA (Sellami et al., DP) Approx., DP Approx. Medium
PFCA (FHE, this paper) Cryptographic (FHE), exact Exact High
PPARM (assoc. rules) Masking, DP Approx. Variable

Traditional FCA is fast but unprotected; federated differential privacy approaches provide approximate accuracy and moderate overhead. PFCA achieves provable cryptographic privacy (IND-CPA) with exact output, incurring high overhead due to homomorphic computations and exponential enumeration.

Current limitations include O(2m)O(2^m) (or O(2n)O(2^n)) enumeration cost and substantial resource requirements for FHE. Future directions involve hybrid protocols with structural pruning (e.g., NextClosure), secure outsourced computation with sub-exponential complexity, and direct data-mining on privacy concepts bypassing full lattice reconstruction (Chen et al., 27 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Privacy-preserving Formal Context Analysis (PFCA).