Privacy-Preserving Formal Context Analysis

Updated 4 December 2025

Privacy-preserving Formal Context Analysis (PFCA) is a secure framework that integrates fully homomorphic encryption with formal concept analysis to extract precise concepts from sensitive data.
It uses bitwise encrypted data operations and torus-based FHE to compute concept lattices without exposing the underlying binary context, maintaining both accuracy and confidentiality.
The approach guarantees IND-CPA security while delivering exact FCA results, though at the cost of increased computational complexity and communication overhead.

Privacy-preserving Formal Context Analysis (PFCA) is a cryptographically-secure framework for conducting Formal Concept Analysis (FCA) on large-scale, sensitive datasets, where the goal is to extract knowledge or discover cognitive concepts without exposing underlying data to external services. PFCA combines binary data representation with fully homomorphic encryption (FHE), enabling secure concept construction on outsourced infrastructure while preserving the confidentiality of the formal context. The protocol yields exact FCA results and rigorous semantic security guarantees, at the cost of increased computational and communication overhead (Chen et al., 27 Nov 2025).

1. Formal Concept Analysis Foundations

A formal context is a triple $\mathcal{K}=(G, M, I)$ , consisting of a finite set of objects $G = \{g_1, \dots, g_m\}$ , a finite set of attributes $M = \{m_1, \dots, m_n\}$ , and an incidence relation $I \subseteq G \times M$ , where $(g,m)\in I$ denotes object $g$ possesses attribute $m$ .

FCA derives concepts as pairs $(A, B)$ where $A \subseteq G$ , $B \subseteq M$ satisfy $A' = B$ and $B' = A$ under the Galois connection:

$A' = \{ m \in M \mid \forall g \in A: (g,m)\in I \}$ ,
$B' = \{ g \in G \mid \forall m \in B: (g,m)\in I \}$ .

Concepts are ordered via $(A_1, B_1) \le (A_2, B_2)$ iff $A_1 \subseteq A_2$ ( $B_2 \subseteq B_1$ ), producing a concept lattice.

2. Data Encoding and Ciphertext Operations

PFCA represents the context as a $0$-$1$ matrix $I \in \{0,1\}^{m \times n}$ . Each object and attribute is encoded as a bit-vector:

Object row: $\mathbf{r}_i = (r_{i1}, \ldots, r_{in}) \in \{0,1\}^n$ ;
Attribute column: $\mathbf{s}_j = (r_{1j}, \ldots, r_{mj})^T \in \{0,1\}^m$ .

Encryption proceeds bitwise: for object $g_i$ , ciphertext vector $\mathbf{o}_i = (c_{i1},\dots,c_{in})$ with $c_{ij} = \mathcal{E}(r_{ij})$ . Similarly for attributes.

PFCA homomorphically evaluates:

Componentwise multiplication: $\mathbf{u} \tilde{\otimes} \mathbf{v}$ , where $(\mathbf{u} \tilde{\otimes} \mathbf{v})_i = u_i \otimes v_i$ ;
Aggregate sum: $\tilde{S}(\mathbf{u}) = u_1 \oplus u_2 \oplus \dots \oplus u_n$ .

For $k$ object vectors, $\tilde{P}(\mathbf{u}_1,\ldots,\mathbf{u}_k) = \mathbf{u}_1 \tilde{\otimes} \dots \tilde{\otimes} \mathbf{u}_k$ ; $\tilde{F} = \tilde{S} \circ \tilde{P}$ . The decryption yields $\mathcal{D}(\tilde{F}(\mathbf{o}_{i_1},\dots,\mathbf{o}_{i_k})) = \sum_{j=1}^n \prod_{\ell=1}^k r_{i_\ell, j}$ , the count of common attributes among the objects.

3. Torus-Based Fully Homomorphic Encryption

PFCA employs a torus-based FHE scheme, such as TFHE, configured for 128-bit security. Key generation selects secret key $s\in\mathbb{Z}_q^n$ with accompanying public evaluation key.

Encryption: $\textrm{Enc}_s(m\in\{0,1\})$ produces $(a,b)\in R_q \times R_q$ with

$a$ sampled from $R_q$ , error $e$ drawn from a discrete Gaussian;
$b = \langle a,s \rangle + \frac{q}{2}m + e \pmod{q}$ .

Decryption: recovers $m$ from $b - \langle a, s \rangle \pmod{q}$ as $m = \textrm{round}(2\mu / q)$ .

Supported homomorphic operations include bitwise XOR $(\oplus)$ and AND $(\otimes)$ , enabling vectorized logical computations on encrypted data.

4. Protocol for Secure Concept Construction

The protocol consists of:

Key Setup: Data owner (DO) generates FHE keys.
Context Encryption: DO encrypts incidence matrix entry-by-entry and uploads $I^*$ to the cloud server (CS).
Homomorphic Evaluation: CS, given encrypted object subsets $X \subseteq G$ , computes $\tilde{F}$ (attribute intersection cardinalities) using the homomorphic operators. Analogous computation applies to attribute subsets for intent calculation.
Concept Enumeration: For each $X \subseteq G$ , CS tests concept maximality by evaluating $\tilde{F}(X)$ and its extensions.
Decryption: DO decrypts results, reconstructing the full set of concepts.

Algorithm 1 details enumeration of privacy concepts via $\tilde{f}$ -induction; Algorithm 2 provides dual $\tilde{g}$ -induction for attribute-centric concept discovery.

5. Security Guarantees and Analysis

The protocol is situated in the honest-but-curious model: CS executes protocol steps but seeks to infer plaintext information.

PFCA relies on the semantic (IND-CPA) security of FHE: given encrypted vectors $\mathcal{E}(u)$ and $\mathcal{E}(v)$ , no polynomial-time adversary can distinguish $u$ from $v$ . All protocol interactions except final concept-size decryptions remain ciphertext-protected.

Correctness is formally established: PFCA recovers the FCA concept lattice exactly if computations proceed faithfully. Privacy is proved by reduction: protocol traces expose no information beyond concept-size aggregates due to FHE ciphertext indistinguishability and noise masking, as formalized in Theorem 2.

6. Computational Complexity and Performance Benchmarks

PFCA imposes significant overhead:

Encryption: $O(mn)$ ciphertexts for the context matrix.
Enumeration: For each subset $X \subseteq G$ (objects), $\tilde{f}(X)$ requires $|X|-1$ vector homomorphic ANDs, $n-1$ homomorphic XORs. Complete enumeration over $2^m$ subsets yields $O(2^m m n)$ homomorphic operations; analogously $O(2^n n m)$ for attribute subsets.

Communication involves upload of $m \cdot n$ ciphertexts (size $\lambda$ per ciphertext) and cloud-owner exchanges of result ciphertexts per query.

Empirical evaluation (AMD EPYC, 32-core, TFHE) reveals generation times for UCI datasets (rows $\times$ columns):

Dataset (Rows×Cols)	HECC (s)	TEM (s)	TIA (s)
8,124 × 18	41.07	217.34	46.00
12,960 × 12	4.84	40.82	356.00
19,735 × 15	34.98	53.58	6486
20,000 × 22	1715.4	9525.7	2216.0
48,842 × 20	1227.6	3676.9	4495.0
53,413 × 14	44.54	77.08	45658
253,680 × 21	14925.1	75869.6	1296000

Parallelization yields up to $8\times$ speedup on concept enumeration.

7. Concrete Example: Toy Context Computation

For objects $U = \{o_1, o_2, o_3, o_4\}$ and attributes $A = \{a_1, ..., a_5\}$ , consider context $I$ from the source table.

PFCA steps:

Encrypt: $\mathbf{o}_2 = (\mathcal{E}(1), \mathcal{E}(1), 0, 0, 0)$ ; $\mathbf{o}_4 = (\mathcal{E}(1), \mathcal{E}(1), \mathcal{E}(1), \mathcal{E}(1), 0)$ .
Evaluate: $\tilde{P}(\mathbf{o}_2, \mathbf{o}_4) = (\mathcal{E}(1)\otimes\mathcal{E}(1), \ldots)$ .
Sum: $\tilde{F}(\mathbf{o}_2, \mathbf{o}_4) = \mathcal{E}(2)$ .
Decrypt: DO recovers 2, denoting two common attributes. Homomorphic tests confirm these as $\{a_1, a_2\}$ .
Extend: Maximality checks complete the privacy concept lattice recovery over all $X \subseteq U$ .

8. Comparison with Alternative FCA Approaches

Comparison of privacy and efficiency across paradigms:

Method	Privacy	Accuracy	Overhead
Classical FCA (In-Close, CbO)	None	Exact	Low
FedFCA (Sellami et al., DP)	Approx., DP	Approx.	Medium
PFCA (FHE, this paper)	Cryptographic (FHE), exact	Exact	High
PPARM (assoc. rules)	Masking, DP	Approx.	Variable

Traditional FCA is fast but unprotected; federated differential privacy approaches provide approximate accuracy and moderate overhead. PFCA achieves provable cryptographic privacy (IND-CPA) with exact output, incurring high overhead due to homomorphic computations and exponential enumeration.

Current limitations include $O(2^m)$ (or $O(2^n)$ ) enumeration cost and substantial resource requirements for FHE. Future directions involve hybrid protocols with structural pruning (e.g., NextClosure), secure outsourced computation with sub-exponential complexity, and direct data-mining on privacy concepts bypassing full lattice reconstruction (Chen et al., 27 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Privacy-preserving formal concept analysis: A homomorphic encryption-based concept construction (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Privacy-preserving Formal Context Analysis (PFCA).