Stochastic Condition Masking (SCM)
- SCM is a framework that designs dynamic, randomized masking policies to maximize uncertainty about a system’s sensitive final state.
- It employs a controlled Hidden Markov Model and a primal–dual policy-gradient algorithm to optimize conditional entropy under cost constraints.
- Empirical evaluations in seven-state HMMs and grid worlds demonstrate that SCM significantly increases opacity while adhering to masking budgets.
Stochastic Condition Masking (SCM) refers to the synthesis of dynamic, randomized masking policies in stochastic systems designed to limit information leakage to external observers. The core objective is to regulate the release of sensor output to maximize the observer’s uncertainty about whether a system’s trajectory ends in a sensitive or “secret” state. SCM addresses the quantitative notion of final-state opacity in stochastic settings, optimizing this measure under explicit constraints on masking resource usage, as recently formalized in information-theoretic terms (Udupa et al., 14 Feb 2025).
1. System Model and Secrecy Objective
SCM models the plant and its masking interface as a controlled Hidden Markov Model (HMM)
where is a finite set of plant states; the state transition kernel; a finite alphabet of possible sensor observations; a finite set of masking configurations (masking actions); the initial state distribution; the initial mask; the emission probability distribution over conditioned on the plant state and mask 0.
A dynamic mask is a (randomized, memoryless) masking policy 1, determining the next masking configuration 2 based on the current system state 3 and current mask 4. Executing this policy produces a trajectory 5 over states, masks, and observations.
A designated subset 6 identifies secret (goal) states. At terminal time 7, the secret-indicator variable 8 equals 9 if 0 and 1 otherwise. The operational opacity goal is to maximize the observer's uncertainty about 2 given access only to the public observation sequence 3.
2. Quantifying Opacity with Conditional Entropy
Opacity in SCM is measured as the conditional Shannon entropy
4
This entropy quantifies information leakage: higher conditional entropy implies greater observer uncertainty as 5 and 6 approach 7 for all possible observation sequences. This transitions opacity analysis from qualitative notions to a rigorous, quantitative, information-theoretic framework.
3. Cost-Constrained Optimization of Masking Policies
Masking actions are associated with resource or privacy costs. For each state transition and mask change, an immediate cost 8 is incurred, and the expected, possibly discounted, total cost along a trajectory is
9
where 0 is a discount factor. SCM poses the mask-synthesis problem as constrained optimization,
1
where 2 is a cost budget. This framework ensures practical resource usage while maximizing final-state opacity.
4. Primal–Dual Policy-Gradient Solution
Masking policies are parameterized as a smooth family 3 (e.g., softmax), with 4 the parameter vector. Define
- 5 as the opacity objective,
- 6 as the associated cost.
The Lagrangian formulation is
7
The solution seeks the saddle-point 8 that maximizes 9 and minimizes 0:
1
Simultaneous gradient updates take the form:
- 2
- 3 where 4 are step sizes and 5 denotes projection onto 6.
Pseudocode:
8
5. Gradient Computation via Observable Operators
The non-additive structure of 7 precludes standard temporal-difference methods. SCM instead computes 8 analytically using the observable-operator formalism for controlled HMMs.
Let 9 denote an observation sequence. Define the controlled transition matrix 0 and emission matrices 1. For each 2:
3
with 4. The total observation likelihood is
5
Gradients are:
- 6
- For 7 (secret reached): 8 with gradient
9
where the numerators use the same observable-operator products. For 0, 1. The analytic expressions permit Monte Carlo estimation using batches of 2.
6. Empirical Evaluation
SCM was empirically validated on two models:
| Model | Masking Budget (3) | Observed 4 | Average Cost |
|---|---|---|---|
| Seven-state HMM | N/A | 0.0895 (none) | — |
| Seven-state HMM | 60 | ≈0.7132 | ≈42.6 (560) |
| Seven-state HMM | 20 | ≈0.6580 | ≈18.9 (620) |
| Grid world (7) | None | ≈0.168 | — |
| Grid world | Final-state mask | ≈0.1763 | ≈14–15 |
| Grid world | 70 | ≈0.6539 | ≈61.4 (870) |
| Grid world | 35 | ≈0.5274 | ≈34.1 (935) |
In the seven-state HMM example, the absence of masking results in low entropy (0), indicating near-certain observer inference. SCM policies under cost budgets achieved higher entropies (e.g., 1 for 2), confirming the ability to reduce information leakage while respecting resource constraints.
For a 3 grid world with mobile robot and spatial sensors, SCM significantly increased final-state opacity compared to naive full or final-state masking at various sensor reliabilities. Under 4, SCM achieved 5 (6), while final-state masks yielded 7; cost was maintained within specified budgets.
These results demonstrate that SCM produces nontrivial, state-dependent masking strategies that optimally trade off between masking overhead and information leakage, outperforming conventional masking approaches (Udupa et al., 14 Feb 2025).
7. Context and Significance
SCM formalizes the synthesis of dynamic masks for stochastic plants in a rigorous, information-theoretic fashion, advancing prior approaches that focused on qualitative or deterministic opacity criteria. The primal–dual policy-gradient algorithm, combined with closed-form conditional entropy gradients via observable operators, addresses the unique challenges of masking policy optimization in HMMs with secrecy goals. This development enables practitioners to tailor privacy and secrecy guarantees in stochastic control systems by directly optimizing observer uncertainty under explicit resource constraints.
The formalism and algorithms of SCM are directly applicable to privacy-preserving sensing, secure robotics, and supervisory control in cyber-physical systems where plausible deniability of final states is essential and masking costs are non-negligible.