Privacy-Aware Machine Unlearning with SISA for Reinforcement Learning-Based Ransomware Detection

Published 18 Apr 2026 in cs.CR | (2604.16760v1)

Abstract: Ransomware detection systems increasingly rely on behavior-based machine learning to address evolving attack strategies. However, emerging privacy compliance, data governance, and responsible AI deployment demand not only accurate detection but also the ability to efficiently remove the influence of specific training samples without retraining the models from scratch. In this study, we present a privacy-aware machine unlearning evaluation framework for reinforcement learning (RL)-based ransomware detection built on Sharded, Isolated, Sliced, and Aggregated (SISA) training. The framework enables efficient data deletion by retraining only the affected model shards rather than the entire detector, reducing the retraining cost while preserving detection performance. We conduct a controlled comparative study using value-based RL agents, including Deep Q-Network (DQN) and Double Deep Q-Network (DDQN), under identical experimental settings with a cost-sensitive reward design and 5-fold cross-validation on Windows 11 ransomware dataset. Detection confidence is evaluated using a continuous Q-score margin, enabling ROC-AUC analysis beyond binary predictions. For unlearning, the dataset is partitioned into five shards with majority-vote aggregation, and a fast-unlearning path is evaluated by deleting 5% of the samples from a single shard and retraining only that shard. Results show that SISA-based unlearning incurs negligible utility degradation (<= 0.05 percent F1 drop) while substantially reducing retraining time relative to full SISA retraining. DDQN exhibits slightly improved stability and lower utility loss than DQN, while both agents maintain near identical in-distribution performance after unlearning. These findings indicate that SISA provides an efficient unlearning mechanism for RL-based ransomware detection, supporting privacy-aware deployment without compromising security effectiveness.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a SISA-based machine unlearning framework in reinforcement learning for ransomware detection, enabling targeted removal of training samples with negligible performance loss.
The paper demonstrates that DQN and DDQN agents achieve near-perfect detection (AUC > 0.998) while maintaining robust performance post-unlearning.
The paper shows that one-shard retraining yields a 5x speedup over full retraining, ensuring efficient and practical privacy-aware updates.

Privacy-Aware Machine Unlearning with SISA for RL-Based Ransomware Detection

Introduction

The confluence of privacy mandates in AI deployment and the operational demand for robust ransomware detection systems necessitates methods for efficient and accountable machine unlearning. This work proposes a privacy-aware unlearning framework for reinforcement learning (RL)-based ransomware detection, leveraging the Sharded, Isolated, Sliced, and Aggregated (SISA) training paradigm. The framework allows for targeted removal of the influence of individual training samples by retraining only affected shards, obviating the need for exhaustive model retraining. The proposed approach is systematically compared using Deep Q-Network (DQN) and Double Deep Q-Network (DDQN) agents on state-of-the-art behavioral telemetry from Windows 11 ransomware and benign samples.

SISA-Based RL Framework for Ransomware Detection

The deployment workflow is succinctly captured in the SISA methodology. The dataset is partitioned into disjoint shards, with independent value-based RL agents trained on each shard. Ensemble aggregation of Q-network outputs at inference employs majority voting to enhance robustness and confine the impact of targeted sample deletion to specific shards. Unlearning is operationalized via retraining of the affected shard(s) only, thus maintaining computational tractability.

Figure 1: Workflow of the SISA-based training and aggregation framework.

The underlying RL formulation incorporates a cost-sensitive reward function with pronounced asymmetry to reflect the operational risk landscape of ransomware detection; false negatives (missed ransomware) incur a higher penalty than false positives.

Experimental Design and Dataset

The experimental regime utilizes a carefully curated dataset containing 2000 executables encompassing both benign files and 30 ransomware families, with rigorous validation using multi-engine consensus labeling. Feature engineering results in 103-dimensional behavioral vectors capturing filesystem activity, registry interactions, network events, and cryptographic operations.

Value-based RL (DQN, DDQN) agents are instantiated for each shard and trained with hyperparameters tuned for stability and efficiency (e.g., Adam optimizer, $\gamma=0.1$ , Huber loss, $\epsilon$ -greedy exploration). Evaluation employs 5-fold stratified cross-validation, with performance measured via F1-score and continuous Q-score–based ROC-AUC, where Q-score is computed as

$Q_{\theta}(\mathcal{X}, 1) - Q_{\theta}(\mathcal{X}, 0)$

allowing robust performance assessment beyond standard accuracy.

Detection Performance of Value-Based RL Agents

Both baseline DQN and DDQN configurations yield high discriminative performance, with DDQN demonstrating improved stability as reflected by lower worst-fold F1 variance and reduced utility degradation post-unlearning.

Figure 2: Confusion matrix for DDQN baseline detection, evidencing low false-positive and false-negative rates.

Figure 3: Q-Score ROC curve for DDQN, confirming near-perfect ranking performance with AUC $>0.998$ .

The empirical findings confirm that value-based RL is highly effective for malware detection on structured behavioral telemetry. The Q-score ROC analysis—enabled by the explicit state-action value function—provides well-calibrated detection confidence, essential for security operations.

Utility Preservation and Efficient Unlearning

Post-training, a privacy deletion request is simulated by removing 5% of samples from a single shard and retraining only the affected RL agent. Across both DQN and DDQN, the SISA-based unlearning process incurs negligible detection performance loss ( $\Delta\text{F1} \leq 0.05\%$ for DQN, $0.00\%$ for DDQN) compared to the pre-unlearning state. This result establishes that the majority-vote SISA ensemble can absorb modest, targeted deletion without operationally relevant degradation.

Moreover, the computational cost of one-shard retraining matches baseline (single-agent) training time, representing a 5x speedup over full SISA (multi-shard) retraining. This demonstrates that SISA unlearning can be seamlessly incorporated into production RL detection pipelines with minimal retraining overhead.

Figure 4: Runtime breakdown per fold for DDQN, illustrating that one-shard SISA unlearning incurs negligible additional cost compared to full SISA retraining.

Theoretical and Practical Implications

The study substantiates that SISA, originally designed for supervised model unlearning, is highly compatible with value-based RL. The explicit action-value function in DQN/DDQN supports robust Q-score confidence estimation, which is preserved post-unlearning. The decoupling of training into shard-isolated agents ensures that privacy deletions are auditable and computationally efficient.

Practically, this positions the SISA RL framework as a viable architecture for responsible AI deployment in security monitoring, particularly where regulatory compliance (e.g., GDPR 'right to be forgotten'), user data correction requests, and dynamic operational environments demand adaptive and privacy-aware behaviors.

Theoretically, the findings open several directions for extension:

Extending SISA to support multi-shard and sequential deletion scenarios,
Investigating formal verifiable unlearning and membership inference resistance,
Incorporating actor-critic or policy-gradient paradigms subject to aggregation and unlearning (requiring new confidence/ranking metrics beyond Q-score),
Assessing resilience under adversarial data removal attacks.

Conclusion

This work demonstrates that SISA-based machine unlearning provides a practical mechanism for integrating privacy-aware data removal into RL-based ransomware detectors without compromising detection efficacy or operational efficiency. The DDQN agent, leveraging SISA, achieves near-perfect detection (F1 $>0.99$ , AUC $>0.998$ ) and exhibits robust utility preservation post-unlearning, with retraining costs reduced to baseline levels. These results directly address the critical intersection of data privacy and effective, deployable AI-driven cybersecurity for dynamic threat landscapes.

Future research will extend these findings to more complex RL algorithms, compositional unlearning scenarios, and formal privacy audits, fostering broad adoption of privacy-preserving AI in security-critical domains.

Markdown Report Issue