Improve ASTRA’s computational efficiency

Develop algorithmic and implementation techniques to reduce the memory footprint and runtime of the ASTRA attention-based prompt injection attack, which currently requires extracting and backpropagating through large attention matrices and is approximately twice as slow as Greedy Coordinate Gradient (GCG) under equal forward-pass budgets.

Background

ASTRA relies on computing and differentiating through attention matrices, whose size scales quadratically with the context length. The authors note that this leads to significant memory consumption and reduced batch sizes, resulting in roughly a 2× slowdown relative to GCG on their 48 GB GPU setup.

The paper focuses on demonstrating attack effectiveness, leaving efficiency improvements for future work. Addressing efficiency is crucial for practical deployment, larger-scale evaluations, and adaptation to longer context windows.

References

We leave the question of efficiency of our attacks to future work.

— May I have your Attention? Breaking Fine-Tuning based Prompt Injection Defenses using Architecture-Aware Attacks (2507.07417 - Pandya et al., 10 Jul 2025) in Section 7.3, Discussion: Limitations of ASTRA

Improve ASTRA’s computational efficiency

Background

References

Related Problems