Effect of augmentation order in audio Best-of-N Jailbreaking

Investigate how the order in which the six audio augmentations—speed, pitch, speech, noise, volume, and music—are applied to vocalized harmful requests affects the resulting audio and the attack success rate of Best-of-N Jailbreaking against audio language models, including whether alternative orders improve or degrade efficacy.

Background

The authors’ audio attack pipeline composes six augmentations in a fixed order: [speed, pitch, speech, noise, volume, music]. They note that changing the order would alter how the resulting audio sounds, implying potential impact on model behavior.

They explicitly indicate that they did not test alternative orders and defer this investigation, leaving a concrete methodological uncertainty about the role of augmentation order in attack effectiveness.

References

We use the same order: [speed, pitch, speech, noise, volume, music] throughout experiments in the paper. We did not run experiments changing the order in which these are applied and leave that for future work.

Best-of-N Jailbreaking  (2412.03556 - Hughes et al., 2024) in Appendix: Augmentation Details, Section “Audio Augmentations,” subsection “Composing augmentations”