Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LCANets++: Robust Audio Classification using Multi-layer Neural Networks with Lateral Competition (2308.12882v2)

Published 23 Aug 2023 in cs.SD, cs.CR, cs.LG, and eess.AS

Abstract: Audio classification aims at recognizing audio signals, including speech commands or sound events. However, current audio classifiers are susceptible to perturbations and adversarial attacks. In addition, real-world audio classification tasks often suffer from limited labeled data. To help bridge these gaps, previous work developed neuro-inspired convolutional neural networks (CNNs) with sparse coding via the Locally Competitive Algorithm (LCA) in the first layer (i.e., LCANets) for computer vision. LCANets learn in a combination of supervised and unsupervised learning, reducing dependency on labeled samples. Motivated by the fact that auditory cortex is also sparse, we extend LCANets to audio recognition tasks and introduce LCANets++, which are CNNs that perform sparse coding in multiple layers via LCA. We demonstrate that LCANets++ are more robust than standard CNNs and LCANets against perturbations, e.g., background noise, as well as black-box and white-box attacks, e.g., evasion and fast gradient sign (FGSM) attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. “Deep learning approaches for understanding simple speech commands,” in 2020 IEEE 40th international conference on electronics and nanotechnology (ELNANO). IEEE, 2020, pp. 688–693.
  2. “Cnn architectures for large-scale audio classification,” in IEEE international Conf. on acoustics, speech and signal processing, 2017, pp. 131–135.
  3. “Audio adversarial examples: Targeted attacks on speech-to-text,” in IEEE security and privacy workshops. IEEE, 2018, pp. 1–7.
  4. “Enabling fast and universal audio adversarial attack using generative model,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, pp. 14129–14137.
  5. “Model inversion attack with least information and an in-depth analysis of its disparate vulnerability,” in 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE, 2023, pp. 119–135.
  6. Sayanton V Dibbo, “Sok: Model inversion attack landscape: Taxonomy, challenges, and future roadmap,” in IEEE 36th Computer Security Foundations Symposium. IEEE Computer Society, 2023, pp. 408–425.
  7. “Few-shot continual learning for audio classification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 321–325.
  8. “On-phone cnn model-based implicit authentication to secure iot wearables,” in 5th Intl Conf. on Safety and Security with IoT. Springer, 2022, pp. 19–34.
  9. “Spatial data augmentation with simulated room impulse responses for sound event localization and detection,” in IEEE Intl Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 8872–8876.
  10. “Speech emotion recognition based on parallel cnn-attention networks with multi-fold data augmentation,” Electronics, vol. 11, no. 23, pp. 3935, 2022.
  11. “Simulating a primary visual cortex at the front of cnns improves robustness to image perturbations,” Advances in Neural Information Processing Systems, vol. 33, pp. 13073–13087, 2020.
  12. “Selectivity and robustness of sparse coding networks,” Journal of vision, pp. 10–10, 2020.
  13. “Lcanets: Lateral competition improves robustness against corruption and attack,” in International Conference on Machine Learning. PMLR, 2022, pp. 21232–21252.
  14. “Revisiting sparse convolutional model for visual recognition,” Advances in Neural Information Processing Systems, pp. 10492–10504, 2022.
  15. “Emergence of simple-cell receptive field properties by learning a sparse code for natural images,” Nature, vol. 381, no. 6583, pp. 607–609, 1996.
  16. “Adversarial attacks and defences competition,” in NIPS’17 Competition: Building Intelligent Systems, 2018, pp. 195–231.
  17. “Witchcraft: Efficient pgd attacks with random step size,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 3747–3751.
  18. “Minimalistic unsupervised representation learning with the sparse manifold transform,” in 11th International Conference on Learning Representations, 2022.
  19. Pete Warden, “Speech commands: A dataset for limited-vocabulary speech recognition,” arXiv preprint arXiv:1804.03209, 2018.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com