Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems (2405.17100v1)
Abstract: The integration of Voice Control Systems (VCS) into smart devices and their growing presence in daily life accentuate the importance of their security. Current research has uncovered numerous vulnerabilities in VCS, presenting significant risks to user privacy and security. However, a cohesive and systematic examination of these vulnerabilities and the corresponding solutions is still absent. This lack of comprehensive analysis presents a challenge for VCS designers in fully understanding and mitigating the security issues within these systems. Addressing this gap, our study introduces a hierarchical model structure for VCS, providing a novel lens for categorizing and analyzing existing literature in a systematic manner. We classify attacks based on their technical principles and thoroughly evaluate various attributes, such as their methods, targets, vectors, and behaviors. Furthermore, we consolidate and assess the defense mechanisms proposed in current research, offering actionable recommendations for enhancing VCS security. Our work makes a significant contribution by simplifying the complexity inherent in VCS security, aiding designers in effectively identifying and countering potential threats, and setting a foundation for future advancements in VCS security research.
- 2023. Smart Speakers Global Market Report 2023. https://www.reportlinker.com/p06247523/Smart-Speakers-Global-Market-Report.html-utm-source-PRN.
- Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing (2014).
- Practical hidden voice attacks against speech and speaker recognition systems. arXiv preprint arXiv:1904.05734 (2019).
- Hear” no evil”, see” kenansville”: Efficient and transferable black-box attacks on speech recognition and voice identification systems. In IEEE Symposium on Security and Privacy.
- Sok: The faults in our asrs: An overview of attacks against automatic speech recognition and speaker identification systems. In IEEE symposium on security and privacy.
- Void: A fast and light voice liveness detection system. In 29th USENIX Security Symposium.
- Tubes Among Us: Analog Attack on Automatic Speaker Identification. In USENIX Security Symposium.
- Spoofing countermeasures to protect automatic speaker verification from voice conversion. In IEEE International Conference on Acoustics, Speech and Signal Processing.
- Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals. In Thirteenth Annual Conference of the International Speech Communication Association.
- Did you hear that? adversarial examples against automatic speech recognition. arXiv preprint arXiv:1801.00554 (2018).
- MP3 compression to diminish adversarial noise in end-to-end speech recognition. In Speech and Computer.
- Acoustic-based sensing and applications: A survey. Computer Networks (2020).
- BBC. 2017. BBC reporter fools bank voice-ID security. https://www.bbc.co.uk/news/technology-39973217.
- High fidelity speech synthesis with adversarial networks. arXiv preprint arXiv:1909.11646 (2019).
- Nonsense attacks on google assistant and missense attacks on amazon alexa. In International Conference on Information Systems Security and Privacy.
- Hello, is it me you’re looking for? differentiating between human and electronic speakers for voice interface security. In ACM Conference on Security & Privacy in Wireless and Mobile Networks.
- Who Are You I Really Wanna Know? Detecting Audio {{\{{DeepFakes}}\}} Through Vocal Tract Reconstruction. In USENIX Security Symposium.
- Hidden voice commands.. In Usenix security symposium.
- Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE security and privacy workshops.
- Audio adversarial examples generation with recurrent neural networks. In Asia and South Pacific Design Automation Conference.
- Who is real bob? adversarial attacks on speaker recognition systems. In IEEE Symposium on Security and Privacy.
- Speaker verification against synthetic speech. In International Symposium on Chinese Spoken Language Processing.
- You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. In IEEE international conference on distributed computing systems.
- Metamorph: Injecting inaudible commands into over-the-air voice controlled systems. In Network and Distributed Systems Security Symposium.
- Devil’s Whisper: A General Approach for Physical Adversarial Attacks against Commercial Black-box Speech Recognition Devices.. In USENIX Security Symposium.
- Sok: A modularized approach to study the security of automatic speech recognition systems. ACM Transactions on Privacy and Security (2022).
- Dangerous skills got certified: Measuring the trustworthiness of skill certification in voice personal assistant platforms. In ACM SIGSAC Conference on Computer and Communications Security.
- Attention-based models for speech recognition. Advances in neural information processing systems (2015).
- Houdini: Fooling deep structured visual and speech recognition models with adversarial examples. Advances in neural information processing systems (2017).
- Inducing wireless chargers to voice out for inaudible command attacks. In IEEE Symposium on Security and Privacy.
- Adagio: Interactive experimentation with adversarial attack and defense for audio. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10–14, 2018, Proceedings, Part III 18.
- Controlling {{\{{UAVs}}\}} with Sensor Input Spoofing Attacks. In 10th USENIX workshop on offensive technologies.
- Evaluation of speaker verification security and detection of HMM-based synthetic speech. IEEE Transactions on Audio, Speech, and Language Processing (2012).
- Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis.. In Interspeech.
- CATCH YOU AND I CAN: Revealing Source Voiceprint Against Voice Conversion. In USENIX Security Symposium.
- Voice conversion using artificial neural networks. In IEEE International Conference on Acoustics.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
- Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security.
- Smart home personal assistants: a security and privacy review. Comput. Surveys (2020).
- On the vulnerability of speaker verification to realistic voice spoofing. In International Conference on Biometrics Theory.
- Spoofing and countermeasures for automatic speaker verification.. In Interspeech.
- How vulnerable are prosodic features to professional imitators?. In The Speaker and Language Recognition Workshop.
- Adversarial vulnerability for any classifier. Advances in neural information processing systems (2018).
- Continuous authentication for voice assistants. In International Conference on Mobile Computing and Networking.
- Voice conversion based on feature combination with limited training data. Speech Communication (2015).
- AudiDoS: Real-Time denial-of-service adversarial attacks on deep audio models. In 2019 18th IEEE International Conference on Machine Learning and Applications.
- Real-time adversarial attacks. In International Joint Conference on Artificial Intelligence.
- Yuan Gong and Christian Poellabauer. 2018a. An overview of vulnerabilities of voice controlled systems. arXiv preprint arXiv:1803.09156 (2018).
- Yuan Gong and Christian Poellabauer. 2018b. Protecting voice controlled systems using sound source identification based on acoustic cues. In International Conference on Computer Communication and Networks.
- Detecting replay attacks using multi-channel audio: A neural network-based method. IEEE Signal Processing Letters (2020).
- Explaining and harnessing adversarial examples. In International Conference on Learning Representations.
- Alex Graves. 2012. Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711 (2012).
- Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning.
- Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognition. In 2022 ACM SIGSAC Conference on Computer and Communications Security.
- INOR—An Intelligent noise reduction method to defend against adversarial audio examples. Neurocomputing (2020).
- Skillexplorer: Understanding the behavior of skills in large scale. In USENIX Conference on Security Symposium.
- Jiang Hai and Er Meng Joo. 2003. Improved linear predictive coding method for speech recognition. In international conference on information, communications and signal processing.
- Nickel to LEGO: using Foolgle to create adversarial examples to fool Google cloud speech-to-text API. In ACM SIGSAC Conference on Computer and Communications Security.
- Bruce Hartpence. 2013. Packet Guide to Voice over IP: A system administrator’s guide to VoIP technologies. ” O’Reilly Media, Inc.”.
- I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry.. In Interspeech.
- Canceling inaudible voice commands against voice control systems. In International Conference on Mobile Computing and Networking.
- Andrew J Hunt and Alan W Black. 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
- Audio hotspot attack: An attack on voice assistance systems using directional sound beams. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.
- Generating adversarial examples for speech recognition. Stanford Technical Report (2017).
- Mohammad Javad Jannati and Abolghasem Sayadiyan. 2018. Part-syllable transformation-based voice conversion with very limited training data. Circuits, Systems, and Signal Processing (2018).
- Exploration of Compressed ILPR Features for Replay Attack Detection.. In Interspeech.
- CapSpeaker: Injecting Voices to Microphones via Capacitors. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security.
- Effectiveness of Speech Demodulation-Based Features for Replay Detection.. In Interspeech.
- Chaouki Kasmi and Jose Lopes Esteves. 2015. IEMI threats for information security: Remote command injection on modern smartphones. IEEE Transactions on Electromagnetic Compatibility (2015).
- James Kennedy and Russell Eberhart. 1995. Particle swarm optimization. In ICNN’95-international conference on neural networks.
- LG Kersta and JA Colangelo. 1970. Spectrographic speech patterns of identical twins. The Journal of the Acoustical Society of America (1970).
- Adversarial black-box attacks on automatic speech recognition systems using multi-objective evolutionary optimization. In Annual Conference of the International Speech Communication Association.
- Hidden Markov model based voice conversion using dynamic characteristics of speaker. In European Conference On Speech Communication And Technology.
- The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection. International Speech Communication Association) (2017).
- Fooling end-to-end speaker verification with adversarial examples. In IEEE international conference on acoustics, speech and signal processing.
- Skill squatting attacks on Amazon Alexa. In USENIX Security Symposium.
- Adversarial examples in the physical world. In Artificial intelligence safety and security.
- Selective audio adversarial example in evasion attack on speech recognition system. IEEE Transactions on Information Forensics and Security (2019).
- Testing voice mimicry with the YOHO speaker verification corpus. In Knowledge-Based Intelligent Information and Engineering Systems.
- Vulnerability of speaker verification to voice mimicking. In International Symposium on Intelligent Multimedia.
- Adversarial music: Real world audio adversary against wake-word detection system. Advances in Neural Information Processing Systems (2019).
- Universal adversarial perturbations generative network for speaker recognition. In IEEE International Conference on Multimedia and Expo.
- Learning to fool the speaker recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (2021).
- Learning Normality is Enough: A Software-based Mitigation against Inaudible Voice Attacks. In USENIX Security Symposium.
- Adversarial attacks on GMM i-vector based speaker verification systems. In IEEE International Conference on Acoustics, Speech and Signal Processing.
- Practical adversarial attacks against speaker recognition systems. In international workshop on mobile computing systems and applications.
- Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations. In ACM SIGSAC Conference on Computer and Communications Security.
- Johan Lindberg and Mats Blomberg. 1999. Vulnerability in speaker verification-a study of technical impostor techniques. In European conference on speech communication and technology.
- When evil calls: Targeted adversarial voice over ip network. In ACM SIGSAC Conference on Computer and Communications Security.
- Weighted-sampling audio adversarial example attack. In AAAI Conference on Artificial Intelligence.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations.
- Johnny Mariéthoz and Samy Bengio. 2005. Can a professional imitator fool a GMM-based speaker verification system? Technical Report. IDIAP.
- Adversarial Optimization for Dictionary Attacks on Speaker Verification.. In Interspeech.
- Matt May. 2005. Inaccessibility of CAPTCHA: Alternatives to Visual Turing Tests on the Web. https://www.w3.org/TR/turingtest/.
- Your microphone array retains your identity: A robust voice liveness detection system for smart speaker. In USENIX Security.
- Wivo: Enhancing the security of voice control system via wireless signal in iot environment. In ACM international symposium on mobile ad hoc networking and computing.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
- Voice liveness detection using phoneme-based pop-noise detector for speaker verifcation. Threshold (2018).
- Speech recognition using deep neural networks: A systematic review. IEEE access (2019).
- Universal Adversarial Perturbations for Speech Recognition Systems. In International Speech Communication Association.
- Discrimination method of synthetic speech using pitch frequency against synthetic speech falsification. IEICE transactions on fundamentals of electronics, communications and computer sciences (2005).
- Parallel wavenet: Fast high-fidelity speech synthesis. In International conference on machine learning.
- SEGAN: Speech enhancement generative adversarial network. arXiv preprint arXiv:1703.09452 (2017).
- Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection.. In Interspeech.
- Hemant A Patil and Keshab K Parhi. 2009. Variable length Teager energy based mel cepstral features for identification of twins. In Pattern Recognition and Machine Intelligence.
- Combating replay attacks against voice assistants. ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (2019).
- Near-Ultrasound lnaudible Trojan(NUIT): Exploiting Your Speaker toAttack Your Microphone. In USENIX Security Symposium.
- A unified trajectory tiling approach to high quality speech rendering. IEEE transactions on audio, speech, and language processing (2012).
- Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In International conference on machine learning.
- Lawrence Rabiner and Biing-Hwang Juang. 1993. Fundamentals of speech recognition. Prentice-Hall, Inc.
- Lawrence R Rabiner. 1978. Digital processing of speech signals. Pearson Education India.
- Speech coding and audio preprocessing for mitigating and detecting audio adversarial examples on automatic speech recognition. Rajaratnam. pdf (2018).
- Isolated and ensemble audio preprocessing methods for detecting adversarial examples against automatic speech recognition. arXiv preprint arXiv:1809.04397 (2018).
- Voice activity detection. fundamentals and speech recognition system robustness. Robust speech recognition and understanding (2007).
- Inaudible Voice Commands: The Long-Range Attack and Defense. In USENIX Symposium on Networked Systems Design and Implementation.
- Speech activity detection on youtube using deep neural networks.. In INTERSPEECH.
- Robust voice liveness detection and speaker verification using throat microphones. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2017).
- Convolutional, long short-term memory, fully connected deep neural networks. In IEEE international conference on acoustics, speech and signal processing.
- Statistical parametric speech synthesis incorporating generative adversarial networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2017).
- A robust speaker verification system against imposture using an HMM-based speech synthesis system. In Seventh European Conference on Speech Communication and Technology.
- Imperio: Robust over-the-air adversarial examples for automatic speech recognition systems. In Annual Computer Security Applications Conference.
- Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding. arXiv preprint arXiv:1808.05665 (2018).
- Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding. In Network and Distributed System Security Symposium.
- FoolHD: Fooling Speaker Identification by Highly Imperceptible Adversarial Disturbances. In IEEE International Conference on Acoustics, Speech and Signal Processing.
- Defending against voice spoofing: A robust software-based liveness detection system. In IEEE International Conference on Mobile Ad Hoc and Sensor Systems.
- Jiacheng Shang and Jie Wu. 2019. Enabling secure voice input on augmented reality headsets using internal body voice. In IEEE International Conference on Sensing, Communication, and Networking.
- Jiacheng Shang and Jie Wu. 2020a. Secure voice input on augmented reality headsets. IEEE Transactions on Mobile Computing (2020).
- Jiacheng Shang and Jie Wu. 2020b. Voice liveness detection for voice assistants using ear canal pressure. In International Conference on Mobile Ad Hoc and Sensor Systems.
- Wei Shang and Maryhelen Stevenson. 2010. Score normalization in playback attack detection. In IEEE international conference on acoustics, speech and signal processing.
- Voice Liveness Detection for Speaker Verification based on a Tandem Single/Double-channel Pop Noise Detector.. In Odyssey.
- Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music.. In ISMIR.
- A statistical model-based voice activity detection. IEEE signal processing letters (1999).
- Liwei Song and Prateek Mittal. 2017. Poster: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.
- Conceptual alignment: How brains achieve mutual understanding. Trends in cognitive sciences (2016).
- Continuous probabilistic transform for voice conversion. IEEE Transactions on speech and audio processing (1998).
- ” Are you home alone?”” Yes” Disclosing Security and Privacy Vulnerabilities in Alexa Skills. arXiv preprint arXiv:2010.10788 (2020).
- Light Commands:Laser-Based Audio Injection Attacks on Voice-Controllable Systems. In USENIX Security Symposium.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
- Joseph Szurley and J Zico Kolter. 2019. Perceptual based adversarial audio attacks. arXiv preprint arXiv:1906.06355 (2019).
- End-to-end anti-spoofing with rawnet2. In IEEE International Conference on Acoustics, Speech and Signal Processing.
- Novel defense method against audio adversarial example for speech-to-text transcription neural networks. In IEEE International Workshop on Computational Intelligence and Applications.
- Targeted adversarial examples for black box audio systems. In IEEE security and privacy workshops.
- ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection. In Annual Conference of the International Speech Communication Association.
- Jon Vadillo and Roberto Santana. 2019. Universal adversarial examples in speech command classification. arXiv preprint arXiv:1911.10182 (2019).
- Piet De Vaere and Adrian Perrig. 2023. Hey Kimya, Is My Smart Speaker Spying on Me? Taking Control of Sensor Privacy Through Isolation and Amnesia. In USENIX Security Symposium.
- Cocaine noodles: exploiting the gap between human and machine speech recognition. In USENIX Workshop on Offensive Technologies.
- WaveNet: A Generative Model for Raw Audio. In ISCA Speech Synthesis Workshop. 125–125.
- Jesús Villalba and Eduardo Lleida. 2010. Speaker verification performance degradation against spoofing and tampering attacks. In FALA workshop.
- Jesús Villalba and Eduardo Lleida. 2011a. Detecting replay attacks from far-field recordings on speaker verification systems. In Biometrics and ID Management.
- Jesus Villalba and Eduardo Lleida. 2011b. Preventing replay attacks on speaker verification systems. In Carnahan Conference on Security Technology.
- Wolfgang Wahlster. 2013. Verbmobil: foundations of speech-to-speech translation. Springer Science & Business Media.
- Targeted speech adversarial example generation with generative adversarial network. IEEE Access (2020).
- Jiakai Wang. 2021. Adversarial Examples in Physical World.. In IJCAI.
- Inaudible adversarial perturbations for targeted attack in speaker recognition. arXiv preprint arXiv:2005.10637 (2020).
- Towards query-efficient adversarial attacks against automatic speech recognition systems. IEEE Transactions on Information Forensics and Security (2020).
- When the differences in frequency domain are compensated: Understanding and defeating modulated replay attacks on automatic speech recognition. In ACM SIGSAC Conference on Computer and Communications Security.
- Adversarial examples for improving end-to-end attention-based small-footprint keyword spotting. In IEEE International Conference on Acoustics, Speech and Signal Processing.
- Feature Selection Based on CQCCs for Automatic Speaker Verification Spoofing.. In Interspeech.
- Secure your voice: An oral airflow-based continuous liveness detection for voice assistants. ACM on interactive, mobile, wearable and ubiquitous technologies (2019).
- GhostTalk: Interactive Attack on Smartphone Voice System Through Power Line. arXiv preprint arXiv:2202.02585 (2022).
- Tacotron: Towards End-to-End Speech Synthesis. Proc. Interspeech (2017).
- The era of silicon MEMS microphone and look beyond. In International Conference on Solid-State Sensors, Actuators and Microsystems.
- Natural evolution strategies. The Journal of Machine Learning Research (2014).
- Mark M Wilde and Andrew B Martinez. 2004. Probabilistic principal component analysis applied to voice conversion. In Conference Record of the Thirty-Eighth Asilomar Conference on Signals.
- Audio Replay Attack Detection Using High-Frequency Features.. In Interspeech.
- Semi-black-box attacks against speech recognition systems using adversarial samples. In IEEE International symposium on dynamic spectrum access networks.
- Detecting converted speech and natural speech for anti-spoofing attack in speaker recognition. In Thirteenth Annual Conference of the International Speech Communication Association.
- A study on spoofing attack in state-of-the-art speaker verification: the telephone speech case. In Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
- ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge. In Annual Conference of the International Speech Communication Association.
- Real-time, universal, and robust adversarial attacks against speaker recognition systems. In IEEE international conference on acoustics, speech and signal processing.
- Hiromu Yakura and Jun Sakuma. 2018. Robust audio adversarial example for a physical attack. arXiv preprint arXiv:1810.11793 (2018).
- ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection. arXiv preprint arXiv:2109.00537 (2021).
- A Survey on Voice Assistant Security: Attacks and Countermeasures. Comput. Surv. (2023).
- The feasibility of injecting inaudible voice commands to voice assistants. IEEE Transactions on Dependable and Secure Computing (2019).
- Surfingattack: Interactive hidden attack on voice assistants using ultrasonic guided waves. In Network and Distributed Systems Security Symposium.
- Remote Attacks on Speech Recognition Systems Using Sound from Power Supply. In USENIX Security Symposium.
- VoShield: Voice Liveness Detection with Sound Field Dynamics. In Proceedings of IEEE INFOCOM.
- Characterizing audio adversarial examples using temporal dependency. arXiv preprint arXiv:1809.10875 (2018).
- A new replay attack against automatic speaker verification systems. IEEE Access (2020).
- {{\{{SkillDetective}}\}}: Automated {{\{{Policy-Violation}}\}} Detection of Voice Assistant Applications in the Wild. In USENIX Security Symposium.
- SMACK: Semantically Meaningful Adversarial Audio Attack. In USENIX Security Symposium.
- All your alexa are belong to us: A remote voice control attack against echo. In IEEE global communications conference.
- Commandersong: A systematic approach for practical adversarial voice recognition. In USENIX Security Symposium.
- A review of MEMS capacitive microphones. Micromachines (2020).
- Statistical parametric speech synthesis. speech communication (2009).
- A multiversion programming inspired approach to detecting audio adversarial examples. In IEEE/IFIP international conference on dependable systems and networks.
- EarArray: Defending against DolphinAttack via Acoustic Attenuation.. In NDSS.
- Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security.
- Defending adversarial attacks on cloud-aided automatic speech recognition systems. In Proceedings of the Seventh International Workshop on Security in Cloud Computing.
- Voiceprint mimicry attack towards speaker verification system in smart home. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications.
- Viblive: A continuous liveness detection for secure voice user interface in iot environment. In Computer Security Applications Conference.
- Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In ACM SIGSAC Conference on Computer and Communications Security.
- Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In IEEE Symposium on Security and Privacy.
- Non-negative matrix factorization using stable alternating direction method of multipliers for source separation. In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.
- Adversarial Example Attacks against ASR Systems: An Overview. In IEEE International Conference on Data Science in Cyberspace.
- Life after speech recognition: Fuzzing semantic misinterpretation for voice assistant applications. In Proc. of the Network and Distributed System Security Symposium.
- Black-box adversarial attacks on commercial speech platforms with minimal information. In ACM SIGSAC Conference on Computer and Communications Security.
- Hidden voice commands: Attacks and defenses on the VCS of autonomous driving cars. IEEE Wireless Communications (2019).
- Cross-lingual voice conversion with bilingual phonetic posteriorgram and average modeling. In IEEE International Conference on Acoustics, Speech and Signal Processing.