Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 157 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 88 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 397 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization (2401.06980v1)

Published 13 Jan 2024 in cs.CL, cs.LG, and stat.ML

Abstract: In this paper, we present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term {bi-level joint unsupervised and supervised training (BL-JUST)}. {BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel optimization to solve this challenging ASR problem with affordable complexity and rigorous convergence guarantees.} To evaluate BL-JUST, extensive experiments on the LibriSpeech and TED-LIUM v2 datasets have been conducted. BL-JUST achieves superior performance over the commonly used pre-training followed by fine-tuning strategy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. vq-wav2vec: Self-supervised learning of discrete speech representations. arXiv preprint arXiv:1910.05453, 2019.
  2. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460, 2020.
  3. Joint unsupervised and supervised training for multilingual asr. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6402–6406, 2022.
  4. Optimization methods for large-scale machine learning. SIAM review, 60(2):223–311, 2018.
  5. Special issue on bilevel optimization. EURO Journal on Computational Optimization, 8:1–2, 2020.
  6. Learning with limited samples: Meta-learning and applications to communication systems. Foundations and Trends® in Signal Processing, 17(2):79–208, 2023.
  7. A single-timescale method for stochastic bilevel optimization. In International Conference on Artificial Intelligence and Statistics, pages 2466–2488, 2022.
  8. Self-supervised learning with random-projection quantizer for speech recognition. In International Conference on Machine Learning, pages 3915–3924, 2022.
  9. Bilevel methods for image reconstruction. Foundations and Trends® in Signal Processing, 15(2-3):121–289, 2022.
  10. Bilevel optimization. In Springer optimization and its applications, volume 161. 2020.
  11. New types of deep neural network learning for speech recognition and related applications: An overview. In IEEE international conference on acoustics, speech and signal processing, pages 8599–8603, 2013.
  12. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135, 2017.
  13. Bilevel programming for hyperparameter optimization and meta-learning. In International Conference on Machine Learning, pages 1568–1577, 2018.
  14. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of International Conference on Machine Learning, pages 369–376, 2006.
  15. Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100, 2020.
  16. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal processing magazine, 29(6):82–97, 2012.
  17. Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:3451–3460, 2021.
  18. Loss landscapes and optimization in over-parameterized non-linear systems and neural networks. Applied and Computational Harmonic Analysis, 59:85–116, 2022.
  19. Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):10045–10067, 2021.
  20. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  21. Songtao Lu. Bilevel optimization with coupled decision-dependent distributions. In International Conference on Machine Learning, pages 22758–22789, 2023.
  22. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  23. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
  24. Librispeech: an asr corpus based on public domain audio books. In IEEE international conference on acoustics, speech and signal processing, pages 5206–5210, 2015.
  25. Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779, 2019.
  26. Fabian Pedregosa. Hyperparameter optimization with approximate gradient. In International Conference on Machine Learning, pages 737–746, 2016.
  27. To transfer or not to transfer. In NIPS workshop on transfer learning, number 3, 2005.
  28. Enhancing the ted-lium corpus with selected data for language modeling and more ted talks. In The International Conference on Language Resources and Evaluation, pages 3935–3939, 2014.
  29. A first order method for solving convex bilevel optimization problems. SIAM Journal on Optimization, 27(2):640–660, 2017.
  30. On penalty-based bilevel gradient descent method. arXiv preprint arXiv:2302.05185, 2023.
  31. Bilevel and multilevel programming: A bibliography review. Journal of Global optimization, 5(3):291–306, 1994.
  32. Characterizing and avoiding negative transfer. In Proceedings of Conference on Computer Vision and Pattern Recognition, pages 11293–11302, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.