BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition (2310.02629v2)
Abstract: Mixture-of-experts based models, which use language experts to extract language-specific representations effectively, have been well applied in code-switching automatic speech recognition. However, there is still substantial space to improve as similar pronunciation across languages may result in ineffective multi-LLMing and inaccurate language boundary estimation. To eliminate these drawbacks, we propose a cross-layer language adapter and a boundary-aware training method, namely Boundary-Aware Mixture-of-Experts (BA-MoE). Specifically, we introduce language-specific adapters to separate language-specific representations and a unified gating layer to fuse representations within each encoder layer. Second, we compute language adaptation loss of the mean output of each language-specific adapter to improve the adapter module's language-specific representation learning. Besides, we utilize a boundary-aware predictor to learn boundary representations for dealing with language boundary confusion. Our approach achieves significant performance improvement, reducing the mixture error rate by 16.55\% compared to the baseline on the ASRU 2019 Mandarin-English code-switching challenge dataset.
- Peikun Chen (9 papers)
- Fan Yu (63 papers)
- Yuhao Lian (2 papers)
- Hongfei Xue (22 papers)
- Xucheng Wan (12 papers)
- Naijun Zheng (8 papers)
- Huan Zhou (51 papers)
- Lei Xie (337 papers)