Cross-linguistic generalization of S3M linguistic encodings
Determine to what extent the encoding of linguistic information at multiple structural levels learned by self-supervised speech models generalizes across languages beyond English.
References
Additionally, since most studies focus on the encoding of English linguistic information in models pre-trained on English speech recordings [with some notable exceptions:], it is an open question to what extent S3M encoding of linguistic information at various structural levels generalizes to other languages.
— Tracking the emergence of linguistic structure in self-supervised models learning from speech
(2604.02043 - Kloots et al., 2 Apr 2026) in Section 2.1 (Related work: Layerwise hierarchies and linguistic structure in S3Ms)