Exploring bat song syllable representations in self-supervised audio encoders (2409.12634v1)
Abstract: How well can deep learning models trained on human-generated sounds distinguish between another species' vocalization types? We analyze the encoding of bat song syllables in several self-supervised audio encoders, and find that models pre-trained on human speech generate the most distinctive representations of different syllable types. These findings form first steps towards the application of cross-species transfer learning in bat bioacoustics, as well as an improved understanding of out-of-distribution signal processing in audio encoder models.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.