Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
164 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Phonetic and Lexical Discovery of a Canine Language using HuBERT (2402.15985v1)

Published 25 Feb 2024 in cs.SD, cs.CL, cs.LG, and eess.AS

Abstract: This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization. We present a self-supervised approach with HuBERT, enabling the accurate classification of phoneme labels and the identification of vocal patterns that suggest a rudimentary vocabulary within dog vocalizations. Our findings indicate a significant acoustic consistency in these identified canine vocabulary, covering the entirety of observed dog vocalization sequences. We further develop a web-based dog vocalization labeling system. This system can highlight phoneme n-grams, present in the vocabulary, in the dog audio uploaded by users.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. Deep machine learning techniques for the detection and classification of sperm whale bioacoustics. Scientific reports, 9(1):12588.
  2. Vggsound: A large-scale audio-visual dataset. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 721–725. IEEE.
  3. Who let the dogs out? modeling dog behavior from visual data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4051–4060.
  4. Audio set: An ontology and human-labeled dataset for audio events. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 776–780. IEEE.
  5. Masato Hagiwara. 2023. Aves: Animal vocalization encoder based on self-supervision. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE.
  6. What is my dog trying to tell me? the automatic recognition of the context and perceived emotion of dog barks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5134–5138. IEEE.
  7. David Holdcroft. 1991. Saussure: signs, system and arbitrariness. Cambridge University Press.
  8. Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:3451–3460.
  9. Transcribing vocal communications of domestic shiba lnu dogs. In Findings of the Association for Computational Linguistics: ACL 2023, pages 13819–13832.
  10. Rescue dog action recognition by integrating ego-centric video, sound and sensor information. In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part III, pages 321–333. Springer.
  11. Audiocaps: Generating captions for audios in the wild. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 119–132.
  12. Panns: Large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28:2880–2894.
  13. Comparing supervised learning methods for classifying sex, age, context and individual mudi dogs from barking. Animal cognition, 18(2):405–421.
  14. Separate anything you describe. arXiv preprint arXiv:2308.05037.
  15. Classification of dog barks: a machine learning approach. Animal Cognition, 11:389–400.
  16. Aleida Paladini. 2020. The bark and its meanings in inter and intra-specific language. Dog behavior, 6(1):21–30.
  17. Acoustic parameters of dog barks carry emotional information for humans. Applied Animal Behaviour Science, 100(3-4):228–240.
  18. Robert L Robbins. 2000. Vocal communication in free-ranging african wild dogs (lycaon pictus). Behaviour, pages 1271–1298.
  19. Using machine learning to decode animal communication. Science, 381(6654):152–155.
  20. Towards lexical analysis of dog vocalizations via online videos. arXiv preprint arXiv:2309.13086.

Summary

We haven't generated a summary for this paper yet.