Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Word-specific tonal realizations in Mandarin (2405.07006v1)

Published 11 May 2024 in cs.CL

Abstract: The pitch contours of Mandarin two-character words are generally understood as being shaped by the underlying tones of the constituent single-character words, in interaction with articulatory constraints imposed by factors such as speech rate, co-articulation with adjacent tones, segmental make-up, and predictability. This study shows that tonal realization is also partially determined by words' meanings. We first show, on the basis of a Taiwan corpus of spontaneous conversations, using the generalized additive regression model, and focusing on the rise-fall tone pattern, that after controlling for effects of speaker and context, word type is a stronger predictor of pitch realization than all the previously established word-form related predictors combined. Importantly, the addition of information about meaning in context improves prediction accuracy even further. We then proceed to show, using computational modeling with context-specific word embeddings, that token-specific pitch contours predict word type with 50% accuracy on held-out data, and that context-sensitive, token-specific embeddings can predict the shape of pitch contours with 30% accuracy. These accuracies, which are an order of magnitude above chance level, suggest that the relation between words' pitch contours and their meanings are sufficiently strong to be functional for language users. The theoretical implications of these empirical findings are discussed.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (125)
  1. The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity, 2019:4895891.
  2. A note on the modeling of the effects of experimental time in psycholinguistic experiments. The Mental Lexicon, 17(2):178–212.
  3. The CELEX lexical database (CD-ROM). Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA.
  4. The cave of shadows: Addressing the human factor with generalized additive mixed models. Journal of Memory and Language, 94:206–234.
  5. Effects of disfluencies, predictability, and utterance position on word form variation in english conversation. The Journal of the acoustical society of America, 113(2):1001–1024.
  6. The acoustic correlates of valence depend on emotion family. Journal of Voice, 28(4):523–e9.
  7. A neural probabilistic language model. Advances in neural information processing systems, 13.
  8. The effect of word frequency and neighbourhood density on tone merge. In ICPhS.
  9. Praat: doing phonetics by computer [computer program]. Version 6.0.48.
  10. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146.
  11. Breiman, L. (2001). Random forests. Machine learning, 45:5–32.
  12. Bybee, J. L. (2001). Phonology and language use. Cambridge University Press, Cambridge.
  13. Chao, Y. R. (1968). A grammar of spoken Chinese. Univ of California Press.
  14. Mechanism of disyllabic tonal reduction in Taiwan Mandarin. Language and speech, 58(3):281–314.
  15. Discriminative learning and the lexicon: NDL and LDL. In Oxford Research Encyclopedia of Linguistics. Oxford University Press.
  16. Analyzing phonetic data with Generalized Additive Mixed Models. In Ball, M. J., editor, Manual of Clinical Phonetics, pages 108–138. Routledge, London.
  17. The effect of incredulity and particle on the intonation of yes/no questions in Taiwan Mandarin. In Proceedings of the 16th International Congress of Phonetic Sciences, pages 1261–1264, Saarbrücken, Germany.
  18. Vector space morphology with linear discriminative learning. In Crepaldi, D., editor, Linguistic morphology in the mind and brain. Routledge.
  19. Comprehending spoken language: a blueprint of the listener. In Brown, C. and Hagoort, P., editors, The Neurocognition of Language, pages 123–166. Oxford University Press, Oxford.
  20. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  21. Drager, K. K. (2011). Sociophonetic variation and the lemma. Journal of Phonetics, 39(4):694–707.
  22. Duanmu, S. (2007). The phonology of standard Chinese. OUP Oxford.
  23. Elman, J. L. (2009). On the meaning of words and dinosaur bones: Lexical knowledge without a lexicon. Cognitive science, 33(4):547–582.
  24. Ernestus, M. (2000). Voice assimilation and segment reduction in casual Dutch. A corpus-based study of the phonology-phonetics interface. LOT, Utrecht.
  25. Firth, J. R. (1968). Selected papers of J R Firth, 1952-59. Indiana University Press.
  26. Fon, J. (2004). A preliminary construction of Taiwan Southern Min spontaneous speech corpus. Technical Report NSC-92-2411-H-003-050-, National Science Council, Taipei, Taiwan.
  27. What does chao have to say about tones?-a case study of Taiwan Mandarin. Journal of Chinese Linguistics, 27(1):13–37.
  28. Positional and phonotactic effects on the realization of dipping tones in Taiwan Mandarin. In Gussenhoven, C. and Riad, T., editors, Phonology and Phonetics, Tones and Tunes: Vol. 2. Experimental Studies in Word and Sentence Prosody, pages 239–269. Mouton de Gruyter, Berlin.
  29. Fu, J.-W. (1999). Chinese Tonal Variation and Social Network-A Case Study in Tantzu Junior High School, Taichung, Taiwan. Master’s thesis, Providence University.
  30. Gahl, S. (2008). Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech. Language, 84(3):474–496.
  31. Time and thyme again: Connecting english spoken word duration to models of the mental lexicon. Language, page accepted.
  32. Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech. Journal of memory and language, 66(4):789–806.
  33. Gårding, E. (1987). Speech act and tonal pattern in standard chinese: constancy and variation. Phonetica, 44(1):13–29.
  34. Let’s Play Mono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses. Transactions of the Association for Computational Linguistics, 9:825–844.
  35. Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3):369–380.
  36. Harris, Z. S. (1954). Distributional structure. Word, 10(2-3):146–162.
  37. Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech understanding. Journal of Phonetics, 31:373–405.
  38. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
  39. Frequency effects in linear discriminative learning. arXiv preprint arXiv:2306.11044.
  40. Modeling morphology with linear discriminative learning: Considerations and design choices. Frontiers in psychology, 12:720713.
  41. How trial-to-trial learning shapes mappings in the mental lexicon: Modelling lexical decision with linear discriminative learning. Cognitive Psychology, 146:101598.
  42. The Discriminative Lexicon: Theory and implementation in the julia package JudiLing. Manuscript, University of Tübingen, under review for Cambridge University Press.
  43. The Discriminative Lexicon: Theory and implementation in the julia package JudiLing. in preparation for Cambridge University Press.
  44. Ho, A. T. (1976). The acoustic variation of Mandarin tones. Phonetica, 33(5):353–367.
  45. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3):651–674.
  46. Howie, J. M. (1974). On the domain of tone in mandarin. Phonetica, 30(3):129–148.
  47. Hsieh, P.-j. (2013). Prosodic markings of semantic predictability in taiwan mandarin. In INTERSPEECH, pages 553–557.
  48. Tutorial on sense-aware computing in chinese (version 0.1.6). In Paper presented in 32nd conference on Computational Linguistics and Speech Processing (ROCLING 2020).
  49. Constructing chinese wordnet: Design principles and implementation. (in chinese). Zhong-Guo-Yu-Wen, 24:2:169–186.
  50. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 873–882, Jeju Island, Korea. Association for Computational Linguistics.
  51. Production and perception of coarticulated tones: The cases of taiwan mandarin and taiwan southern min. Available at SSRN 4637487.
  52. Huang, Y.-H. (2008). Dialectal variations on the realization of high tonal targets in Taiwan Mandarin. Master’s thesis, National Taiwan University.
  53. SensEmbed: Learning sense embeddings for word and relational similarity. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 95–105, Beijing, China. Association for Computational Linguistics.
  54. Johnson, K. (2004). Massive reduction in conversational American English. In Spontaneous speech: data and analysis. Proceedings of the 1st session of the 10th international symposium, pages 29–54, Tokyo, Japan. The National International Institute for Japanese Language.
  55. Kendall, D. G. (1977). The diffusion of shape. Advances in applied probability, 9(3):428–430.
  56. Kilgarriff, A. (2007). Word senses. In Agirre, E. and Edmonds, P., editors, Word Sense Disambiguation: Algorithms and Applications, pages 29–46. Springer.
  57. Prominence in triconstituent compounds: Pitch contours and linguistic theory. Language and speech, 56(4):529–554.
  58. Vowel intrinsic pitch in connected speech. Phonetica, 41(1):31–40.
  59. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240.
  60. Lee, O. J. (2005). The prosody of questions in Beijing Mandarin. The Ohio State University.
  61. A theory of lexical access in speech production. Behavioral and brain sciences, 22(1):1–38.
  62. Parallel encoding of focus and interrogative meaning in mandarin intonation. Phonetica, 62(2-4):70–87.
  63. Lohmann, A. (2018). Cut (n) and cut (v) are not homophones: Lemma frequency affects the duration of noun–verb conversion pairs. Journal of Linguistics, 54(4):753–777.
  64. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior research methods, instruments, & computers, 28(2):203–208.
  65. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605.
  66. Marsolek, C. J. (2008). What antipriming reveals about priming. Trends in Cognitive Science, 12(5):176–181.
  67. Martinet, A. (1965). La Linguistique Synchronique: Études et Recherches. Presses Universitaires de France, Paris.
  68. context2vec: Learning generic context embedding with bidirectional LSTM. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pages 51–61, Berlin, Germany. Association for Computational Linguistics.
  69. Efficient estimation of word representations in vector space.
  70. Speaker normalization in the perception of mandarin chinese tones. The Journal of the Acoustical Society of America, 102(3):1864–1877.
  71. Efficient non-parametric estimation of multiple embeddings per word in vector space. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1059–1069, Doha, Qatar. Association for Computational Linguistics.
  72. Prosody and information structure in a tone language: an investigation of mandarin chinese. Language, Cognition and Neuroscience, 30(1-2):57–72.
  73. Pavlick, E. (2022). Semantic structure in deep learning.
  74. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
  75. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
  76. Embeddings in natural language processing: Theory and advances in vector representations of meaning. Morgan & Claypool Publishers.
  77. Homophony and morphology: The acoustics of word-final s in english1. Journal of Linguistics, 53(1):181–216.
  78. R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  79. Improving language understanding by generative pre-training.
  80. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  81. Multi-prototype vector-space models of word meaning. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 109–117, Los Angeles, California. Association for Computational Linguistics.
  82. Articulatory effects of frequency modulated by inflectional meanings. In Schlechtweg, M., editor, Interfaces of Phonetics. De Gruyter.
  83. A vector space model for automatic indexing. Commun. ACM, 18(11):613–620.
  84. Sampson, G. (2015). A chinese phonological enigma. Journal of Chinese Linguistics, 43(2):679–691.
  85. Sampson, G. (2019). An unaddressed phonological contradiction. International Journal of Chinese Linguistics, 6(2):221–237.
  86. The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45):e2105646118.
  87. Schütze, H. (1992). Word space. In Hanson, S., Cowan, J., and Giles, C., editors, Advances in Neural Information Processing Systems, volume 5. Morgan-Kaufmann.
  88. Seyfarth, S. (2014). Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition, 133(1):140–155.
  89. Shen, X.-n. (1989). Interplay of the four citation tones and intonation in mandarin chinese. Journal of Chinese Linguistics, 17(1):61–74.
  90. Shen, X.-n. S. (1990a). the prosody of Mandarin Chinese, volume 118. Univ of California Press.
  91. Shen, X. S. (1990b). Tonal coarticulation in mandarin. Journal of Phonetics, 18(2):281–295.
  92. A perceptual study of mandarin tones 2 and 3. Language and speech, 34(2):145–156.
  93. Vowel intrinsic pitch in Standard Chinese. In Proceedings of the 11th International Congress of Phonetic Sciences, pages 142–145.
  94. Shih, C. (1988). Tone and intonation in mandarin. Working Papers, Cornell Phonetics Laboratory, 3:83–109.
  95. Shih, C. (1997). Declination in mandarin. In Intonation: Theory, Models and Applications.
  96. Chinese tone modeling with Stem-ML. In Sixth International Conference on Spoken Language Processing.
  97. Soskuthy, M. (2021). Evaluating generalised additive mixed modelling strategies for dynamic speech analysis. Journal of Phonetics, 84:101017.
  98. Changing word usage predicts changing word durations in new zealand english. Cognition, 166:298–313.
  99. Chinese lexical database (cld). Behavior research methods, 50(6):2606–2629.
  100. Boundary-conditioned anticipatory tonal coarticulation in standard mandarin. Journal of Phonetics, 84:101018.
  101. Prosody leaks into the memories of words. Cognition, 210:104601.
  102. The acoustic realization of mandarin tones in fast speech. In INTERSPEECH, pages 1938–1941.
  103. Modeling the duration of word-final s in english with naive discriminative learning. Journal of Linguistics. https://psyarxiv.com/4bmwg, doi = 10.31234/osf.io/4bmwg.
  104. How is anticipatory coarticulation of suffixes affected by lexical proficiency? PsyArXiv, pages 1–34.
  105. Tseng, C.-y. (1981). An acoustic phonetic study on tones in Mandarin Chinese. Brown University.
  106. Tseng, S.-C. (2005). Contracted syllables in mandarin: Evidence from spontaneous conversations. Language and Linguistics, 6(1):153–180.
  107. Visualizing data using t-sne. Journal of machine learning research, 9(11).
  108. Analyzing the time course of pupillometric data. Trends in hearing, 23:2331216519832483.
  109. Attention is all you need. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  110. Probing pretrained language models for lexical semantics. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7222–7240, Online. Association for Computational Linguistics.
  111. The universality of intrinsic f0 of vowels. Journal of phonetics, 23(3):349–366.
  112. Investigating dialectal differences using articulography. Journal of Phonetics, 59:122–143.
  113. Wood, S. (2017). Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC, 2 edition.
  114. Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association, 111:1548–1575.
  115. Wu, E.-C. (2009). The Effect of Min Proficiency on the Realization of Tones and Foci in Taiwan Mandarin-Min Bilinguals. Master’s thesis, National Taiwan University.
  116. Wu, S.-J. (2003). A Sociolinguistic Study of Chinese Tonal Variation in Puli, Nantou, Taiwan. Master’s thesis, Providence University.
  117. Mandarin lexical tone duration: Impact of speech style, word length, syllable position and prosodic position. Speech Communication, 146:45–52.
  118. Effects of consonant aspiration on mandarin tones. Journal of the International Phonetic Association, 33(2):165–181.
  119. Xu, Y. (1994). Production and perception of coarticulated tones. The Journal of the Acoustical Society of America, 95(4):2240–2253.
  120. Xu, Y. (1997). Contextual tonal variations in mandarin. Journal of phonetics, 25(1):61–83.
  121. Xu, Y. (1998). Consistency of tone-syllable alignment across different syllable structures and speaking rates. Phonetica, 55(4):179–203.
  122. Xu, Y. (1999). Effects of tone and focus on the formation and alignment of F0 contours. Journal of Phonetics, 27(1):55–105.
  123. Xu, Y. (2001). Sources of tonal variations in connected speech. Journal of Chinese Linguistics Monograph Series, pages 1–31.
  124. Maximum speed of pitch change and how it may relate to speech. The Journal of the Acoustical Society of America, 111(3):1399–1413.
  125. Acoustic analysis of emotional speech in mandarin chinese. In International symposium on chinese spoken language processing, pages 57–66. Citeseer.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets