CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation (2402.13145v2)
Abstract: Metaphor is a prominent linguistic device in human language and literature, as they add color, imagery, and emphasis to enhance effective communication. This paper introduces a large-scale high quality annotated Chinese Metaphor Corpus, which comprises around 28K sentences drawn from a diverse range of Chinese literary sources, such as poems, prose, song lyrics, etc. To ensure the accuracy and consistency of our annotations, we introduce a comprehensive set of guidelines. These guidelines address the facets of metaphor annotation, including identifying tenors, vehicles, and grounds to handling the complexities of similes, personifications, juxtapositions, and hyperboles. Breaking tradition, our approach to metaphor generation emphasizes grounds and their distinct features rather than the conventional combination of tenors and vehicles. By integrating "ground" as a CoT (Chain of Thoughts) input, we are able to generate metaphors that resonate more with real-world intuition. We test generative models such as Belle, Baichuan, and Chinese-alpaca-33B using our annotated corpus. These models are able to generate creative and fluent metaphor sentences more frequently induced by selected samples from our dataset, demonstrating the value of our corpus for Chinese metaphor research. The code is available in https://github.com/JasonShao55/Chinese_Metaphor_Explanation.
- Baichuan. 2023. Baichuan 2: Open large-scale language models. arXiv preprint arXiv:2309.10305.
- Max Black et al. 1979. More about metaphor. Metaphor and thought, 2:19–41.
- Generating similes effortlessly like a pro: A style transfer approach for simile generation.
- Mermaid: Metaphor generation with symbolism and discriminative decoding.
- Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201.
- Efficient and effective text encoding for chinese llama and alpaca. arXiv preprint arXiv:2304.08177.
- Laure J End. 1986. Grounds for metaphor comprehension. In Advances in psychology, volume 39, pages 327–345. Elsevier.
- Shu-Ping Gong. 2003. A corpus-based study on mapping principles of metaphors in politics. In Proceedings of the ROCLING 2003 Student Workshop, pages 287–294.
- Tigerscore: Towards building explainable metric for all text generation tasks. arXiv preprint arXiv:2310.00752.
- George Lakoff. 1992. The contemporary theory of metaphor. In Andrew Ortony, editor, Metaphor and Thought (2nd edition), chapter 11, pages 202–251. Cambridge University Press, Cambridge.
- Nominal metaphor generation with multitask learning.
- Nominal metaphor generation with multitask learning. In Proceedings of the 15th International Conference on Natural Language Generation, pages 225–235, Waterville, Maine, USA and virtual meeting. Association for Computational Linguistics.
- Metaphor detection via explicit basic meanings modelling. arXiv preprint arXiv:2305.17268.
- Su Lin. 2021. Metaphor and metonymy: Differences in chinese language and culture. Open Journal of Modern Linguistics, 11(2):135–139.
- Second thoughts are best: Learning to re-align with human values from text edits. Advances in Neural Information Processing Systems, 35:181–196.
- Training socially aligned language models in simulated human society. arXiv preprint arXiv:2305.16960.
- Rhetorically controlled encoder-decoder for Modern Chinese poetry generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1992–2001, Florence, Italy. Association for Computational Linguistics.
- OpenAI. 2023a. Gpt-4 technical report. ArXiv, abs/2303.08774.
- OpenAI. 2023b. How do davinci and text davinci-003 differ?
- Baidu Research. 2023. Wenxin: Baidu’s advanced language model.
- Metaphor generation with conceptual mappings. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6724–6736, Online. Association for Computational Linguistics.
- Automatic detection and interpretation of nominal metaphor based on the theory of meaning. Neurocomputing, 219:300–311.
- Lennart Wachowiak and Dagmar Gromann. 2023. Does gpt-3 grasp metaphors? identifying metaphor mappings with generative language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1018–1032.
- Interactive natural language processing. arXiv preprint arXiv:2305.13246.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- Fantastic expressions and where to find them: Chinese simile generation with multiple constraints. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 468–486, Toronto, Canada. Association for Computational Linguistics.
- Mammoth: Building math generalist models through hybrid instruction tuning. arXiv preprint arXiv:2309.05653.
- Exploring the impact of instruction data scaling on large language models: An empirical study on real-world use cases. arXiv preprint arXiv:2303.14742.
- Chinese open instruction generalist: A preliminary release. arXiv preprint arXiv:2304.07987.
- Writing polishment with simile: Task, dataset and a neural approach. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14383–14392.
- ”love is as complex as math”: Metaphor generation system for social chatbot.
- Yujie Shao (1 paper)
- Xinrong Yao (1 paper)
- Xingwei Qu (30 papers)
- Chenghua Lin (127 papers)
- Shi Wang (47 papers)
- Stephen W. Huang (9 papers)
- Ge Zhang (170 papers)
- Jie Fu (229 papers)