Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data (2407.01937v2)

Published 2 Jul 2024 in cs.CL

Abstract: In recent years, with the rapid advancements in LLMs, achieving excellent empathetic response capability has become a crucial prerequisite. Consequently, managing and understanding large-scale video datasets has gained increasing importance. However, empathetic data are typically trained without any quality selection, leading to inefficient data usage and wasted computational resources. Additionally, using raw data can result in low performance in empathetic dialogues. In this work, we present Efficient-Empathy, a sensibility and rationality score-based data selection algorithm that automatically selects sensibility and rationality data while discarding low-quality data. With only the sensibility data (59% of the full dataset), our trained sensibility model efficiently achieves state-of-the-art (SoTA) performance. Furthermore, with multiple data selection hyperparameters, the sensibility model demonstrates SoTA performance, showcasing the robustness of our method. By integrating sensibility and rationality data with a MoE structure, we achieve even higher performance, demonstrating the effectiveness of our Efficient-Empathy algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. A Survey of Multimodal Large Language Model from A Data-centric Perspective. arXiv preprint arXiv:2405.16640 (2024).
  2. COMET: Commonsense Transformers for Automatic Knowledge Graph Construction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4762–4779. https://aclanthology.org/P19-1470
  3. Improving Empathetic Dialogue Generation by Dynamically Infusing Commonsense Knowledge. In Findings of the Association for Computational Linguistics: ACL 2023. 7858–7873. https://aclanthology.org/2023.findings-acl.498
  4. Alpagasus: Training a better alpaca with fewer data. arXiv preprint arXiv:2307.08701 (2023).
  5. Yangbin Chen and Chunfeng Liang. 2022. Wish I Can Feel What You Feel: A Neural Approach for Empathetic Response Generation. In Findings of the Association for Computational Linguistics: EMNLP 2022. 922–933. https://aclanthology.org/2022.findings-emnlp.65
  6. Lingua manga: A generic large language model centric system for data curation. arXiv preprint arXiv:2306.11702 (2023).
  7. Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models. arXiv preprint arXiv:2401.06066 (2024).
  8. Mark H Davis. 1983. Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of personality and social psychology 44, 1 (1983), 113.
  9. Mods: Model-oriented data selection for instruction tuning. arXiv preprint arXiv:2311.15653 (2023).
  10. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. Journal of Machine Learning Research 23, 120 (2022), 1–39.
  11. How large language models will disrupt data management. Proceedings of the VLDB Endowment 16, 11 (2023), 3302–3309.
  12. E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 10568–10586.
  13. Megablocks: Efficient sparse training with mixture-of-experts. Proceedings of Machine Learning and Systems 5 (2023), 288–304.
  14. CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior. In Database Systems for Advanced Applications: 28th International Conference, DASFAA 2023. 597–606. https://doi.org/10.1007/978-3-031-30675-4_44
  15. COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. In Findings of the Association for Computational Linguistics: EMNLP 2020. 2470–2481. https://aclanthology.org/2020.findings-emnlp.224
  16. DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION. In International Conference on Learning Representations.
  17. Adaptive mixtures of local experts. Neural computation 3, 1 (1991), 79–87.
  18. Mixtral of experts. arXiv preprint arXiv:2401.04088 (2024).
  19. Emp-RFT: Empathetic Response Generation via Recognizing Feature Transitions between Utterances. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4118–4128. https://aclanthology.org/2022.naacl-main.303
  20. Gshard: Scaling giant models with conditional computation and automatic sharding. arXiv preprint arXiv:2006.16668 (2020).
  21. Branch-train-merge: Embarrassingly parallel training of expert language models. arXiv preprint arXiv:2208.03306 (2022).
  22. EmpDG: Multi-resolution Interactive Empathetic Dialogue Generation. In Proceedings of the 28th International Conference on Computational Linguistics. 4454–4466. https://aclanthology.org/2020.coling-main.394
  23. Knowledge bridging for empathetic dialogue generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10993–11001.
  24. MoEL: Mixture of Empathetic Listeners. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 121–132. https://aclanthology.org/D19-1012
  25. What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning. In The Twelfth International Conference on Learning Representations.
  26. # InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models. In The Twelfth International Conference on Learning Representations.
  27. meta llama. 2024. Introducing Meta Llama 3: The most capable openly available LLM to date. https://ai.meta.com/blog/meta-llama-3/ Accessed: 2024-05-02.
  28. Demystifying Data Management for Large Language Models. In Companion of the 2024 International Conference on Management of Data. 547–555.
  29. Flexmoe: Scaling large-scale sparse pre-trained model training via dynamic device placement. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–19.
  30. OpenAI. 2023a. ChatGPT. https://openai.com/blog/chatgpt
  31. R OpenAI. 2023b. GPT-4 technical report. arXiv (2023), 2303–08774.
  32. Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models. arXiv preprint arXiv:2404.05567 (2024).
  33. SelectLLM: Can LLMs Select Important Instructions to Annotate? arXiv preprint arXiv:2401.16553 (2024).
  34. Think Twice: A Human-like Two-Stage Conversational Agent for Emotional Response Generation. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. 727–736.
  35. Harnessing the Power of Large Language Models for Empathetic Response Generation: Empirical Investigations and Improvements. In Findings of the Association for Computational Linguistics: EMNLP 2023. 6516–6528.
  36. Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5370–5381. https://aclanthology.org/P19-1534
  37. Cem: Commonsense-aware empathetic response generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11229–11237.
  38. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538 (2017).
  39. Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM. arXiv preprint arXiv:2403.07816 (2024).
  40. Rational Sensibility: LLM Enhanced Empathetic Response Generation Guided by Self-presentation Theory. arXiv preprint arXiv:2312.08702 (2023).
  41. Qwen Team. 2024. Introducing Qwen1.5. https://qwenlm.github.io/blog/qwen1.5/
  42. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  43. Immanuel Trummer. 2023. From BERT to GPT-3 codex: harnessing the potential of very large language models for data management. arXiv preprint arXiv:2306.09339 (2023).
  44. Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible Knowledge Selection. In Findings of the Association for Computational Linguistics: EMNLP 2022. 4634–4645. https://aclanthology.org/2022.findings-emnlp.340
  45. Enhancing Empathetic and Emotion Support Dialogue Generation with Prophetic Commonsense Inference. arXiv preprint arXiv:2311.15316 (2023).
  46. Rethinking the Instruction Quality: LIFT is What You Need. arXiv:2312.11508 [cs.CL]
  47. An Iterative Associative Memory Model for Empathetic Response Generation. arXiv preprint arXiv:2402.17959 (2024).
  48. Exploiting Emotion-Semantic Correlations for Empathetic Response Generation. In Findings of the Association for Computational Linguistics: EMNLP 2023. 4826–4837. https://aclanthology.org/2023.findings-emnlp.320
  49. CTSM: Combining Trait and State Emotions for Empathetic Response Model. arXiv preprint arXiv:2403.15516 (2024).
  50. Don’t Lose Yourself! Empathetic Response Generation via Explicit Self-Other Awareness. In Findings of the Association for Computational Linguistics: ACL 2023. 13331–13344. https://aclanthology.org/2023.findings-acl.843
  51. Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training. arXiv preprint arXiv:2405.03133 (2024).
  52. CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 8223–8237. https://aclanthology.org/2023.acl-long.457
  53. Probing commonsense explanation in dialogue response generation. In Findings of the Association for Computational Linguistics: EMNLP 2021. 4132–4146.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Linzhuang Sun (18 papers)
  2. Hao Liang (137 papers)
  3. Jingxuan Wei (21 papers)
  4. Linkun Sun (2 papers)
  5. Bihui Yu (16 papers)
  6. Bin Cui (165 papers)
  7. Wentao Zhang (261 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.