Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly (2210.08604v2)

Published 16 Oct 2022 in cs.CL and cs.AI

Abstract: Norm discovery is important for understanding and reasoning about the acceptable behaviors and potential violations in human communication and interactions. We introduce NormSage, a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery, based on LLM prompting and self-verification. NormSAGE leverages the expressiveness and implicit knowledge of the pretrained GPT-3 LLM backbone, to elicit knowledge about norms through directed questions representing the norm discovery task and conversation context. It further addresses the risk of LLM hallucination with a self-verification mechanism ensuring that the norms discovered are correct and are substantially grounded to their source conversations. Evaluation results show that our approach discovers significantly more relevant and insightful norms for conversations on-the-fly compared to baselines (>10+% in Likert scale rating). The norms discovered from Chinese conversation are also comparable to the norms discovered from English conversation in terms of insightfulness and correctness (<3% difference). In addition, the culture-specific norms are promising quality, allowing for 80% accuracy in culture pair human identification. Finally, our grounding process in norm discovery self-verification can be extended for instantiating the adherence and violation of any norm for a given conversation on-the-fly, with explainability and transparency. NormSAGE achieves an AUC of 95.4% in grounding, with natural language explanation matching human-written quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. 2022. CCU TA1 Mandarin/Chinese Development Annotation LDC2022E18. Web Download.
  2. Tiago Bianchi. 2022. Regional distribution of desktop traffic to reddit.com as of may 2022 by country. Accessed: 2023-01-31.
  3. Analysis of moral judgment on reddit. IEEE Transactions on Computational Social Systems.
  4. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  5. e-snli: Natural language inference with natural language explanations. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
  6. LexGLUE: A benchmark dataset for legal language understanding in English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4310–4330, Dublin, Ireland. Association for Computational Linguistics.
  7. How is chatgpt’s behavior changing over time?
  8. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  9. Haibo Ding and Ellen Riloff. 2016. Acquiring knowledge of affective events from blogs using label propagation. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, page 2935–2942. AAAI Press.
  10. Raft: Reward ranked finetuning for generative foundation model alignment.
  11. Moral stories: Situated reasoning about norms, intents, actions, and their consequences. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 698–718, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  12. Social chemistry 101: Learning to reason about social and moral norms. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 653–670, Online. Association for Computational Linguistics.
  13. An empirical exploration of moral foundations theory in partisan news sources. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 3730–3736, Portorož, Slovenia. European Language Resources Association (ELRA).
  14. A zero-shot claim detection framework using question answering. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6927–6933, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
  15. Liberals and conservatives rely on different sets of moral foundations. In Journal of Personality and Social Psychology, pages 1029–1046. Brill.
  16. Herbert P Grice. 1975. Logic and conversation. In Speech acts, pages 41–58. Brill.
  17. Lm-switch: Lightweight language model conditioning in word embedding space. In arxiv.
  18. Zero-shot faithful factual error correction. In Proc. The 61st Annual Meeting of the Association for Computational Linguistics (ACL2023).
  19. Improving cross-lingual fact checking with cross-lingual retrieval. In Proc. The 29th International Conference on Computational Linguistics (COLING2022).
  20. Delphi: Towards machine ethics and norms. arXiv preprint arXiv:2110.07574.
  21. How can we know when language models know? on the calibration of language models for question answering. Transactions of the Association for Computational Linguistics, 9:962–977.
  22. Alexandra Kallia. 2004. Linguistic politeness: The implicature approach. Multilingua, 23.
  23. Teven Le Scao and Alexander Rush. 2021. How many data points is a prompt worth? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2627–2636, Online. Association for Computational Linguistics.
  24. TVQA: Localized, compositional video question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1369–1379, Brussels, Belgium. Association for Computational Linguistics.
  25. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
  26. Defining a new nlp playground. ACL Findings.
  27. Open-domain hierarchical event schema induction by incremental prompting and verification. In Proc. The 61st Annual Meeting of the Association for Computational Linguistics (ACL2023).
  28. Acquiring background knowledge to improve moral value prediction. In Proc. The 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM2018).
  29. Scruples: A corpus of community ethical judgments on 32,000 real-life anecdotes. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35(15), pages 13470–13479.
  30. Hatexplain: A benchmark dataset for explainable hate speech detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14867–14875.
  31. Moralization in social networks and the emergence of violent protests. Nature Human Behavior [June 2018 Cover].
  32. Social-group-agnostic bias mitigation via the stereotype content model. In Proc. The 61st Annual Meeting of the Association for Computational Linguistics (ACL2023).
  33. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  34. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463–2473, Hong Kong, China. Association for Computational Linguistics.
  35. Eric Posner. 2009. Law and social norms, chapter 1-4. Havard University Press.
  36. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67.
  37. Event2Mind: Commonsense inference on events, intents, and reactions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 463–473, Melbourne, Australia. Association for Computational Linguistics.
  38. Smartbook: Ai-assisted situation report generation.
  39. Sarah T Roberts. 2016. Commercial content moderation: Digital laborers’ dirty work. In The Intersectional Internet: Race, Sex, Class and Culture Online. Peter Lang Publishing.
  40. Multitask prompted training enables zero-shot task generalization. In International Conference on Learning Representations.
  41. Social bias frames: Reasoning about social and power implications of language. arXiv preprint arXiv:1911.03891.
  42. Atomic: An atlas of machine commonsense for if-then reasoning. In AAAI.
  43. Shalom H Schwartz et al. 2012. An overview of the schwartz theory of basic values. Online readings in Psychology and Culture, 2(1):2307–0919.
  44. Kim Bartel Sheehan. 2018. Crowdsourcing research: data collection with amazon’s mechanical turk. Communication Monographs, 85(1):140–156.
  45. Irene Solaiman and Christy Dennison. 2021. Process for adapting language models to society (palms) with values-targeted datasets. Advances in Neural Information Processing Systems, 34:5861–5873.
  46. Collecting natural sms and chat conversations in multiple languages: The bolt phase 2 corpus. In LREC, pages 1699–1704. Citeseer.
  47. Decoding the silent majority: Inducing belief augmented social graph with large language model for response forecasting. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP).
  48. A word on machine ethics: A response to jiang et al.(2021). arXiv preprint arXiv:2111.04158.
  49. Detection and fine-grained classification of cyberbullying events. In Proceedings of the International Conference Recent Advances in Natural Language Processing, pages 672–680, Hissar, Bulgaria. INCOMA Ltd. Shoumen, BULGARIA.
  50. Acquiring a dictionary of emotion-provoking events. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 128–132, Gothenburg, Sweden. Association for Computational Linguistics.
  51. Dialogue natural language inference. arXiv preprint arXiv:1811.00671.
  52. RESIN: A dockerized schema-guided cross-document cross-lingual cross-media information extraction and event tracking system. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, pages 133–143, Online. Association for Computational Linguistics.
  53. Reframing human-ai collaboration for generating free-text explanations. arXiv preprint arXiv:2112.08674.
  54. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122. Association for Computational Linguistics.
  55. Adept: A debiasing prompt framework. In Proc. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI2023).
  56. The unreliability of explanations in few-shot prompting for textual reasoning. In Advances in Neural Information Processing Systems, volume 35, pages 30378–30392. Curran Associates, Inc.
  57. Unlearning bias in language models by partitioning gradients. In Proc. The 61st Annual Meeting of the Association for Computational Linguistics (ACL2023) Findings.
  58. CH-SIMS: A Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3718–3727, Online. Association for Computational Linguistics.
  59. Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2236–2246.
  60. The moral integrity corpus: A benchmark for ethical dialogue systems. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3755–3773, Dublin, Ireland. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yi R. Fung (31 papers)
  2. Tuhin Chakraborty (1 paper)
  3. Hao Guo (172 papers)
  4. Owen Rambow (26 papers)
  5. Smaranda Muresan (47 papers)
  6. Heng Ji (266 papers)
Citations (36)