Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations (2306.13865v1)

Published 24 Jun 2023 in cs.CL

Abstract: LLMs encode meanings of words in the form of distributed semantics. Distributed semantics capture common statistical patterns among language tokens (words, phrases, and sentences) from large amounts of data. LLMs perform exceedingly well across General Language Understanding Evaluation (GLUE) tasks designed to test a model's understanding of the meanings of the input tokens. However, recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs when processing inputs that were seen rarely during training, or inputs that are associated with diverse contexts (e.g., well-known hallucination phenomenon in language generation tasks). Crowdsourced and expert-curated knowledge graphs such as ConceptNet are designed to capture the meaning of words from a compact set of well-defined contexts. Thus LLMs may benefit from leveraging such knowledge contexts to reduce inconsistencies in outputs. We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations of input tokens. IERL has the distinct advantage of being interpretable by design (when was the LLM context used vs. when was the knowledge context used?) over state-of-the-art (SOTA) methods, allowing scrutiny of the inputs in conjunction with the parameters of the model, facilitating the analysis of models' inconsistent or irrelevant outputs. Although IERL is agnostic to the choice of LLM and crowdsourced knowledge, we demonstrate our approach using BERT and ConceptNet. We report improved or competitive results with IERL across GLUE tasks over current SOTA methods and significantly enhanced model interpretability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
  2. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461, 2018.
  3. Phillip Bricker. Ontological commitment. 2014.
  4. Conceptnet 5.5: An open multilingual graph of general knowledge. In Thirty-first AAAI conference on artificial intelligence, 2017.
  5. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 33(2):494–514, 2021.
  6. Kala: Knowledge-augmented language model adaptation. arXiv preprint arXiv:2204.10555, 2022.
  7. K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808, 2020.
  8. Tdlr: Top semantic-down syntactic language representation. In NeurIPS’22 Workshop on All Things Attention: Bridging Different Perspectives on Attention.
  9. Learning the graphical structure of electronic health records with graph convolutional transformer. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 606–613, 2020.
  10. Local interpretable model-agnostic explanations for music content analysis. In ISMIR, volume 53, pages 537–543, 2017.
  11. The effects of data aggregation in statistical analysis. Geographical Analysis, 8(4):428–438, 1976.
  12. Process knowledge-infused learning for clinician-friendly explanations. arXiv preprint arXiv:2306.09824, 2023.
  13. Demo alleviate: Demonstrating artificial intelligence enabled virtual assistance for telehealth: The mental health case. arXiv preprint arXiv:2304.00025, 2023.
  14. ” is depression related to cannabis?”: A knowledge-infused model for entity and relation extraction with limited supervision. arXiv preprint arXiv:2102.01222, 2021.
  15. Knowledge infused policy gradients for adaptive pandemic control. arXiv preprint arXiv:2102.06245, 2021.
  16. Covid-19 in spain and india: comparing policy implications by analyzing epidemiological and social media data. arXiv preprint arXiv:2010.14628, 2020.
  17. Ksat: Knowledge-infused self attention transformer–integrating multiple domain-specific contexts. arXiv preprint arXiv:2210.04307, 2022.
  18. Cook-gen: Robust generative modeling of cooking actions from recipes. arXiv preprint arXiv:2306.01805, 2023.
  19. Knowledge graph guided semantic evaluation of language models for user trust. arXiv preprint arXiv:2305.04989, 2023.
  20. “who can help me?”: Knowledge infused matching of support seekers and support providers during covid-19 on reddit. In 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI), pages 265–269. IEEE, 2021.
  21. Proknow: Process knowledge for safety constrained and explainable question generation for mental health diagnostic assistance. Frontiers in big Data, 5:1056728, 2023.
  22. Overview of the clpsych 2022 shared task: Capturing moments of change in longitudinal user posts. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, pages 184–198, 2022.
  23. Learning to automate follow-up question generation using process knowledge for depression triage on reddit posts. arXiv preprint arXiv:2205.13884, 2022.
  24. Nlp is not enough–contextualization of user input in chatbots. arXiv preprint arXiv:2105.06511, 2021.
  25. Tdlr: Top (semantic)-down (syntactic) language representation. UMBC Faculty Collection, 2022.
  26. edarktrends: Harnessing social media trends in substance use disorders for opioid listings on cryptomarket. arXiv preprint arXiv:2103.15764, 2021.
  27. Neurosymbolic ai-why, what, and how. arXiv preprint arXiv:2305.00813, 2023.
  28. Knowledge-intensive language understanding for explainable ai. IEEE Internet Computing, 25(5):19–24, 2021.
  29. Process knowledge-infused ai: Toward user-level explainability, interpretability, and safety. IEEE Internet Computing, 26(5):76–84, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuxin Zi (8 papers)
  2. Kaushik Roy (265 papers)
  3. Vignesh Narayanan (20 papers)
  4. Manas Gaur (59 papers)
  5. Amit Sheth (127 papers)
Citations (7)