Papers
Topics
Authors
Recent
Search
2000 character limit reached

Infusing Knowledge into Large Language Models with Contextual Prompts

Published 3 Mar 2024 in cs.CL | (2403.01481v1)

Abstract: Knowledge infusion is a promising method for enhancing LLMs for domain-specific NLP tasks rather than pre-training models over large data from scratch. These augmented LLMs typically depend on additional pre-training or knowledge prompts from an existing knowledge graph, which is impractical in many applications. In contrast, knowledge infusion directly from relevant documents is more generalisable and alleviates the need for structured knowledge graphs while also being useful for entities that are usually not found in any knowledge graph. With this motivation, we propose a simple yet generalisable approach for knowledge infusion by generating prompts from the context in the input text. Our experiments show the effectiveness of our approach which we evaluate by probing the fine-tuned LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. There is no big brother or small brother:knowledge infusion in language models for link prediction and question answering. In Proceedings of the 19th International Conference on Natural Language Processing (ICON), pages 204–211, New Delhi, India. Association for Computational Linguistics.
  2. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3554–3565, Online. Association for Computational Linguistics.
  3. Bringing light into the dark: A large-scale evaluation of knowledge graph embedding models under a unified framework. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):8825–8845.
  4. Training deep nets with sublinear memory cost. CoRR, abs/1604.06174.
  5. Scaling Instruction-Finetuned Language Models.
  6. Editing factual knowledge in language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6491–6506.
  7. Mixture-of-domain-adapters: Decoupling and injecting domain knowledge to pre-trained language models memories. arXiv preprint arXiv:2306.05406.
  8. Knowledge prompts: Injecting world knowledge into language models through soft prompts.
  9. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. arXiv e-prints, page arXiv:1705.03551.
  10. Mixed precision training. CoRR, abs/1710.03740.
  11. SKILL: Structured knowledge infusion for large language models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1581–1588, Seattle, United States. Association for Computational Linguistics.
  12. Can lms learn new entities from descriptions? challenges in propagating injected knowledge.
  13. Unifying large language models and knowledge graphs: A roadmap. arXiv preprint arXiv:2306.08302.
  14. Language models as knowledge bases? arXiv preprint arXiv:1909.01066.
  15. Sequence-to-sequence knowledge graph completion and question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2814–2828, Dublin, Ireland. Association for Computational Linguistics.
  16. Similar Cases Recommendation using Legal Knowledge Graphs. arXiv e-prints, page arXiv:2107.04771.
  17. Re-TACRED: Addressing shortcomings of the TACRED dataset. Proceedings of the AAAI Conference on Artificial Intelligence, 35(15):13843–13850.
  18. Interpreting language models through knowledge graph extraction. In Advances in Neural Information Processing Systems (NeurIPS), 1st Workshop on eXplainable AI Approaches for Debugging and Diagnosis.
  19. Ellen M. Voorhees and Dawn M. Tice. 2000. The TREC-8 question answering track. In Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece. European Language Resources Association (ELRA).
  20. K-adapter: Infusing knowledge into pre-trained models with adapters. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1405–1418.
  21. Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), pages 35–45.
  22. MQuAKE: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.