Papers
Topics
Authors
Recent
Search
2000 character limit reached

Do LLMs Dream of Ontologies?

Published 26 Jan 2024 in cs.CL and cs.AI | (2401.14931v2)

Abstract: LLMs have demonstrated remarkable performance across diverse natural language processing tasks, yet their ability to memorize structured knowledge remains underexplored. In this paper, we investigate the extent to which general-purpose pre-trained LLMs retain and correctly reproduce concept identifier (ID)-label associations from publicly available ontologies. We conduct a systematic evaluation across multiple ontological resources, including the Gene Ontology, Uberon, Wikidata, and ICD-10, using LLMs such as Pythia-12B, Gemini-1.5-Flash, GPT-3.5, and GPT-4. Our findings reveal that only a small fraction of ontological concepts is accurately memorized, with GPT-4 demonstrating the highest performance. To understand why certain concepts are memorized more effectively than others, we analyze the relationship between memorization accuracy and concept popularity on the Web. Our results indicate a strong correlation between the frequency of a concept's occurrence online and the likelihood of accurately retrieving its ID from the label. This suggests that LLMs primarily acquire such knowledge through indirect textual exposure rather than directly from structured ontological resources. Furthermore, we introduce new metrics to quantify prediction invariance, demonstrating that the stability of model responses across variations in prompt language and temperature settings can serve as a proxy for estimating memorization robustness.

Citations (2)

Summary

  • The paper shows that LLMs partially memorize ontological concepts, with higher retention for frequently mentioned terms on the web.
  • The paper introduces prediction invariance as a novel metric through rigorous experiments using Gene and Uberon ontologies.
  • The paper finds that error patterns retain syntactic proximity to correct entries, indicating structured domain knowledge learning.

Introduction

The capabilities of LLMs in text understanding and generation have garnered significant attention in both commercial and academic spheres. These models rely heavily on their expansive neural architectures, enabling memorization of substantial amounts of data encountered during training. A key inquiry pertains to the models' ability to retain and recall information from established ontologies, a critical aspect given the role ontologies play in structuring domain-specific knowledge.

Memorization Evaluation

Recent examinations into LLMs, such as GPT and PYTHIA-12B, seek to determine to what extent these models have internalized ontological concepts without the need for additional training. Through a series of experiments leveraging the Gene and Uberon ontologies, it has been demonstrated that LLMs possess a partial repository of ontological knowledge. However, the memorization of concepts is inconsistent and appears to be influenced by the frequency of these concepts across the Web. The findings suggest that popular concepts are more likely to be memorized, indicating that LLMs possibly learn from textual material mentioning these concepts rather than the ontological resources themselves.

Analysis of Error Patterns

An exploration of error patterns showed that the LLMs' wrong predictions still exhibit some syntactical similarity to the correct ontology entries. There is an observed pattern where incorrectly predicted concept IDs maintain a syntactic proximity to the correct IDs or their associated labels. This phenomenon becomes more pronounced when the incorrect predictions are for concepts with higher web presence.

Prediction Invariance as a Memorization Metric

The study introduces novel metrics to ascertain an LLM's retention of ontological data, with particular emphasis on the uniformity of output produced across variable prompts. By altering prompt repetitions, query languages, and temperature levels, researchers have found a distinct correlation between prediction invariance and concept frequency on the web. This correlation suggests that consistent model outputs, irrespective of prompt perturbations, could serve as strong indicators of concept memorization within LLMs.

Conclusion

LLMs reflect an emerging understanding of known ontologies, with their memorization capabilities directly influenced by the prevalence of concepts in web-based materials. While these models have internalized a subset of ontological concepts, complete and uniform memorization remains unattained. The methodologies proposed and implemented in this research to measure ontological memorization might pave the way for future in-depth explorations of how LLMs interact with structured domain knowledge.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 5 likes about this paper.