Agent-OM: Leveraging LLM Agents for Ontology Matching

Published 1 Dec 2023 in cs.AI, cs.CL, and cs.IR | (2312.00326v10)

Abstract: Ontology matching (OM) enables semantic interoperability between different ontologies and resolves their conceptual heterogeneity by aligning related entities. OM systems currently have two prevailing design paradigms: conventional knowledge-based expert systems and newer machine learning-based predictive systems. While LLMs and LLM agents have revolutionised data engineering and have been applied creatively in many domains, their potential for OM remains underexplored. This study introduces a novel agent-powered LLM-based design paradigm for OM systems. With consideration of several specific challenges in leveraging LLM agents for OM, we propose a generic framework, namely Agent-OM (Agent for Ontology Matching), consisting of two Siamese agents for retrieval and matching, with a set of OM tools. Our framework is implemented in a proof-of-concept system. Evaluations of three Ontology Alignment Evaluation Initiative (OAEI) tracks over state-of-the-art OM systems show that our system can achieve results very close to the long-standing best performance on simple OM tasks and can significantly improve the performance on complex and few-shot OM tasks.

Abstract PDF HTML Upgrade to Chat

Authors (3)

Citations (5)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

All Videos Subscribe on YouTube

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a consolidated list of what remains missing, uncertain, or unexplored in the paper, phrased to guide actionable follow-up research.

Lack of support for complex OWL axioms: the framework only verbalizes triple-based relations; integration of axiom-level verbalization (e.g., class expressions, restrictions) and its impact on matching performance remains unstudied.
Relation types beyond equivalence are not handled: the system focuses on equivalence; subsumption, part-of, disjointness, and other alignment relations are not modeled, detected, or evaluated.
No global alignment coherence/repair: there is no logical consistency checking or repair of alignments using OWL reasoning; effects of adding repair on precision/recall are unknown.
Individuals and property alignment coverage unclear: handling of individuals, object vs datatype properties, and property reification is not detailed; coverage and accuracy on these element types are unassessed.
Mapping cardinality constraints not addressed: one-to-one vs many-to-many alignment strategies and conflict resolution are absent; current “intersection-only” merging may depress recall without analysis.
Dependence on prompt-based tools: all tools are prompt-driven; the benefits and drawbacks of replacing them with programmatic/algorithmic extractors (e.g., robust RDF parsers, reasoners, rule-based modules) are not evaluated.
LLM-based “graphical” verbalization quality unvalidated: the faithfulness and consistency of natural language verbalizations of triples (and their effect on downstream matching) are not measured.
Naming convention normalization assumptions: the approach relies on high-quality rdfs:label/comment to replace codes/URIs; behavior when labels are missing, noisy, ambiguous, or multilingual is not addressed.
Automatic context determination: the system relies on a manually specified context (e.g., “conference”); how to infer or learn context automatically from ontologies or metadata is not explored.
Sensitivity to hyperparameters is untested: similarity threshold (e.g., 0.8), top-k (e.g., 5), and RRF settings (e.g., choice of k) lack justification and sensitivity/robustness analysis.
Embedding/model dependence: only OpenAI embeddings (1536-d) and GPT-3.5 are used; comparisons with alternative embedding models (e.g., SBERT, E5, VertexAI), cross-encoders, and LLMs (GPT-4, Llama, Claude) are not reported.
No ablation studies: contributions of planning (CoT), shared memory, individual matchers (initial/lexical/graphical), RRF fusion, and validator are not disentangled via ablations.
Scalability and efficiency unquantified: latency, throughput, and memory footprint for large ontologies (e.g., with millions of entities) are not measured; no use or evaluation of ANN indexes (e.g., HNSW) for vector search.
Cost analysis absent: API call counts, token usage, embedding costs, and overall monetary cost per alignment are not documented; cost–accuracy trade-offs remain unclear.
Reliability of the LLM validator is unknown: the yes/no equivalence check lacks calibration, confidence estimation, majority voting, or entailment-based verification; false positive/negative rates are not analyzed.
Handling of ontology evolution: incremental updates, re-embedding strategies, and memory/database maintenance for changing ontologies are not addressed.
Robustness to sparse or noisy ontologies: performance when ontologies lack descriptive labels/comments or contain inconsistent annotations is not evaluated.
Generalization beyond three OAEI tracks: domain transferability (e.g., biomedical, geospatial, industrial ontologies), and performance on larger or more heterogeneous benchmarks are not demonstrated.
Cross-lingual OM not considered: matching across different natural languages and integration of multilingual embeddings/translation pipelines remain open issues.
Explainability and user trust: how to present match rationales, enable human verification, or support interactive correction is not discussed.
Privacy and compliance: reliance on proprietary APIs (OpenAI for LLMs and embeddings) raises data governance questions; on-prem or open-source alternatives and their performance are not studied.
Reproducibility and artifacts: complete prompt templates, agent configurations, and code/data release are unclear (artifact URL placeholder); reproducibility across LLM versions/providers is not assessed.
Parameterized candidate generation strategy: the impact of different candidate retrieval strategies (e.g., union vs intersection merging, dual-encoder vs cross-encoder reranking) on recall/precision is not explored.
Mathematical/formal completeness: the cosine similarity and RRF formulations are not rigorously parameterized or justified in the OM context; comparative evaluation against other fusion and similarity measures is missing.
Integration with symbolic reasoning: combining LLM-driven retrieval with OWL reasoners, rule systems, or graph algorithms for candidate generation/validation remains an open design and evaluation question.

View Paper Prompt View All Prompts

Glossary

AgreementMakerLight (AML): A knowledge-based ontology matching system that automates aligning entities across ontologies. "AgreementMakerLight (AML)"
Agent-OM: The paper’s proposed agent-powered, LLM-based framework for ontology matching, using retrieval and matching agents with shared memory and tools. "Agent-OM"
BERT: A transformer-based LLM often used to generate text embeddings for downstream tasks. "BERT"
BERTMap: An ontology matching system that leverages BERT to improve matching performance. "BERTMap"
chain of thought (CoT): A prompting technique that guides LLMs to plan or reason via step-by-step decomposition. "chain of thought (CoT)"
cosine similarity: A metric that measures the similarity between two vectors based on the cosine of the angle between them. "cosine similarity"
CRUD: The basic database operations—create, read, update, and delete—used for managing stored data. "CRUD (create, read, update, and delete)"
DeepOnto: A toolkit supporting ontology-related tasks, including verbalisation of logical axioms. "DeepOnto"
embedding model: A model that maps text into vector representations to enable similarity computations. "embedding model"
fine-tuning: Adapting a pre-trained model to a specific task or domain by further training on labeled examples. "fine-tuning"
few-shot OM tasks: Ontology matching scenarios with very few labeled examples for learning. "few-shot OM tasks"
hybrid database: A data storage design that combines a relational database with a vector database for metadata and embeddings. "hybrid database"
in-context learning (ICL): Supplying examples or context in the prompt so an LLM can perform a task without parameter updates. "in-context learning (ICL)"
knowledge bases (KBs): Structured repositories of facts used to provide general or domain-specific information. "knowledge bases (KBs)"
knowledge graph (KG): A graph-structured representation of entities and their relationships, often used for reasoning and retrieval. "knowledge graph (KG)"
LangChain: A library for building LLM-driven agents with planning, memory, and tool use. "LangChain"
LLMs: Large neural LLMs pre-trained on massive corpora, used for generation and reasoning. "LLMs"
LLM agents: Systems that use an LLM as a controller to plan, use tools, and manage memory for complex tasks. "LLM agents"
LogMap: A traditional knowledge-based ontology matching system known for precision and effectiveness. "LogMap"
LogMap-ML: A machine learning-augmented variant of LogMap that integrates predictive techniques. "LogMap-ML"
Matching EvaLuation Toolkit (MELT): A toolkit for evaluating and benchmarking ontology matching methods. "Matching EvaLuation Toolkit (MELT)"
Model as a Service: The paradigm of invoking a model as an external service rather than integrating or retraining it. "Model as a Service"
Ontology Alignment Evaluation Initiative (OAEI): A community benchmark and set of tracks for evaluating ontology matching systems. "Ontology Alignment Evaluation Initiative (OAEI)"
ontology engineering: The process of designing, building, and maintaining ontologies. "ontology engineering"
Ontology matching (OM): The task of finding correspondences between entities in different ontologies to achieve semantic interoperability. "Ontology matching (OM)"
OWL Verbaliser: A tool that converts OWL axioms into natural language descriptions. "OWL Verbaliser"
pgvector: A PostgreSQL extension that enables storage and similarity search over vector embeddings. "pgvector"
Prefix:Name: An ontology entity naming convention using a namespace prefix and a local name. "Prefix:Name"
reciprocal rank fusion (RRF): A method to combine multiple ranked lists by summing reciprocal ranks to produce a fused ranking. "reciprocal rank fusion (RRF)"
retrieval-augmented generation (RAG): A technique that augments LLM prompts with retrieved documents to ground responses. "retrieval-augmented generation (RAG)"
rdfs:comment: An RDF Schema property used to attach human-readable descriptions to resources. "rdfs:comment"
rdfs:label: An RDF Schema property used to attach a human-readable name to a resource. "rdfs:label"
Sentence-BERT: A model that produces sentence-level embeddings suitable for semantic similarity tasks. "Sentence-BERT"
Self-RAG: A method where an LLM self-evaluates and refines retrieval-augmented outputs to reduce hallucinations. "Self-RAG"
SelfCheckGPT: An approach where an LLM checks its own outputs for consistency to detect hallucinations. "SelfCheckGPT"
Siamese agents: A pair of coordinated agents that operate separately (e.g., retrieval and matching) but share memory. "Siamese agents"
similarity search: Retrieving items most similar to a query in embedding space, typically via vector distance measures. "similarity search"
Sydney OWL Syntax: A controlled natural language/syntax for representing OWL axioms in a readable form. "Sydney OWL Syntax"
URI#Name: An ontology entity naming pattern using a full URI with a fragment identifier. "URI#Name"
vector database: A database that stores and indexes vector embeddings to support efficient similarity queries. "vector database"
VersaMatch: A machine learning-based ontology matching system. "VersaMatch"
Wikidata: A collaboratively edited knowledge base often used as an external source for general meanings. "Wikidata"

View Paper Prompt View All Prompts

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Agent-OM: Leveraging LLM Agents for Ontology Matching

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Glossary

Open Problems

Continue Learning

Collections