OntoEKG: LLM-Driven Enterprise Ontology
- OntoEKG is an LLM-driven pipeline that extracts candidate ontological elements from unstructured enterprise text and organizes them hierarchically via entailment inference.
- The system uses a two-stage process with a structured extraction module (via a Pydantic schema) followed by subsumption-based hierarchical structuring, culminating in RDF/OWL ontology serialization.
- Evaluated on domain-specific datasets, OntoEKG establishes benchmarks using fuzzy-match F1 metrics while highlighting challenges in abstraction, hierarchical reasoning, and logical consistency.
OntoEKG is a LLM-driven pipeline for the automated construction of domain-specific ontologies from unstructured enterprise text. It addresses the challenge of resource-intensive manual ontology modeling for Enterprise Knowledge Graphs by decomposing the task into two distinct LLM-mediated phases: extraction of candidate ontological elements and entailment-based hierarchical structuring. The pipeline culminates in an RDF/OWL ontology serialization. OntoEKG is evaluated on newly introduced, domain-specific datasets and establishes a benchmark for end-to-end ontology construction, highlighting both the promise and limitations of LLM-based approaches (Oyewale et al., 1 Feb 2026).
1. System Architecture and Workflow
OntoEKG consists of two primary modules, each serving a distinct ontological modeling function:
1. Extraction Module: This module ingests free-form, unstructured enterprise text and, using a strongly typed Pydantic schema as a system prompt, constrains the LLM's output to valid JSON objects delineating 'classes' (entity types) and 'object properties' (binary relations). Each output includes, respectively, labels and textual descriptions for classes, and labels, domains, and ranges for properties. Zero-shot LLM invocations scan the corpus in overlapping windows, extracting all conceptual candidate types and binary relationships.
2. Entailment Module: The set of extracted classes is logically structured through automated subsumption inference. For every distinct pair , the system queries a second LLM: "Based on these textual definitions, does every instance of also satisfy ?" Affirmative answers yield candidate subclass triples , i.e., . The resulting set of subclass assertions is closed under reflexivity and transitivity, producing a logically coherent (though not necessarily cycle-free) subclass hierarchy.
RDF/OWL Serialization: The merged results—classes, properties, their domains/ranges, and all subclass links—are programmatically mapped to standard RDF using rdflib and emitted in Turtle syntax.
2. Extraction Module: Methods and Constraints
OntoEKG's extraction mechanism enforces output structure via a Pydantic-based system prompt, mandating compliance with a JSON schema:
1 2 |
Class { label: string; description: string }
Property { label: string; domain: string; range: string } |
Zero-shot LLM calls (e.g., Gemini 3 Flash) sweep the input text, extracting mentions of entity types and binary relations between concepts. There is no explicit downstream frequency-based scoring at the extraction stage; instead, the schema constraint and the LLM’s internal confidence filter potential false positives. The extraction output is a structured JSON array of all classes and properties considered for subsequent hierarchical structuring.
Evaluation of extraction performance applies fuzzy matching: predicted triples are aligned to gold standards using cosine similarity in a sentence embedding space, with domain-specific similarity thresholds (, , ).
3. Entailment Module: Hierarchical Structuring and Inference
The entailment phase considers the set of extracted classes and determines subsumption via LLM-based pairwise entailment checks. The binary relation is defined by:
For each distinct pair (with ), the Hierarchy Construction LLM is prompted as described above. Subsumption pairs confirmed by the LLM are added to . Logical closure is enforced:
- Reflexivity:
- Transitivity: , if
Pseudocode:
1 2 3 4 5 6 7 8 9 10 |
Input: classes C, descriptions D
R = {} # subsumption relations
for each (c_i, c_j) in C×C where i ≠ j:
answer = EntailmentLLM("Is every '"+c_i+"' instance also a '"+c_j+"'?")
if answer == "Yes":
R.add((c_i, c_j))
for (a,b) in R:
for (b2,c) in R where b2==b:
R.add((a,c))
return R |
This process produces a flat set of rdfs:subClassOf links, which are mapped directly into the final ontology.
4. RDF Serialization: Mapping and Implementation
OntoEKG serializes the result of its extraction and entailment phases into a standard OWL/RDF ontology, implemented via rdflib with Turtle output. The mapping functions are as follows:
| Source Element | RDF Triple |
|---|---|
| Class | |
| Property | |
| Domain of | |
| Range of | |
| Subsumption |
Each triple directly reflects the LLM-extracted or deduced structure, with no post-hoc logical validation or cycle breaking.
5. Evaluation Datasets and Performance Metrics
OntoEKG is evaluated using three corpora of unstructured policy documents from Data, Finance, and Logistics domains. Each dataset includes a human-curated "gold" ontology comprising classes, properties, and subsumption hierarchy.
Performance is assessed using fuzzy-match F1, calculated based on the counts of true positives (TP), false positives (FP), and false negatives (FN), considering embedding similarity above domain-specific thresholds:
Results for the Data domain ():
- Precision
- Recall
- Fuzzy-match
Exact-match F1 (requiring textual identity rather than embedding proximity) is substantially lower: Data (0.102), Finance (0.000), Logistics (0.048). This indicates extracted triples are frequently near-correct semantically but not lexically.
6. Empirical Findings and System Limitations
Performance varies by domain, with the Data corpus yielding higher fuzzy F1 than Logistics and Finance. This pattern aligns with the relative clarity and consistency of domain terminology and scope.
Limitations include:
- Scope Definition: LLMs may omit borderline concepts or include out-of-scope terms due to implicit or ambiguous boundaries in unstructured text.
- Abstraction Level: The extraction module is occasionally prone to outputting individuals (instances) instead of classes (types), a mismatch for T-Box schema requirements.
- Hierarchical Reasoning: Entailment LLMs sometimes invert the direction of subsumption and may generate inconsistent cycles (e.g., and ).
- Evaluation Bias: The reliance on embedding similarity for fuzzy matching raises questions about alignment with real-world ontological correctness when deviations are paraphrastic rather than structural.
A plausible implication is that while LLM-driven extraction provides substantial semantic coverage, enforcing ontological coherence and abstraction remains an open challenge.
7. Significance, Impact, and Open Challenges
OntoEKG's two-stage LLM-centric design is a first demonstrator of end-to-end, T-Box-focused ontology construction from enterprise text, leveraging strong output scaffolding for extraction and entailment. The pipeline architecture, together with the introduced multi-domain evaluation benchmark, constitutes an advance for automated ontology generation, and creates opportunities for benchmarking and comparison (Oyewale et al., 1 Feb 2026).
Notable open research challenges highlighted by OntoEKG include:
- Prompt Engineering for Scope: Refining prompt or model architectures to reliably enforce domain boundaries and abstraction granularity.
- Integration of Named Individuals: Extending the architecture to handle A-Box assertions (individuals) and provenance metadata.
- Logical Consistency: Incorporating symbolic validation or cycle detection to reinforce subclass graph correctness post-hoc.
- Ontology Evolution: Enabling incremental, context-aware evolution as new textual input arrives.
This suggests the next phase of research will focus on integrating LLM-based generative capabilities with formal reasoning and curation workflows, aiming for mature deployment in enterprise semantic governance.