X-HCOME: Hybrid LLM Ontology Engineering
- X-HCOME is a hybrid ontology engineering framework that integrates iterative LLM generation with simulated expert feedback for enhanced concept completeness and domain relevance.
- It employs structured micro-iterations by combining LLM-generated ontology drafts with systematic validations from simulated Knowledge Worker, Domain Expert, and Knowledge Engineer roles.
- The methodology transforms natural-language rules into executable SWRL artifacts, achieving measurable improvements in F1 scores over static and one-shot approaches.
SimX-HCOME+ (Simulated eXtended Human-Centered Ontology Engineering Methodology, enhanced) is a hybrid, LLM-driven methodology for ontology engineering that integrates structured, iterative human supervision with the natural-language and reasoning capabilities of LLMs. It systematically addresses challenges in automated and collaborative ontology construction by layering micro-iterations of LLM generation with simulated roles for concept completeness, domain relevance, and formal correctness. Originally applied in the domain of Parkinson’s disease (PD) monitoring and alerting, the SimX-HCOME+ framework aims to bridge gaps in LLM-generated ontologies by emulating collaborative expert workflows and supporting executable rule derivation alongside traditional ontology structures (Bouchouras et al., 16 Dec 2025).
1. Conceptual Foundation and Objectives
SimX-HCOME+ is predicated on three key objectives:
- Exploiting LLMs for expansive, context-rich knowledge extraction from source data and prompts.
- Maintaining domain and logical fidelity via continuous, structured human input simulated through three differentiated roles: Knowledge Worker (KW), Domain Expert (DE), and Knowledge Engineer (KE).
- Supporting not only taxonomy (classes) and relations (object properties) but also systematic conversion of natural-language domain rules into actionable SWRL (Semantic Web Rule Language) artifacts.
At every cycle, LLMs propose candidate ontology components, which are validated and enhanced by these three simulated human agents, yielding a dynamic, feedback-rich development process. Outputs are generated after each micro-iteration, enabling incremental validation and early detection of logical or semantic gaps.
2. Workflow, Algorithm, and Procedural Architecture
The SimX-HCOME+ process is organized as an iterative loop, which can be summarized by the following formalized pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
\begin{algorithm}[H]
\caption{SimX-HCOME⁺ Ontology Engineering Loop}
\begin{algorithmic}[1]
\State \textbf{Input:} Aim\_Scope, CompetencyQuestions, SourceData, NL\_Rules
\State \textbf{Output:} FinalOntology
\State Initialize Ontology %%%%0%%%% empty
\State Set iteration %%%%1%%%%
\Repeat
\State %%%%2%%%%
\State %%%%3%%%%
\State %%%%4%%%%
\State %%%%5%%%%
\State %%%%6%%%%
\State %%%%7%%%%
\State %%%%8%%%%
\State Ontology %%%%9%%%% Ontology %%%%10%%%% \{\,%%%%11%%%%, %%%%12%%%%\}
\State \textbf{validate}(\text{Ontology}) \Comment{Pellet consistency, OOPS! pitfalls}
\Until{convergence or %%%%13%%%%}
\State \textbf{return} Ontology
\end{algorithmic}
\end{algorithm} |
Human feedback is injected at each iteration, with specialized attention: KW for concept completeness, DE for accuracy and relevance, and KE for consistency and modeling best practices. The LLM not only incorporates these interventions for class and property definitions but also handles transformation of natural-language rules to SWRL, providing executable semantic constraints.
3. LLM Prompt Engineering and Micro-Iteration
Prompting in SimX-HCOME+ is structured in a multi-stage, conversational manner. Each micro-iteration comprises:
- Ontology Sketching: LLM is tasked to generate a class-property skeleton from aim, scope, and competency questions, typically in Turtle syntax.
- Detail Enrichment: LLM further refines the output by integrating domain-specific source data, including observational data, event patterns, and state modeling.
- SWRL Rule Conversion: Natural language (NL) business rules are solicited for translation, e.g., rules about bradykinesia detection in PD, into precise SWRL expressions.
This “micro-Chain-of-Thought” (micro-CoT) prompt chaining ensures every refinement by the LLM is immediately and systematically contextualized by both data and prior feedback, differentiating SimX-HCOME+ from both one-shot and basic CoT approaches (Bouchouras et al., 16 Dec 2025).
4. Quantitative Evaluation and Comparative Performance
Ontology construction results are benchmarked against a domain gold standard comprising 41 classes, using standard metrics:
- Precision:
- Recall:
- F1:
Table of comparative F1 scores (classes):
| Method | #Classes | TP | FP | FN | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| ChatGPT-4 | 17 | 9 | 8 | 32 | 52 % | 21 % | 31 % |
| ChatGPT-3.5 | 21 | 14 | 7 | 27 | 66 % | 34 % | 45 % |
| Gemini | 22 | 15 | 7 | 26 | 68 % | 36 % | 48 % |
| Claude | 24 | 12 | 12 | 29 | 50 % | 29 % | 37 % |
X-HCOME (Gemini) achieves F1 = 42 %, while SimX-HCOME+ with Gemini attains 48 %, exhibiting measurable gains over both one-shot and static hybrid methodologies.
For NL→SWRL translation tasks, performance remains limited (F1 ≤ 20 %), with Claude reaching the top logical atom match (F1ₗₒ𝚌 = 20 %). This suggests significant headroom for further prompt engineering and model tuning.
5. Qualitative Analysis, Error Sources, and Limitations
Ontologies produced using SimX-HCOME+ are consistently syntactically valid (as determined by tools such as OOPS! and Pellet), with only minor errors (e.g., rare Gemini anomalies). Notable improvements include:
- Comprehensiveness: LLMs, aided by iterative prompting, can introduce relevant novel classes (e.g., subclassification of alerts) absent from initial expert ontologies, sometimes exceeding X-HCOME recall.
- Property Modeling: Remains suboptimal; F1 for properties typically below 15 %, indicating LLMs’ current limitations in relation modeling.
- SWRL Rule Translation: LLMs correctly identify few logical atoms in NL rules, confirming that this remains a bottleneck.
Key limitations:
- Residual LLM hallucinations and tendency for repeated mistakes at each generation cycle.
- Under-specification in modeling data properties, complex axioms, and advanced constraints.
- Artificial “simulation” of human roles does not capture the cognitive and workflow variability of real-world collaborative ontology teams.
6. Future Directions and Applications
The SimX-HCOME+ paradigm points to several avenues for advancement (Bouchouras et al., 16 Dec 2025):
- Training domain-adapted LLMs (“Ontology-GPT”) with fine-tuned capacity for SWRL and complex property/axiom modeling.
- Extension of the iterative loop to address data property extraction and constraint specification.
- Investigation of asynchronous, multi-expert feedback mechanisms for more realistic simulation of expert teams.
- Empirical analysis of human effort reduction, cost, and quality trade-offs in varying levels of supervision, and adaptation to industrial or additional healthcare domains.
A plausible implication is that hybrid micro-iteration workflows, like SimX-HCOME+, will be central to future state-of-the-art LLM ontology engineering pipelines, especially where domain accuracy and formal correctness are required.
7. Comparative Context: Systematization in Hybrid Engineering
Compared to sequential or batch LLM prompting (e.g., one-shot and CoT), SimX-HCOME+ establishes a role-driven, continuous loop for ontology refinement. In contrast to X-HCOME—which integrates human interventions at discrete, predefined stages—SimX-HCOME+ operationalizes feedback at each micro-iteration, thereby reducing error propagation and enhancing the granularity of both correction and enrichment. This methodology outlines a blueprint for integrating advanced human-AI collaborative practices into formal knowledge engineering and is supported by empirical evidence from its application in the PD monitoring and alerting domain (Bouchouras et al., 16 Dec 2025).