Conversational Ontology Expansion

Updated 27 February 2026

Conversational Ontology Expansion is the systematic process of extending and refining dialogue ontologies by incorporating novel intents, slots, and slot values.
Techniques include zero-shot induction, neural clustering, and LLM-driven constrained Chain-of-Thought decoding, achieving high accuracy and improved entity extraction.
Integrating human-in-the-loop workflows and domain-specific protocols enhances the adaptability, transparency, and efficiency of conversational AI systems.

Conversational ontology expansion is the systematic process of extending, refining, or adapting the ontological structures that undergird conversational agents and dialogue systems. This involves not only the discovery and formalization of new intents, slots, domains, or values relevant to dialogue, but also the operationalization of qualitative conversational features into quantitative or logical forms. Techniques in this domain span zero-shot ontology induction from raw transcripts, LLM-guided term and relation extraction, decision logic formalization, and end-to-end conversational workflows for ontology engineering. The goal is to equip conversational AI with enhanced adaptability, transparency, and controllability in dynamic, real-world settings.

1. Formal Foundations and Problem Scope

Conversational ontology expansion (OnExp) is formalized as the augmentation of a base dialogue ontology $O = (N, E)$ —with $N = I \cup S \cup V$ encoding intents ( $I$ ), slots ( $S$ ), and slot values ( $V$ ), and $E$ denoting typed edges (e.g., "intent→slot", "slot→value")—by discovering and incorporating previously unknown items $O_u \subset N$ . Given labeled data $D_\ell = \{(x_i, o_i) : o_i \in O_k\}$ and unlabeled utterances $D_u = \{x_i\}$ , the expansion operator $\Delta$ yields an expanded ontology $O' = O \cup \Delta(O)$ , typically via a mapping $f_\theta^{\text{OnExp}}: x \mapsto (o_i, o_s, o_v)$ , where $o_i$ , $o_s$ , and $o_v$ represent known or novel ontology items (Liang et al., 2024).

Three core OnExp settings are delineated:

New Intent Discovery (NID): identification of novel intent classes
New Slot-Value Discovery (NSVD): detection of new slots/values
Joint OnExp: simultaneous assignment of intent, slot, and value triples.

Evaluation uses metrics such as accuracy (Hungarian alignment), Adjusted Rand Index (ARI), Normalized Mutual Information, and entity span-F1 (Liang et al., 2024).

2. Methods for Conversational Ontology Expansion

2.1. Unsupervised and Zero-Shot Methods

Zero-shot approaches do not require annotated ontologies or labels. The WOAH (Weighted Ontology Approximation Heuristic) framework exemplifies this category, extracting high-level intent (verb-driven) and entity (noun-driven) types from raw dialogue by leveraging dependency parsing, TF-IDF filtering, Gini sparsity metrics, and cosine similarity clustering. The pipeline produces intent-entity mappings parameterized for coarse or fine granularity via thresholds and prototype selection ( $\tau_v, \tau_o, \tau_c, g, c$ ) (Buyo, 2017). It constructs distinct intent and entity matrices, enabling configurable abstraction levels.

2.2. Neural and Clustering-Based OnExp

Neural cluster-based OnExp methods assign utterances to either known or new classes, using encoding architectures such as Deep Embedded Clustering (DEC), Deep Clustering Network (DCN), and contrastive or prototypical losses (SCCL, IDAs). These approaches are robust to high-dimensional dialogue data, enabling both unsupervised and semi-supervised NID with accuracy on benchmarks like CLINC150 up to 94.93% for semi-supervised methods (Liang et al., 2024).

Zero-shot NID exploits RNN-based capsule networks (IntentCapsNet), transformer encoders (LABAN, PIE), and LLM prompting. Performance for novel intent F1 typically ranges from 60–70% in zero-shot settings.

For slot-value discovery, unsupervised methods (DistFrame-Sem, Inter-Slot, Merge-Select) apply frame-semantic parsing and graph clustering, while partially supervised approaches leverage sequence tagging, adapter modules, or prototypical contrastive learning (HiCL, GZPL) depending on what prior knowledge is assumed (Liang et al., 2024).

2.3. LLM-Driven and Dialogue-Based Expansion

LLMs are directly leveraged for ontology construction and relation extraction using tailored prompting strategies. Constrained Chain-of-Thought (CoT) Decoding is a prominent methodology: LLMs generate multi-branch outputs enumerating candidate ontology triplets, with hard constraints applied to restrict head and tail tokens to known term sets and relation labels. Confidence is scored via mean log-probability disparity; only the branch with maximal confidence is retained (Vukovic et al., 2024). This reduces hallucinations and increases F1 for relation extraction by 25–26% over unconstrained or baseline approaches (Table: F1 up to 13.7 in transfer, vs. baseline 10.9).

Pseudocode for constrained CoT decoding is characterized by multi-branch generation (beam size $k$ ), disparity aggregation, and constraint enforcement on the generated bracketed content.

3. Integration of Qualitative Conversational Features

A significant frontier in OnExp is the formal quantification of qualitative conversational features, such as proficiency level, engagement, politeness, and user satisfaction. The process involves:

Selection of Measurable Linguistic Descriptors: $\mathcal{F} = \{f_1, ..., f_m\}$ , e.g., readability scores, word length, pronoun density.
Empirical Range Extraction: For each class $C_i$ and descriptor $f_j$ , compute $R_j^{C_i} = [\min, \max]_{(t, c)=C_i} f_j(t)$ .
Quantitative Criterion: $Q(C_i) = \bigwedge_{j=1}^m (R_j^{C_i} \ni f_j(t))$ ; assign utterance $t$ to $C_i$ if it satisfies all descriptor constraints (Gendron et al., 5 Sep 2025).
Decision Tree Extraction: Shallow trees (depth $\leq 5$ ) yield interpretable conjunctions of feature thresholds, which are mapped to formal ranges in the ontology.

These definitions are formalized in Description Logic (Manchester syntax), e.g.,

Class: B1LevelUtterance
  EquivalentTo: Utterance
    and hasFleschKincaidIndex min 6.0
    and hasFleschKincaidIndex max 9.0
    and hasGunningFogIndex min 7.0
    and hasGunningFogIndex max 10.5
    and hasPronounDensity min 0.09
    and hasPronounDensity max 0.28

OWL-DL reasoners ensure range consistency and derive utterance assignment via data property instantiation.

4. Human-in-the-Loop and Conversational Engineering Pipelines

Contemporary frameworks such as OntoChat institutionalize ontology expansion as a conversational, multi-turn, human-in-the-loop workflow. The architecture orchestrates:

Requirement elicitation and use-case drafting (persona modules)
Generation and abstraction of competency questions (CQs)
Gap testing by SPARQL and mask-based coverage checking against the current OWL ontology
Slot-filling dialogue to clarify unmet information needs
Automated proposal of new ontology classes/properties through LLM-driven label generation and embedding-based similarity filtering (cosine threshold $\delta$ )
User confirmation, refinement, and OWL axiom integration

Such systems demonstrate improved CQ coverage (from 87.5% to 95%), high proposal precision (0.82), and reduced modeling time (−35%) (Zhang et al., 2024).

5. Domain-Specific and Biomedical Ontology Expansion

Semi-automated ontology expansion in specialized domains leverages conversational LLM prompts for both concept and relation extraction. Using structured prompts anchored in expert guidelines, candidate concepts/relation triples are extracted via repeated LLM runs ( $T=10$ , threshold $\tau=6$ ), filtered by frequency, and mapped to a controlled relation vocabulary (e.g., UMLS types). Precision and recall in such pipelines reach 0.63 and 0.58, with errors originating from boundary ambiguities, misassigned relations, or hallucinated concepts (Zaitoun et al., 2023).

All integration is mediated via translation of extracted triples to OWL axioms, with manual validation yielding improved ontology coverage.

6. Challenges, Evaluation, and Emerging Directions

Key challenges in conversational ontology expansion include:

Semantic Overlap and Ambiguity: Boundary issues between novel and known intents/slots.
Label Sparsity and Few-Shot Settings: Many techniques must extrapolate from few or no annotations.
Noise and Robustness: Dialogues are noisy and often contain polysemous or ambiguous utterances.
Inter-task Interference: Joint learning of intent, slot, and value expansion can yield cross-task degradation.
Evaluation: Scarcity of standardized joint OnExp benchmarks and comprehensive downstream metrics (Liang et al., 2024).

Emerging research frontiers encompass few-shot and continual OnExp, multi-modal augmentation, holistic integration into end-to-end pipelines, explainable ontology expansion, and cross-lingual adaptability. The field increasingly targets adaptable, transparent, and robust ontology engineering protocols that operate at the intersection of statistical learning, formal reasoning, and structured dialogue.

References:

Towards Ontology-Based Descriptions of Conversations with Qualitatively-Defined Concepts (Gendron et al., 5 Sep 2025)
Dialogue Ontology Relation Extraction via Constrained Chain-of-Thought Decoding (Vukovic et al., 2024)
WOAH: Preliminaries to Zero-shot Ontology Learning for Conversational Agents (Buyo, 2017)
A Survey of Ontology Expansion for Conversational Understanding (Liang et al., 2024)
Can LLMs Augment a Biomedical Ontology with missing Concepts and Relations? (Zaitoun et al., 2023)
OntoChat: a Framework for Conversational Ontology Engineering using LLMs (Zhang et al., 2024)