Domain-Action Taxonomy: Concepts & Applications

Updated 23 October 2025

Domain-action taxonomy is a framework that organizes knowledge by integrating hierarchical concepts with dynamic actions and processes.
It supports research impact evaluation and autonomous systems planning by mapping contributions and reducing computational complexity.
Methodologies include supervised classifiers, semantic similarity metrics, and multilingual pretraining to ensure accurate and adaptable categorization.

A domain-action taxonomy is a structured framework that organizes knowledge within a domain by explicitly encoding the hierarchical relations among conceptual entities and the actions or processes that define, transform, or relate them. Domain-action taxonomies serve as foundational tools for qualitative research evaluation, improved planning and learning in autonomous systems, semantic understanding across heterogeneous sources, and efficient continual adaptation in both open and specialized domains.

1. Formal Structure and Definition

The domain-action taxonomy extends the classical concept of domain taxonomy—where entities or concepts are connected in a hierarchy defined by an "is part of" or "is-a" relation—by embedding actions, processes, or operational links within or between nodes. In research impact evaluation, a domain taxonomy is defined as a hierarchical structure in which each node represents a domain, subdomain, or specific concept, ordered such that higher-level nodes capture broad areas and lower-level nodes capture more specialized subjects (Murtagh et al., 2016). Action nodes may represent processes such as "analyze," "classify," "cluster," or "detect," and their integration facilitates a mapping between static conceptual organization and dynamic operational transformations.

For example, in the ACM Computing Classification System, a data analysis taxonomy may have a root node "Data and Information," subdivided down to "Theory and Algorithms," "Probability and Statistics," "Machine Learning," and further into "Clustering" or "Topic Modeling." In a domain-action taxonomy, researchers or algorithms are mapped to nodes not only by conceptual innovation but also by the processes they introduce or substantially reform.

2. Methodologies for Taxonomy Construction and Maintenance

Induction and refinement of domain-action taxonomies require systematic methodologies. Three main phases are often deployed (Chu et al., 2019):

Category Cleaning: Supervised classifiers, integrating lexical (plurality, capitalization, pattern matching) and graph-based features (instance counts, path length, subgraph size), filter out non-domain categories and noise.
Edge Cleaning: Syntactic (head word matching), semantic (embedding similarity, WordNet synset linking), and graph-based features identify valid hierarchical relations (subclass, process enables, or action-performs relations). The Wu–Palmer similarity and HyperVec score are used for semantic closeness.
Top-level Construction: Classes and actions are aligned with an external generalized ontology (e.g., WordNet), compressing and enriching the taxonomy for broad interoperability and comparison.

These modular steps yield taxonomies with high precision and recall, even in noisy domains such as fan wikis or specialized verticals. Precision scores in such settings often approach 83–85%, and proper-name edges can reach up to 96% (Chu et al., 2019).

3. Role in Research Impact and Evaluation

In research assessment, domain-action taxonomies support qualitative measurement by mapping scholars to those nodes they have created or significantly altered (Murtagh et al., 2016). The taxonomic rank (TR) is determined by the minimum rank of affected nodes, penalized for repeated/deep subdomains, and normalized via a linear transformation to the range [0, 100]. This metric captures the depth and originality of influence, contrasting with popularity measures such as citation counts, which may fail to detect paradigm-shifting or conceptual contributions.

Aggregate measures combining taxonomy-based evaluation with citation or merit-based scores can be computed via convex combinations:

$f_i = \sum_{j=1}^M w_j x_{i,j} \quad \text{where} \quad \sum_{j} w_j = 1, \quad w_j \geq 0$

This multidimensional approach can expose qualitative differences undervalued by traditional metrics.

4. Applications in Autonomous Agents and Planning

Action-category representations (ACR) exemplify operational domain-action taxonomies in agent planning and reinforcement learning (Nair et al., 2018). Here, agents construct bipartite graphs linking objects to actions via "action codes"—tuples of objects and actions:

$((o_1,o_2,\ldots,o_j), (a_1,a_2,\ldots,a_k))$

Action categories are defined as sets of actions sharing a unique set of object associations:

$A^c = \{ a_j : \bigcup \hat{O}_{a_j} = \bigcap \hat{O}_{a_j} \}$

ACR reduces the action space by abstracting low-level state to higher-level operational groupings, improving computational efficiency (e.g., reducing agent-object interactions from thousands to hundreds in StarCraft) and enabling faster convergence in domains like Lightworld.

5. Multidimensional Classification in Robotics

Robotic action representations are organized via a multidimensional domain-action taxonomy (Zech et al., 2018). The taxonomy consists of two modules:

Action Model: Perception, abstraction/granularity, learning methods, and effect representation (whether categorical/continuous, grounded or not, supporting forward or inverse modeling).
Computational Model: Mathematical vs. biomimetic formulation, feature types, training methods, and evaluation environments.

This framework enables systematic literature review, highlighting gaps (e.g., lack of grounding, forward/inverse modeling integration) and guiding future developments in robust manipulation and autonomous planning.

6. Integration with Modern LLMs and Continual Pretraining

Recent advances utilize taxonomies for supervised document-level classification and domain-adaptive pretraining (Nandy et al., 2023, Zhang et al., 2023). FastDoc leverages hierarchical classification heads according to a domain taxonomy and contrastive triplet losses generated from metadata. This methodology replaces token-level MLM/NSP with document-level supervision:

$L_{\text{hier}} = \sum_{i=1}^{N} \sum_{j=1}^{H} \text{CELoss}(x_{ij}, y_{ij})$

The result is an over 1000x reduction in pretraining compute without significant deterioration in downstream task performance. ESCOXLM-R further incorporates the ESCO taxonomy into pre-training objectives, combining masked language modeling and relation prediction among skills and occupations in 27 languages:

$L = L_{MLM} + L_{ERP} \ L_{ERP} = -\log p(r | h_{[CLS]})$

These approaches enable multilingual, taxonomy-aware representations crucial for complex job-ad, scientific, and customer-support tasks.

7. Taxonomy-Structured Domain Adaptation and Semantic Segmentation

Taxonomy-structured domain adaptation introduces the concept of hierarchical similarity among domains, treated as nodes in a taxonomy rather than flat categories (Liu et al., 2023). The methodology incorporates a "taxonomist" network that predicts pairwise domain distances:

$L_T(T,E) = \mathbb{E}_{x_1,u_1,x_2,u_2}[ \ell_2( T(E(x_1,u_1,A), E(x_2,u_2,A)), A(u_1,u_2) ) ]$

This encourages encoded representations to align with the hierarchical similarities, outperforming classic adversarial domain adaptation (DANN), especially in transfer tasks where related domains exist.

In semantic segmentation, cross-domain adaptation with inconsistent taxonomies is addressed by integrating vision-LLMs (VLMs) such as OWL-ViT and CLIP for automatic pseudo label relabeling (Lim et al., 5 Aug 2024). This decouples segment reasoning and semantic label reasoning, enabling adaptation to new or refined classes without ground truth labels in the target domain. Measured via mIoU on extended class spaces, this method enables robust adaptation in practical settings with evolving or non-aligned taxonomies.

Conclusion

The domain-action taxonomy paradigm underpins qualitative assessment, efficient planning and learning, robust semantic understanding, and adaptive continual training across a wide spectrum of domains—from academic research impact evaluation to agent-based task learning, knowledge base induction, and transfer learning in complex environments. Its core utility lies in encoding both hierarchical conceptual structures and the processes that transform or relate those structures, supporting transparency, comparability, and adaptability in contemporary scientific and engineering disciplines.