Recursive Semantic Anchoring in ISO 639:2023: A Structural Extension to ISO/TC 37 Frameworks (2506.06870v1)
Abstract: ISO 639:2023 unifies the ISO language-code family and introduces contextual metadata, but it lacks a machine-native mechanism for handling dialectal drift and creole mixtures. We propose a formalisation of recursive semantic anchoring, attaching to every language entity $\chi$ a family of fixed-point operators $\phi_{n,m}$ that model bounded semantic drift via the relation $\phi_{n,m}(\chi) = \chi \oplus \Delta(\chi)$, where $\Delta(\chi)$ is a drift vector in a latent semantic manifold. The base anchor $\phi_{0,0}$ recovers the canonical ISO 639:2023 identity, whereas $\phi_{99,9}$ marks the maximal drift state that triggers a deterministic fallback. Using category theory, we treat the operators $\phi_{n,m}$ as morphisms and drift vectors as arrows in a category $\mathrm{DriftLang}$. A functor $\Phi: \mathrm{DriftLang} \to \mathrm{AnchorLang}$ maps every drifted object to its unique anchor and proves convergence. We provide an RDF/Turtle schema (\texttt{BaseLanguage}, \texttt{DriftedLanguage}, \texttt{ResolvedAnchor}) and worked examples -- e.g., $\phi_{8,4}$ (Standard Mandarin) versus $\phi_{8,7}$ (a colloquial variant), and $\phi_{1,7}$ for Nigerian Pidgin anchored to English. Experiments with transformer models show higher accuracy in language identification and translation on noisy or code-switched input when the $\phi$-indices are used to guide fallback routing. The framework is compatible with ISO/TC 37 and provides an AI-tractable, drift-aware semantic layer for future standards.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.