Umwelt Engineering: Designing the Cognitive Worlds of Linguistic Agents

Published 29 Mar 2026 in cs.CL and cs.AI | (2603.27626v1)

Abstract: I propose Umwelt engineering -- the deliberate design of the linguistic cognitive environment -- as a third layer in the agent design stack, upstream of both prompt and context engineering. Two experiments test the thesis that altering the medium of reasoning alters cognition itself. In Experiment 1, three LLMs reason under two vocabulary constraints -- No-Have (eliminating possessive "to have") and E-Prime (eliminating "to be") -- across seven tasks (N=4,470 trials). No-Have improves ethical reasoning by 19.1 pp (p < 0.001), classification by 6.5 pp (p < 0.001), and epistemic calibration by 7.4 pp, while achieving 92.8% constraint compliance. E-Prime shows dramatic but model-dependent effects: cross-model correlations reach r = -0.75. In Experiment 2, 16 linguistically constrained agents tackle 17 debugging problems. No constrained agent outperforms the control individually, yet a 3-agent ensemble achieves 100% ground-truth coverage versus 88.2% for the control. A permutation test confirms only 8% of random 3-agent subsets achieve full coverage, and every successful subset contains the counterfactual agent. Two mechanisms emerge: cognitive restructuring and cognitive diversification. The primary limitation is the absence of an active control matching constraint prompt elaborateness.

Abstract PDF Upgrade to Chat

Authors (1)

Rodney Jehu-Appiah

Summary

The paper demonstrates that altering linguistic primitives via Umwelt engineering profoundly restructures LLM cognition and reasoning.
It employs systematic experiments across diverse tasks, revealing that specific linguistic constraints yield both task improvements and model-specific effects.
The framework promotes cognitive diversification, where ensemble designs leveraging varied Umwelten achieve complementary performance gains.

Umwelt Engineering for Linguistic Agents: A Technical Analysis

Conceptual Framework

The paper "Umwelt Engineering: Designing the Cognitive Worlds of Linguistic Agents" (2603.27626) introduces the notion of linguistic Umwelt engineering: the explicit design and manipulation of the linguistic substrate available to LLM-based agents. Drawing on Uexküll’s "Umwelt" from ethology, the author contends that for LLMs, language constitutes not only the interface but the entire space of cognition—unlike humans, for whom language is only one cognitive modality. Therefore, altering the available linguistic primitives fundamentally changes agent cognition, not merely its externalization.

The proposed framework differentiates among three engineering layers for agent design:

Prompt engineering: Formulating what the agent is asked to do.
Context engineering: Controlling what the agent can access and retrieve at inference.
Umwelt engineering: Defining the cognitive world the agent can inhabit via manipulation of available linguistic structures (vocabulary, grammar, conceptual distinctions).

Importantly, these layers are posited to be orthogonal and hierarchically upstream—invisible from below—with Umwelt engineering providing foundational cognitive constraints on all downstream processes.

Theoretical and Empirical Foundations

The core theoretical assertion, supported by empirical precedents, is that linguistic constraints operationalized as Umwelt engineering are not mere output filters; they restructure the cognitive operations of LLMs. This is distinct from prompting, which delivers instructions within a static Umwelt. The argument is reinforced by:

The strong form of linguistic relativity in LLMs (e.g., [wang2025], [ray2025]), where models trained in different natural languages show durable reasoning divergences that map onto their training-linguistic structures rather than only surface-level artifacts.
Prior work on designed reasoning languages, where synthetic or task-specific “reasoning dialects” substantially alter—and often improve—model inference efficiency and accuracy ([tanmay2025], [sketch2025]).

The author constructs a taxonomy of cognitive-linguistic constraints, drawing from diverse intellectual traditions (E-Prime, General Semantics, Rheomode, Operationalism, constructed languages like Lojban and Toki Pona, evidentiality, tetralemma nonbinary logic, Nonviolent Communication), each targeting distinct axes of cognitive bias (e.g., identity claims, over-generalization, entity bias).

Experimental Methodology

Experiment 1: Task-Level Cognitive Effects

Two linguistic constraints—E-Prime (elimination of all forms of "to be") and No-Have (elimination of possessive "to have" as main verb)—were implemented as system-level prompts. Their effects were evaluated on three cost-efficient LLMs (Claude Haiku 4.5, GPT-4o-mini, Gemini 2.5 Flash Lite), across seven tasks (syllogisms, causal reasoning, analogical reasoning, classification, epistemic calibration, ethical dilemmas, math word problems) with multiple-choice formats.

Key results:

No-Have yielded broad and consistent improvements: +19.1pp in ethical dilemmas ( $p < 0.001$ ), +6.5pp in classification ( $p < 0.001$ ), +7.4pp in epistemic calibration. Compliance was high (92.8%), maximizing interpretability of effects.
E-Prime effects were volatile and highly model-dependent: large positive shifts (e.g., +14.1pp for causal reasoning, +15.5pp for ethical dilemmas), but severe impairments elsewhere (e.g., -3.4pp for syllogisms, -27.5pp in epistemic calibration for GPT-4o-mini). Compliance was low (48.1%), reflecting the pervasiveness of the copula.
Constraints universally compressed output verbosity (16–33% reduction in non-mathematical tasks), indicating a robust cognitive restructuring effect.

Task improvement was non-monotonic and exhibited significant cross-model interaction; the same constraint improved one model’s performance on a particular task while degrading another’s, e.g., negative inter-model correlations up to $r = -0.75$ for E-Prime.

Experiment 2: Ensemble Cognitive Orthogonality

A set of 16 agents, each operating under a distinct Umwelt-defining linguistic constraint, were evaluated on software debugging problems. Despite no constrained agent individually outperforming the control (88.2% accuracy), ensembles composed for maximal linguistic diversity (particularly those including the counterfactual agent) achieved 100% ground-truth coverage, a result statistically unlikely by random agent selection (found in only 8% of 3-agent subsets). Each ensemble with perfect coverage required specific modes (e.g., counterfactual), confirming the mechanism of cognitive diversification.

Mechanisms and Interpretations

Two critical mechanisms are identified:

Cognitive restructuring: Constraints force models to adapt by deploying more explicit (or otherwise altered) reasoning, e.g., re-articulating relational and operational structure in the absence of possessives or the copula.
Cognitive diversification: Constraint diversity in ensembles yields complementary perspectives; certain problem features are only surfaced in some Umwelten (e.g., the counterfactual agent revealing a specification ambiguity undetected by any other agent).

These mechanisms are empirically dissociated from mere prompt elaborateness—a critical methodological issue—by observing divergent, constraint-specific, and model-specific cognitive effects not predicted by self-monitoring load alone.

Implications

Model-Dependent Umwelten

The strong interaction between environmental constraint and model architecture/training corpus argues for the characterization of each model’s “native Umwelt.” This carries implications for transfer and generalization: not only must prompts be tailored, but even Umwelt engineering must be sensitive to model idiosyncrasies.

Agent Ensemble Design

The demonstration that ensemble cognitive diversity derived from Umwelt orthogonality yields superadditive performance advances ensemble methodological practice. Ensembles of structurally similar models (e.g., same constraints) maximize redundancy; maximal coverage necessitates maximal Umwelt diversity.

Constraint Taxonomy and Future Directions

The empirical mapping of constraint axes (as in the Table of Traditions/Failures) provides a candidate geometry for future research. Questions include:

Composition: Can constraints be composed or blended, and with what interaction properties—additive, antagonistic, or non-linear?
Native Umwelt mapping: Which cognitive operations are natively accessible to which model families, and how does this modulate constraint efficacy?
Metric development: Beyond accuracy, how should cognitive diversity and complementary coverage be measured and operationalized?
Active controls: The necessity for further work to disentangle specific restructuring mechanisms from the confound of metalinguistic self-monitoring is highlighted.

Theoretical Consequences

The conclusions fortify the theoretical view that, for LLMs, the cognitive world is exhaustively specified by the available linguistic world—a radically strong form of the Sapir-Whorf hypothesis in artificial cognition. By making linguistic substrate a first-class design variable, the paper provides a rationale and empirical foundation for structured Umwelt construction in agent engineering.

Conclusion

Umwelt engineering repositions the design of the available cognitive substrate as upstream of prompt and context manipulations in language agents. Through robust experimentation, the paper demonstrates that altering available linguistic forms can produce both substantial and model-specific effects on agent reasoning: some constraints produce task-general improvements, some induce volatile or orthogonal capacities, and diversity in constraints enables ensemble phenomena unachievable by single agents. The field is thus prompted to systematically chart the space of cognitive-linguistic interventions—both as a theoretical enterprise and as a practical mechanism for improving and diversifying artificial agent cognition.

Markdown Report Issue