Papers
Topics
Authors
Recent
2000 character limit reached

Uniform Meaning Representation (UMR)

Updated 15 December 2025
  • Uniform Meaning Representation (UMR) is a graph-based semantic formalism that represents predicate-argument structure, modality, aspect, and coreference in both sentence- and document-level contexts.
  • UMR extends AMR by incorporating document-level representations and multilingual flexibility, enabling annotation for low-resource and typologically diverse languages.
  • UMR parsing leverages fine-tuning of AMR parsers and UD-based conversions, yielding robust performance in translation and text generation tasks with significant metrics improvements.

Uniform Meaning Representation (UMR) is a graph-based semantic formalism in the Abstract Meaning Representation (AMR) family, specifically engineered to capture the predicate-argument structure and core discourse phenomena of texts across the full typological diversity of the world's languages—including extremely low-resource and indigenous languages. UMR adopts a rooted, labeled, directed graph schema capable of representing both sentence- and document-level phenomena such as modality, aspect, coreference, and temporal relations. By providing a flexible, lattice-style annotation schema and universal inventories of semantic roles and concepts, UMR facilitates cross-lingual comparison, interpretable semantics, and robust foundation for downstream multilingual NLP applications such as translation and generation (Wein, 13 Feb 2025, Markle et al., 17 Feb 2025, Markle et al., 8 Dec 2025).

1. Formal Structure and Multilingual Extensibility

At the sentence level, a UMR graph is defined as G=(V,E,ℓV,ℓE)G = (V, E, \ell_V, \ell_E), where VV is a finite set of nodes representing semantic units (predicates, constants, operators), E⊆V×VE \subseteq V \times V is the set of labeled, directed edges encoding semantic relations, ℓV\ell_V maps nodes to concept labels from a universal inventory, and ℓE\ell_E assigns edge labels from the set of argument and modifier roles. Graphs are typically linearized in PENMAN notation for both human annotation and downstream modeling. Document-level UMR extends the sentence schema with explicit models for discourse nodes, coreference, temporal, and modal structure, and cross-sentence alignments (Markle et al., 8 Dec 2025, Markle et al., 17 Feb 2025).

Multilingual extensibility is achieved through:

  • Lattice-style annotation schema permitting variable granularity (from coarse semantic frames to fine-grained sense distinctions) and dynamic creation of Stage 0 rolesets for languages lacking established predicate inventories.
  • Unified relation inventories: roles (e.g., ARG0, ARG1), aspect/modality edges, and referent features remain constant across languages.
  • Low-resource bootstrapping: rapid annotation is possible from interlinear glosses or lexical resources even in the absence of language-specific PropBank-style rolesets (Wein, 13 Feb 2025, Markle et al., 8 Dec 2025).

2. Key Differences between UMR and AMR

UMR extends and generalizes AMR in two principal ways:

  • Document-level representation: Whereas AMR is strictly sentence-bound, UMR links sentence graphs via explicit coreference and discourse relations, as well as alignments across sentence boundaries. This supports the representation of phenomena like coreference and discourse connectives at the document level (Markle et al., 17 Feb 2025).
  • Typological and multilingual flexibility: AMR’s fixed English-centric rolesets are replaced by a schema that annotators can adapt to morphosyntactic, aspectual, and discourse distinctions relevant to diverse languages. UMR supports annotation at the morpheme, word, or phrase level as appropriate, and allows consistent comparison across typologically distant languages, e.g., analytic and polysynthetic languages (Wein, 13 Feb 2025, Markle et al., 17 Feb 2025, Markle et al., 8 Dec 2025).

3. UMR Parsing: Models and Methodologies

UMR parsing remains an emerging field, with the leading models pursuing two complementary strategies:

  • Fine-tuning AMR parsers: Encoder–decoder Transformer architectures (e.g., BiBL, AMRBART, SPRING) are adapted from AMR parsing by retraining on sentence–UMR pairs. This approach, exemplified by the SETUP system, achieves state-of-the-art sentence-level English UMR parsing (SMATCH++ ≈ 91) (Markle et al., 8 Dec 2025).
  • UD-driven conversion and completion: Universal Dependencies (UD) dependency trees are mapped via rule-based transformations to produce partial UMR graphs, which are then completed to full UMRs using sequence-to-sequence models (e.g., T5) (Markle et al., 8 Dec 2025).

Performance metrics include SMATCH++ (extending SMATCH for fuzzy alignments and document-level scoring), AnCast, and AnCast++. The SETUP model achieves an AnCast score of 84 and demonstrates robustness across both news and dialogue data, significantly outperforming AMR-to-UMR pipeline baselines (Markle et al., 8 Dec 2025).

4. Text Generation and Downstream Applications

Recent work investigates converting UMR graphs into natural language text:

  • Pipeline UMR→AMR→Text: Rule-based conversion maps UMR graphs to AMR, enabling the use of mature AMR-to-text generation models such as BiBL and SPRING2. However, this pipeline can lose UMR-specific information, e.g., aspect/modality features (Markle et al., 17 Feb 2025).
  • Direct fine-tuning of LLMs and AMR-to-text models on UMR data: Direct training or fine-tuning on UMR targets (in both sentence and document scope) achieves stronger results. Fine-tuned AMR-to-text models outperform LLMs lacking explicit graph structure.
  • Evaluation: Metrics include BERTScore, BLEU, METEOR, and human adequacy/fluency judgments, with the best models reaching BERTScore up to 0.825/0.882 for English/Chinese (Markle et al., 17 Feb 2025).

UMR-augmented prompts (supplying the UMR graph alongside the source) improve LLM-based translation for low-resource languages, yielding significant gains according to paired t-tests on chrF and BERTScore (Wein, 13 Feb 2025).

5. Empirical Evaluation: Translation and Generation with UMR

UMR provides statistically significant benefits in low-resource language tasks. For machine translation from Navajo, Arápaho, and Kukama, integrating the UMR graph into GPT-4 prompts yielded systematic gains:

  • Five-shot demonstrations with UMR consistently outperformed five-shot without UMR (e.g., Kukama chrF: 43.54 vs. 40.82).
  • Zero-shot with UMR improved results for most metrics and languages.
  • BERTScore and chrF increments were often significant with p<0.05p < 0.05 as determined by paired t-tests.

Qualitative analysis demonstrates that UMR-augmented outputs better capture fine-grained predicate–argument structure and verb sense disambiguation, especially in morphologically rich, low-resource languages (Wein, 13 Feb 2025).

Text generation from UMR has shown that pipeline and fine-tuned models outperform LLMs without structured graph input. Monolingual fine-tuning consistently surpasses multilingual training, reflecting the "curse of multilinguality" (Markle et al., 17 Feb 2025).

6. Datasets, Annotation Workflow, and Evaluation Protocols

The main annotated corpora for UMR research include:

  • UMR v1.0: 1,472 training / 213 dev / 435 test sentences across English, Chinese, and four indigenous languages (Arápaho, Navajo, Sanapaná, Kukama) (Markle et al., 17 Feb 2025).
  • UMR v2.0: 210,237 sentence-level UMRs; 29,912 for English extracted from diverse domains including Minecraft dialogues and news (Markle et al., 8 Dec 2025).
  • Annotation: Stage 0 bootstrapping followed by manual validation enables rapid production of high-quality gold-standard graphs for low-resource languages (Wein, 13 Feb 2025).

Evaluation relies on automatic metrics (SMATCH++, AnCast, BERTScore, chrF, BLEU, METEOR) and human judgments of adequacy/fluency. For low-resource languages, automatic metrics are less reliable, necessitating careful qualitative and human evaluation (Markle et al., 17 Feb 2025). Adaptive selection of in-context demonstrations based on source-side chrF is shown to boost translation performance (Wein, 13 Feb 2025).

7. Applications, Limitations, and Future Research Directions

UMR supports a range of downstream applications:

  • Semantic parsing in both high- and low-resource languages.
  • Machine translation, particularly for under-served languages, via symbolic scaffolding for LLMs.
  • Explainable NLP, leveraging graph structures for summarization, question answering, and information extraction.
  • Language documentation and typology, through uniform annotation schemas (Markle et al., 8 Dec 2025, Wein, 13 Feb 2025).

Limitations include scarce UMR annotations for many languages, challenges in document-level parsing (handling coreference and discourse links), and metric reliability for highly inflected or low-resource languages. Future research objectives involve:

  • Extending UMR annotation to new domains and modalities.
  • Developing robust, end-to-end document-level parsers.
  • Improving neuro-symbolic conversion between UMR and AMR to minimize information loss.
  • Creating more accurate evaluation methodologies for low-resource outputs.
  • Multilingual joint training to realize the vision of an interlingual semantic scaffold (Markle et al., 8 Dec 2025, Markle et al., 17 Feb 2025).

UMR thus constitutes a flexible, extensible foundation for semantic representation in multilingual, low-resource, and cross-lingual NLP research, with demonstrated empirical benefits in both parsing and generation tasks (Wein, 13 Feb 2025, Markle et al., 17 Feb 2025, Markle et al., 8 Dec 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Uniform Meaning Representation (UMR).