Dice Question Streamline Icon: https://streamlinehq.com

Investigate the reasons and intent behind missing or contradictory language metadata

Investigate the reasons and intent underlying missing, contradictory, or ambiguously recorded language metadata in Crossref journal-article records, to determine why these issues occur and whether they reflect workflow limitations, platform constraints, or deliberate choices.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper documents compounding problems: missing language attributes, contradictions between declared and detected languages, ambiguity from multiple abstracts merged into a single field, and unreliably available affiliations. These hinder accurate language detection and attribution.

Recognizing the difficulty of drawing conclusions about intent, the authors explicitly ask why these issues happen, motivating a focused investigation into causal factors.

References

This, in turn, leads to a long series of open questions: Why?

Evaluating Multilingual Metadata Quality in Crossref (2503.11853 - II et al., 14 Mar 2025) in Discussion, paragraph beginning “These problems have a compounding effect on one another”