Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 58 tok/s Pro
Kimi K2 201 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

ModePoem Dataset for Computational Poetry Research

Updated 3 November 2025
  • ModePoem Dataset is a multilingual resource comprising over 100,000 annotated poems with detailed meter, language, and metadata information.
  • It supports diverse research applications including automatic versification, cross-modal poem generation, and benchmarking LLM-generated poetry detection.
  • The dataset is rigorously cleaned and annotated using both automated and human validation techniques, ensuring high-quality data for computational poetics.

ModePoem Dataset refers to several large-scale resources released in recent years for computational analysis and synthesis of poetry, with an emphasis on poetic meter, cross-modal inspiration, and generative benchmarking. Collections under the "ModePoem" terminology support diverse applications including automatic versification, image-inspired poem generation, and LLM-originated poetry detection.

1. Dataset Definition and Structure

ModePoem datasets are designed to facilitate empirical research in poetry analysis and generation. The primary release described in (Yousef et al., 2019) comprises over 100,000 annotated poems, each supplied with:

  • Raw poem text
  • Meter annotation (central feature)
  • Language indicator
  • Metadata (such as author, title when available)

A formal abstraction for meter classification appears as:

M:PCM: P \rightarrow \mathcal{C}

with PP the set of poems and C\mathcal{C} the set of language-specific meter classes.

Dataset statistics aggregate over multiple languages:

N=lLnlN = \sum_{l \in L} n_l

where NN is total poems, LL is the language set, and nln_l is the count for language ll.

2. Multilingual and Cross-Modal Coverage

ModePoem spans multiple languages—English, Arabic, Spanish, Russian, and Hindi—enabling comparative studies in versification. The English and Arabic subcorpora feature meticulous meter annotations: English with iambic, trochaic, etc.; Arabic with the classical bahr system.

There is an expanded interpretation of "ModePoem" in (Liu et al., 2018), where it sometimes refers to "MultiM-Poem," a multimodal dataset pairing images with English poems. The pairing process involves:

  • MultiM-Poem: 8,292 human-curated image-poem pairs, each line-linked by experienced literature majors for cross-modal relevance
  • MultiM-Poem (Ex): 26,161 pairs generated by a deep coupled visual-poetic embedding (topological similarity in embedding space)
  • UniM-Poem: 93,265 standalone English poems filtered for length and language consistency

All variant datasets emphasize heterogeneous sources and text normalization.

3. Collection, Cleaning, and Annotation Methodology

ModePoem datasets aggregate poems from public literary archives, digitized anthologies, and crawling reputable literary sites. The cleaning pipeline includes:

  • Automated removal of duplicates and prose using heuristics and length-based filtering
  • Exclusion of non-poetic forms not amenable to metrical annotation (i.e., haikus, limericks)
  • Language normalization for character encoding and structure
  • Human annotators verify a sample for annotation quality, especially in meter assignment

Meter annotations use both automatic metrical analysis and manual review, which ensures statistical robustness and annotation fidelity across vastly heterogeneous poetic forms.

For multimodal forms (Liu et al., 2018): human voting is required for image-poem relevance, preferring free verse and a clear inspiration link (objects, emotions, scenes).

4. Benchmarking and Technical Use Cases

ModePoem acts as a foundational benchmark for several computational poetry tasks:

  • Meter Classification: Models, notably RNNs operating on raw character sequences, achieve 96.38% accuracy for 16 Arabic meters and 82.31% for 4 English meters (Yousef et al., 2019).
  • Cross-modal Generation: Datasets enable adversarial training architectures to ensure poeticness and relevance in image-inspired poem generation; paired data supports discriminators for style/relevance and fine-tuning CNNs for poetic object detection (Liu et al., 2018).
  • LLM-Generated Poetry Detection: In the context of modern Chinese poetry, ModePoem (aka AIGenPoetry) contains both professional human-written and LLM-generated texts (42,400 poems total: 800 human, 41,600 LLMs) (Wang et al., 1 Sep 2025). Benchmark tasks are binary classification Y(P):P{0,1}Y(P): P \mapsto \{0,1\} (human/AI source), revealing the acute challenge in detecting intrinsic stylistic emulation by advanced LLMs.

5. Public Availability and Documentation

All ModePoem corpora are publicly released, hosted at designated repositories under open academic licenses:

Supporting documentation comprises dataset schema, annotation and usage guidelines, sample code for parsing and classification, as well as meter classification theory for each language. Distributions and frequency analyses are included for comprehensive dataset understanding.

6. Impact and Novel Research Enablement

ModePoem marks several research firsts:

  • Unified Multilingual Meter Data: First consistently annotated large-scale meter corpus across five languages, bridging significant resource gaps in computational verse analysis (Yousef et al., 2019).
  • Human-Annotated Multimodal Inspiration: First large-scale image-poem pair resource for benchmarking cross-modal poetic generation and evaluation of deep coupled visual-poetic embedding models (Liu et al., 2018).
  • LLM Benchmarking: The benchmark for distinguishing LLM-generated modern Chinese poetry sets a new challenge for stylometric detectors, revealing the unreliability of existing algorithms on poems generated to imitate human style. Style-based imitation is the hardest to detect, even for fine-tuned RoBERTa, which otherwise achieves up to 91.17% F1 (baseline) but drops sharply in out-of-domain detection and style-matched cases (Wang et al., 1 Sep 2025).

ModePoem resources enable:

  • Statistical and neural modeling of meter and style
  • Large-scale benchmarking for generative poetry models and multimodal inspiration tasks
  • Linguistic and literary paper of versification systems, historical trends, and cross-lingual stylistic differences
  • Data-driven research into distinguishing human vs. AI poetic creativity

7. Summary Table of ModePoem Variants

Dataset Variant Num. Poems Unique Features
ModePoem (meter corpus) >100,000 Multilingual, annotated meters, cleaned format
MultiM-Poem 8,292 Human-judged image-poem pairs
MultiM-Poem (Ex) 26,161 Embedding-based image-poem semantic pairs
UniM-Poem 93,265 Standalone English poems, poeticness-filtered
AIGenPoetry/ModePoem 42,400 Human+LLM Chinese poetry for origin detection

ModePoem datasets underpin advances in computational poetics, meter classification, transfer learning across verse genres, and robust evaluation of AI-generated poetry and stylistic mimetics. The cross-lingual and multimodal structure, open availability, and rigorous documentation position ModePoem as a central resource for empirical and applied research in the digital humanities and natural language processing.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to ModePoem Dataset.