Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation (2108.01374v1)

Published 3 Aug 2021 in cs.SD, cs.MM, and eess.AS

Abstract: While there are many music datasets with emotion labels in the literature, they cannot be used for research on symbolic-domain music analysis or generation, as there are usually audio files only. In this paper, we present the EMOPIA (pronounced yee-m\{o}-pi-uh') dataset, a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. The dataset contains 1,087 music clips from 387 songs and clip-level emotion labels annotated by four dedicated annotators. Since the clips are not restricted to one clip per song, they can also be used for song-level analysis. We present the methodology for building the dataset, covering the song list curation, clip selection, and emotion annotation processes. Moreover, we prototype use cases on clip-level music emotion classification and emotion-based symbolic music generation by training and evaluating corresponding models using the dataset. The result demonstrates the potential of EMOPIA for being used in future exploration on piano emotion-related MIR tasks.

Citations (74)

Summary

  • The paper introduces EMOPIA, a comprehensive multi-modal pop piano dataset enriched with detailed emotional annotations.
  • It combines audio signals and symbolic representations to support advanced models in emotion recognition and music generation.
  • Empirical evaluations demonstrate improved accuracy in emotion classification and innovative capabilities in generating expressive music.

An Overview of ISMIR Conference Paper Formatting Guidelines

The document under examination is a template designed for formatting Late-Breaking Demo (LBD) manuscripts submitted to the International Society for Music Information Retrieval (ISMIR) Conference. Despite its lack of technical content or data-driven results typical in conventional academic papers, the template serves a crucial infrastructural role within the academic publication process, providing a definitive set of guidelines for authors to ensure uniformity and consistency across conference submissions.

Structure and Format Guidelines

The document outlines detailed specifications for authors regarding manuscript preparation, striving to standardize the appearance of submissions. Key attributes of the formatting guidelines include:

  • Page Layout: The template mandates that the submission should be formatted on A4-size paper with precise margin specifications and a two-column text layout. The requirements on margins and column width are intended to optimize readability and visual uniformity.
  • Text Formatting: A 10-point Times font is specified for the body text, with accommodations for sans-serif or monospaced fonts for distinctive purposes like code representation. Titles and headings follow a hierarchical structure with specific typesetting norms that dictate font size, style, and placement, ensuring a clear visual differentiation across sections.
  • Figures and Tables: Authors are instructed to ensure all artwork is legible and comprehensible in grayscale, mindful of reproduction limitations in physical proceedings without color. Captions are to be placed below each figure or table, which must be appropriately numbered and referenced in the text.
  • Bibliographic References: References should conform to IEEE standards, enabling consistency in citation styles, which is an integral part of scientific discourse.

Administrative Aspects

The document includes procedural instructions for authors concerning submission logistics and post-acceptance modifications. The policy requires that page numbers, headers, and footers be omitted to facilitate the conference's consolidation processes. Additionally, the inclusion of line numbers prior to acceptance is mandated to facilitate reviewer commentary.

Implications for Academic Contributors

This template plays an essential role in facilitating the peer review process and maintaining the professional appearance of conference proceedings. By enforcing a uniform structure, it alleviates potential content presentation issues that could detract from the peer review process's credibility. Moreover, the standardized formatting simplifies the task for reviewers, enabling them to focus on content quality without distractions arising from format discrepancies.

The promulgated format serves the conference's logistical needs and preserves the aesthetic integrity of the compiled proceedings, projecting a professional image that reflects the conference's overall quality and rigor. Furthermore, as academic dissemination increasingly moves toward digital platforms, these guidelines ensure that electronic versions maintain consistent quality and readability standards.

Future Prospects

With evolving technological capabilities and potential advances in document processing tools, future iterations of this template may benefit from incorporating features that accommodate multimedia content and interactive elements. Enhancements could include more flexible formatting options to reflect dynamic data visualization standards or adaptive layouts suited for digital-first publishing mediums.

In conclusion, while this document does not contribute novel scientific insights or experimental results, its strategic role in the publication cycle is indispensable. The meticulous design of the ISMIR LBD paper template reflects the concerted effort to uphold rigorous standards within the academic community, ensuring that all submissions meet the established criteria for clarity, professionalism, and coherence in presentation.

Github Logo Streamline Icon: https://streamlinehq.com