Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection (1909.09347v1)

Published 20 Sep 2019 in cs.SD, cs.LG, eess.AS, and stat.ML

Abstract: Factory machinery is prone to failure or breakdown, resulting in significant expenses for companies. Hence, there is a rising interest in machine monitoring using different sensors including microphones. In the scientific community, the emergence of public datasets has led to advancements in acoustic detection and classification of scenes and events, but there are no public datasets that focus on the sound of industrial machines under normal and anomalous operating conditions in real factory environments. In this paper, we present a new dataset of industrial machine sounds that we call a sound dataset for malfunctioning industrial machine investigation and inspection (MIMII dataset). Normal sounds were recorded for different types of industrial machines (i.e., valves, pumps, fans, and slide rails), and to resemble a real-life scenario, various anomalous sounds were recorded (e.g., contamination, leakage, rotating unbalance, and rail damage). The purpose of releasing the MIMII dataset is to assist the machine-learning and signal-processing community with their development of automated facility maintenance. The MIMII dataset is freely available for download at: https://zenodo.org/record/3384388

Citations (282)

Summary

  • The paper introduces a novel MIMII dataset that fills the gap in industrial machine sound analysis by providing recordings for both normal and malfunctioning conditions.
  • The dataset includes over 32,000 files from four types of machines, enabling robust benchmarking of unsupervised anomaly detection methods.
  • Its real-world recording approach and diverse range of anomalies offer practical insights for advancing predictive maintenance and industrial IoT applications.

An Expert Review on the MIMII Dataset for Industrial Machine Sound Analysis

The paper "MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection" presents an essential contribution to the field of machine anomaly detection through acoustic signals, filling a notable gap in publicly available datasets for industrial environments. The research team from Hitachi, Ltd., comprising Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, introduces the MIMII dataset, which caters specifically to the domain of machine sound analysis under both normal and anomalous conditions.

Highlights of the MIMII Dataset

The MIMII dataset is meticulously structured, providing sound recordings of four different types of industrial machines: valves, pumps, fans, and slide rails. Each machine category encompasses seven distinct product models, reflecting common variations encountered in industrial settings. A substantial dataset, it comprises 26,092 sound files under normal conditions and 6,065 files depicting a range of anomalies such as contamination, leakage, rotating imbalance, and rail damage. The recordings were captured using a circular microphone array in real factory environments, ensuring authenticity in the audio data.

Methodological Approach and Experimentation

In the field of machine anomaly detection, unsupervised learning methods are of particular interest. The authors employed an autoencoder-based approach to develop a benchmark model for anomaly detection, designed to operate in an unsupervised setting where only normal sound data is leveraged during training. The autoencoder's performance was evaluated based on its ability to discern anomalies in sound recordings, with the Area Under the Curve (AUC) metric serving as the primary measure of success. This experiment highlighted the challenges posed by non-stationary sound signals and noise, notably affecting the anomaly detection precision for valves in comparison to the more predictable acoustic patterns of industrial fans.

Implications and Future Directions

The introduction of the MIMII dataset holds both practical and theoretical significance. Practically, it provides an invaluable resource for developers of audio-based diagnostic systems aimed at predictive maintenance, thereby enhancing operational efficiency in industrial settings. Theoretically, it paves the way for further exploration into sound-based anomaly detection methodologies and the integration of multimodal sensor data for comprehensive machine condition monitoring.

As researchers delve into the MIMII dataset, opportunities abound for refining anomaly detection algorithms, particularly in overcoming the hurdles of noise and non-stationarity. Future iterations of the dataset could incorporate meta-data to enrich anomaly classification and domain adaptation studies. The open access provided to the dataset via Zenodo facilitates widespread adoption and collaborative research, potentially leading to innovative breakthroughs in industrial IoT applications.

Overall, the work presented in this paper establishes a foundational step towards advancing the field of acoustic machine monitoring, offering a robust platform for future research and development efforts.