- The paper presents a comprehensive TSER archive that benchmarks extrinsic regression challenges using 19 diverse datasets.
- It evaluates multiple algorithms, with ROCKET achieving the best average performance based on RMSE.
- The archive spans various real-world domains, offering a framework to catalyze advanced research in TSER.
Monash University, UEA, UCR Time Series Extrinsic Regression Archive
This paper presents the Monash University, UEA, UCR Time Series Extrinsic Regression Archive, aiming to enhance time series extrinsic regression (TSER) research by introducing a comprehensive benchmarking archive. This initiative addresses a notable gap in time series research, particularly with respect to extrinsic regression challenges that have previously been overshadowed by the focus on time series classification (TSC) and time series forecasting (TSF).
Introduction and Motivation
The authors highlight that the advancement of machine learning disciplines largely depends on robust benchmarking datasets. The repository at the University of California Irvine, alongside time series archives from UCR and UEA, significantly contributed to the TSC and TSF communities. However, no analogous resources existed for TSER—a challenging paradigm where the interest lies in predicting a single continuous value that may not be a direct continuation of the time series.
TSER differs from both TSC and TSF. While TSC is concerned with predicting discrete labels, and TSF with forecasting future time series values, TSER targets broader regression problems where a continuous value, potentially independent of the immediate past data or future estimations, is predicted.
TSER Dataset Archive
This paper introduces the first TSER benchmark archive comprising 19 datasets across diverse domains: energy monitoring, environment monitoring, health monitoring, sentiment analysis, and forecasting. Each dataset is crafted to offer unique attributes. For instance, they include multivariate, unequally-timed, missing entries, and varying dimensionalities. Notably, these datasets provide practical representations of real-world TSER scenarios, such as predicting daily appliances' energy usage using hourly sensor data.
The datasets are derived from reputable sources, including the UCI machine learning repository, contributing entities such as the World Health Organization, and bespoke calibrations like synthetic flood models from Monash University researchers.
Baseline Evaluation and Results
The paper sets a baseline for TSER by evaluating a variety of algorithms, including:
- Functional principal component regression (FPCR) with/without B-spline.
- Support vector regression (SVR).
- Ensemble methods like random forests (RF) and XGBoost.
- Neural network-based time series classifiers such as FCN, ResNet, and Inception Network, alongside ROCKET—a highly efficient random convolutional kernel-based classifier.
These methods were benchmarked using root mean squared error (RMSE), facilitated by standard implementations in Python libraries (e.g., Scikit-Learn). The efficacy evaluation showed ROCKET achieving the best average ranking among the algorithms, demonstrating its superior utility in handling TSER tasks. However, the outcome also highlighted that traditional machine learning approaches remain competitive, suggesting that the development of dedicated TSER methods is necessary to enhance performance further.
Implications and Future Directions
This pioneering work sets a groundbreaking precedent in TSER research by providing a structured framework that mirrors real-world complexities within the domain. The establishment of the TSER archive not only promotes algorithmic innovation but also fosters a deeper understanding and solution design for extrinsic regression problems hampered by irregularities and multidimensionality.
Future directions include expanding the archive to include more varied datasets, continuously updating the benchmark with cutting-edge algorithms, and improving the coverage of domain-specific challenges within the datasets. This work is set to catalyze significant growth in the TSER landscape, with potential spill-overs in broader predictive modeling and machine learning applications.
The authors acknowledge contributions from various institutions and extend an open invitation for further data donations to sustainably evolve the archive's scope. As TSER becomes progressively crucial in time series analysis, this archive will likely become an essential resource for researchers and industry practitioners.