Darts: User-Friendly Modern Machine Learning for Time Series (2110.03224v3)

Published 7 Oct 2021 in cs.LG and stat.CO

Abstract: We present Darts, a Python machine learning library for time series, with a focus on forecasting. Darts offers a variety of models, from classics such as ARIMA to state-of-the-art deep neural networks. The emphasis of the library is on offering modern machine learning functionalities, such as supporting multidimensional series, meta-learning on multiple series, training on large datasets, incorporating external data, ensembling models, and providing a rich support for probabilistic forecasting. At the same time, great care goes into the API design to make it user-friendly and easy to use. For instance, all models can be used using fit()/predict(), similar to scikit-learn.

Citations (188)

View on Semantic Scholar

Summary

The paper introduces the Darts library, unifying classical and modern ML forecasting with a streamlined API.
The paper details a dedicated TimeSeries data structure that enables efficient, numpy-based operations and scalable model training.
The paper demonstrates robust covariate and probabilistic forecasting support, enhancing risk assessment and transfer learning potential.

An Expert Analysis of "Darts: User-Friendly Modern Machine Learning for Time Series"

The paper "Darts: User-Friendly Modern Machine Learning for Time Series" introduces a Python library aimed at fostering the adoption of ML methodologies in the domain of time series forecasting. Authored by Julien Herzen and colleagues, the paper elucidates both the concept and implementation of Darts, targeting researchers and practitioners interested in leveraging advanced ML techniques alongside classical forecasting methods.

The central merit of Darts is its integration of classical and machine-learning-based forecasting models within a unified, high-level API, facilitating increased accessibility for practitioners. The library comprises an extensive array of models, including both traditional ones like ARIMA and emerging deep learning architectures such as RNN, N-BEATS, and TCN among others. This versatility caters to a broad spectrum of task-specific forecasting requirements while also providing robust multimodel ensemble capabilities.

Core Contributions and Technical Specifics

One of the paper's substantial contributions is the introduction of a dedicated TimeSeries data structure. TimeSeries serves as a container for time series data, utilizing a three-dimensional xarray format to encapsulate the time index, component dimensions, and stochastic samples. This structure ensures data integrity and supports efficient, numpy-based operations, thereby streamlining model training and prediction processes.

Key features of Darts include:

Unified Forecasting API: The library supports the traditional fit and predict methods for time series models. The API allows for sequential fitting on multiple series, enabling comparative analyses without exploring model-specific intricacies.
Scalability: Darts supports training on extensive datasets and multiple time series, addressing key scalability issues inherent in classical techniques. The library exploits GPU-accelerated computation via PyTorch to handle large data efficiently.
Covariate Support: The library offers explicit functionality to incorporate past and future covariates, improving model performance by integrating additional data sources like weather forecasts or economic indicators.
Probabilistic Forecasting: Darts accommodates probabilistic modeling through Monte Carlo sampling and supports a range of distributions. This capability enhances its utility in risk assessment and uncertainty management.
Additional Tools: Beyond forecasting, Darts includes tools for preprocessing, hyperparameter optimization, backtesting, and metric evaluations, thus covering the workflow from model training to application.

Implications and Future Prospects

The inception of Darts represents a pivotal step toward democratizing advanced forecasting practices. The library's user-friendly approach and comprehensive feature set lower the barrier for practitioners to adopt sophisticated ML methodologies. By unifying diverse models under a cohesive framework, Darts provides a platform for evaluating the efficacy of machine learning versus classical models in various domains such as energy, finance, and healthcare.

The architecture of Darts also posits implications for research in meta-learning and transfer learning. The potential to utilize pretrained models across domains could significantly reduce computational costs and accelerate adoption in industry.

Conclusion

In summary, the paper effectively outlines the benefits and applications of Darts, emphasizing its role in bridging classical and modern ML forecasting techniques. By advancing the usability and functionality of time series forecasting tools, the authors pave the way for enhanced analytical capabilities across multiple sectors. Future developments, as suggested by the authors, may include support for static covariates and an expansion into the field of pretrained model libraries, akin to initiatives seen in other ML domains like NLP and computer vision. As such, Darts positions itself as a valuable asset in modern forecasting research and practice.

PDF Markdown