Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly (1903.06694v2)

Published 15 Mar 2019 in stat.ML, cs.AI, and cs.LG

Abstract: Bayesian Optimisation (BO) refers to a suite of techniques for global optimisation of expensive black box functions, which use introspective Bayesian models of the function to efficiently search for the optimum. While BO has been applied successfully in many applications, modern optimisation tasks usher in new challenges where conventional methods fail spectacularly. In this work, we present Dragonfly, an open source Python library for scalable and robust BO. Dragonfly incorporates multiple recently developed methods that allow BO to be applied in challenging real world settings; these include better methods for handling higher dimensional domains, methods for handling multi-fidelity evaluations when cheap approximations of an expensive function are available, methods for optimising over structured combinatorial spaces, such as the space of neural network architectures, and methods for handling parallel evaluations. Additionally, we develop new methodological improvements in BO for selecting the Bayesian model, selecting the acquisition function, and optimising over complex domains with different variable types and additional constraints. We compare Dragonfly to a suite of other packages and algorithms for global optimisation and demonstrate that when the above methods are integrated, they enable significant improvements in the performance of BO. The Dragonfly library is available at dragonfly.github.io.

Citations (166)

View on Semantic Scholar

Summary

The paper introduces Dragonfly, an open-source library that scales Bayesian optimisation for hyperparameter tuning using additive models and multi-fidelity evaluation.
It employs a dynamic portfolio of acquisition functions and GP hyperparameter exploration to robustly navigate diverse optimisation challenges.
Extensive experiments on synthetic benchmarks and real-world tasks, including neural architecture search and astrophysical inference, validate its efficiency in high-dimensional settings.

Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly

The paper "Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly" introduces Dragonfly, an open-source Python library designed to address the challenges associated with Bayesian Optimisation (BO) in modern settings. BO techniques are pivotal for optimising expensive black-box functions, typically used in domains where evaluations are costly and derivative information is unavailable, such as hyperparameter tuning in machine learning models.

Innovations in Dragonfly

Dragonfly provides several advancements that enhance the scalability and robustness of BO:

Scalability:
- High-Dimensional Optimisation: The library incorporates additive models that enable the execution of BO in high dimensions. This approach mitigates the statistical and computational challenges by assuming that the function can be decomposed into a sum of lower-dimensional functions.
- Multi-Fidelity Evaluation: Dragonfly implements techniques to leverage cheaper approximations of expensive black-box functions, thereby exploring regions with high potential effectively before committing significant resources to high-fidelity evaluations.
- Neural Architecture Search (NAS): The library includes tools for optimising neural network architectures by employing optimal transport metrics, allowing for efficient exploration of the architecture space.
Robustness:
- Portfolio of Acquisition Functions: Instead of relying on a single acquisition function, Dragonfly samples from a portfolio (including commonly used acquisitions like GP-UCB, EI, and Thompson Sampling), allowing it to adapt dynamically to different optimisation landscapes.
- GP Hyperparameter Exploration: The library offers mechanisms for exploring Gaussian Process (GP) hyperparameters through posterior sampling, reducing the reliance on potentially overfitting maximum likelihood estimates.
Parallel and Asynchronous Optimisation: The library supports parallel evaluations using a hallucination technique to handle the asynchronicity between different evaluation paths, thus efficiently utilising computational resources.

Experimental Validation

The performance of Dragonfly is validated through extensive experimentation:

Synthetic Benchmarks: Dragonfly shows competitive performance across a range of synthetic benchmarks, including high-dimensional extensions and noisy settings. It consistently performs better or competitively relative to other BO and non-BO optimisation techniques.
Applied Problems: The application of Dragonfly in astrophysical Bayesian inference and hyperparameter tuning tasks for machine learning models demonstrates its efficacy in real-world scenarios. The experiments underscore Dragonfly's ability to adapt to various fidelity levels and constraints.
Neural Architecture Search: Dragonfly's NAS capabilities are explored, exhibiting the ability to optimise neural architectures in constrained search spaces efficiently, outperforming traditional methods through its integrated multi-fidelity approach.

Implications and Future Directions

The development of Dragonfly has significant implications for the field of optimisation as it pertains to hyperparameter tuning and neural architecture design. Its methodologies for scalable multi-fidelity optimisation are particularly relevant in the era of deep learning, where model selection and parameter optimisation are critical for deploying performant models.

Moving forward, the strategies and insights from Dragonfly could have broader applications in other domains that require optimising complex functions under computational constraints. Future work could explore the integration of multi-objective optimisation frameworks and enhanced probabilistic models to further broaden the library's applicability.

In summary, Dragonfly represents a pragmatic and powerful approach to overcoming the limitations traditionally associated with BO, providing a flexible, scalable, and robust toolkit for a diverse array of optimisation tasks in modern scientific and engineering applications.

PDF Markdown