Automated Machine Learning on Graphs: A Survey (2103.00742v4)

Published 1 Mar 2021 in cs.LG

Abstract: Machine learning on graphs has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To solve this critical challenge, automated machine learning (AutoML) on graphs which combines the strength of graph machine learning and AutoML together, is gaining attention from the research community. Therefore, we comprehensively survey AutoML on graphs in this paper, primarily focusing on hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We further overview libraries related to automated graph machine learning and in-depth discuss AutoGL, the first dedicated open-source library for AutoML on graphs. In the end, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive review of automated machine learning on graphs to the best of our knowledge.

Citations (83)

View on Semantic Scholar

Summary

The paper presents key contributions by evaluating HPO strategies like AutoNE and JITuNE to improve scalability in graph model optimization.
It details NAS methods tailored for GNNs, categorizing search spaces, strategies, and performance estimation techniques using reinforcement learning and differentiable approaches.
It reviews practical libraries such as AutoGL, providing tools that streamline AutoML for graphs and pave the way for future research innovations.

Automated Machine Learning on Graphs: A Survey

The paper "Automated Machine Learning on Graphs: A Survey" presents a comprehensive examination of the landscape of automated machine learning (AutoML) applied to graph-based data. Given the increasing complexity and diversity of graph-related tasks in both academic and industrial settings, this paper endeavors to systematize the burgeoning research occurring at the intersection of graph machine learning and AutoML. The survey specifically emphasizes two prevalent areas: hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph learning models.

Key Contributions

HPO for Graph Machine Learning: The paper discusses various strategies aimed at improving the scalability of HPO within the graph learning domain, a crucial aspect given the resource-intensive nature of processing massive graphs. Noteworthy methods such as AutoNE, which employs subgraph proxies for efficiency, and JITuNE, which utilizes graph coarsening for a hierarchical approach to HPO, are explored. The survey highlights how these approaches mitigate the computational costs traditionally associated with extensive HPO.
NAS for Graph Machine Learning: The distinction of graph neural networks (GNNs) from other neural architectures necessitates unique considerations for NAS. The paper categorizes NAS efforts by dissecting the search space, search strategies, and performance estimation strategies. Approaches leveraging reinforcement learning for search strategies, differentiable architectures (akin to DARTS) for gradients-based optimizations, and evolutionary algorithms are all presented. This section further categorizes search space into micro, macro, pooling methods, and hyper-parameters, thereby offering a fine-grained taxonomy for researchers.
Libraries for AutoML on Graphs: Addressing the need for accessible and standardized tools, the paper reviews available libraries such as PyTorch Geometric and AutoGL, the latter being highlighted for its comprehensiveness and support for both HPO and NAS in graph data contexts. AutoGL, as the first dedicated library for automated graph learning, provides an important resource for deploying and testing novel algorithms efficiently.

Implications and Future Directions

This survey delineates a significant need for continuing research and innovation within the domain of AutoML on graphs, given the unique challenges and opportunities present in graph data. Practical implications of this research include enhanced efficiency in deploying graph models in real-world applications, ranging from social networks to bioinformatics, where the optimization of learning processes is both computationally demanding and critical for performance.

The survey also anticipates future enhancements in this domain, suggesting that further development of hardware-aware models could vastly improve scalability, especially on large-scale graph datasets. Moreover, there is a call for more robust evaluation protocols and datasets to enable a broader perspective and improved generalizability of results across different graph learning tasks and methodologies.

In summary, this survey is a pivotal step toward consolidating existing research on AutoML for graph machine learning and posits directions for developing more sophisticated, efficient, and broader applications of these technologies. The insights presented are primed to fuel further research and innovation directed at fully realizing the capabilities of automated graph learning.

PDF Markdown

Related Papers

GitHub

GitHub - THUMNLab/awesome-auto-graph-learning: A paper collection about automated graph learning (96 stars)