Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Open Graph Benchmark: Datasets for Machine Learning on Graphs (2005.00687v7)

Published 2 May 2020 in cs.LG, cs.SI, and stat.ML

Abstract: We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph ML research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. Finally, OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation. OGB will be regularly updated and welcomes inputs from the community. OGB datasets as well as data loaders, evaluation scripts, baseline code, and leaderboards are publicly available at https://ogb.stanford.edu .

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Weihua Hu (24 papers)
  2. Matthias Fey (21 papers)
  3. Marinka Zitnik (79 papers)
  4. Yuxiao Dong (119 papers)
  5. Hongyu Ren (31 papers)
  6. Bowen Liu (63 papers)
  7. Michele Catasta (9 papers)
  8. Jure Leskovec (233 papers)
Citations (2,431)

Summary

Summary of the Open Graph Benchmark Paper

The paper presents the Open Graph Benchmark (OGB), a collection of graph datasets designed to support rigorous and reproducible research in graph ML. The datasets in OGB are curated to represent various domains and ML tasks, and each dataset includes a unified evaluation protocol with realistic data splits and meaningful evaluation metrics. OGB aims to address several issues in current graph benchmarks, such as limited dataset sizes, lack of standard splitting procedures, and the gap between academic benchmarks and real-world applications.

Core Contributions

Diversity and Scale of Datasets:

OGB comprises a diverse set of datasets that cover a range of domains including social networks, information networks, biological networks, molecular graphs, and knowledge graphs. These datasets come in different scales, from small to large graphs with millions of nodes and edges. This diversity allows researchers to develop and evaluate models that can generalize across different types of graph data.

Unified Evaluation Protocols:

For each dataset, OGB provides pre-defined training, validation, and test splits along with standardized evaluation metrics. This standardization addresses a common challenge in graph ML research, where inconsistent dataset splits and evaluation protocols make it difficult to compare results across studies. By adhering to realistic application-specific splitting procedures, OGB ensures that reported performance metrics more closely reflect real-world scenarios.

Automated ML Pipeline:

OGB presents an end-to-end graph ML pipeline that simplifies dataset handling, experimental setup, and model evaluation. The pipeline includes automated data loaders, evaluators, and leaderboards. The OGB data loaders and evaluators are compatible with popular graph ML frameworks such as PyTorch Geometric and Deep Graph Library, which facilitates seamless integration into existing workflows.

Dataset Description and Benchmark Results

Node Property Prediction:

The paper includes five datasets for node property prediction, such as ogbn-products (Amazon product co-purchasing network) and ogbn-arxiv (paper citation network). The datasets vary in scale and domain, and represent different challenges in terms of generalization and scalability. Benchmark results demonstrate that traditional GNN models like GCN and GraphSAGE perform well, but there is a significant gap between training and test performance, particularly under the realistic time-based splits.

Link Property Prediction:

Six datasets are included for link property prediction, such as ogbl-ppa (protein-protein association network) and ogbl-citation2 (paper citation network). These datasets pose challenges related to dense graphs and out-of-distribution (OOD) generalization. Results show that methods incorporating positional information, such as Matrix Factorization and Node2Vec, often outperform GNNs in terms of generalization to unseen links.

Graph Property Prediction:

Four datasets are provided for graph property prediction, including ogbg-molhiv (molecular graphs) and ogbg-code2 (ASTs of source code). These datasets require models to predict properties at the graph level, which is essential for applications in chemistry and software engineering. Benchmark results indicate that models leveraging additional node and edge features, like GIN with virtual nodes, perform best.

Implications and Future Directions

Scalability:

OGB provides datasets that push the limits of graph ML models in terms of scalability. The inclusion of large-scale graphs such as ogbn-papers100M encourages the development of scalable GNNs and mini-batch training techniques capable of handling web-scale data.

Generalization:

The realistic data splits present a significant challenge for out-of-distribution generalization. This is exemplified by the large generalization gaps observed in node and link prediction tasks. Future research can explore new architectures and training methods that improve OOD performance.

Heterogeneous Graphs:

OGB includes heterogeneous graphs like ogbn-mag, which contain multiple types of nodes and edges. This necessitates the development of more sophisticated GNN models that can handle heterogeneous graph structures.

Conclusion

The Open Graph Benchmark provides a comprehensive suite of datasets that address key limitations in current graph ML research. By offering diverse and realistic benchmarks with standardized evaluation protocols, OGB aims to facilitate reproducible and impactful research. The open-source nature of OGB, coupled with its automated ML pipeline, lowers the barrier to entry and accelerates progress in the field of graph ML. As the OGB evolves with community input, it is expected to continue driving forward advancements in scalable and robust graph machine learning methods.