Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 59 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 127 tok/s Pro

Kimi K2 189 tok/s Pro

GPT OSS 120B 421 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

DGL-KE: Training Knowledge Graph Embeddings at Scale (2004.08532v1)

Published 18 Apr 2020 in cs.DC

Abstract: Knowledge graphs have emerged as a key abstraction for organizing information in diverse domains and their embeddings are increasingly used to harness their information in various information retrieval and machine learning tasks. However, the ever growing size of knowledge graphs requires computationally efficient algorithms capable of scaling to graphs with millions of nodes and billions of edges. This paper presents DGL-KE, an open-source package to efficiently compute knowledge graph embeddings. DGL-KE introduces various novel optimizations that accelerate training on knowledge graphs with millions of nodes and billions of edges using multi-processing, multi-GPU, and distributed parallelism. These optimizations are designed to increase data locality, reduce communication overhead, overlap computations with memory accesses, and achieve high operation efficiency. Experiments on knowledge graphs consisting of over 86M nodes and 338M edges show that DGL-KE can compute embeddings in 100 minutes on an EC2 instance with 8 GPUs and 30 minutes on an EC2 cluster with 4 machines with 48 cores/machine. These results represent a 2x~5x speedup over the best competing approaches. DGL-KE is available on https://github.com/awslabs/dgl-ke.

Citations (172)

View on Semantic Scholar

Summary

The paper introduces a framework that leverages multi-process, multi-GPU, and distributed parallelism for scalable knowledge graph embedding training.
The paper presents optimization techniques like METIS data partitioning and joint negative sampling to minimize communication overhead and enhance efficiency.
The paper demonstrates significant speedups by computing embeddings in 100 minutes on 8 GPUs and 30 minutes on a 4-machine cluster.

DGL-KE: Training Knowledge Graph Embeddings at Scale

The paper presents DGL-KE, an open-source package designed for efficient computation of Knowledge Graph Embeddings (KGEs) on large-scale knowledge graphs. This work addresses the computational challenges posed by growing knowledge graphs, which encompass millions of nodes and billions of edges. Through a series of optimizations, DGL-KE improves operational efficiency, data locality, and reduces communication overhead.

Technical Contributions

Scalability Enhancements: DGL-KE offers multiprocess, multi-GPU, and distributed parallelism, making it capable of handling graphs with vast sizes. The package harnesses the computational power of CPUs and GPUs effectively, demonstrating substantial speedups over existing methods.
Optimization Techniques:
- Data Partitioning: The package uses METIS partitioning for minimizing cross-machine data transfers, improving the efficiency of distributed training.
- Negative Sampling Strategies: By leveraging joint negative sampling, the method reduces the number of entity embeddings accessed, thereby optimizing tensor operations and decreasing CPU-GPU data movement.
- Relation Partitioning: To enhance training on multi-GPU systems, relation embeddings are pinned within GPUs, significantly reducing CPU-GPU communication, especially beneficial for models like TransR with larger relation matrices.
- Asynchronous Updates: Overlapping gradient updates with batch computations ensures efficient GPU utilization, increasing training throughput.
Performance Evaluation: The performance benchmarks on knowledge graphs with up to 86 million nodes are impressive. DGL-KE can compute embeddings in 100 minutes using a single EC2 instance with 8 GPUs and in 30 minutes on a 4-machine EC2 cluster, offering a 2x to 5x speedup as compared to competing tools like GraphVite and PyTorch-BigGraph.

Empirical Validation

Experiments demonstrate that DGL-KE achieves competitive embedding quality much faster than existing approaches. The usage of joint sampling and improved graph partitioning leads to significant efficiency gains without compromising accuracy. The evaluations span various hardware setups, showing substantial scalability improvements, especially with distributed environments.

Implications and Future Directions

DGL-KE's design offers a framework that effectively leverages modern computing architectures, highlighting the necessity of scalable solutions for handling increasingly large knowledge graphs.

Future work could explore further optimizations in negative sampling techniques or adapt these optimizations to emerging hardware architectures. Additionally, expanding the library's model support could make it a more versatile tool in the knowledge graph embedding ecosystem.

Conclusion

DGL-KE represents a well-engineered solution for training knowledge graph embeddings efficiently at scale. Its contributions to enhancing data processing efficiency and scalability make it a valuable resource for researchers and practitioners dealing with large-scale graph-based data applications.