GraphLab: A New Framework For Parallel Machine Learning (1408.2041v1)

Published 9 Aug 2014 in cs.LG and cs.DC

Abstract: Designing and implementing efficient, provably correct parallel ML algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems.

Authors (6)

Yucheng Low (7 papers)
Joseph E. Gonzalez (167 papers)
Aapo Kyrola (7 papers)
Danny Bickson (27 papers)
Carlos E. Guestrin (8 papers)
Joseph Hellerstein (3 papers)

Citations (885)

View on Semantic Scholar

Summary

The paper introduces GraphLab, a framework that utilizes a graph-based data model to effectively capture computational dependencies in iterative ML tasks.
The paper presents multiple consistency models and sophisticated scheduling primitives that enable significant parallel speedups on multi-core systems.
The paper validates its approach through experiments on ML algorithms like belief propagation and Gibbs sampling, achieving speedups of up to 15x.

GraphLab: A New Framework For Parallel Machine Learning

Introduction

The paper "GraphLab: A New Framework For Parallel Machine Learning" introduces GraphLab, a novel framework designed to address limitations in existing parallel computing models such as MapReduce, especially in the context of ML algorithms. The paper highlights the critical challenges posed by low-level tools like MPI and Pthreads, which require complex synchronization, and high-level abstractions like MapReduce, which are often inexpressive for ML tasks that include computational dependencies and iterative computations. GraphLab is proposed as a solution that strikes a balance between expressiveness and ease of use while ensuring data consistency and achieving high parallel performance.

Key Contributions

The major contributions of the paper include:

Graph-Based Data Model: GraphLab uses a directed data graph to represent both the computational structure and the program state, allowing the encapsulation of sparse computational dependencies common in ML algorithms.
Concurrent Access Models: The framework provides several data consistency models, including full consistency, edge consistency, and vertex consistency, each offering different levels of parallelism and correctness guarantees.
Sophisticated Scheduling Mechanism: GraphLab introduces various scheduling models, such as synchronous, round-robin, and prioritized schedules, and presents a novel set scheduler that optimizes parallel execution by leveraging graph colorings.
Aggregation Framework: A sync mechanism is included for efficient management of global state through operations analogous to fold and reduce in functional programming.
Experimental Evaluation: The paper demonstrates the practical applications of GraphLab through four real-world ML algorithms—belief propagation, Gibbs sampling, Co-EM, and Lasso—showcasing its performance advantages on multi-core systems.

Detailed Framework

Data Model: GraphLab's data model comprises a directed data graph G and a shared data table (SDT), allowing arbitrary data association with vertices and edges. This structure is particularly advantageous for representing pairwise Markov Random Fields (MRFs) used in belief propagation (BP).

Update Functions and Sync Mechanism: Update functions in GraphLab can read and modify data within a localized scope of the graph, facilitating fine-grained parallelism. The sync mechanism aggregates global statistics, supporting convergence assessments and parameter updates in algorithms.

Consistency Models: The three data consistency models offered—fully consistent, edge consistent, and vertex consistent—provide trade-offs between execution parallelism and program correctness. For example, the full consistency model restricts parallelism to avoid data races, while the vertex consistency model maximizes parallelism at the risk of potential race conditions.

Scheduling: The framework includes flexible scheduling primitives to manage dynamic iterative computations, with special schedulers designed for specific algorithmic needs. For instance, the splash scheduler optimizes the scheduling for loopy BP by creating spanning trees across vertices.

Experimental Results

MRF Parameter Learning: The paper first demonstrates GraphLab’s utility in a retinal image denoising task. The task involves parameter learning for a 3D grid pairwise MRF using BP and a gradient descent procedure applied via the sync mechanism. Various scheduling strategies demonstrate significant speedups (up to 15x with splash scheduling) on a 16-core system.

Gibbs Sampling: By implementing a parallel Gibbs sampler for a protein-protein interaction network, the paper shows that GraphLab can achieve speedups of up to 10x using optimized scheduling while maintaining the efficacy of the sampling process.

Co-EM: The semi-supervised Co-EM algorithm for named entity recognition (NER) scales effectively under GraphLab. Experiments indicate that dynamic scheduling (Multiqueue FIFO) does not significantly outperform regular round-robin scheduling, yet achieves notable speedups over single-threaded execution.

Lasso Regression: Two parallel optimization algorithms for the Lasso problem are implemented. The Shooting Algorithm displays respectable scaling properties, particularly when relaxed consistency guarantees are applied, achieving a speedup factor of 4x and 2x on sparse and dense datasets, respectively.

Compressed Sensing: The implementation of a compressed sensing algorithm using an interior-point method highlights GraphLab’s integration capabilities as an iterative solver, achieving an 8x speedup on 16 processors.

Implications and Future Directions

GraphLab’s framework offers theoretical and practical advancements for parallel ML algorithm design and implementation. By providing a balanced abstraction that accommodates the expressive needs of structured models and the performance exigencies of parallel computation, GraphLab has paved the way for enhanced ML scalability on multi-core systems.

The planned future work includes extending GraphLab to distributed settings, presenting new challenges like efficient graph partitioning, load balancing, and fault tolerance. These extensions could further amplify GraphLab’s applicability to even larger datasets and more complex ML tasks.

GraphLab represents a significant step forward in enabling efficient and scalable parallel machine learning, attuned to the computational realities and data dependencies inherent in modern ML algorithms.

PDF Markdown