HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding

Zhaocheng Zhu; Shizhen Xu; Meng Qu; Jian Tang

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding

Abstract

Learning continuous representations of nodes is attracting growing interest in both academia and industry recently, due to their simplicity and effectiveness in a variety of applications. Most of existing node embedding algorithms and systems are capable of processing networks with hundreds of thousands or a few millions of nodes. However, how to scale them to networks that have tens of millions or even hundreds of millions of nodes remains a challenging problem. In this paper, we propose GraphVite, a high-performance CPU-GPU hybrid system for training node embeddings, by co-optimizing the algorithm and the system. On the CPU end, augmented edge samples are parallelly generated by random walks in an online fashion on the network, and serve as the training data. On the GPU end, a novel parallel negative sampling is proposed to leverage multiple GPUs to train node embeddings simultaneously, without much data transfer and synchronization. Moreover, an efficient collaboration strategy is proposed to further reduce the synchronization cost between CPUs and GPUs. Experiments on multiple real-world networks show that GraphVite is super efficient. It takes only about one minute for a network with 1 million nodes and 5 million edges on a single machine with 4 GPUs, and takes around 20 hours for a network with 66 million nodes and 1.8 billion edges. Compared to the current fastest system, GraphVite is about 50 times faster without any sacrifice on performance.

Code Repositories

DeepGraphLearning/graphvite
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
link-prediction-on-fb15kSimplE
Hits@1: 0.721
Hits@10: 0.876
Hits@3: 0.818
MR: 74
MRR: 0.779
training time (s): 2105
link-prediction-on-fb15k-237RotatE
Hits@1: 0.217
Hits@10: 0.511
Hits@3: 0.347
MR: 176
MRR: 0.314
training time (s): 857
link-prediction-on-wn18SimplE
Hits@1: 0.944
Hits@10: 0.954
Hits@3: 0.950
MR: 412
MRR: 0.948
training time (s): 1042
node-classification-on-youtubeLINE
Macro-F1@2%: 33.69
Micro-F1@2%: 40.61
runtime (s): 70.09

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding | Papers | HyperAI