3 个月前

SparseGPT:大规模语言模型可实现一次性精准剪枝

SparseGPT:大规模语言模型可实现一次性精准剪枝

摘要

我们首次证明,大规模生成式预训练变换器(GPT)系列模型可在无需任何微调的情况下,通过一次性剪枝实现至少50%的稀疏度,且准确率损失极小。这一成果得益于一种专为高效、精准处理大规模GPT系列模型而设计的新剪枝方法——SparseGPT。我们可在不到4.5小时内完成对目前最大规模的开源模型OPT-175B和BLOOM-176B的剪枝,达到60%的非结构化稀疏度,同时困惑度(perplexity)几乎无增长:令人瞩目的是,在推理阶段,这些模型中超过1000亿个权重可被忽略。SparseGPT还可推广至半结构化剪枝模式(如2:4和4:8),并与权重量化方法兼容。相关代码已开源,地址为:https://github.com/IST-DASLab/sparsegpt。

代码仓库

baithebest/sparsellm
pytorch
GitHub 中提及
baithebest/adagp
pytorch
GitHub 中提及
eth-easl/deltazip
pytorch
GitHub 中提及
nvlabs/maskllm
pytorch
GitHub 中提及
ist-daslab/sparsegpt
官方
pytorch
GitHub 中提及
nvidia/tensorrt-model-optimizer
pytorch
GitHub 中提及

基准测试

基准方法指标
common-sense-reasoning-on-arc-challengeOPT-175B (50% Sparsity)
Accuracy: 25.6
common-sense-reasoning-on-arc-challengeOPT-175B
Accuracy: 43.94
common-sense-reasoning-on-arc-challengeSparseGPT (175B, 2:4 Sparsity)
Accuracy: 38.99
common-sense-reasoning-on-arc-challengeSparseGPT (175B, 50% Sparsity)
Accuracy: 41.3
common-sense-reasoning-on-arc-challengeSparseGPT (175B, 4:8 Sparsity)
Accuracy: 39.85
common-sense-reasoning-on-arc-easySparseGPT 175B (50% sparsity)
Accuracy: 69.65
common-sense-reasoning-on-arc-easySparseGPT (175B, 4:8 Sparsity)
Accuracy: 68.35
common-sense-reasoning-on-arc-easyOPT-175B
Accuracy: 71.04
common-sense-reasoning-on-arc-easySparseGPT 175B (2:4 sparsity)
Accuracy: 67.08
common-sense-reasoning-on-arc-easyOPT 175B (50% Sparsity)
Accuracy: 28.03
language-modelling-on-lambadaOPT-175B (50% Sparsity)
Accuracy: 0.02
language-modelling-on-lambadaSparseGPT (175B, 2:4 Sparsity)
Accuracy: 79.47
language-modelling-on-lambadaSparseGPT (175B, 50% Sparsity)
Accuracy: 76.51
language-modelling-on-lambadaOPT-175B
Accuracy: 75.59
language-modelling-on-lambadaSparseGPT (175B, 4:8 Sparsity)
Accuracy: 78.77
language-modelling-on-wikitext-2OPT-175B (50% Sparsity)
Test perplexity: 234.77
language-modelling-on-wikitext-2SparseGPT (175B, 50% Sparsity)
Test perplexity: 8.21
language-modelling-on-wikitext-2OPT-175B
Test perplexity: 8.34
language-modelling-on-wikitext-2SparseGPT (175B, 2:4 Sparsity)
Test perplexity: 8.73
language-modelling-on-wikitext-2SparseGPT (175B, 4:8 Sparsity)
Test perplexity: 8.45
question-answering-on-piqaSparseGPT 175B (50% Sparsity)
Accuracy: 80.63
question-answering-on-piqaOPT-175B (50% Sparsity)
Accuracy: 54.73
question-answering-on-piqaOPT-175B
Accuracy: 81.07
question-answering-on-piqaSparseGPT 175B (4:8 Sparsity)
Accuracy: 79.54
question-answering-on-piqaSparseGPT 175B (2:4 Sparsity)
Accuracy: 79.54
question-answering-on-storyclozeSparseGPT (175B, 2:4 Sparsity)
Accuracy: 76.19
question-answering-on-storyclozeSparseGPT (175B, 50% Sparsity)
Accuracy: 78.87
question-answering-on-storyclozeSparseGPT (175B, 4:8 Sparsity)
Accuracy: 77.02
question-answering-on-storyclozeOPT-175B
Accuracy: 79.82
question-answering-on-storyclozeOPT-175B (50% Sparsity)
Accuracy: 47.10

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
SparseGPT:大规模语言模型可实现一次性精准剪枝 | 论文 | HyperAI超神经