HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Generative Low-bitwidth Data Free Quantization

Xu Shoukai ; Li Haokun ; Zhuang Bohan ; Liu Jing ; Cao Jiezhang ; Liang Chuangrun ; Tan Mingkui

Generative Low-bitwidth Data Free Quantization

Abstract

Neural network quantization is an effective way to compress deep models andimprove their execution latency and energy efficiency, so that they can bedeployed on mobile or embedded devices. Existing quantization methods requireoriginal data for calibration or fine-tuning to get better performance.However, in many real-world scenarios, the data may not be available due toconfidential or private issues, thereby making existing quantization methodsnot applicable. Moreover, due to the absence of original data, the recentlydeveloped generative adversarial networks (GANs) cannot be applied to generatedata. Although the full-precision model may contain rich data information, suchinformation alone is hard to exploit for recovering the original data orgenerating new meaningful data. In this paper, we investigate asimple-yet-effective method called Generative Low-bitwidth Data FreeQuantization (GDFQ) to remove the data dependence burden. Specifically, wepropose a knowledge matching generator to produce meaningful fake data byexploiting classification boundary knowledge and distribution information inthe pre-trained model. With the help of generated data, we can quantize a modelby learning knowledge from the pre-trained model. Extensive experiments onthree data sets demonstrate the effectiveness of our method. More critically,our method achieves much higher accuracy on 4-bit quantization than theexisting data free quantization method. Code is available athttps://github.com/xushoukai/GDFQ.

Code Repositories

ricky40403/GDFQ
pytorch
Mentioned in GitHub
xushoukai/GDFQ
Official
pytorch
Mentioned in GitHub
iamkanghyunchoi/ait
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
data-free-quantization-on-cifar-100ResNet-20 CIFAR-100
CIFAR-100 W4A4 Top-1 Accuracy: 43.12
CIFAR-100 W5A5 Top-1 Accuracy: 64.03
CIFAR-100 W6A6 Top-1 Accuracy: 68.63
CIFAR-100 W8A8 Top-1 Accuracy: 70.29
data-free-quantization-on-cifar10ResNet-20 CIFAR-10
CIFAR-10 W4A4 Top-1 Accuracy: 85.20
CIFAR-10 W5A5 Top-1 Accuracy: 92.39
CIFAR-10 W6A6 Top-1 Accuracy: 93.38
CIFAR-10 W8A8 Top-1 Accuracy: 93.92

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Generative Low-bitwidth Data Free Quantization | Papers | HyperAI