Command Palette
Search for a command to run...
{Yu Liu Haihang You Guanglu Song Shenghan Zhang Boxiao Liu}

Abstract
Knowledge distillation is a representative technique formodel compression and acceleration, which is important fordeploying neural networks on resource limited devices. Theknowledge transferred from teacher to student is the mapping of teacher model, or represented by all the input-outputpairs. However, in practice the student model only learnsfrom data pairs of the dataset that may be biased, and wethink this limits the performance of knowledge distillation.In this paper, we first quantitatively define the uniformityof the sampled data for training, providing a unified viewfor methods that learn from biased data. Then we evaluatethe uniformity on real world dataset and show that existing methods actually improve the uniformity of data. Wefurther introduce two uniformity-oriented methods for rectifying the bias of data for knowledge distillation. Extensive experiments conducted on Face Recognition and Person Re-identification have shown the effectiveness of ourmethod. Moreover, we analyze the sampled data on FaceRecognition and show that better balance is achieved between races and between easy and hard samples. And thiseffect can be also confirmed in training the student modelfrom scratch, resulting in a comparable performance withstandard knowledge distillation.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| face-verification-on-ijb-c | L2E+IS-sampling | TAR @ FAR=1e-3: 97.05% TAR @ FAR=1e-4: 95.49% TAR @ FAR=1e-5: 93.25% model: MobileFaceNet training dataset: MS1M V3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.