5 months ago

TAN Without a Burn: Scaling Laws of DP-SGD

Sander Tom ; Stock Pierre ; Sablayrolles Alexandre

Abstract

Differentially Private methods for training Deep Neural Networks (DNNs) haveprogressed recently, in particular with the use of massive batches andaggregated data augmentations for a large number of training steps. Thesetechniques require much more computing resources than their non-privatecounterparts, shifting the traditional privacy-accuracy trade-off to aprivacy-accuracy-compute trade-off and making hyper-parameter search virtuallyimpossible for realistic scenarios. In this work, we decouple privacy analysisand experimental behavior of noisy training to explore the trade-off withminimal computational requirements. We first use the tools of R\'enyiDifferential Privacy (RDP) to highlight that the privacy budget, when notovercharged, only depends on the total amount of noise (TAN) injectedthroughout training. We then derive scaling laws for training models withDP-SGD to optimize hyper-parameters with more than a $100\times$ reduction incomputational budget. We apply the proposed method on CIFAR-10 and ImageNetand, in particular, strongly improve the state-of-the-art on ImageNet with a +9points gain in top-1 accuracy for a privacy budget epsilon=8.

Code Repositories

facebookresearch/tan

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
image-classification-with-dp-on-imagenet	NFResnet-50	Top 1 Accuracy: 39.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette