HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Improving Generalization in Federated Learning by Seeking Flat Minima

Debora Caldarola Barbara Caputo Marco Ciccone

Improving Generalization in Federated Learning by Seeking Flat Minima

Abstract

Models trained in federated settings often suffer from degraded performances and fail at generalizing, especially when facing heterogeneous scenarios. In this work, we investigate such behavior through the lens of geometry of the loss and Hessian eigenspectrum, linking the model's lack of generalization capacity to the sharpness of the solution. Motivated by prior studies connecting the sharpness of the loss surface and the generalization gap, we show that i) training clients locally with Sharpness-Aware Minimization (SAM) or its adaptive version (ASAM) and ii) averaging stochastic weights (SWA) on the server-side can substantially improve generalization in Federated Learning and help bridging the gap with centralized models. By seeking parameters in neighborhoods having uniform low loss, the model converges towards flatter minima and its generalization significantly improves in both homogeneous and heterogeneous scenarios. Empirical results demonstrate the effectiveness of those optimizers across a variety of benchmark vision datasets (e.g. CIFAR10/100, Landmarks-User-160k, IDDA) and tasks (large scale classification, semantic segmentation, domain generalization).

Code Repositories

debcaldarola/fedsam
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
federated-learning-on-cifar-100-alpha-0-10FedAvg
ACC@1-100Clients: 36.74
federated-learning-on-cifar-100-alpha-0-10FedSAM + SWA
ACC@1-100Clients: 39.51
federated-learning-on-cifar-100-alpha-0-10FedASAM
ACC@1-100Clients: 39.76
federated-learning-on-cifar-100-alpha-0-10FedASAM + SWA
ACC@1-100Clients: 42.64
federated-learning-on-cifar-100-alpha-0-10FedSAM
ACC@1-100Clients: 36.93
federated-learning-on-cifar-100-alpha-0-20FedASAM
ACC@1-100Clients: 40.81
federated-learning-on-cifar-100-alpha-0-20FedAvg
ACC@1-100Clients: 38.59
federated-learning-on-cifar-100-alpha-0-20FedSAM + SWA
ACC@1-100Clients: 39.24
federated-learning-on-cifar-100-alpha-0-20FedSAM
ACC@1-100Clients: 38.56
federated-learning-on-cifar-100-alpha-0-20FedASAM + SWA
ACC@1-100Clients: 41.62
federated-learning-on-cifar-100-alpha-0-5FedSAM
ACC@1-100Clients: 31.04
federated-learning-on-cifar-100-alpha-0-5FedAvg
ACC@1-100Clients: 30.25
federated-learning-on-cifar-100-alpha-0-5FedASAM + SWA
ACC@1-100Clients: 42.01
federated-learning-on-cifar-100-alpha-0-5FedASAM
ACC@1-100Clients: 36.04
federated-learning-on-cifar-100-alpha-0-5FedSAM + SWA
ACC@1-100Clients: 39.3
federated-learning-on-cifar-100-alpha-0-5-10FedSAM + SWA
ACC@1-100Clients: 46.76
federated-learning-on-cifar-100-alpha-0-5-10FedAvg
ACC@1-100Clients: 41.27
federated-learning-on-cifar-100-alpha-0-5-10FedASAM
ACC@1-100Clients: 46.58
federated-learning-on-cifar-100-alpha-0-5-10FedASAM + SWA
ACC@1-100Clients: 48.72
federated-learning-on-cifar-100-alpha-0-5-10FedSAM
ACC@1-100Clients: 44.84
federated-learning-on-cifar-100-alpha-0-5-20FedASAM + SWA
ACC@1-100Clients: 48.27
federated-learning-on-cifar-100-alpha-0-5-20FedSAM
ACC@1-100Clients: 46.05
federated-learning-on-cifar-100-alpha-0-5-20FedAvg
ACC@1-100Clients: 42.17
federated-learning-on-cifar-100-alpha-0-5-20FedASAM
ACC@1-100Clients: 47.78
federated-learning-on-cifar-100-alpha-0-5-20FedSAM + SWA
ACC@1-100Clients: 46.47
federated-learning-on-cifar-100-alpha-0-5-5FedASAM + SWA
ACC@1-100Clients: 49.17
federated-learning-on-cifar-100-alpha-0-5-5FedASAM
ACC@1-100Clients: 45.61
federated-learning-on-cifar-100-alpha-0-5-5FedSAM + SWA
ACC@1-100Clients: 47.96
federated-learning-on-cifar-100-alpha-0-5-5FedSAM
ACC@1-100Clients: 44.73
federated-learning-on-cifar-100-alpha-0-5-5FedAvg
ACC@1-100Clients: 40.43
federated-learning-on-cifar-100-alpha-1000-10FedSAM + SWA
ACC@1-100Clients: 53.67
federated-learning-on-cifar-100-alpha-1000-10FedSAM
ACC@1-100Clients: 53.39
federated-learning-on-cifar-100-alpha-1000-10FedASAM + SWA
ACC@1-100Clients: 54.79
federated-learning-on-cifar-100-alpha-1000-10FedASAM
ACC@1-100Clients: 54.97
federated-learning-on-cifar-100-alpha-1000-10FedAvg
ACC@1-100Clients: 50.25
federated-learning-on-cifar-100-alpha-1000-20FedASAM + SWA
ACC@1-100Clients: 54.1
federated-learning-on-cifar-100-alpha-1000-20FedSAM + SWA
ACC@1-100Clients: 54.36
federated-learning-on-cifar-100-alpha-1000-20FedAvg
ACC@1-100Clients: 50.66
federated-learning-on-cifar-100-alpha-1000-20FedASAM
ACC@1-100Clients: 54.5
federated-learning-on-cifar-100-alpha-1000-20FedSAM
ACC@1-100Clients: 53.97
federated-learning-on-cifar-100-alpha-1000-5FedASAM + SWA
ACC@1-100Clients: 53.86
federated-learning-on-cifar-100-alpha-1000-5FedAvg
ACC@1-100Clients: 49.92
federated-learning-on-cifar-100-alpha-1000-5FedSAM
ACC@1-100Clients: 54.01
federated-learning-on-cifar-100-alpha-1000-5FedSAM + SWA
ACC@1-100Clients: 53.9
federated-learning-on-cifar-100-alpha-1000-5FedASAM
ACC@1-100Clients: 54.81
federated-learning-on-cityscapesFedAvg
mIoU: 38.65
federated-learning-on-cityscapesFedSAM
mIoU: 41.22
federated-learning-on-cityscapesSiloBN + ASAM
mIoU: 49.75
federated-learning-on-cityscapesFedASAM
mIoU: 42.27
federated-learning-on-cityscapesFedASAM + SWA
mIoU: 43.02
federated-learning-on-cityscapesFedSAM + SWA
mIoU: 43.42
federated-learning-on-cityscapesSiloBN + SAM
mIoU: 49.1
federated-learning-on-cityscapesSiloBN
mIoU: 45.96
federated-learning-on-cityscapesFedAvg + SWA
mIoU: 42.48
federated-learning-on-landmarks-user-160kFedSAM
Acc@1-1262Clients: 63.72
federated-learning-on-landmarks-user-160kFedASAM
Acc@1-1262Clients: 64.23
federated-learning-on-landmarks-user-160kFedASAM + SWA
Acc@1-1262Clients: 68.32
federated-learning-on-landmarks-user-160kFedAvg + SWA
Acc@1-1262Clients: 67.52
federated-learning-on-landmarks-user-160kFedAvg
Acc@1-1262Clients: 61.91
federated-learning-on-landmarks-user-160kFedSAM + SWA
Acc@1-1262Clients: 68.12
image-classification-on-cifar-100-alpha-0-20FedAvgM + ASAM + SWA
ACC@1-100Clients: 51.58

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Improving Generalization in Federated Learning by Seeking Flat Minima | Papers | HyperAI