Efficient Vits On Imagenet 1K With Deit T

评估指标

GFLOPs
Top 1 Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Base (DeiT-T)1.272.2Training data-efficient image transformers & distillation through attention
SPViT (1.0G)1.072.2SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
MCTF ($r=8$)1.072.9Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
SPViT1.070.7Pruning Self-attentions into Convolutional Layers in Single Path
LTMP (80%)1.072.0Learned Thresholds Token Merging and Pruning for Vision Transformers
SPViT (0.9G)0.972.1SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
S$^2$ViTE0.970.1Chasing Sparsity in Vision Transformers: An End-to-End Exploration
ToMe ($r=8$)0.971.7Token Merging: Your ViT But Faster
LTMP (60%)0.871.5Learned Thresholds Token Merging and Pruning for Vision Transformers
BAT0.872.3Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers
ToMe ($r=12$)0.871.4Token Merging: Your ViT But Faster
EvoViT0.872.0Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
eTPS0.872.3Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers
PPT0.872.1PPT: Token Pruning and Pooling for Efficient Vision Transformers
dTPS0.872.9Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers
PS-ViT0.772.0Patch Slimming for Efficient Vision Transformers-
LTMP (45%)0.769.8Learned Thresholds Token Merging and Pruning for Vision Transformers
MCTF ($r=16$)0.772.7Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
MCTF ($r=20$)0.671.4Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
ToMe ($r=16$)0.670.7Token Merging: Your ViT But Faster
0 of 22 row(s) selected.
Efficient Vits On Imagenet 1K With Deit T | SOTA | HyperAI超神经