5 hours ago

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang

Abstract

As the foundational architecture of modern machine learning, Transformers have driven remarkable progress across diverse AI domains. Despite their transformative impact, a persistent challenge across various Transformers is Attention Sink (AS), in which a disproportionate amount of attention is focused on a small subset of specific yet uninformative tokens. AS complicates interpretability, significantly affecting the training and inference dynamics, and exacerbates issues such as hallucinations. In recent years, substantial research has been dedicated to understanding and harnessing AS. However, a comprehensive survey that systematically consolidates AS-related research and offers guidance for future advancements remains lacking. To address this gap, we present the first survey on AS, structured around three key dimensions that define the current research landscape: Fundamental Utilization, Mechanistic Interpretation, and Strategic Mitigation. Our work provides a pivotal contribution by clarifying key concepts and guiding researchers through the evolution and trends of the field. We envision this survey as a definitive resource, empowering researchers and practitioners to effectively manage AS within the current Transformer paradigm, while simultaneously inspiring innovative advancements for the next generation of Transformers. The paper list of this work is available at https://github.com/ZunhaiSu/Awesome-Attention-Sink.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 hours ago

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 hours ago

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang10 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang10 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang10 more

Abstract

Build AI with AI

HyperAI Newsletters

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang

Zunhai Su Hengyuan Zhang Wei Wu Yifan Zhang Yaxiu Liu He Xiao Qingyao Yang Yuxuan Sun Rui Yang Chao Zhang