Command Palette
Search for a command to run...
A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation (GALI)
Li Yan Zhang Tianyi Li Zechuan Han Soyeon Caren

Abstract
Transformer-based Large Language Models (LLMs) struggle with inputs exceedingtheir training context window due to positional out-of-distribution (O.O.D.)issues that disrupt attention. Existing solutions, including fine-tuning andtraining-free methods, face challenges like inefficiency, redundantinterpolation, logit outliers, or loss of local positional information. Wepropose Greedy Attention Logit Interpolation (GALI), a training-free methodthat improves length extrapolation by greedily reusing pretrained positionalintervals and interpolating attention logit to eliminate outliers. GALIachieves stable and superior performance across a wide range of long-contexttasks without requiring input-length-specific tuning. Our analysis furtherreveals that LLMs interpret positional intervals unevenly and that restrictinginterpolation to narrower ranges improves performance, even on short-contexttasks. GALI represents a step toward more robust and generalizable long-textprocessing in LLMs. Our implementation of GALI, along with the experiments fromour paper, is open-sourced at https://github.com/adlnlp/Gali.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| long-context-understanding-on-l-eval | GALI(Llama3-8b-ins-8k-to-32k) | Average Score: 42.79 |
| long-context-understanding-on-l-eval | GALI(Llama3-8b-ins-4k-to-16k) | Average Score: 59.21 |
| long-context-understanding-on-l-eval | GALI(Llama3-8b-ins-8k-to-16k) | Average Score: 42.32 |
| long-context-understanding-on-l-eval | GALI(Llama3-8b-ins-4k-to-32k) | Average Score: 59.10 |
| long-context-understanding-on-longbench | GALI(Llama3-8b-ins-8k-to-16k) | Average Score: 45.17 |
| long-context-understanding-on-longbench | GALI(Llama3-8b-ins-4k-to-16k) | Average Score: 46.22 |
| long-context-understanding-on-longbench | GALI(Llama3-8b-ins-8k-to-32k) | Average Score: 45.38 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.