7 months ago

Method/Architecture

Max Belitsky Dawid J. Kopiczko Michael Dorkenwald M. Jehanzeb Mirza Cees G. M. Snoek Yuki M. Asano

Abstract

We propose cache steering, a lightweight method for implicit steering oflanguage models via a one-shot intervention applied directly to the key-valuecache. To validate its effectiveness, we apply cache steering to inducechain-of-thought reasoning in small language models. Our approach leveragesGPT-4o-generated reasoning traces to construct steering vectors that shiftmodel behavior toward more explicit, multi-step reasoning without fine-tuningor prompt modifications. Experimental evaluations on diverse reasoningbenchmarks demonstrate that cache steering improves both the qualitativestructure of model reasoning and quantitative task performance. Compared toprior activation steering techniques that require continuous interventions, ourone-shot cache steering offers substantial advantages in terms ofhyperparameter stability, inference-time efficiency, and ease of integration,making it a more robust and practical solution for controlled generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

7 months ago

Method/Architecture

Max Belitsky Dawid J. Kopiczko Michael Dorkenwald M. Jehanzeb Mirza Cees G. M. Snoek Yuki M. Asano

Abstract

We propose cache steering, a lightweight method for implicit steering oflanguage models via a one-shot intervention applied directly to the key-valuecache. To validate its effectiveness, we apply cache steering to inducechain-of-thought reasoning in small language models. Our approach leveragesGPT-4o-generated reasoning traces to construct steering vectors that shiftmodel behavior toward more explicit, multi-step reasoning without fine-tuningor prompt modifications. Experimental evaluations on diverse reasoningbenchmarks demonstrate that cache steering improves both the qualitativestructure of model reasoning and quantitative task performance. Compared toprior activation steering techniques that require continuous interventions, ourone-shot cache steering offers substantial advantages in terms ofhyperparameter stability, inference-time efficiency, and ease of integration,making it a more robust and practical solution for controlled generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp