HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Max Belitsky Dawid J. Kopiczko Michael Dorkenwald M. Jehanzeb Mirza Cees G. M. Snoek Yuki M. Asano

KV Cache Steering for Inducing Reasoning in Small Language Models

Abstract

We propose cache steering, a lightweight method for implicit steering oflanguage models via a one-shot intervention applied directly to the key-valuecache. To validate its effectiveness, we apply cache steering to inducechain-of-thought reasoning in small language models. Our approach leveragesGPT-4o-generated reasoning traces to construct steering vectors that shiftmodel behavior toward more explicit, multi-step reasoning without fine-tuningor prompt modifications. Experimental evaluations on diverse reasoningbenchmarks demonstrate that cache steering improves both the qualitativestructure of model reasoning and quantitative task performance. Compared toprior activation steering techniques that require continuous interventions, ourone-shot cache steering offers substantial advantages in terms ofhyperparameter stability, inference-time efficiency, and ease of integration,making it a more robust and practical solution for controlled generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
KV Cache Steering for Inducing Reasoning in Small Language Models | Papers | HyperAI