HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Mohammad Zbeeb Hasan Abed Al Kader Hammoud Bernard Ghanem

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task
  Arithmetic

Abstract

Large language models often require costly optimization, such asreinforcement learning, to master complex reasoning tasks. This workdemonstrates that reasoning ability, once learned, can be extracted andtransferred between models as a compact task vector. We source two publiclyavailable, identically initialized Qwen2.5 models, one fine-tuned withsupervised fine-tuning (SFT) and the other with group relative policyoptimization (GRPO) on the same dataset. From these, we extract a reasoningvector: v_{reason} = theta_{GRPO} - theta_{SFT}. Wehypothesize that this vector captures the reasoning capability instilled byreinforcement learning while factoring out shared knowledge from the SFTprocess. When added to compatible instruction-tuned models through simplearithmetic, this vector consistently improves performance across diversereasoning benchmarks: GSM8K (+4.9%), HumanEval (+4.3%), SciQ (+1.7%), andBigBenchHard (+12.3% for the 1.5B model). The performance improvements persistunder adversarial conditions. Conversely, subtracting the vector causessignificant performance degradation (-11.8% on GSM8K), demonstrating thevector's strong contribution to the model's reasoning abilities. This workshows how reasoning capabilities, typically developed through expensivetraining, can be extracted from existing open-source models and reused throughsimple tensor arithmetic, offering a practical way to enhance models byrecycling prior computational investments.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic | Papers | HyperAI