5 months ago

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang

Abstract

While the deep reasoning'' paradigm has spurred significant advances inverifiable domains like mathematics, its application to open-ended, creativegeneration remains a critical challenge. The two dominant methods forinstilling reasoning -- reinforcement learning (RL) and instructiondistillation -- falter in this area; RL struggles with the absence of clearreward signals and high-quality reward models, while distillation isprohibitively expensive and capped by the teacher model's capabilities. Toovercome these limitations, we introduce REverse-Engineered Reasoning (REER), anew paradigm that fundamentally shifts the approach. Instead of building areasoning processforwards'' through trial-and-error or imitation, REER works``backwards'' from known-good solutions to computationally discover the latent,step-by-step deep reasoning process that could have produced them. Using thisscalable, gradient-free approach, we curate and open-source DeepWriting-20K, alarge-scale dataset of 20,000 deep reasoning trajectories for open-ended tasks.Our model, DeepWriter-8B, trained on this data, not only surpasses strongopen-source baselines but also achieves performance competitive with, and attimes superior to, leading proprietary models like GPT-4o and Claude 3.5.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Reverse-Engineered Reasoning for Open-Ended Generation | Papers | HyperAI

Command Palette

Reverse-Engineered Reasoning for Open-Ended Generation

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang2 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Reverse-Engineered Reasoning for Open-Ended Generation

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang2 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Reverse-Engineered Reasoning for Open-Ended Generation

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang2 more

Abstract

Build AI with AI

HyperAI Newsletters

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang

Haozhe Wang Haoran Que Qixin Xu Minghao Liu Wangchunshu Zhou Jiazhan Feng Wanjun Zhong Wei Ye Tong Yang Wenhao Huang