8 months ago

Supervised Fine-Tuning

Method/Architecture

Yu Sun Xingyu Qian Weiwen Xu Hao Zhang Chenghao Xiao Long Li Yu Rong Wenbing Huang Qifeng Bai Tingyang Xu

Abstract

Though reasoning-based large language models (LLMs) have excelled inmathematics and programming, their capabilities in knowledge-intensive medicalquestion answering remain underexplored. To address this, we introduceReasonMed, the largest medical reasoning dataset, comprising 370k high-qualityexamples distilled from 1.7 million initial reasoning paths generated byvarious LLMs. ReasonMed is constructed through a multi-agentverification and refinement process, where we design an Error Refinerto enhance the reasoning paths by identifying and correcting error-prone stepsflagged by a verifier. Leveraging ReasonMed, we systematically investigate bestpractices for training medical reasoning models and find that combiningdetailed Chain-of-Thought (CoT) reasoning with concise answer summaries yieldsthe most effective fine-tuning strategy. Based on this strategy, we trainReasonMed-7B, which sets a new benchmark for sub-10B models, outperforming theprior best by 4.17% and even exceeding LLaMA3.1-70B on PubMedQA by 4.60%.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Supervised Fine-Tuning

Method/Architecture

Yu Sun Xingyu Qian Weiwen Xu Hao Zhang Chenghao Xiao Long Li Yu Rong Wenbing Huang Qifeng Bai Tingyang Xu

Abstract

Though reasoning-based large language models (LLMs) have excelled inmathematics and programming, their capabilities in knowledge-intensive medicalquestion answering remain underexplored. To address this, we introduceReasonMed, the largest medical reasoning dataset, comprising 370k high-qualityexamples distilled from 1.7 million initial reasoning paths generated byvarious LLMs. ReasonMed is constructed through a multi-agentverification and refinement process, where we design an Error Refinerto enhance the reasoning paths by identifying and correcting error-prone stepsflagged by a verifier. Leveraging ReasonMed, we systematically investigate bestpractices for training medical reasoning models and find that combiningdetailed Chain-of-Thought (CoT) reasoning with concise answer summaries yieldsthe most effective fine-tuning strategy. Based on this strategy, we trainReasonMed-7B, which sets a new benchmark for sub-10B models, outperforming theprior best by 4.17% and even exceeding LLaMA3.1-70B on PubMedQA by 4.60%.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp