Command Palette
Search for a command to run...
AceMath Instruct Training Data Mathematical Reasoning Dataset
AceMath Instruct Training Data is a dataset released by NVIDIA in 2025 for training AceMath models, aiming to improve the performance of the model in mathematical reasoning tasks. The related paper results are "AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling".
This dataset contains multiple stages of fine-tuning data. general_sft_stage1 Contains 2,261,687 samples, mainly covering fine-tuning samples of instructions in the code and mathematics fields; general_sft_stage2 Contains 1,634,573 samples, further extended to code, mathematics, and general domain instruction fine-tuning; and specialized for the mathematics domain math_sft The data contains 1,661,094 samples, focusing on improving mathematical reasoning ability. The generation of these data combines the Qwen2.5-Math-72B-Instruct and GPT-4o-mini models to ensure the diversity and high quality of the data.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.