Date

a year ago

Organization

Paper URL

Tags

AceReason-1.1-SFT is a diverse and high-quality supervised fine-tuning (SFT) dataset released by NVIDIA in 2025, focusing on mathematical and code reasoning. The related paper results are:AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy", which aims to train SFT models that focus on mathematical and code reasoning. This dataset serves as a mathematical and code reasoning model AceReason-Nemotron-1.1-7B SFT training data of , all answers in the dataset are generated by DeepSeek-R1. The AceReason-1.1-SFT dataset contains 2,668,741 math samples and 1,301,591 code samples, covering data from OpenMathReasoning, NuminaMath-CoT, OpenCodeReasoning, MagicoderEvolInstruct, opc-sft-stage2, leetcode, TACO, and apps. The dataset is cleaned and samples with 9-gram overlap with any test samples in math and coding benchmarks are filtered.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at support@hyper.ai for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Discuss on Discord

Date

a year ago

Organization

Paper URL

arxiv.org

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

AceReason-1.1-SFT Mathematical Code Reasoning Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

AceReason-1.1-SFT Mathematical Code Reasoning Dataset

Related Datasets

CHOCLO Latin American Cultural Benchmark Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

zh-meme-sft-8k Chinese Internet Meme Culture Dataset

CHIMERA General Inference Synthetic Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

Diabetes Mexico (Mexico Diabetes Dataset)

Nemotron-Math-v2 Mathematical Inference Dataset

GroundingME Complex Scene Understanding Evaluation Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

AceReason-1.1-SFT Mathematical Code Reasoning Dataset

Related Datasets

CHOCLO Latin American Cultural Benchmark Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

zh-meme-sft-8k Chinese Internet Meme Culture Dataset

CHIMERA General Inference Synthetic Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

Diabetes Mexico (Mexico Diabetes Dataset)

Nemotron-Math-v2 Mathematical Inference Dataset

GroundingME Complex Scene Understanding Evaluation Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

CHOCLO Latin American Cultural Benchmark Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

zh-meme-sft-8k Chinese Internet Meme Culture Dataset

CHIMERA General Inference Synthetic Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

Diabetes Mexico (Mexico Diabetes Dataset)

Nemotron-Math-v2 Mathematical Inference Dataset

GroundingME Complex Scene Understanding Evaluation Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

Related Datasets

CHOCLO Latin American Cultural Benchmark Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

zh-meme-sft-8k Chinese Internet Meme Culture Dataset

CHIMERA General Inference Synthetic Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

Diabetes Mexico (Mexico Diabetes Dataset)

Nemotron-Math-v2 Mathematical Inference Dataset

GroundingME Complex Scene Understanding Evaluation Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset