Command Palette
Search for a command to run...
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

Abstract
We present a scientific reasoning foundation model that aligns naturallanguage with heterogeneous scientific representations. The model is pretrainedon a 206B-token corpus spanning scientific text, pure sequences, andsequence-text pairs, then aligned via SFT on 40M instructions, annealedcold-start bootstrapping to elicit long-form chain-of-thought, andreinforcement learning with task-specific reward shaping, which instillsdeliberate scientific reasoning. It supports four capability families, coveringup to 103 tasks across workflows: (i) faithful translation between text andscientific formats, (ii) text/knowledge extraction, (iii) property prediction,(iv) property classification, (v) unconditional and conditional sequencegeneration and design. Compared with specialist systems, our approach broadensinstruction coverage, improves cross-domain generalization, and enhancesfidelity. We detail data curation and training and show that cross-disciplinelearning strengthens transfer and downstream reliability. The model, instructtuning datasets and the evaluation code are open-sourced athttps://huggingface.co/SciReason andhttps://github.com/open-sciencelab/SciReason.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.