Search for a command to run...
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models