Search for a command to run...
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation