Search for a command to run...
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices