Search for a command to run...
A Critical Review of Causal Reasoning Benchmarks for Large Language Models