Q. "What is the number to the left of 1000?"
Step 0 — think. "… checking traffic lights … license plates … bus route numbers … none match 1000 … maybe the question is a trick. The number to the left of 1000 in numerical order is 999 … Let me try image_zoom_in … [reasoning continues without ever emitting a tool call] … Given that this is a common math question, the answer is likely 999."
No tool call was emitted.
Answer. 999
Step 0 — think. "… checking phone numbers and license plates … under SUNBEAM THEATRE there's a section with 'Tel: 2563 1000' … Let me confirm with image_zoom_in."
Step 0 — tool call.
image_zoom_in({"bbox_2d": [870, 450, 945, 517],
"label": "Telephone number sign", "img_idx": 0})
Step 1 — think. "The cropped image clearly shows 'Tel: 2563 1000'. The number to the left of 1000 is 2563."
Answer. 2563
Both prefixes mention image_zoom_in as a candidate,
but only AXPO commits. GRPO talks itself into a numerical-order
guess and never invokes the tool — same pattern across all four
GRPO rollouts.