Teaching Physics to AI: When Simulators Become Teachers

This presentation explores Sim2Reason, a breakthrough approach that transforms physics simulators into scalable question-answer generators for training large language models. By automatically creating diverse, physically grounded problems and using reinforcement learning with verifiable rewards, researchers achieve dramatic improvements on International Physics Olympiad problems—all without human annotation. The talk demonstrates how synthetic data from classical mechanics simulations can teach AI genuine physical reasoning that transfers to real-world problems.
Script
Large language models excel at math but struggle with physics, because physics problems are rare in training data—less than 1% of available question-answer pairs. The researchers behind Sim2Reason asked: what if we could turn physics simulators into automatic teachers, generating thousands of rigorous problems on demand?
The Sim2Reason framework starts with a domain-specific language that composes physical scenes—springs, masses, inclines—in combinatorially diverse ways. Each scene runs in MuJoCo, a physics simulator, generating precise trajectories. From these traces, the system automatically extracts numeric questions, reverse-engineering challenges, and symbolic reasoning problems, then filters out any shortcuts where answers don't depend on the full system.
Here's the key insight: not all simulation questions are created equal. If you can remove a mass or spring from a scene and still get the same answer, that question teaches nothing about interaction. The filtering pipeline ablates each scene element and discards questions where answers remain invariant, ensuring the model must reason about genuine multi-body dynamics.
When post-trained with reinforcement learning on just 6,400 synthetic questions, models across all scales—from 3 billion to 32 billion parameters—achieve 5 to 10 percentage point gains on International Physics Olympiad mechanics problems. The 32 billion parameter model jumps nearly 18 points on JEE Bench, and remarkably, these physics-trained models also improve on pure math benchmarks, suggesting deeper quantitative reasoning rather than memorization.
The domain-specific language isn't just a data generator—it's a semantic layer that scales. When researchers need to simulate novel competition problems, they task a language model with extending the entity vocabulary. This abstraction succeeds where direct code generation fails, and the same entities port cleanly from MuJoCo to Omniverse, proving the approach generalizes across physics engines.
Sim2Reason removes the data bottleneck for scientific reasoning by transforming simulators into tireless teachers. The result isn't simulation memorization—it's genuine transfer to Olympiad-level problems the models have never seen. If you're curious how synthetic physics can teach real reasoning, visit EmergentMind.com to explore this work and generate your own deep dives into the research shaping AI's next frontier.