Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 60 tok/s

Gemini 2.5 Pro 40 tok/s Pro

GPT-5 Medium 27 tok/s Pro

GPT-5 High 28 tok/s Pro

GPT-4o 87 tok/s Pro

Kimi K2 190 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration (2506.09272v1)

Published 10 Jun 2025 in cs.LG and stat.ML

Abstract: Constructing robust simulators is essential for asking "what if?" questions and guiding policy in critical domains like healthcare and logistics. However, existing methods often struggle, either failing to generalize beyond historical data or, when using LLMs, suffering from inaccuracies and poor empirical alignment. We introduce G-Sim, a hybrid framework that automates simulator construction by synergizing LLM-driven structural design with rigorous empirical calibration. G-Sim employs an LLM in an iterative loop to propose and refine a simulator's core components and causal relationships, guided by domain knowledge. This structure is then grounded in reality by estimating its parameters using flexible calibration techniques. Specifically, G-Sim can leverage methods that are both likelihood-free and gradient-free with respect to the simulator, such as gradient-free optimization for direct parameter estimation or simulation-based inference for obtaining a posterior distribution over parameters. This allows it to handle non-differentiable and stochastic simulators. By integrating domain priors with empirical evidence, G-Sim produces reliable, causally-informed simulators, mitigating data-inefficiency and enabling robust system-level interventions for complex decision-making.

Summary

The paper introduces a hybrid framework that automatically constructs simulators by combining LLM-driven structural reasoning with empirical gradient-free calibration.
It employs Bayesian-inspired simulation-based inference and gradient-free optimization to accurately estimate parameters in non-differentiable models.
Experiments show G-Sim achieves lower Wasserstein distances, enabling reliable decision-support in domains like healthcare and logistics.

Overview of G-Sim: Generative Simulations with LLMs and Gradient-Free Calibration

This paper introduces G-Sim, a framework designed to facilitate automatic simulator construction by integrating the structural design capabilities of LLMs with rigorous parameter calibration techniques that do not rely on gradients. The framework's primary aim is to create robust simulators that can support decision-making and policy guidance in complex domains such as healthcare and logistics.

Key Contributions

Hybrid Framework: G-Sim combines LLM-driven structural reasoning with empirical data calibration. The process begins with LLMs proposing possible simulator architectures based on domain knowledge, followed by calibration of these structures using techniques adept at handling non-differentiable and stochastic simulations. This iterative cycle refines both the structure and parameters of the simulators, ensuring they align well with real-world data.
Gradient-Free Calibration: The framework employs gradient-free optimization (GFO) and simulation-based inference (SBI) to calibrate the numerical parameters of the models suggested by the LLM. GFO is effective for parameter estimation through techniques that do not require derivative calculations, while SBI offers a Bayesian inference approach to handle complex uncertainties in parameter estimation.
General-Purpose Simulation: The paper emphasizes the creation of simulators that are capable of generalization beyond historical data, facilitating the exploration of "what if" scenarios and supporting system-wide experimentation. These simulators aim to integrate diverse data sources, manage uncertainty, and remain consistent with real-world observations.

Implementation and Results

The paper outlines the application of G-Sim on several benchmark environments, highlighting its efficacy in creating simulators that closely match ground truth dynamics. By comparing G-Sim with various baseline methods, the authors demonstrate that it achieves lower Wasserstein distances, indicating more accurate distributional predictions.

The iterative process allows for continuous refinement until the model's empirical alignment is satisfactory, as evidenced by experiments involving COVID-19 simulations, supply chain management, and hospital bed scheduling scenarios. The approach shows promise in handling scenarios unseen in training, such as varied infection rates or logistics lead times.

Implications and Future Directions

From a theoretical standpoint, G-Sim presents a novel approach to combining LLMs with empirical calibration, which could be particularly valuable in domains where traditional model-based approaches struggle with complexities such as stochasticity and partial observability. Practically, it suggests a method for leveraging machine learning to simulate vast and intricate real-world systems without extensive manual setup.

Potential future developments could explore extending G-Sim to support multi-scale simulations or integrating explicit fairness constraints to mitigate biases from historical data or the LLM's inherent biases. Furthermore, exploring active learning strategies within the calibration process could enhance the identification of key parameters and submodule interactions, leading to even more robust simulations.

In conclusion, the G-Sim framework stands out as a significant contribution to the field of simulation modeling, offering a compelling pathway to automated, data-integrated simulator design. This blend of LLMs and empirical calibration could pave the way for new advancements in decision-support systems across various high-stakes domains.