Thor: Multidisciplinary Scientific Frameworks
- Thor is a collection of advanced scientific frameworks spanning astrophysics, exoplanet atmospheric modeling, radiative transfer, formal theorem proving, and RL-based mathematical reasoning.
- It underpins high-resolution radio surveys for supernova remnant discovery, non-hydrostatic climate simulations, GPU-accelerated Monte Carlo radiative transfer, and neuro-symbolic proof search in formal logic.
- The modular and reproducible methodologies enhance computational efficiency and drive scientific discoveries across diverse fields by integrating empirical data with advanced numerical and symbolic techniques.
Thor refers to several advanced scientific frameworks and technologies across astrophysics, atmospheric sciences, mathematical reasoning with LLMs, formal theorem proving, and high-performance computational radiative transfer. The term encompasses internationally recognized survey programs, open-source simulation codes, hybrid machine learning/theorem-proving architectures, and tool-augmented reasoning systems.
1. Thor in Galactic Survey Science and Supernova Remnant Discovery
The HI/OH/Recombination-line survey of the Milky Way (THOR) is a high-resolution radio survey undertaken with the VLA in C-configuration, delivering spectral-line imaging of the 21 cm HI line, ground-state OH lines near 1.6 GHz, and Hnα radio recombination lines in the L-band (1–2 GHz) (Dokara et al., 2018). This survey achieves spatial resolutions of ≃20″ per pointing (continuum) and ≃25″ when combined with VGPS continuum data (1.4 GHz).
A critical application of THOR is the identification and confirmation of supernova remnants (SNRs) within the Galactic plane. The primary discriminants between SNRs and H II regions in the THOR context are:
- Radio spectral index (α): distinguishes thermal (H II, α ≃ 0) vs. non-thermal (SNR shell, α ∼ −0.5 to −0.7, even −1.0) emission. Broad-band maps spanning 150–1400 MHz enable direct pixel-by-pixel measurement and spatial correlation with shell morphologies.
- Fractional linear polarization (p): , usually suppressed () across both SNRs and H II regions in the inner Galaxy due to Faraday depolarization and diffuse polarized backgrounds, thus lacking strong diagnostic power here.
- Shell morphology: Limb-brightened structures in the THOR+VGPS continuum, in spatial alignment with negative spectral index regions.
Dokara et al. confirmed two SNR candidates—G27.06+0.04 (single 6′ partial shell, α ≲ −0.5) and G51.26+0.11 (11.3′ shell, α ≃ −0.6)—by joint use of THOR radio continuum morphology and spectral index maps. Polarization-based separation was ineffective due to contamination by diffuse Galactic synchrotron emission (Dokara et al., 2018).
Pulsar–SNR associations are geometrically plausible for several candidates but remain tentative due to unknown distances and characteristic ages:
2. Thor in Exoplanet and Atmospheric Modeling: The THOR General Circulation Model
THOR 2.0 is a first-principles, open-source general circulation model (GCM) constructed for exoplanet atmospheric studies, explicitly avoiding Earth- or Solar System-specific empirical tunings (Deitrick et al., 2019). Unlike earlier platforms, THOR solves the full non-hydrostatic Euler equations on a sphere:
- Mass continuity:
- Momentum conservation:
- Energy (potential temperature):
- Equation of State:
THOR leverages an icosahedral-geodesic grid for homogeneous angular resolution and eliminates pole singularities.
Time integration follows the Horizontally Explicit Vertically Implicit (HEVI) paradigm, combining explicit stepping for horizontal dynamics with implicit solvers for vertically stiff acoustic/gravity modes. The core numerics retain full vertical accelerations (non-hydrostatic deep, NHD), but can be specialized to quasi-hydrostatic (QHD) and hydrostatic shallow (HSS) modes by selectively removing acceleration terms.
Physics modules included are two-stream, double-grey radiative transfer and dry convective adjustment. For stability, a Rayleigh-drag “sponge layer” at model tops damps spurious wave reflections, following:
where is a function of fractional height 0 above a threshold.
Benchmarks include tidally-locked Earth, deep hot Jupiter atmospheres, acoustic and gravity wave propagation. For giant planets, non-hydrostatic treatments are necessary to capture vertical wave–mean-flow coupling and angular-momentum transport.
THOR achieves mass conservation to 1 M and total energy to 2 over 10³–10⁴ simulated days. The modular code is CUDA/Python-based and rigorously validated (Deitrick et al., 2019).
3. Thor in High-Performance Radiative Transfer
THOR is a state-of-the-art, distributed-memory, multi-target Monte Carlo radiative transfer (MCRT) code focused on resonant emission lines, notably Lyman-α and Mg II (Byrohl et al., 15 Jul 2025). The code is written in C++17 with hybrid MPI and SYCL support, enabling execution on CPUs, GPUs, and APUs with 10–50× speedups relative to CPU-only codes across use cases.
Key abstractions include:
- Datasets: Uniform grids (with SPH and Voronoi-to-grid support), meshless shells, infinite slabs.
- Drivers: Resonant-line MCRT and ray-tracing for spectra/surface-brightness maps/volume rendering.
- Interactors: Physical processes—resonant line scattering (with full Voigt-profile and velocity sampling), dust, doublet lines.
- Generators: Photon sources tied to global or particle-based emission models.
- OutputProcessors: Support for surface-brightness maps, spectra, integral field cubes.
The radiative transfer algorithm uses photon packages emitted according to astrophysical emissivities (e.g., case-B recombination, collisional excitation), propagating until a target optical depth τ_target (drawn from Exp(1)) is reached, at which point they interact via scattering or absorption. Resonant line interactions employ velocity and frequency redistribution based on atomic and environmental properties. Acceleration techniques such as core-skipping and peeling-off optimize computational performance and observer-directed statistics.
Performance validation on CPUs and GPUs demonstrates strong and weak scaling up to hundreds of accelerators; synthetic astrophysical applications cover shell models, z ≃ 6 galaxy Lyα post-processing, CGM halos, cosmic web Lyα emission, and Lyα forest spectra. Results match analytic and published Monte Carlo solutions (e.g., Neufeld sphere, dust slab escape fractions, Mg II P Cygni profiles).
A YAML-based configuration manages runs, specifying grid types, physics, accelerations, sources, and output. Forthcoming features include AMR/Voronoi mesh support, coupled ionization/thermal updates, additional physics, and radiative-hydrodynamics feedback (Byrohl et al., 15 Jul 2025).
4. Thor in Hybrid Theorem Proving: Neuro-symbolic Integration with Hammers
Thor is a general framework for integrating LLMs with automated theorem provers (ATPs) through “hammers,” targeting efficient premise selection and proof automation in interactive theorem provers (ITPs) like Isabelle, Coq, and HOL4 (Jiang et al., 2022).
In this paradigm:
- Proof Search: The prover’s state 3 is rendered as text, and the LLM proposes a proof step 4, which is applied in the ITP, evolving 5.
- Task Delegation: All premise selection decisions are handled by the hammer—a subroutine that executes ATPs on the current goal and a large pool of candidate facts, scoring and reconstructing successful proofs in the ITP kernel. The LLM handles all other proof reasoning maneuvers.
- Inference Flow: Given a proof state, the LLM outputs either a routine step or a
<hammer>token, invoking the hammer. On success, the hammer injects an ATP subproof using the minimal set of premises.
The generic hammer protocol involves format translation, ATP calls (E, Vampire, SPASS, Z3, etc.), and reconstruction into the ITP. In Isabelle, for example, higher-order goals are first-order reduced (Meng–Paulson translation).
Empirical evaluation on the PISA dataset (3,000 held-out theorems):
| Method | Success Rate (%) |
|---|---|
| LLM alone | 39.0 |
| Sledgehammer alone | 25.7 |
| LM ∪ Sledgehammer | 48.8 |
| Thor | 57.0 |
Thor uniquely solves 8.2% of problems unprovable by either subsystem alone. On MiniF2F (488 problems), Thor achieves 29.9%, exceeding expert-iteration baselines at reduced computational cost.
Installation is protocolized: training data incorporates hammer success/failure, fine-tuning is performed on the processed dataset, and inference invokes the hammer as needed via token-level decision. Ablation studies confirm that offloading “when” (but not “how”) to use the hammer critically improves performance, and even minimal local proof context enhances learning (Jiang et al., 2022).
5. Thor in Tool-Augmented Mathematical Reasoning with RL Optimization
THOR (“Tool-Integrated Hierarchical Optimization via RL”) is an RL-based framework for mathematical reasoning in LLMs, designed to address failures in high-precision computation and formal symbolic manipulation by tightly integrating external tools (e.g., Python, SymPy, NumPy) (Chang et al., 17 Sep 2025). The architecture comprises:
- TIRGen: A two-agent (actor-critic) pipeline generates policy-aligned tool-integrated reasoning data. The actor (typically an autoregressive LLM) constructs stepwise verbal reasoning; the critic judges if a step should be delegated to external code, leading to generation and execution of tool calls.
- Hierarchical Reinforcement Learning: Two reward streams are optimized:
- Trajectory-level RL maximizes solution correctness for entire problems using a PPO variant (GRPO).
- Step-level RL retroactively corrects failed code calls, resampling and retraining the LLM to emit executable code in ambiguous cases.
- Self-Correction: During inference, failed tool calls trigger local backtracking and regeneration of the minimally necessary suffix of the failed reasoning step, iterated up to 6 times, reducing error propagation.
Empirical results on mathematical and code benchmarks show that, for non-reasoning models, THOR-7B reaches 61.2% average accuracy (vs. 35.7% for tool-augmented baselines), while THOR-Thinking-8B achieves 79.8% vs. 74.5% for chain-of-thought LLMs. Best-of-N self-rewarded selection raises performance (e.g., to 83.2% for 8B models at 7 samples). Code benchmarks (HumanEval⁺, MBPP⁺, LiveCodeBench) show improvements of 2–4 points in pass@1 without code-specific tuning. RL ablations document cumulative gains from hierarchical objectives and self-correction (Chang et al., 17 Sep 2025).
The core optimization equations are:
Trajectory likelihood:
8
Hierarchical RL Objective:
9
6. Comparative Architecture and Scientific Domains
| Name/Context | Domain | Core Function |
|---|---|---|
| THOR (survey) | Galactic astronomy | VLA+VGPS radio mapping, SNR discovery |
| THOR (GCM) | Exoplanet science | 3D non-hydrostatic atmospheric modeling |
| THOR (MCRT) | Computational astro | GPU-accelerated radiative transfer for emission lines |
| Thor (theorem proving) | ML+formal logic | LLM–hammered ATP neuro-symbolic proof search |
| THOR (RL reasoning) | ML/math reasoning | Tool-integrated, RL-optimized LLM framework |
Each THOR instantiation is distinguished by rigorously implemented numerical/symbolic algorithms, support for open reproducibility, and high domain impact through integration (or replacement) of legacy pipelines.
7. Outlook and Future Developments
The surveyed THOR frameworks are under ongoing extension:
- Astrophysical THOR codes are broadening geometric mesh support (AMR, unstructured), coupling with non-equilibrium chemistry, and exploring joint radiative–hydrodynamic feedback via on-the-fly energy/momentum injection (Byrohl et al., 15 Jul 2025).
- General circulation THOR is targeting moist and hydrologic cycles, advanced cloud microphysics, tracer chemistry, and magnetohydrodynamics—all within the original non-hydrostatic formulation (Deitrick et al., 2019).
- Tool-integrated THOR for reasoning is generalizing to diverse code tools, with focus on self-improving reasoning abilities and code/data co-training to enhance both mathematical and programming generalization (Chang et al., 17 Sep 2025).
- Neuro-symbolic Thor presents a blueprint for modular hybrid theorem proving, with demonstrated efficiency and extensibility to most mainstream interactive proof environments (Jiang et al., 2022).
The coherence across all uses of thor lies in modular, scalable, and high-accuracy methods, spanning empirical discovery, physical simulation, and formal symbolic reasoning.