KLong: Particle Physics, Detectors & LLM Agent
- KLong is a multi-faceted term describing the long-lived neutral kaon, specialized experimental detectors, and an open-source LLM agent for extensive task automation.
- In high-energy experiments like Belle II and Jefferson Lab's KLong Facility, optimized detector systems enhance CP violation studies and strange-quark spectroscopy.
- The KLong LLM agent employs trajectory-splitting and progressive reinforcement learning to effectively manage and replicate extremely long-horizon tasks.
KLong refers to multiple distinct but technically advanced entities in scientific research: (1) KLong (), the long-lived neutral kaon, a key particle in high-energy and nuclear physics; (2) specialized KLong detectors and experimental facilities, such as in the Belle II experiment and at the KLong Facility at Jefferson Lab, which study rare hadronic and hyperon processes; and (3) KLong, an open-source LLM agent engineered specifically for solving extremely long-horizon tasks in machine learning research and software engineering. This article covers each of these meanings in their scientific and technical contexts.
1. : The Long-Lived Neutral Kaon
The (KLong) is a neutral kaon eigenstate with a long lifetime ( s), arising from the superposition of and . It plays a crucial role in studies of CP violation, hadron spectroscopy, and rare decay processes. At modern accelerator facilities, precise production, tagging, and detection of beams underpin comprehensive programs in both standard model precision tests and hadronic structure searches (Dobbs, 2022).
2. Detection in High-Energy Experiments
2.1 Belle II Scintillator-Based KLong Detector
The Belle II experiment employs a highly segmented, scintillator-based and muon () detector in both the endcap and inner barrel regions, designed to accommodate the increased luminosity and background at the SuperKEKB collider (Aushev et al., 2014).
Detector Geometry and Optical Chain
- Scintillator Strips: Extruded polystyrene doped with PTP/PPO and POPOP, 40 mm 10 mm cross-section, up to 2.8 m long. Each strip features a central groove (1.2 mm) housing a wavelength-shifting (WLS) fiber.
- WLS Fiber: Kuraray Y-11(200)MSJ, emission peak 500 nm, matched to silicon photomultiplier (SiPM) PDE.
- Segmentation: Sectors (14 per endcap, 2 endcaps) contain 4 superlayers, each with two orthogonal planes (75 strips/plane), totaling 16,800 strips in the endcap detector.
- Coupling Optimizations: Optical glue index-matched to scintillator and fiber cladding, rounded groove profile (+25% light yield), fiber protrusion into SiPM resin (+37% light yield), aggregate 70% light yield improvement.
Photodetector and Signal Processing
- SiPMs: Hamamatsu MPPC S10362-13-050 and others, gains $0.6$–, PDE $30$–$40$\% at 500 nm, dark rate Hz for mm, optical cross-talk $10$–$20$\%.
- Radiation Hardness: After 40 Sv (10 yr), dark current up to 12 A; performance on minimum-ionizing particle (MIP) detection and light yield remains unaffected. Noise rate at 7.5 p.e. threshold stays below neutron backgrounds.
- Timing and Spatial Resolution: ns from TDC differences enables cm longitudinal localization ( cm/ns). Integration gate: 100 ns; SiPM dark noise suppressed by 7.5 p.e threshold.
Performance
- Muon Efficiency: for GeV/c, with pion mis-ID.
- Cluster Finding: "Tight" (≥2 superlayers): increases from 15% at low to 60% at GeV/c; "loose" (≥1 superlayer): efficiency at the cost of 0.2 fake clusters/event. Angular resolution mrad.
- Background Rejection: Fake rate /event; SiPM noise is negligible at operational threshold.
- Comparison to Belle RPC KLM: Scintillator+SiPM maintains full efficiency MHz/ch, provides 0.7 ns timing (vs 10–20 ns for RPC), and matches or exceeds predecessor in efficiency and granularity, drastically reducing fake hits.
Calibration and Monitoring
- Alignment: Mechanical registration, cosmic-ray and laser alignment checks.
- Light-Yield Calibration: MIP/cosmic-ray scans, truncated-Landau fit for .
- Bias and Temperature Compensation: Auto-adjustment for SiPM breakdown voltage drift (60 mV/K).
- Aging/Radiation: Periodic dark-current monitoring and calibration fibers or Sr sources; light-yield drop over a decade (Aushev et al., 2014).
3. KLong Facility (KLF) at Jefferson Lab
The KLong Facility at Jefferson Lab represents a major advance for strange-quark hadron spectroscopy via high-flux, high-precision beams in Hall D, instrumented with the GlueX spectrometer (Dobbs, 2022).
Beam Production and Properties
- Production Chain: CEBAF 12 GeV electron beam copper radiator (10% ) bremsstrahlung photons beryllium target ( predominantly via ) dipole sweeps charged secondaries.
- Flux: Up to /s at GeV/; kaons on target per channel over – events in 100 days.
- Momentum Spread: 0.5–5 GeV/; energy resolution , MeV.
- Spot and Divergence: mrad, cm, well-matched to GlueX acceptance.
Experimental Apparatus
- GlueX Spectrometer: 2 T solenoid, central straw-tube chamber, forward drift chambers, barrel and forward calorimeters.
- Particle ID: TOF walls (up to 4 GeV/ separation), Barrel DIRC (proposed), high-granularity tracking.
Physics Program and Methodologies
- Strange Baryon Production: Channels such as , , allow simultaneous measurement on proton/neutron targets, differential cross sections extracted with normalization to flux and target density.
- Partial-Wave Analysis: Coupled-channel PWA enables full amplitude extraction and resonance parameter determination with mass uncertainties at the 10–40 MeV level for benchmark states.
- Spectroscopy: Associated production with sensitivity to branching fractions of 1–2%.
- Kaon Spectroscopy and Scattering: S/P-wave decompositions; S-wave phase shifts measured to across –1.6 GeV, surpassing LASS/BNL datasets.
Systematics and Legacy
- Backgrounds: Neutron/photon backgrounds suppressed by sweeping dipole and TOF; accidental backgrounds limited by bunching and detector granularity; flux monitored to .
- Comparative Yield: KLF flux is -fold higher than prior SLAC/BNL runs; events expected vs prior ; hyperon polarization events (cf. world data).
- Resolution and Acceptance: Acceptance 40–80%; statistical precision on per 20 MeV/; polarization to 0.01–0.02.
- Significance: KLF will establish a definitive database for strange-hadron and kaon spectroscopy, supporting unambiguous partial-wave and pole-parameter analyses (Dobbs, 2022).
4. KLong LLM Agent for Extremely Long-Horizon Tasks
KLong also designates an open-source LLM agent specifically constructed to handle extremely long-horizon tasks characterized by procedures that exceed the model’s context window and involve hundreds to thousands of decision or tool-use turns (Liu et al., 19 Feb 2026).
Objectives and Model Architecture
- Goal: Enable an LLM (base: GLM-4.5-Air-Base, 40K token window, 106B parameters) to replicate research papers (PaperBench) and generalize to other long-horizon benchmarks (SWE-bench Verified, MLE-bench, Terminal-Bench Hard, SEC-bench).
- Architecture Stages:
- Stage 1: Cold-start supervised fine-tuning (SFT) on standard agentic data.
- Stage 2: Trajectory-splitting SFT on extremely long (>40K token) tool-use demonstrations.
- Stage 3: Progressive reinforcement learning (RL) curriculum, expanding wall-clock timeouts.
Trajectory-Splitting SFT
- Method: Long demonstration trajectories are split into overlapping sub-trajectories of length , with prefix (task/paper context) repeated at every chunk, overlap ensuring continuity:
for
- Loss: Standard teacher-forcing over all actions in every chunk.
- Effect: Allows the model to internalize long-horizon agentic behavior under context window constraints, e.g., raising average assistant turns from 115 to 733.
Research-Factory Data Pipeline
- Automated Data Generation: Two-agent pipeline consisting of a search agent (crawling ICML, NeurIPS, ICLR, filtering by impact/novelty, PDF Markdown, blacklisting official GitHub for anti-cheating) and evaluation agent (rubric construction from paper plus code).
- Distillation Source: Claude 4.5 Sonnet "Thinking" model generates K extremely long rollouts.
- Quality Control: Acceptance of trajectories only if rubric judge (GPT-OSS-120B) scores 80%.
Progressive Reinforcement Learning
- Curriculum: RL proceeds in stages with increasing wall-clock timeouts (h, h, h). Rollouts exceeding the context window are split as above.
- Optimization: Clipped PPO objective, rewards derived from rubric-judge model evaluations.
- Splitting Recapitulation: Even truncated rollouts (at each ) require trajectory splitting for efficient RL.
Empirical Performance
| Model | Avg. PaperBench (%) |
|---|---|
| Claude 4.5 Sonnet (Thinking) | 69.75 |
| GPT-5 Thinking (High) | 52.31 |
| Kimi K2 (1T) | 51.31 |
| KLong (106B) | 62.59 |
- PaperBench: KLong (106B) leads open-source models by +11.28% (vs Kimi K2 1T), with largest gains in sustained-reasoning (test-time-model-adaptation: 80.09% vs 65.64%; all-in-one: 70.14% vs 28.10%).
- Other Benchmarks: SWE-bench Verified bug-fixing (62.80% vs baseline 60.80%), MLE-bench (higher rates of medals), Terminal-Bench Hard (16.67% vs 14.58%), SEC-bench (7.67% vs 5.00%).
- Ablations: Gains follow the addition of splitting SFT (+17.29 pp vs cold-start SFT) and progressive RL (+6.67 pp vs SFT-only).
- Infrastructure: Kubernetes sandbox, prompt caching, async rollouts, priority queueing for judge concurrency optimize experimentation throughput (Liu et al., 19 Feb 2026).
5. Contextual Significance and Cross-Domain Relevance
/KLong and its associated experimental and computational frameworks highlight the convergence of experimental particle physics, detector technology, and large-scale AI-driven research automation.
- In particle and hadron physics, -based studies continue to drive advances in CP violation, spectroscopy, and rare decay characterization, with new detectors (Belle II) and facilities (KLF) raising both precision and event rates by 1–3 orders of magnitude over past efforts (Aushev et al., 2014, Dobbs, 2022).
- In large-scale AI/LLMs, KLong exemplifies a path to enabling LLMs as agents for research replication, complex experiment automation, and sustained task completion beyond fixed context window limitations, leveraging both advanced data distillation and RL curricula targeting agentic compositional skills over many hours (Liu et al., 19 Feb 2026).
- A plausible implication is that methods from the KLong LLM agent paradigm (trajectory-splitting, automated curriculum RL, judge-based feedback) may inform automated scientific assistants in other long-horizon, compositional, or tool-rich domains.
KLong thus occupies a distinctive space at the intersection of high-precision experimental science and automated long-horizon reasoning, with each instantiation advancing the state of the art in its respective field.