Papers
Topics
Authors
Recent
2000 character limit reached

Turbo-Muon: Beam Compression & ML Optimization

Updated 7 December 2025
  • Turbo-Muon is a family of techniques that compress muon beam phase space by 10¹⁰, producing ultra-cold, high-brightness beams for precision particle physics applications.
  • It also introduces a machine learning preconditioning method that accelerates Newton-Schulz orthogonalization, reducing computational iterations and speeding up convergence.
  • Additionally, Turbo-Muon methods extend to FPGA-based muon trigger systems, achieving sub-microsecond latency and high spatial resolution for real-time track reconstruction.

Turbo-Muon refers to a family of high-efficiency, high-brightness muon manipulation concepts, divided into two principal research streams: (1) an advanced suite of phase-space compression and extraction technologies to produce ultra-cold, ultrabright low-energy muon beams for particle physics applications, building on the muCool methodology; and (2) recent developments in large-scale orthogonality-based optimization for machine learning, specifically the Turbo-Muon optimizer, which accelerates Newton-Schulz orthogonalization via a matrix preconditioning scheme. Both lines draw on the core principle of efficient transformation of input distributions—physical or algorithmic—into tightly controlled, application-optimized states.

1. Turbo-Muon in Physical Muon Beam Compression

Turbo-Muon sources are based on an integrated series of physical processes that compress the six-dimensional phase space of conventional surface muon beams by 1010\sim10^{10} while maintaining a net transmission efficiency of 10310^{-3}, as realized in the muCool program at PSI (Antognini et al., 2018, Belosevic et al., 2019, Bao et al., 2014). The process can be decomposed into several stages:

  • Stopping: Surface muons (p11p \approx 11–13 MeV/c, intensity O(107μ+/s)O(10^7\,\mu^+/s)) are injected into a helium gas cell inside a 5T5\,\mathrm{T} solenoidal magnet. Approximately 1% of muons stop in the active region, yielding a stopping efficiency O(102)O(10^{-2}).
  • Transverse Compression: A cryogenic helium cell (T=412T=4{-}12\,K, p110p\sim1{-}10 mbar) with a vertical density gradient is subjected to crossed electric (Ex=Ey1E_x=E_y\approx 1 kV/cm) and magnetic (B=5B=5 T, z^\hat{z}) fields. The position-dependent collision frequency ν(y)\nu(y) produces a drift of the muon swarm that collapses its yy-extent from ±15\pm15 mm to a few mm in 2μ\lesssim 2\,\mus.
  • Longitudinal Compression: In a subsequent room-temperature, low-pressure (\sim5 mbar) He cell, an axial electric field (Ez±50E_z\approx\pm50 V/cm) focuses muons along zz to within 1\lesssim 1 mm from an initial distribution of $20$ cm, also in 2μ\lesssim2\,\mus.
  • Extraction and Re-acceleration: The fully-compressed muon packet is extracted via an orifice (diameter 1\sim1 mm) into vacuum using an Ey×BE_y\times B drift. Muons are then re-accelerated to keV energies for downstream use (Antognini et al., 2018, Belosevic et al., 2019).

The net result is a beam with 10310^310410^4 bunched μ+\mu^+ per pulse, normalized emittance \sim0.1 mm mrad, and energy spread \sim1 eV at kHz rates, suitable for high-precision spectroscopy, μ\muSR, or as a muon collider front end.

2. Physical Principles and Design Parameters

The phase-space compression mechanism exploits the drift velocity of muons in crossed electric and magnetic fields, in the presence of high-frequency μ\mu–He collisions:

vD=μE1+ω2/ν2[E^+ωνE^×B^+ω2ν2(E^B^)B^],\vec v_D = \frac{\mu |\vec E|}{1+\omega^2/\nu^2} \left[\hat{E} + \frac{\omega}{\nu}\,\hat{E}\times\hat{B} + \frac{\omega^2}{\nu^2}\,(\hat{E}\cdot\hat{B})\hat{B} \right],

where μ=emν\mu = \frac{e}{m\,\nu} is the mobility, ω=eBm\omega = \frac{eB}{m} the cyclotron frequency, and the regime ων\omega \gg \nu ensures drift predominantly along E^B^\hat{E}\cdot\hat{B} (longitudinal compression) or E^×B^\hat{E}\times\hat{B} (transverse steering).

  • Transverse Stage: Gas-density gradients (ν(y)\nu(y)) and field control ensure compression trajectories converging into a sub-mm cross-section. The duration and magnitude of field gradients are tuned so that the majority of muons survive until extraction, given τμ+=2.2μ\tau_{\mu^+}=2.2\,\mus.
  • Longitudinal Stage: The V-shaped potential V(z)V(z), implemented by stepped electrodes, generates a focusing drift along zz; simulation and experiment verify compression from Li200L_i\approx200 mm to Lf6L_f \lesssim 6 mm within tcomp24μt_{\mathrm{comp}}\sim2{-}4\mus (Bao et al., 2014).
  • Extraction: Electrostatic acceleration and high-vacuum orifices minimize charge-exchange and enable beam transport into precision experiments.

The table below summarizes the core parameters:

Stage Gas/Temp./Pressure Fields Compression Effect
Transverse He, 4–12 K, 1–10 mbar (gradient) Ex=Ey1E_x=E_y\approx1 kV/cm, B=5B=5 T Δy\Delta y: \sim20 mm\to1 mm
Longitudinal He, 300 K, 5 mbar Ez±50E_z\approx\pm50 V/cm, B=5B=5 T Δz\Delta z: 200 mm<\to<6 mm
Extraction He→vac., orifice 1–2 mm Ey×BE_y\times B drift, HV columns Spot σx,y<1\sigma_{x,y}<1 mm

3. Experimental Demonstration and Performance

Transverse and longitudinal compression, as well as extraction, have been demonstrated using surface-muon beams at Paul Scherrer Institute:

  • Transverse compression: Δy\Delta y reduction by >10×>10\times within a drift length of 35 mm and time 2μ\lesssim2\,\mus (Belosevic et al., 2019).
  • Longitudinal compression: A swarm originally Li200L_i\approx200 mm is focused to Lf6L_f\lesssim6 mm with >90%>90\% survival (neglecting post-extraction losses) (Bao et al., 2014). Timing and spatial profiles match full GEANT4 simulations for the relevant field and gas parameters.
  • Extraction demonstration: Muon packets are observed to drift perpendicularly to BB and pass through a 1 mm orifice with eV energy spreads, then re-accelerated to preserve phase-space quality. Projected beams achieve normalized transverse emittance 0.110.1{-}1 mm mrad and energy spread <1<1 eV (Antognini et al., 2018).

The total phase-space reduction is 101010^{10}; overall transmission is 10310^{-3}, primarily limited by stopping efficiency and in-gas survival.

4. Engineering and Operational Challenges

Turbo-Muon sources at scale require solutions to several technical challenges (Antognini et al., 2018):

  • High-voltage, fast-rise pulsing in high magnetic fields and gaseous environments.
  • Continuous, ppm-level helium purity to suppress Mu formation and contamination.
  • High-throughput differential pumping to protect ultra-high vacuum acceleration stages.
  • Sub-100 μm field alignment precision to minimize aberrations.
  • Thermal management and voltage standoff at cryogenic temperatures.
  • Scaling from 10410^4 μ+^+/s (single-cell tests) to 10710^710810^8 μ+^+/s for routine operation, avoiding sparking and space-charge.

These aspects are critical for integration into large-scale facilities, future high-luminosity muon colliders, and next-generation precision experiments needing ultra-cold beams.

5. Turbo-Muon Algorithms in Orthogonality-Based Optimization

Turbo-Muon also denotes a matrix preconditioning technique that accelerates the convergence of orthogonality-based optimizers (e.g., Muon) in large-scale machine learning (Boissin et al., 4 Dec 2025). Orthogonality constraints on gradients or weight updates (requiring QQ=IQ^\top Q=I) are enforced by projecting onto the polar factor via Newton-Schulz (NS) iterations. The standard NS scheme is computationally expensive for large nn (O(n3)\mathcal{O}(n^3) per iteration) and typically requires $5$–$9$ iterations for error ε103\varepsilon\sim10^{-3}.

Turbo-Muon introduces an “almost-orthogonal layer” (AOL) diagonal preconditioner: P=diag(jX0X0ij)1/2,X1=X0PP = \mathrm{diag}\Bigl(\sum_j |X_0^\top X_0|_{ij}\Bigr)^{-1/2},\qquad X_1 = X_0 P where X0X_0 is the input matrix. Preconditioning reduces the Gram matrix condition number, leading to smaller initial error for NS iterations and effectively enabling one full NS iteration to be dropped for constant accuracy. Empirical results demonstrate $2.2$–2.8×2.8\times speedup in the NS subroutine for n=8192n=8192, with $5$–10%10\% net improvement in end-to-end model training runtime—without any need for hyperparameter adjustment. The core descent property is preserved, and integration into PyTorch workflows requires only minimal modification to the optimizer's step logic. The method is available at https://github.com/thib-s/flash-newton-schulz.

6. Turbo-Muon in Real-Time Muon Trigger and Track Reconstruction

A further usage of Turbo-Muon is in FPGA-based muon trigger demonstrators for high-energy physics detectors (Migliorini et al., 2021). In this context, Turbo-Muon refers to a low-latency, pipelined pipeline that integrates compact neural networks and analytical methods for online drift-tube muon track reconstruction:

  1. Hit grouping and de-multiplexing: Buffering and unique assembly of macro-cell hits from raw digitized data streams.
  2. Filtering Neural Network (F-NN): 16 input coarse timestamps processed to select the best spatially separated hits; implemented as a small (20-unit) quantized NN.
  3. Disambiguation Neural Network (D-NN): Assigns left/right unambiguous association to each selected hit; also a compact quantized NN.
  4. Time pedestal/mean-timer stage: Resolves the global t0t_0 and track inclination φ\varphi analytically using precomputed combinatorial formulae.
  5. Track parameter extraction: Fixed-point least-squares fitting for local track slope and intercept.

The entire pipeline offers <1μ<1\,\mus end-to-end latency at $40$ MHz clock, per-macrocell spatial resolution 250μ\lesssim250\,\mum, efficiency >99%>99\%, and minimal FPGA resource footprint. In both simulation and cosmic-ray data, ghost rate is <1%<1\% and timing resolution is $3$–$4.1$ ns, vastly outperforming pure analytical implementations at high rates. Extensions to multi-chamber tiling and vertical integration are under way for high-luminosity LHC upgrades.

7. Impact and Prospects

Turbo-Muon methodologies, in both beam physics and algorithmic optimization, have substantially advanced state-of-the-art in their respective domains:

  • In particle physics, Turbo-Muon sources will enable new regimes in μ+\mu^+ beam brightness, collimation, and timing, directly impacting muon g2g-2, EDM, muonium spectroscopy, and planned muon collider injectors. The 101010^{10} phase-space compression at O(103)O(10^{-3}) efficiency unlocks orders of magnitude gain in usable cold muon flux, providing 105\sim10^5106μ+/10^6 \mu^+/s at sub-eV energies (Antognini et al., 2018, Belosevic et al., 2019). The methodology sets a template for gas-based cooling and high-field extraction in other particle species.
  • In optimization, Turbo-Muon’s preconditioning accelerates orthogonality-based training across vision and LLMs by a consistent factor, establishing a computational “drop-in” for fast NS-like matrix operations and stimulating further research in generalized preconditioning and matrix manifold optimization (Boissin et al., 4 Dec 2025).
  • In real-time signal processing, Turbo-Muon FPGA demonstrators validate hybrid neural/analytic pipelines for sub-microsecond, high-fidelity event characterization in high-background environments, with planned deployment in major detector upgrades (Migliorini et al., 2021).

Collectively, Turbo-Muon approaches exemplify the fusion of advanced mathematical, experimental, and computing techniques in pursuit of precision and efficiency at both the physical and algorithmic frontiers.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Turbo-Muon.