Papers
Topics
Authors
Recent
Search
2000 character limit reached

Turbo-Muon: Beam Compression & ML Optimization

Updated 7 December 2025
  • Turbo-Muon is a family of techniques that compress muon beam phase space by 10¹⁰, producing ultra-cold, high-brightness beams for precision particle physics applications.
  • It also introduces a machine learning preconditioning method that accelerates Newton-Schulz orthogonalization, reducing computational iterations and speeding up convergence.
  • Additionally, Turbo-Muon methods extend to FPGA-based muon trigger systems, achieving sub-microsecond latency and high spatial resolution for real-time track reconstruction.

Turbo-Muon refers to a family of high-efficiency, high-brightness muon manipulation concepts, divided into two principal research streams: (1) an advanced suite of phase-space compression and extraction technologies to produce ultra-cold, ultrabright low-energy muon beams for particle physics applications, building on the muCool methodology; and (2) recent developments in large-scale orthogonality-based optimization for machine learning, specifically the Turbo-Muon optimizer, which accelerates Newton-Schulz orthogonalization via a matrix preconditioning scheme. Both lines draw on the core principle of efficient transformation of input distributions—physical or algorithmic—into tightly controlled, application-optimized states.

1. Turbo-Muon in Physical Muon Beam Compression

Turbo-Muon sources are based on an integrated series of physical processes that compress the six-dimensional phase space of conventional surface muon beams by 1010\sim10^{10} while maintaining a net transmission efficiency of 10310^{-3}, as realized in the muCool program at PSI (Antognini et al., 2018, Belosevic et al., 2019, Bao et al., 2014). The process can be decomposed into several stages:

  • Stopping: Surface muons (p11p \approx 11–13 MeV/c, intensity O(107μ+/s)O(10^7\,\mu^+/s)) are injected into a helium gas cell inside a 5T5\,\mathrm{T} solenoidal magnet. Approximately 1% of muons stop in the active region, yielding a stopping efficiency O(102)O(10^{-2}).
  • Transverse Compression: A cryogenic helium cell (T=412T=4{-}12\,K, p110p\sim1{-}10 mbar) with a vertical density gradient is subjected to crossed electric (Ex=Ey1E_x=E_y\approx 1 kV/cm) and magnetic (B=5B=5 T, 10310^{-3}0) fields. The position-dependent collision frequency 10310^{-3}1 produces a drift of the muon swarm that collapses its 10310^{-3}2-extent from 10310^{-3}3 mm to a few mm in 10310^{-3}4s.
  • Longitudinal Compression: In a subsequent room-temperature, low-pressure (10310^{-3}55 mbar) He cell, an axial electric field (10310^{-3}6 V/cm) focuses muons along 10310^{-3}7 to within 10310^{-3}8 mm from an initial distribution of 10310^{-3}9 cm, also in p11p \approx 110s.
  • Extraction and Re-acceleration: The fully-compressed muon packet is extracted via an orifice (diameter p11p \approx 111 mm) into vacuum using an p11p \approx 112 drift. Muons are then re-accelerated to keV energies for downstream use (Antognini et al., 2018, Belosevic et al., 2019).

The net result is a beam with p11p \approx 113–p11p \approx 114 bunched p11p \approx 115 per pulse, normalized emittance p11p \approx 1160.1 mm mrad, and energy spread p11p \approx 1171 eV at kHz rates, suitable for high-precision spectroscopy, p11p \approx 118SR, or as a muon collider front end.

2. Physical Principles and Design Parameters

The phase-space compression mechanism exploits the drift velocity of muons in crossed electric and magnetic fields, in the presence of high-frequency p11p \approx 119–He collisions:

O(107μ+/s)O(10^7\,\mu^+/s)0

where O(107μ+/s)O(10^7\,\mu^+/s)1 is the mobility, O(107μ+/s)O(10^7\,\mu^+/s)2 the cyclotron frequency, and the regime O(107μ+/s)O(10^7\,\mu^+/s)3 ensures drift predominantly along O(107μ+/s)O(10^7\,\mu^+/s)4 (longitudinal compression) or O(107μ+/s)O(10^7\,\mu^+/s)5 (transverse steering).

  • Transverse Stage: Gas-density gradients (O(107μ+/s)O(10^7\,\mu^+/s)6) and field control ensure compression trajectories converging into a sub-mm cross-section. The duration and magnitude of field gradients are tuned so that the majority of muons survive until extraction, given O(107μ+/s)O(10^7\,\mu^+/s)7s.
  • Longitudinal Stage: The V-shaped potential O(107μ+/s)O(10^7\,\mu^+/s)8, implemented by stepped electrodes, generates a focusing drift along O(107μ+/s)O(10^7\,\mu^+/s)9; simulation and experiment verify compression from 5T5\,\mathrm{T}0 mm to 5T5\,\mathrm{T}1 mm within 5T5\,\mathrm{T}2s (Bao et al., 2014).
  • Extraction: Electrostatic acceleration and high-vacuum orifices minimize charge-exchange and enable beam transport into precision experiments.

The table below summarizes the core parameters:

Stage Gas/Temp./Pressure Fields Compression Effect
Transverse He, 4–12 K, 1–10 mbar (gradient) 5T5\,\mathrm{T}3 kV/cm, 5T5\,\mathrm{T}4 T 5T5\,\mathrm{T}5: 5T5\,\mathrm{T}620 mm5T5\,\mathrm{T}71 mm
Longitudinal He, 300 K, 5 mbar 5T5\,\mathrm{T}8 V/cm, 5T5\,\mathrm{T}9 T O(102)O(10^{-2})0: 200 mmO(102)O(10^{-2})16 mm
Extraction He→vac., orifice 1–2 mm O(102)O(10^{-2})2 drift, HV columns Spot O(102)O(10^{-2})3 mm

3. Experimental Demonstration and Performance

Transverse and longitudinal compression, as well as extraction, have been demonstrated using surface-muon beams at Paul Scherrer Institute:

  • Transverse compression: O(102)O(10^{-2})4 reduction by O(102)O(10^{-2})5 within a drift length of 35 mm and time O(102)O(10^{-2})6s (Belosevic et al., 2019).
  • Longitudinal compression: A swarm originally O(102)O(10^{-2})7 mm is focused to O(102)O(10^{-2})8 mm with O(102)O(10^{-2})9 survival (neglecting post-extraction losses) (Bao et al., 2014). Timing and spatial profiles match full GEANT4 simulations for the relevant field and gas parameters.
  • Extraction demonstration: Muon packets are observed to drift perpendicularly to T=412T=4{-}12\,0 and pass through a 1 mm orifice with eV energy spreads, then re-accelerated to preserve phase-space quality. Projected beams achieve normalized transverse emittance T=412T=4{-}12\,1 mm mrad and energy spread T=412T=4{-}12\,2 eV (Antognini et al., 2018).

The total phase-space reduction is T=412T=4{-}12\,3; overall transmission is T=412T=4{-}12\,4, primarily limited by stopping efficiency and in-gas survival.

4. Engineering and Operational Challenges

Turbo-Muon sources at scale require solutions to several technical challenges (Antognini et al., 2018):

  • High-voltage, fast-rise pulsing in high magnetic fields and gaseous environments.
  • Continuous, ppm-level helium purity to suppress Mu formation and contamination.
  • High-throughput differential pumping to protect ultra-high vacuum acceleration stages.
  • Sub-100 μm field alignment precision to minimize aberrations.
  • Thermal management and voltage standoff at cryogenic temperatures.
  • Scaling from T=412T=4{-}12\,5 μT=412T=4{-}12\,6/s (single-cell tests) to T=412T=4{-}12\,7–T=412T=4{-}12\,8 μT=412T=4{-}12\,9/s for routine operation, avoiding sparking and space-charge.

These aspects are critical for integration into large-scale facilities, future high-luminosity muon colliders, and next-generation precision experiments needing ultra-cold beams.

5. Turbo-Muon Algorithms in Orthogonality-Based Optimization

Turbo-Muon also denotes a matrix preconditioning technique that accelerates the convergence of orthogonality-based optimizers (e.g., Muon) in large-scale machine learning (Boissin et al., 4 Dec 2025). Orthogonality constraints on gradients or weight updates (requiring p110p\sim1{-}100) are enforced by projecting onto the polar factor via Newton-Schulz (NS) iterations. The standard NS scheme is computationally expensive for large p110p\sim1{-}101 (p110p\sim1{-}102 per iteration) and typically requires p110p\sim1{-}103–p110p\sim1{-}104 iterations for error p110p\sim1{-}105.

Turbo-Muon introduces an “almost-orthogonal layer” (AOL) diagonal preconditioner: p110p\sim1{-}106 where p110p\sim1{-}107 is the input matrix. Preconditioning reduces the Gram matrix condition number, leading to smaller initial error for NS iterations and effectively enabling one full NS iteration to be dropped for constant accuracy. Empirical results demonstrate p110p\sim1{-}108–p110p\sim1{-}109 speedup in the NS subroutine for Ex=Ey1E_x=E_y\approx 10, with Ex=Ey1E_x=E_y\approx 11–Ex=Ey1E_x=E_y\approx 12 net improvement in end-to-end model training runtime—without any need for hyperparameter adjustment. The core descent property is preserved, and integration into PyTorch workflows requires only minimal modification to the optimizer's step logic. The method is available at https://github.com/thib-s/flash-newton-schulz.

6. Turbo-Muon in Real-Time Muon Trigger and Track Reconstruction

A further usage of Turbo-Muon is in FPGA-based muon trigger demonstrators for high-energy physics detectors (Migliorini et al., 2021). In this context, Turbo-Muon refers to a low-latency, pipelined pipeline that integrates compact neural networks and analytical methods for online drift-tube muon track reconstruction:

  1. Hit grouping and de-multiplexing: Buffering and unique assembly of macro-cell hits from raw digitized data streams.
  2. Filtering Neural Network (F-NN): 16 input coarse timestamps processed to select the best spatially separated hits; implemented as a small (20-unit) quantized NN.
  3. Disambiguation Neural Network (D-NN): Assigns left/right unambiguous association to each selected hit; also a compact quantized NN.
  4. Time pedestal/mean-timer stage: Resolves the global Ex=Ey1E_x=E_y\approx 13 and track inclination Ex=Ey1E_x=E_y\approx 14 analytically using precomputed combinatorial formulae.
  5. Track parameter extraction: Fixed-point least-squares fitting for local track slope and intercept.

The entire pipeline offers Ex=Ey1E_x=E_y\approx 15s end-to-end latency at Ex=Ey1E_x=E_y\approx 16 MHz clock, per-macrocell spatial resolution Ex=Ey1E_x=E_y\approx 17m, efficiency Ex=Ey1E_x=E_y\approx 18, and minimal FPGA resource footprint. In both simulation and cosmic-ray data, ghost rate is Ex=Ey1E_x=E_y\approx 19 and timing resolution is B=5B=50–B=5B=51 ns, vastly outperforming pure analytical implementations at high rates. Extensions to multi-chamber tiling and vertical integration are under way for high-luminosity LHC upgrades.

7. Impact and Prospects

Turbo-Muon methodologies, in both beam physics and algorithmic optimization, have substantially advanced state-of-the-art in their respective domains:

  • In particle physics, Turbo-Muon sources will enable new regimes in B=5B=52 beam brightness, collimation, and timing, directly impacting muon B=5B=53, EDM, muonium spectroscopy, and planned muon collider injectors. The B=5B=54 phase-space compression at B=5B=55 efficiency unlocks orders of magnitude gain in usable cold muon flux, providing B=5B=56–B=5B=57s at sub-eV energies (Antognini et al., 2018, Belosevic et al., 2019). The methodology sets a template for gas-based cooling and high-field extraction in other particle species.
  • In optimization, Turbo-Muon’s preconditioning accelerates orthogonality-based training across vision and LLMs by a consistent factor, establishing a computational “drop-in” for fast NS-like matrix operations and stimulating further research in generalized preconditioning and matrix manifold optimization (Boissin et al., 4 Dec 2025).
  • In real-time signal processing, Turbo-Muon FPGA demonstrators validate hybrid neural/analytic pipelines for sub-microsecond, high-fidelity event characterization in high-background environments, with planned deployment in major detector upgrades (Migliorini et al., 2021).

Collectively, Turbo-Muon approaches exemplify the fusion of advanced mathematical, experimental, and computing techniques in pursuit of precision and efficiency at both the physical and algorithmic frontiers.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Turbo-Muon.