Permutation Scanning Mamba

Updated 30 November 2025

Permutation Scanning Mamba is a technique that serializes multidimensional data into a 1D sequence using permutation operators, enabling efficient long-range dependency capture.
It applies selective structured state space models (SSMs) with dynamic permutation orders—such as Hilbert and DA-HMG—to enhance performance in super-resolution, MRI reconstruction, and other domains.
The approach reduces computational complexity from quadratic to linear scaling, achieving notable speedups and resource savings while preserving spatial locality and global context.

Permutation scanning Mamba refers to a broad family of techniques for serializing multidimensional (primarily image or tensor) data into 1D sequences via permutation operators, subsequently applying selective structured state space models (SSMs) such as Mamba in the permuted order. This paradigm enables efficient, flexible, and expressive context modeling with linear computational complexity, while capturing richer long-range dependencies than fixed raster or multi-direction scans. Permutation scanning underpins a wide range of recent visual Mamba architectures, providing critical efficiency and representational improvements in tasks including super-resolution, low-light enhancement, segmentation, MRI reconstruction, and high-dimensional data modeling.

1. Formal Definitions and Core Mechanisms

Let $X \in \mathbb{R}^{C \times H \times W}$ represent an image tensor (or $X \in \mathbb{R}^{C \times H \times W \times \dots}$ for higher dimensions). Permutation scanning is the process by which $X$ is serialized into a 1D sequence $x \in \mathbb{R}^{L \times D}$ (where $L = H \cdot W$ and $D = C$ or a feature dimension) via a permutation operator $\pi:\{1, \dots, L\} \to \{1, \dots, L\}$ . The SSM block then applies its recurrence relations in the order prescribed by $\pi$ : $h_t = A h_{t-1} + B_{\pi(t)} x_{\pi(t)} \ y_t = C_{\pi(t)} h_t$ where $A$ is the (possibly input-adaptive) recurrent matrix, $B_{\pi(t)}$ and $C_{\pi(t)}$ are input-conditioned dynamic projections, and $h_t$ is the hidden state.

After the SSM scan is completed, the inverse permutation $\pi^{-1}$ reorders the outputs back to the native spatial configuration. Permutations may be deterministic (e.g., raster, Hilbert, Peano, serpentine, radial, Morton) or learned, and may vary across network depth or by functional module (Xu et al., 29 Apr 2024).

2. Permutation Types and Scanning Strategies

Permutation scanning encompasses a spectrum of strategies for sequence ordering, often optimized for contextual coverage, spatial locality, or domain-specific topology.

Raster and Axis-Aligned Scans: Traditional row-major, column-major, row/column-reversed, or bidirectional axis scans are widely used as baselines (Zhu et al., 14 May 2024, Meng et al., 14 Jan 2025). Each corresponds to a simple lexicographic permutation.
Multi-Directional / Diagonal / Serpentine Scans: Extended to include diagonal, anti-diagonal, or alternating (serpentine) row/column patterns, which provide moderate improvements in spatial adjacency preservation at the cost of some implementation complexity (Zhu et al., 14 May 2024).
Hierarchical Direction Alternation (DA-HMG): Hi-Mamba proposes modular permutation: a set of local or regional SSMs operate along a single direction per block, with the overall group alternating (permuting) directions across layers. For example, one block scans horizontally, the next vertically, the third reverse-horizontally, and so forth. This avoids multi-direction scans within each block (which would multiply computational costs), achieving global 2D contextual coverage over several layers (Qiao et al., 14 Oct 2024).
Space-Filling Fractal Scans (High Hausdorff Dimension): Permutations derived from space-filling curves (Hilbert, Peano) greatly increase the Hausdorff dimension ( $D_H = 2$ ), preserving both spatial locality and global coverage in a single traversal. Empirically, use of Hilbert or Peano scans in Mamba leads to improved PSNR and SSIM, lower LPIPS, and reduced computational resource consumption (Wang et al., 29 Oct 2025). The orderings are constructed via recursive, self-similar pseudocode exploiting the fractal structure of the curve.
Multi-Scale and Domain-Specific Permutations: In MRI, both image- and k-space representations are scanned with axial, radial, or hierarchical multi-scale permutations, e.g., concentric circular traversal in k-space to align with spectral structure (Meng et al., 14 Jan 2025). In 4D light field models, permutation scanning is applied along spatial, angular, and EPI (epipolar) subspaces to reduce the quadratic scaling inherent in conventional Transformer attention (Gao et al., 23 Jun 2024).
Learned and Adaptive Permutations: Techniques such as sequence reordering (SR-Mamba) and dynamically adapted permutations per input (e.g., per-whole-slide-image in pathology (Xu et al., 29 Apr 2024)) further diversify the class of permutation scans.

3. Computational Complexity and Efficiency

For a feature map of $N$ elements and block depth $B$ , the runtime depends on scan permutations:

Multi-Directional per Block: $T_\text{block} = D \cdot O(N)$ , $T_\text{total} = B D O(N)$ , where $D$ is the number of directions (Qiao et al., 14 Oct 2024).
Single Permuted Direction per Block (Hi-Mamba, DA-HMG): $T_\text{block} = O(N)$ , $T_\text{total} = B O(N)$ . This eliminates the multiplicative overhead of simultaneous multi-directional scans, yielding a near-halving of FLOPs/runtimes in super-resolution, e.g., 274 vs 568 GFLOPs, 2.6 $\times$ runtime speedup on Urban100 (Hi-Mamba-S vs MambaIR-4) (Qiao et al., 14 Oct 2024).
Space-Filling Permutations: The per-block complexity is $O(N \cdot m)$ , equivalent to raster for hidden state dimension $m$ , with additional savings from reduced memory fragmentation and improved cache locality (Wang et al., 29 Oct 2025).
High-D Models and Subspace Scanning: In 4D light field applications, permuting along multiple subspaces (spatial, angular, EPI) converts quadratic $O(N^2)$ Transformer cost to linear $O(N)$ in Mamba, with each sub-sequence much shorter than the flattened volume (Gao et al., 23 Jun 2024).

4. Locality, Long-Range Dependency, and Theoretical Properties

The choice of permutation has direct consequences for spatial locality, long-range information propagation, and approximation error:

Dispersion and Approximation: Scanning with higher Hausdorff dimension (space-filling order) minimizes dispersion $\varepsilon(P, \Omega)$ —the maximal gap between consecutive visited points—and thus reduces worst-case interpolation error in Hölder-continuous image functions (Wang et al., 29 Oct 2025).
Locality: Raster scans incur large discontinuities at row junctions; Hilbert and Peano maintain average jump lengths $O(1)$ , substantially improving the preservation of local context needed for low-level tasks, especially in low-light and denoising (Wang et al., 29 Oct 2025).
Global Receptive Field: Regardless of permutation, the SSM core with selective recurrence provides full receptive field; however, orderings that enhance spatial mixing (e.g., alternation in DA-HMG or Hilbert scan) empirically facilitate feature fusion (Qiao et al., 14 Oct 2024, Gao et al., 23 Jun 2024).
Hardware Considerations: Arbitrarily permuted scans may reduce memory access efficiency, while fractal or structured permutations (Hilbert, Morton) retain good cache coherence (Wang et al., 29 Oct 2025, Xu et al., 29 Apr 2024).

5. Empirical Results and Application Domains

Permutation scanning Mamba has been evaluated across vision domains, consistently demonstrating efficiency and, in key settings, state-of-the-art representational gains:

Task / Domain	Permutation Type	Key Empirical Outcomes	Reference
Image SR	DA-HMG (direction alternation)	+0.29 dB PSNR on Manga109; 2.6x faster	(Qiao et al., 14 Oct 2024)
Low-light Enhance	Hilbert/Peano scan	+1.5–2 dB PSNR, −15% runtime, −0.3 GB mem	(Wang et al., 29 Oct 2025)
MRI Reconstruction	Circular, multi-scale DA scan	+0.2 dB PSNR, −45% FLOPs	(Meng et al., 14 Jan 2025)
Light Field SR	Subspace bi-directional scan	3–4x faster, feasible >384x384 LFs	(Gao et al., 23 Jun 2024)
Segmentation (RS)	All scan types	No significant difference; D1 (raster) suffices	(Zhu et al., 14 May 2024)
Point Cloud/WSI	Morton/octree permutation	1–2% acc lift (classification)	(Xu et al., 29 Apr 2024)

In super-resolution, DA-HMG alternation recovers 2D dependencies while halving resource overhead. Hilbert scanning yields superior perceptual metrics and smoother outputs under low-light conditions. Multi-scale and dual-domain permutations jointly optimize MRI reconstruction, leveraging spectral priors in k-space. In semantic segmentation on high-resolution aerial imagery, scan choice has negligible impact due to insensitivity of the global Vision Mamba SSM core (Zhu et al., 14 May 2024).

6. Architectural and Methodological Generalizations

Permutation scanning is architecture-agnostic and has been generalized in Visual Mamba research:

Multi-Scale Hierarchy: Combining local and regional SSMs (e.g., in Hi-Mamba HMB) allows aggregation of both patch-level and image-level context in a permutation-efficient manner (Qiao et al., 14 Oct 2024).
Hybrid Systems: Permutation scanning integrates with CNN, Transformer, or hybrid SSM architectures; scan schedules can be adapted or learned at different resolution stages (Qiao et al., 14 Oct 2024).
Learning Permutations: Recent studies report gains from adaptively learning scan order per sample or per layer (SR-Mamba, sequence-reordering in digital pathology) (Xu et al., 29 Apr 2024).
High-Dimensional and Multi-Modal Data: Subspace and fractal permutation scanning has been extended to 4D light fields, hyperspectral cubes, and even graph-structured or sparse-attention neural networks (Gao et al., 23 Jun 2024, Wang et al., 29 Oct 2025).

7. Future Directions and Limitations

Experimental findings indicate that, for some domains (remote sensing segmentation), scan choice is statistically insignificant; for others (SR, low-light, MRI, classification, reconstruction), space-filling or permuted scans drive substantial improvements. Open questions include:

For which data modalities does high Hausdorff dimension traversal yield consistently better inductive bias?
Can entirely learned or data-conditioned permutations exceed the performance of fixed fractal or hierarchical orders?
What are the optimal trade-offs between permutation-induced parallelization overhead and efficiency gains in emerging hardware?
How can permutation scanning be further leveraged in non-vision settings (e.g., GNNs, sparse transformers) (Wang et al., 29 Oct 2025)?

A plausible implication is that diversified permutation strategies (including learned, hierarchical, and space-filling orders) will remain central to efficient, scalable state space modeling in high-resolution and multi-dimensional data settings.