Dual Selective SSM Projector
- Dual selective SSM projector is a paradigm that integrates two data-adaptive SSM operations along complementary axes, enhancing model expressivity and efficiency.
- It employs token and channel (or orthogonal spatial) selectivity via MLP-based gating and linear recurrences, ensuring scalable computation.
- Applications span deep learning, computer vision, image restoration, and optical imaging, achieving notable performance improvements and reduced computational costs.
A Dual Selective SSM (State Space Model) Projector refers to an architectural paradigm that exploits two selective SSM operations—typically along distinct axes, modalities, or functional branches—to maximize expressive capacity, adaptive computation, and efficiency in neural, optical, or physical systems. This article surveys its emergence, mathematical underpinnings, algorithmic structure, design variants, and principal applications, drawing on canonical implementations in deep learning, computer vision, and optical projection hardware.
1. Concept and Technical Foundations
The dual selective SSM projector paradigm generalizes basic SSM layers by instantiating two distinct, content- or task-dependent SSM operators, typically addressing orthogonal or complementary processing axes. The approach leverages data-dependent parameterization and gating to adapt state evolution—in both recurrent and feedforward settings—while maintaining linear computation and memory cost.
In the original neural context, this is realized as dual “selection” mechanisms: one along the token (or spatial/temporal) dimension and another along the channel (feature or modality) dimension. Each SSM projector utilizes small linear/MLP-based parameter generators for time- or location-varying system kernels (B, C, Δ), which then drive a discretized SSM recurrence with additional gating, typically in parallel with MLP or convolutional blocks (Behrouz et al., 2024, Deng et al., 2024). In hardware or signal processing settings (e.g., structured-light, metasurface optics), this dual selectivity is matched by parallel or cascaded projection elements with orthogonal control parameters or physical axes (Ou et al., 2024, Je et al., 2015).
2. Mathematical and Algorithmic Structure
The general dual selective SSM projector consists of two serial (sometimes parallel) SSM pipelines, each with its own data-adaptive parameterization and gating:
A. Neural Architectures
Given input :
- Token-Selective SSM Projector:
For each token position , compute parameter functions , , .
Compute the gated input and an MLP channel . Propagate through a discretized SSM recurrence, with final output elementwise-gated: .
- Channel-Selective SSM Projector:
Transpose to , perform bidirectional SSM scan (forward and backward) along the channel axis, again with dynamic parameterization as above. Combine, gate, and re-transpose to original shape (Behrouz et al., 2024, Deng et al., 2024).
Both stages utilize associative scan algorithms (e.g., S6 parallel scan) and are strictly linear in sequence or feature-dimension length.
B. Physical and Optical Systems
- Dual-branch LCoS SSM projector combines a bottom polarization-converting metasurface and a top electrically tunable subwavelength grating/Liquid Crystal (LC)—the first handles polarization selectivity (acting as a reflective half-wave plate), and the second enables electro-optical amplitude switching, thus achieving polarization-independent modulation and dense addressability (Ou et al., 2024).
- Dual-projector structured-light vision employs two spatial projectors with orthogonal color-multiplexed stripe orientations; orientation-multiplexed patterns enable derivative-based code separation for superposed, spatially entangled patterns (Je et al., 2015).
3. Key Design Variants
Distinct dual selective SSM projector designs have emerged, each reflecting domain-specific constraints and objectives.
| Variant Context | Dual Axes or Branches | Selective Mechanism (Per Branch) |
|---|---|---|
| MambaMixer (vision/time-series) | Token (L, sequence) / Channel (D, feature) | Data-dependent (B, C, Δ); MLP gating |
| CU-Mamba (image restoration) | Spatial (flattened H×W) / Channel (C, per-pixel) | Data-dependent (B, C, Δ); SiLU gating |
| Mamba-FSCIL (incremental learn.) | Base / incremental branches (class disjunction) | Sample-conditioned SSM parameterization |
| FMOcc (3D occupancy) | Tri-perspective planes; “air”/“non-air” token gating | TPV SSM + selective gate per plane |
| Meta-LCoS (hardware) | Metasurface/polarization + LC/amplitude modulation | Physical structure, voltage control |
| Structured-light vision | Two projectors, orthogonal orientation multiplexer | Chromatic/gradient code separation |
Projector types extend from purely algorithmic modules to physical systems, unified by dual selective, content-influenced parameterization.
4. Applications and Empirical Results
Machine Learning Architectures
- Vision and Time Series: ViM2 and TSM2, based on MambaMixer dual selective SSMs, achieve SOTA or competitive performance on ImageNet, COCO, and multiple time-series benchmarks, providing strong evidence that dual mixing along token and channel provides complementary global context and feature selectivity (Behrouz et al., 2024).
- Image Restoration: CU-Mamba, with spatial and channel SSM projectors integrated into a U-Net, yields up to +1 dB PSNR improvements over baselines, reaching 33.53 dB on GoPro deblurring and outperforming quadratic-complexity transformers in both PSNR and runtime (Deng et al., 2024).
- Incremental and Few-Shot Learning: Mamba-FSCIL leverages a dual-base/incremental branch selective SSM projector, achieving SOTA on miniImageNet, CUB-200, and CIFAR-100—minimizing catastrophic forgetting via explicit branch separation and content-adaptive selective scanning (Li et al., 2024).
- 3D Occupancy Prediction: FMOcc, using dual (tri-perspective) selective SSM projection, surpasses prior methods on Occ3D-nuScenes (43.1% RayIoU with two-frame input), demonstrating effective missing-data hallucination and linear computation (Chen et al., 3 Jul 2025).
- Dynamic SSM/Attention Routing: AMOR deploys a dual-selective “Ghost KV” SSM projector (projecting SSM state for sparse-attention use), with a metacognitive (entropy-based) gate, achieving perfect retrieval with 22% attention usage and significant efficiency over standard transformers (Zheng, 22 Jan 2026).
Optical and Imaging Hardware
- Meta-LCoS Projection: Dual metasurface SSM projector achieves 81:1 contrast (532 nm), 70% fill factor, polarization-insensitive amplitude modulation, and 50–60% optical efficiency in a CMOS-compatible architecture (Ou et al., 2024).
- Structured-Light Vision: Dual-projector system employing orientation-multiplexed color patterns enables single-shot, large-area 3D reconstruction with chromatic-derivative code separation (Je et al., 2015).
5. Computational and Physical Efficiency Analysis
A defining feature of the dual selective SSM projector is its ability to unify expressivity across axes without incurring quadratic computational burdens:
- Algorithmic Complexity:
- MambaMixer and CU-Mamba dual SSMs have per-block cost , linear in token and channel dimensions, compared to for attention-based transformers and for large-kernel CNNs (Behrouz et al., 2024, Deng et al., 2024).
- FMOcc’s TPV SSM achieves linear complexity along reduced 2D planes, addressing 3D volumetric efficiency.
- Hardware Implementations:
- Meta-LCoS stacks two tunable, physically orthogonal SSM elements, eliminating polarization losses and high-voltage requirements of conventional LCoS.
- Structured-light dual projectors increase spatial coverage by simultaneously encoding separable features in orientation/chromaticity, sidestepping typical occlusion limits.
6. Practical Implementation, Training, and Tuning
Key practices and hyperparameters for dual selective SSM projector designs:
- Parameterization:
- Typically all SSM parameters (B, C, Δ) are data- and position-dependent, generated via lightweight MLPs or convolutions.
- Gating:
- Output gating (SiLU, sigmoid, or MLP) multiplexes or modulates the SSM output along each axis/direction.
- Dense Layer-to-Layer Connections:
- Weighted sums of prior token/channel SSM outputs allow information “shortcuts” and improved feature aggregation (Behrouz et al., 2024).
- Branch Decoupling:
- For incremental learning, freezing base SSM branches preserves foundational representations, while new branches serve adaptive needs (Li et al., 2024).
- Losses:
- Selective scan losses—suppression and separation—balance stability and plasticity in continual learning (Li et al., 2024).
- Hyperparameters:
- Channel state dim typically 64 (SSM); scan directions count (e.g., 4) for 2D tasks; learning rates, weight decays, and loss balances are empirically robust over specified intervals (Deng et al., 2024, Li et al., 2024).
7. Significance, Limitations, and Outlook
The dual selective SSM projector encapsulates a shift toward structured, content-adaptive, and axis-complementary state space modeling across both software and hardware domains. It attains high task performance, efficiency (linear complexity), and modular extensibility by leveraging dual-axis or dual-branch selectivity.
A key finding is that cross-token and cross-channel selectivity are empirically complementary—removal or ablation of either systematically degrades representative tasks (e.g., PSNR in image restoration drops when omitting either stage) (Deng et al., 2024). Similar trends are reported in 3D occupancy prediction and few-shot learning, where dual branches (e.g., class-disjoint) substantially outperform single-branch formulations (Chen et al., 3 Jul 2025, Li et al., 2024).
Physical implementations confirm that dual selectivity (e.g., polarization + amplitude, or orthogonal color projectors) enables hardware modularity, efficiency, and coverage beyond what is feasible with conventional single-axis modulation (Ou et al., 2024, Je et al., 2015).
Current challenges include tuning for domain-specific heterogeneity, scaling to higher-dimensional axes, and understanding gating dynamics in deep stacked configurations. A plausible implication is that dual selective SSM projectors will serve as a foundational motif in both next-generation neural sequence modeling and emerging optical, imaging, and embedded intelligent sensor platforms.