MoE-Enhanced Bi-Path Rectifier
- The paper presents a novel integration of MoE-based expert routing with spatial-context and channel-semantic dual paths to refine complex motion fields.
- The system uses probabilistic prior-guided decomposition to partition signals into structure-aware sub-fields for targeted bi-path enhancement and subsequent expert-driven rectification.
- Empirical evidence shows significant improvements in relative pose, homography estimation, and point cloud registration, validating the architecture's precision and robustness.
A Mixture-of-Experts-Enhanced Bi-Path Rectifier is a neural architectural strategy that integrates a Mixture-of-Experts (MoE) sparse routing mechanism with dual-path (bi-path) enhancement modules, enabling fine-grained, structure-aware rectification of heterogeneous signal components. Most notably instantiated in the GeoMoE framework for two-view geometry (Le et al., 1 Aug 2025), this concept enables adaptive refinement of motion field sub-structures by combining spatial-contextual and channel-semantic processing followed by MoE-based expert selection for specialized modeling.
1. Structural Decomposition and Probabilistic Prior Guidance
The bi-path rectifier is situated downstream of a probabilistic prior-guided decomposition stage, which partitions a global signal (e.g., a motion field) into multiple structure-aware sub-fields. This decomposition employs a soft clustering methodology:
- Let denote the feature tensor at layer .
- A multi-layer perceptron (MLP) and softmax produce soft assignment scores:
- The clustering is informed by inlier probability priors, typically extracted from deep features or geometric consensus, which suppress the influence of outliers and encourage groupings reflecting coherent motion regimes.
This structure-aware decomposition yields sub-fields that are suitable for path-specific enhancement and eventual MoE-based rectification.
2. Bi-Path Enhancement: Spatial-Context and Channel-Semantic Streams
Each decomposed sub-field undergoes dual-path enhancement:
A. Spatial-Context Path
- Employs graph-based aggregation (e.g., Graph Attention Networks, denoted ) to encode fine-grained spatial dependencies and neighborhood relationships within the sub-field:
- Captures geometric configurations, boundary dynamics, and regional motion consistency.
B. Channel-Semantic Path
- Applies global average pooling over each sub-field, followed by an MLP to distill semantic cues:
( denotes optional fusion with the original sub-field features.)
- Encodes semantic attributes such as material, texture, or motion category.
These paths are concatenated along the channel axis and fused by a lightweight MLP, generating an enriched representation:
3. MoE Routing and Expert-Driven Rectification
The bi-path-fused representation is routed to a set of specialized experts using an MoE layer:
- Routing logits are computed by an MLP:
- Probabilities via softmax; top-k selection yields routing mask :
- The output is an aggregation of expert inferences weighted by routing probabilities:
where denotes the -th expert, a feedforward network tailored for a specific motion or semantic regime.
This expert-driven refinement suppresses interference between heterogeneous sub-fields and reduces representational entanglement, yielding precise rectification per motion regime.
4. Sequential Integration and Global Recomposition
Following MoE-based rectification, the system recombines the refined sub-field outputs into an updated global signal:
- Aggregation (typically via graph-based attention ) fuses with the previous global feature :
This sequential pipeline—decomposition, bi-path enhancement, MoE rectification, and global recomposition—supports disciplined divide-and-conquer modeling of complex and highly variable signals (e.g., motion fields with diverse, disconnected movement patterns).
5. Empirical Impact and Ablation Evidence
Experimental results for MoE-Enhanced Bi-Path Rectifiers in GeoMoE (Le et al., 1 Aug 2025) include:
- Relative Pose Estimation: Under a metric, GeoMoE achieves approximately 10.75% improvement over DeMatch using the weighted eight-point algorithm.
- Homography Estimation and Point Cloud Registration: Superior average corner error and registration recall compared with baseline models not leveraging this architecture.
- Ablations: Incremental benefits are attributed to (a) probabilistic prior injection, (b) spatial and channel dual-path enhancement, and (c) MoE-based refinement; peak performance is associated with their joint use.
These results indicate robust generalization and accuracy improvements across varying geometric tasks, particularly in settings with non-uniform and multi-modal motion distributions.
6. Relation to Broader MoE Routing and System Design
MoE-Enhanced Bi-Path Rectifiers extend the general MoE paradigm (as in SMILE (He et al., 2022), SiDA-MoE (Du et al., 2023), and HEXA-MoE (Luo et al., 2 Nov 2024)) to domains requiring localized and semantically aware rectification. The bi-path strategy conceptually parallels bi-level MoE routing in SMILE by dividing processing into complementary flows but distinguishes itself by explicitly focusing on spatial-context and channel-semantic disambiguation before expert dispatch. The resulting architecture is compatible with broader MoE routing optimization and scaling frameworks and can be further combined with device- or memory-aware scheduling and resource allocation.
7. Technical Caveats and Architectural Implications
The compositional complexity inherent to bi-path and MoE design introduces system-level challenges, including the need for careful expert specialization, avoidance of expert collapse, and management of sequential fusion modules. Tuning routing network capacity, expert granularity, and enhancement fusion strategies is non-trivial, especially for high-dimensional signals with subtle inter-regional dependencies. Nonetheless, empirical ablations suggest that when properly instantiated, this architectural stack yields notable advances in both precision and robustness for signal rectification tasks in vision and geometry.
The MoE-Enhanced Bi-Path Rectifier exemplifies a hybrid approach that systematically decomposes, contextually enhances, and adaptively routes sub-structural signal components through expert networks, advancing the modeling of heterogeneous, context-rich real-world data in a mathematically principled and empirically validated manner (Le et al., 1 Aug 2025).