Local Continuity Module (LCM)

Updated 2 December 2025

LCM is a module that integrates domain-specific local correspondence and compact modeling to capture fine-grained spatial and inter-image relationships.
In co-salient object detection, LCM uses multi-stage pairwise correlation and 3D convolutions to fuse local and global features, significantly improving accuracy.
For point cloud masked modeling, LCM leverages a locally constrained encoder and a Mamba-based decoder to reduce computational cost while boosting reconstruction fidelity.

The Local Continuity Module (LCM) designates two distinct, high-impact architectural strategies for modeling fine-grained local relationships: (1) Local Correspondence Modeling in co-salient object detection, and (2) Locally Constrained Compact Models for efficient masked point modeling. Both lines of work replace or augment standard attention frameworks with domain-specific locality-aware components to encode spatial or inter-image affinities, achieving improvements in both accuracy and computational efficiency. The principal designs are exemplified by the LCM in GLNet for co-salient object detection (Cong et al., 2022), and the Locally Constrained Compact Model for point-cloud masked modeling (Zha et al., 2024).

1. LCM in Co-Salient Object Detection: Architecture and Operations

In the context of co-salient object detection (CoSOD), the Local Correspondence Modeling (LCM) module is a core component of the global-and-local collaborative learning architecture (GLNet), engineered to explicitly capture local inter-image correspondence for robust co-saliency prediction (Cong et al., 2022).

The LCM operates on a feature map $F^n_{ia} \in \mathbb{R}^{C \times H \times W}$ for each image $n$ in a group of $N$ images, typically with $C=512$ for VGG16-based backbones. For each image $k$ , LCM computes pairwise local correspondences with all other images $j\neq k$ via a multi-stage Pairwise Correlation Transformation (PCT):

Subspace Mapping: 1×1 convolution projects each $F^n_{ia}$ to $U^n \in \mathbb{R}^{C \times H \times W}$ .
Affinity Estimation: Each $U^k$ and $U^j$ are reshaped to $\mathbb{R}^{C \times (HW)}$ ; their affinities $A^{kj} \in \mathbb{R}^{(HW) \times (HW)}$ are computed as transposed matrix product, measuring pixel-wise similarity.
Score Pooling and Normalization: For image $k$ , globally pooled local maxima $\bar A^{kj}$ and softmax normalization yield a weighting map $\tilde A^{kj} \in \mathbb{R}^{1 \times H \times W}$ .
Feature Fusion with Attention: Local affinity maps are broadcast and fused into residual-attention-weighted feature flows: $W^{kj} = \mathrm{SA}(\mathrm{CA}(\tilde A^{kj} \odot F^k_{ia} \oplus F^k_{ia}))$ .
Inter-image Aggregation: These $N-1$ local maps for each $k$ are stacked, followed by stacked 3D convolutions (kernel $2 \times 3 \times 3$ ), yielding the local inter-image descriptor $P^k$ .

Internal attention mechanisms (SE-based channel attention and CBAM-style spatial attention) refine both fusion and local context. Key architectural details include two 3D convolutions (approx. $9.4$M parameters for $C=512$ ), a $1\times1$ conv, and attention modules, totaling $10$–$11$M parameters per image $k$ .

2. LCM Contribution to Global-and-Local Feature Fusion

The LCM’s output $P^k$ provides fine-grained pairwise local descriptors, which are fused with global group-level features $G \in \mathbb{R}^{C \times H \times W}$ from the Global Correspondence Modeling (GCM) module. Fusion occurs via the Global-and-Local Correspondence Aggregation (GLA):

$[G, P^k] \in \mathbb{R}^{2C \times H \times W}$ is fused using a $2 \times 3 \times 3$ 3D convolution, ReLU, and subsequent channel/spatial attention operations.
This yields the final inter-image feature $F^k_{ie}$ incorporated into downstream co-saliency prediction.

Ablation studies demonstrate that removing the LCM results in significant performance degradation: on the Cosal2015 dataset, $F_\beta$ drops from $0.8936$ to $0.8550$ and $MAE$ increases from $0.0648$ to $0.0783$, indicating substantial loss of co-saliency discrimination especially in groups with strong intra-class variance (Cong et al., 2022).

3. LCM in Point Cloud Modeling: Locally Constrained Compact Model Design

Separately, the Locally Constrained Compact Model (LCM) for masked point modeling (Zha et al., 2024) establishes a locality-driven alternative to quadratic-complexity Transformer frameworks, targeting redundancy reduction and linear scaling.

The architecture consists of two principal modules:

Locally Constrained Compact Encoder (LCCE): Replaces global self-attention with local aggregation layers. Each patch token finds its $k$ -nearest neighbors via geometric KNN on patch centers, aggregating local structure using concatenation and local MLPs, followed by channel-wise max-pooling. The static neighbor graph ( $N_i$ ) is shared across all encoder layers, enforcing locality and continuity.
Locally Constrained Mamba-Based Decoder (LCMD): Integrates a linear-time State-Space Model (SSM, as in Mamba) with a local-constrained feed-forward network (LCFFN). The decoder preserves mutual information for masked patch reconstruction by ensuring only geometric neighbors communicate, achieving robustness to patch ordering and high reconstruction fidelity.

This design compresses a Point-MAE backbone to $2.7$M parameters (from $22.1$M) and $1.3$G FLOPs (from $4.8$G), while increasing ScanObjectNN OBJ-BG accuracy from $92.67\%$ to $94.51\%$ and ScanNetV2 AP $_{0.25}$ from $59.5$ to $64.7$ (+$5.2$) (Zha et al., 2024).

4. Mathematical Operations in LCM Modules

Affinity matrix between $U^k, U^j$ :

$A^{kj} = [\Re_1(U^k)]^T \otimes \Re_1(U^j)$

Global score and weighting:

$\bar A^{kj} = \max_{\text{rows}} A^{kj}, \quad \tilde A^{kj} = \mathrm{reshape}\bigl(\mathrm{softmax}(\bar A^{kj})\bigr)$

Fusion:

$W^{kj} = \mathrm{SA}(\mathrm{CA}(\tilde A^{kj} \odot F^k_{ia} \oplus F^k_{ia}))$

3D Convolutional Aggregation:

$P^k = \delta(f^{(2)}_{2\times3\times3}(\delta(f^{(1)}_{2\times3\times3}(\widehat W^k))))$

Local Aggregation Layer:

For each token $i$ :

$a_i = \max_{1 \leq j \leq k} g_\downarrow(\mathrm{concat}(X[i], X[N_i[j]]))$

$L_i = g_\uparrow(a_i)$

Mutual Information Guarantee (Decoder): The mutual information preserved by the Mamba SSM decoder:

$I(Y_2^{Mamba}; X_1, X_2) \geq I(Y_2^{Trans}; X_1, X_2)$

due to the data processing inequality and the linear nature of the SSM.

5. Computational Characteristics and Ablation Insights

Both forms of LCM dramatically reduce computational cost by confining feature interactions to local neighborhoods.

Parameter Efficiency: In point cloud MPM, LCM reduces parameter count by $88\%$ (from $22.1$M to $2.7$M) and FLOPs by $73\%$ (from $4.8$G to $1.3$G).
Accuracy Gains: LCM-Point-MAE outperforms Transformer-based Point-MAE by $+1.84\%$ (OBJ-BG), $+0.67\%$ (OBJ-ONLY), and $+0.60\%$ (PB-T50-RS).
Key Hyperparameters: Local aggregation with $k=5$ neighbors provides optimal accuracy/resource tradeoff; geometric KNN matches or slightly outperforms dynamic/feature-space KNN.
Ablations: In the point-cloud domain, inclusion of both local aggregation and FFN in the encoder yields optimal accuracy ( $88.06\%$ PB-T50-RS), but local aggregation is the dominant driver, accounting for most performance improvements.
Decoder Variants: Mamba+LCFFN configuration yields the highest masked reconstruction accuracy ( $89.35\%$ on ScanObjectNN PB-T50-RS).

Model/Setting	Params	FLOPs	Acc./Metric
Transformer (PM)	22.1M	4.8G	OBJ-BG 92.67%
LCM (PM)	2.7M	1.3G	OBJ-BG 94.51%
GLNet w/ LCM	10–11M/𝑘	—	$F_\beta=.8936$
GLNet w/o LCM	<10M	—	$F_\beta=.8550$

6. Theoretical Significance and Limitations

LCM-based designs replace non-local self-attention with constrained, neighborhood-preserving aggregation, underpinned by the principle that most relevant contextual information in highly structured domains (e.g., 3D space, local object saliency) is localized. In point cloud modeling, information-theoretic analysis shows that the locally constrained Mamba decoder retains at least as much mutual information about masked regions as a Transformer, while relying on linear operations.

A plausible implication is that for structured data with clear geometric or semantic neighborhoods, LCM-like modules can deliver superior efficiency–accuracy profiles compared to transformer-based paradigms, provided domain knowledge about locality is available. However, for modeling long-range dependencies or highly non-local relationships, purely local architectures may require auxiliary modules or hybrid fusion.

7. Impact and Current Use

LCMs have proven critical both for improved model performance and for making inference or pretraining feasible on larger, more realistic inputs without the quadratic cost of classical attention. In image co-saliency detection, LCM enables fine-grained correspondence learning between images, overcoming limitations of global feature pooling. In point cloud masked modeling, the Locally Constrained Compact Model supports scalable pretraining and robust transfer across 3D tasks, with empirical evidence showing up to $88\%$ parameter reductions with no loss—and sometimes improvement—in downstream accuracy (Cong et al., 2022, Zha et al., 2024). In both cases, architectural modularity allows seamless integration with global contextual modeling, supporting a hierarchy of correspondence cues.

Markdown Upgrade to Chat

References (2)

Global-and-Local Collaborative Learning for Co-Salient Object Detection (2022)

LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Local Continuity Module (LCM).

Local Continuity Module (LCM)

1. LCM in Co-Salient Object Detection: Architecture and Operations

2. LCM Contribution to Global-and-Local Feature Fusion

3. LCM in Point Cloud Modeling: Locally Constrained Compact Model Design

4. Mathematical Operations in LCM Modules

LCM in CoSOD (Cong et al., 2022):

LCM in Point Cloud Masked Modeling (Zha et al., 2024):

5. Computational Characteristics and Ablation Insights

6. Theoretical Significance and Limitations

7. Impact and Current Use

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Local Continuity Module (LCM)

1. LCM in Co-Salient Object Detection: Architecture and Operations

2. LCM Contribution to Global-and-Local Feature Fusion

3. LCM in Point Cloud Modeling: Locally Constrained Compact Model Design

4. Mathematical Operations in LCM Modules

LCM in CoSOD (Cong et al., 2022):

LCM in Point Cloud Masked Modeling (Zha et al., 2024):

5. Computational Characteristics and Ablation Insights

6. Theoretical Significance and Limitations

7. Impact and Current Use

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics