Papers
Topics
Authors
Recent
2000 character limit reached

SimMerge: Merging Strategies for Models & Code

Updated 16 January 2026
  • SimMerge is a framework that applies similarity measures for merging LLM checkpoints and semistructured code, addressing scalability and semantic alignment challenges.
  • It leverages both probe-based and structural features to predict optimal merge operations, preserving expert performance and recovering auxiliary capabilities.
  • Empirical results demonstrate significant improvements, with up to 65% gap closure in model merges and a 41% reduction in conflicts for code merging.

SimMerge refers to two distinct research directions: (1) predictive model merging for LLMs and (2) semistructured code merging based on syntactic separators. Both utilize similarity signals to improve merging outcomes at scale, either for neural network checkpoints or source code artifacts. The following sections provide a detailed exposition of both variants as substantiated in leading arXiv publications (Bolton et al., 14 Jan 2026, Cavalcanti et al., 2024).

1. Predictive Model Merging: Overview and Problem Statement

SimMerge for LLMs addresses the model composition challenge stemming from the proliferation of fine-tuned checkpoints. The objective is to merge multiple models in weight space such that expert (on-domain) performance is preserved and auxiliary capabilities are recovered, without incurring the prohibitive cost of exhaustive merge-and-evaluate searches across operators, model subsets, and merge orders. SimMerge provides a predictive, task-agnostic framework that leverages similarity features computed from small sets of unlabeled probes and model weights, allowing for accurate pre-merge selection of merge operators, model subsets, and merge ordering (Bolton et al., 14 Jan 2026).

2. Similarity Signal Extraction and Feature Construction

Similarity signals in SimMerge are categorized into functional (probe-based) and structural (weight-based) features. For any ordered model pair (ma,mb)(m_a, m_b) and task tt, probe-based features utilize an unlabeled input set Pt\mathcal{P}_t, calculating metrics such as average Kullback–Leibler divergence of predictive distributions, cosine similarity of layerwise activations, and attention pattern similarities. Structural features include the cosine similarity and Euclidean distance between flattened model parameters, as well as their respective norms. All sequence-valued metrics are summarized using robust statistics and concatenated with a learnable task embedding c(t)c(t), yielding a comprehensive input x~(ma,mb,t)\widetilde x(m_a, m_b, t) for merge operator selection.

3. Predictive Merge-Selection and Plan Construction

SimMerge employs a two-layer MLP trained via cross-entropy to predict the optimal merge operator o^\widehat o from the feature vector x~(ma,mb,t)\widetilde x(m_a, m_b, t), considering operators such as Linear, SLERP, and TIES. For multi-way merges (k>2k>2), merge plans π\pi are represented by concatenated pairwise feature blocks and are scored using an MLP regressor that approximates downstream utility, with intermediate features propagated using convexity and triangle-inequality bounds. The full plan-selection process consists of enumerating/sampling candidate plans, scoring each via fplanf_{\mathrm{plan}}, and executing the plan with adaptive operator selection at each step.

4. Generalization, Scalability, and Online Adaptation

The SimMerge selector fopf_{\mathrm{op}} and fplanf_{\mathrm{plan}} are trained solely on 2-way merges of 7B-parameter models but generalize without retraining to multi-way merges and 111B-parameter models. In empirical evaluations, SimMerge outperforms fixed operators (e.g., Linear, SLERP) on both multi-way and scale-transferred merges. For online adaptation, SimMerge introduces a contextual bandit variant employing neural-linear models and Linear Thompson Sampling, supporting rapid adaptation to new tasks, models, and operators. LinTS approaches oracle performance in minimizing regret and maintaining gap-closed metrics under partial feedback conditions.

5. Quantitative Results and Comparative Performance

In pairwise 7B model merges, SimMerge closes 65.0% of the expert–auxiliary gap, outperforming the best fixed operator (41.8%). Per-domain results show substantial increases in gap closed across Code, Math, Multilingual, and RAG domains. For multi-way (k=3,4) 7B merges, SimMerge maintains superior trade-off between expert degradation and auxiliary gain compared to fixed operators. In large-scale (111B, k=3) experiments, SimMerge achieves a degradation of –7.8% vs. expert with +42.7% auxiliary gain, surpassing fixed operator benchmarks. The online bandit variant closely tracks oracle performance in both cumulative regret and final macro-averaged gap closure (Bolton et al., 14 Jan 2026).

6. Semistructured Code Merge Using Syntactic Separators

SimMerge (Sesame) for code merge introduces a formalism in which language-specific separator tokens SS (e.g., {, }, ;) are used to partition source files into non-overlapping segments ωj\omega_j and separators. All versions (base, left, right) are segmented and aligned accordingly. Each segment—placed on a unique line and marked by placeholders—is merged using an unstructured tool (e.g., diff3). Postprocessed output recovers original separator placement and wraps conflicts at the semantic block level. The algorithm exhibits linear time complexity in practice, avoids full AST matching, and provides significant reductions in spurious conflict rates compared to diff3 and robust semistructured engines (Cavalcanti et al., 2024).

Tool Total Conflicts Conflicting Files
diff3 2413 1090
s3m 1632 832
Sesame 1413 657

The separator-based approach achieves a 41% reduction in total conflicts versus diff3 and a 13% reduction versus s3m, but introduces unique textual alignment artifacts and false negatives when semantic conflicts cross separator boundaries.

7. Limitations and Research Directions

SimMerge for LLMs is currently limited to existing merge operator sets; opportunities exist for incorporating Fisher-weighted or LoRA merges. Probe set optimization, tightness of metric propagation, true task-agnostic generalization, and scalable plan enumeration remain open. The framework could benefit from integration with continuous fine-tuning mechanisms.

The separator-based code merge exhibits limitations in syntactic alignment precision and false-negative detection for cross-block semantic conflicts. Extending the method to new languages requires separator calibration and could utilize hybrid fallbacks for "hard" merge cases. Semantic analysis overlays, as well as improved heuristics for separator grouping, delineate key future directions for code merge accuracy (Cavalcanti et al., 2024).

A plausible implication is that learning to select merge operators and plans from similarity signals—whether for neural network checkpoints or textual artifacts—constitutes a tractable strategy for scalable merging in the presence of vast candidate catalogs and limited evaluation resources.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SimMerge.