Scalability of CALM beyond two models

Determine whether Composition to Augment Language Models (CALM), which merges a base large language model with specialized models via pairwise cross-attention, can scale beyond two constituent models without incurring quadratic complexity in the number of pairwise cross-attention connections.

Background

In the discussion of architectural integration methods, the paper describes CALM, which merges representations from a base LLM and specialized models using cross-attention. The approach designates one model as an anchor and the other as an augmenting counterpart. The authors point out uncertainty regarding CALM’s scalability: extending beyond two models may necessitate a quadratic number of pairwise cross-attention connections, which could hinder practical deployment in larger ensembles.

This open question concerns whether CALM can be extended to more than two models efficiently, and whether the architecture fundamentally requires quadratic pairwise connections as the number of models grows.

References

It is also not clear how this construction scales beyond two models, as it may require a quadratic number of pairwise cross-attention connections.

A Theoretical Framework for Modular Learning of Robust Generative Models  (2602.17554 - Cortes et al., 19 Feb 2026) in Section 2 (Related Work) — Mixtures, Merging, and Composition