Papers
Topics
Authors
Recent
Search
2000 character limit reached

Model Agreement via Anchoring

Published 26 Feb 2026 in cs.LG and cs.AI | (2602.23360v1)

Abstract: Numerous lines of aim to control $\textit{model disagreement}$ -- the extent to which two machine learning models disagree in their predictions. We adopt a simple and standard notion of model disagreement in real-valued prediction problems, namely the expected squared difference in predictions between two models trained on independent samples, without any coordination of the training processes. We would like to be able to drive disagreement to zero with some natural parameter(s) of the training procedure using analyses that can be applied to existing training methodologies. We develop a simple general technique for proving bounds on independent model disagreement based on $\textit{anchoring}$ to the average of two models within the analysis. We then apply this technique to prove disagreement bounds for four commonly used machine learning algorithms: (1) stacked aggregation over an arbitrary model class (where disagreement is driven to 0 with the number of models $k$ being stacked) (2) gradient boosting (where disagreement is driven to 0 with the number of iterations $k$) (3) neural network training with architecture search (where disagreement is driven to 0 with the size $n$ of the architecture being optimized over) and (4) regression tree training over all regression trees of fixed depth (where disagreement is driven to 0 with the depth $d$ of the tree architecture). For clarity, we work out our initial bounds in the setting of one-dimensional regression with squared error loss -- but then show that all of our results generalize to multi-dimensional regression with any strongly convex loss.

Summary

  • The paper introduces a midpoint anchoring approach that uses the average of two models to derive bounds and nearly eliminate model disagreement.
  • It applies the method to applications like stacked aggregation, gradient boosting, and non-convex models, showing robust improvement across various training regimes.
  • The proposed framework demonstrates that key training parameters, such as stack size and iteration count, can systematically drive disagreement to zero.

Model Agreement via Anchoring

Introduction

The problem of model disagreement, specifically between independently trained machine learning models, is a critical issue that arises in various scenarios, often referred to as model multiplicity or the Rashomon effect. Such disagreements, particularly in high-stakes settings, can question the validity of decisions made based on statistical models. This paper, titled "Model Agreement via Anchoring" (2602.23360), presents a generalized method to address this problem by proving bounds on model disagreement through a concept termed midpoint anchoring.

Disagreement Reduction Technique

The core idea is to anchor the analysis of disagreement to the midpoint model, effectively defined as the average of two independently trained models. This approach facilitates the derivation of disagreement bounds, enabling the reduction of model disagreement to nearly zero as a function of certain parameters of the training procedure. The method is applicable to widely used machine learning practices, including stacked aggregation, gradient boosting, neural network training with architecture search, and regression tree training.

Specifically, the paper introduces a generalized bound on the expected disagreement between two models, f1f_1 and f2f_2, which are trained on independent samples from the same distribution: D(f1,f2)=Ex[(f1(x)−f2(x))2]D(f_1,f_2) = E_x[(f_1(x)-f_2(x))^2]. By anchoring on the average model, fˉ=(f1+f2)/2\bar{f} = (f_1+f_2)/2, the authors demonstrate that the bounds can be driven to zero, contingent on specific training methodology parameters.

Applications of Midpoint Anchoring

Stacked Aggregation

The application of the midpoint anchoring technique to stacked aggregation reveals that disagreement can be minimized aggressively by increasing the number of models included in the stack kk. The paper shows that expected model disagreement E[D(f1,f2)]E[D(f_1,f_2)] can be upper-bounded and driven toward zero as kk increases, aligning with practical ensembling stability goals.

Gradient Boosting

In the case of gradient boosting, a similar anchoring argument is employed, where the disagreement diminishes proportionally with the number of iterations kk. Notably, the bounds crucially depend on the intrinsic properties of the model class, such as the magnitude of the architecture over which agreement is established, and the convergence analysis validations typically associated with weak learners.

Non-Convex Models

For non-convex model classes, such as neural networks and regression trees, the anchoring method demonstrates robustness by providing stability results despite arbitrary potential parameter space disagreement. In these contexts, architectures like neural networks are optimized over a size parameter nn, while regression trees utilize depth dd. The technique remains effective, showing that even complex, non-linear models can achieve robust agreements in prediction space dictated by careful parameter optimization.

Generalization

The paper further extends its results to models that predict multi-dimensional distributions, utilizing strongly convex loss functions. This generalization involves modifying the disagreement definition to expected squared Euclidean distances between predictions, thus broadening the applicability of the technique across various real-world, multidimensional predictive modeling scenarios.

Conclusion

This research lays out a comprehensive framework for reducing model disagreement through midpoint anchoring, offering strong theoretical underpinnings that align with current machine learning practices. By establishing generally applicable bounds, the study equips practitioners with the ability to foster model agreement efficiently during independent model training. The implications of this research are notable for improving reproducibility, reducing predictive churn, and ensuring fairness in decision-making processes involving learning systems, thereby addressing core issues in modern predictive modeling frameworks.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 12 tweets with 99 likes about this paper.