Collaborative Prediction: Tractable Information Aggregation via Agreement (2504.06075v1)

Published 8 Apr 2025 in cs.LG, cs.DS, and cs.GT

Abstract: We give efficient "collaboration protocols" through which two parties, who observe different features about the same instances, can interact to arrive at predictions that are more accurate than either could have obtained on their own. The parties only need to iteratively share and update their own label predictions-without either party ever having to share the actual features that they observe. Our protocols are efficient reductions to the problem of learning on each party's feature space alone, and so can be used even in settings in which each party's feature space is illegible to the other-which arises in models of human/AI interaction and in multi-modal learning. The communication requirements of our protocols are independent of the dimensionality of the data. In an online adversarial setting we show how to give regret bounds on the predictions that the parties arrive at with respect to a class of benchmark policies defined on the joint feature space of the two parties, despite the fact that neither party has access to this joint feature space. We also give simpler algorithms for the same task in the batch setting in which we assume that there is a fixed but unknown data distribution. We generalize our protocols to a decision theoretic setting with high dimensional outcome spaces, where parties communicate only "best response actions." Our theorems give a computationally and statistically tractable generalization of past work on information aggregation amongst Bayesians who share a common and correct prior, as part of a literature studying "agreement" in the style of Aumann's agreement theorem. Our results require no knowledge of (or even the existence of) a prior distribution and are computationally efficient. Nevertheless we show how to lift our theorems back to this classical Bayesian setting, and in doing so, give new information aggregation theorems for Bayesian agreement.

Summary

Collaborative Prediction: Tractable Information Aggregation via Agreement

The paper "Collaborative Prediction: Tractable Information Aggregation via Agreement" by Natalie Collina et al. addresses the challenge of how two parties, possessing distinct sets of features for the same instances, can interact to form predictions that surpass those made independently by either party. This problem is set in environments where feature sharing is impractical due to privacy concerns, data modality constraints, or legal restrictions, such as in human/AI interaction scenarios and vertically federated learning.

Key Contributions

Collaboration Protocols: The authors develop efficient "collaboration protocols" that allow two parties to iteratively exchange prediction labels rather than raw features to compute more accurate predictions collectively. This approach is advantageous in settings where direct feature sharing is either infeasible or undesirable.
Regret Bounds in Adversarial Settings: In an online adversarial context, the paper demonstrates how to establish regret bounds on predictions. Despite neither party having full access to the joint feature space, the proposed protocols facilitate achieving predictions with regret bounds similar to those of benchmark policies in the joint feature space.
Batch Setting Algorithms: For a fixed but unknown data distribution, the paper also offers simpler algorithms that operate in a "batch" setting, providing robustness in creating accurate predictions based on aggregated information without requiring real-time feature exchanges.
Generalization to Bayesian Settings: The results promote a computationally and statistically feasible generalization of classical Bayesian information aggregation, bypassing the need for shared prior distribution knowledge and providing new aggregation theorems in Bayesian contexts.

Theoretical Implications

The endorsement of collaboration protocols founded on iterative label sharing emphasizes the power of agreement for information aggregation, a concept rooted in Aumann’s agreement theorem. The work builds on philosophical grounds laid by Aumann, adapting them for computational settings without necessitating distributional assumptions. The introduction of weak learning conditions is crucial in offering formal characterizations of when swap regret with respect to one set of features implies external regret with another.

Practical Implications

These protocols provide a framework applicable to a variety of domains, especially those involving multi-modal data or strict privacy requirements. By proving that interaction leads to significantly improved predictive performance, the work suggests new strategies for designing AI systems that enhance collaborative potential—particularly relevant in human-AI collaboration.

Future Developments in AI

The extension of these theoretical frameworks to real-world applications could significantly impact areas reliant on multi-source data, such as healthcare, finance, and consumer applications. Future research might explore more refined communication strategies, extending these protocols to handle beyond-worst-case scenarios including adversarial models that account for real-world uncertainties.

Overall, this paper lays foundational work in advancing collaborative prediction methodologies, highlighting the substantial predictive gain achievable through efficient algorithmic interaction paradigms. However, as practical implementations advance, maintaining computational efficiency while managing increasing data dimensionality and diverse feature spaces will remain pivotal challenges.