Dice Question Streamline Icon: https://streamlinehq.com

Market design for heterogeneous, compositional data goods

Design markets and mechanisms for datasets whose value is heterogeneous and compositional, including mechanisms that handle interdependent valuations among buyers, standards for provenance and quality certification enabling price discovery without full inspection, and computationally feasible attribution methods for models trained on millions of sources.

Information Square Streamline Icon: https://streamlinehq.com

Background

Data’s heterogeneity and compositional value create interdependent valuations that standard markets do not handle well. The verification paradox (inspection enables copying) and the intractability of attribution when models train on vast sources impede price discovery and compensation.

Section 6 calls for mechanism design, institutional standards for provenance and certification, and practical attribution approaches, drawing analogies to the historical development of standards and exchanges for grain and oil.

References

Building on these foundations, we outline four open research problems foundational to data economics: measuring context-dependent value, balancing governance with privacy, estimating data's contribution to production, and designing mechanisms for heterogeneous, compositional goods.

The Economics of AI Training Data: A Research Agenda (2510.24990 - Oderinwale et al., 28 Oct 2025) in Abstract