Identifiability in Unlinked Linear Regression: Some Results and Open Problems (2507.14986v1)
Abstract: A tacit assumption in classical linear regression problems is the full knowledge of the existing link between the covariates and responses. In Unlinked Linear Regression (ULR) this link is either partially or completely missing. While the reasons causing such missingness can be different, a common challenge in statistical inference is the potential non-identifiability of the regression parameter. In this note, we review the existing literature on identifiability when the $d \ge 2$ components of the vector of covariates are independent and identically distributed. When these components have different distributions, we show that it is not possible to prove similar theorems in the general case. Nevertheless, we prove some identifiability results, either under additional parametric assumptions for $d \ge 2$ or conditions on the fourth moments in the case $d=2$. Finally, we draw some interesting connections between the ULR and the well established field of Independent Component Analysis (ICA).