Source of performance gains: isotropic prior choice versus Euclidean gradient dynamics
Determine whether, in backbone–projector self-supervised learning architectures that impose an isotropic prior distribution (such as a Gaussian or Laplace prior) on projector outputs while using only backbone features at inference, the observed downstream performance gains are attributable to the specific choice of isotropic prior versus to the general favorable learning dynamics of Euclidean gradients.
Sponsor
References
Experimentally, we observe that the choice between Gaussian and Laplace priors on non-sliced experiments had negligible impact on downstream classification performance. Consequently, it remains unclear whether performance gains stem from the specific choice of isotropic prior or simply from the favorable learning dynamics of Euclidean gradients.