Papers
Topics
Authors
Recent
Search
2000 character limit reached

JEPA-SCORE: Self-Supervised Density Estimation

Updated 16 October 2025
  • JEPA-SCORE is a closed-form estimator that computes sample likelihoods using the Jacobian of a JEPA model.
  • It leverages anti-collapse regularization to infer data density, aligning computed scores with ground-truth log-densities across various datasets.
  • Its practical applications include data curation, outlier detection, and scalable density estimation without requiring explicit generative models.

JEPA-SCORE is a closed-form estimator of sample density derived from Joint Embedding Predictive Architectures (JEPAs), which are a class of self-supervised representation learning frameworks. While JEPAs have traditionally been employed for learning robust, general-purpose embeddings without explicit generative modeling, recent theoretical advances demonstrate that the anti-collapse regularization typical of JEPA implicitly learns the data distribution. JEPA-SCORE leverages the network’s Jacobian at an input sample to extract the probability (or “score”) that the learned representation assigns to that data point. This provides a tractable, model-agnostic means of density estimation, with applications to data curation, outlier detection, and nonparametric sample likelihood estimation, validated on a range of modalities and JEPA instances (Balestriero et al., 7 Oct 2025).

1. Theoretical Foundation and Motivation

The core principle of JEPA-SCORE derives from the dual objectives underlying JEPA training: (i) a latent-space prediction term, which enforces invariance by requiring the representation of a perturbed sample to be predictable from the unperturbed, and (ii) an anti-collapse term, typically realized by regularization mechanisms (e.g., forcing the empirical covariance of embeddings to match that of a standard Gaussian). The anti-collapse term not only prevents the trivial constant solution but also induces the learned embeddings f(X) to globally match a Gaussian (or hyperspherical uniform) distribution.

Given this setup, the change-of-variable formula of probability densities formalizes the relationship between the data distribution pX(x)p_X(x) and the induced embedding distribution pf(X)(f(x))p_{f(X)}(f(x)), linking them through the Jacobian determinant of the learned mapping ff:

pf(X)(f(x))=x:f(x)=f(x)pX(x)k=1rank(Jf(x))σk(Jf(x))dHr(x)p_{f(X)}(f(x)) = \int_{x : f(x) = f(x)} \frac{p_X(x)}{\prod_{k=1}^{\operatorname{rank}(J_f(x))} \sigma_k(J_f(x))} d\mathcal{H}^r(x)

where Jf(x)J_f(x) is the Jacobian, σk\sigma_k are its singular values, and Hr\mathcal{H}^r is the Hausdorff measure over the level set of ff at xx.

JEPA-SCORE exploits this by inverting the relationship: when ff is trained so that pf(X)p_{f(X)} matches the target Gaussian density, the preimage volume (modulated by the Jacobian) yields a closed-form likelihood proxy for input samples.

2. Formal Definition and Practical Computation

Specialized to practical JEPA settings, the sample-wise density proxy (JEPA-SCORE) is computed as:

JEPA-SCORE(x)k=1rank(Jf(x))logσk(Jf(x))\text{JEPA-SCORE}(x) \triangleq \sum_{k=1}^{\operatorname{rank}(J_f(x))} \log \sigma_k(J_f(x))

where σk(Jf(x))\sigma_k(J_f(x)) are the singular values of the Jacobian matrix of the encoder ff at input xx.

Implementation of JEPA-SCORE involves the following steps:

  • For any trained JEPA model (e.g., I-JEPA, DINOv2, MetaCLIP):
    1. For an input xx, perform a forward pass to compute f(x)f(x).
    2. Compute, via autograd, the Jacobian Jf(x)J_f(x).
    3. Perform singular value decomposition (SVD) of Jf(x)J_f(x).
    4. For numerical stability, singular values may be clipped below a small ϵ>0\epsilon > 0 (e.g., 1e-6).
    5. Sum the logarithms of the singular values to yield the JEPA-SCORE.

This scalar can be interpreted (modulo constants) as the local log-determinant of the volume contraction from xx to f(x)f(x)—hence, as a log-likelihood under the learned embedding-induced data density.

3. Empirical Findings and Cross-Model Validation

JEPA-SCORE has been empirically validated across synthetic and real data settings:

  • On Gaussian mixture and controlled synthetic datasets, JEPA-SCORE correlates with ground-truth log-densities: Langevin sampling according to JEPA-SCORE recovers the true density.
  • On Imagenet-1k, MNIST, and a Galaxy dataset, JEPA-SCORE computed with I-JEPA, DINOv2, and MetaCLIP correctly assigns higher scores to in-distribution samples and lower scores to out-of-distribution or undersampled cases. Ordering images by JEPA-SCORE within a class reveals semantic alignment with high or low-density features (e.g., flying birds vs. seated birds).
  • Across architectures and modalities, the connection between learned Gaussian embeddings and data density via JEPA-SCORE remains robust, confirming the method’s broad applicability.

4. Applications

JEPA-SCORE enables several downstream uses:

  • Data curation: By ranking samples according to JEPA-SCORE, practitioners can select representative (high-density) samples or sparsify overrepresented regions; conversely, samples with low scores may be selected for targeted augmentation or investigation.
  • Outlier detection: Samples with exceptionally low JEPA-SCORE are flagged as probable outliers or novelty cases, facilitating dataset cleaning or anomaly detection tasks.
  • Density estimation in high dimensions: JEPA-SCORE provides a scalable and efficient alternative to explicit generative modeling for likelihood estimation, without requiring density modeling in input space.
  • Model assessment and calibration: Distributions of JEPA-SCORE across datasets indicate coverage (or lack thereof) of the training distribution, providing a diagnostic for representation quality and generalization.

5. Relation to the JEPA Family and Interpretative Implications

The existence and utility of JEPA-SCORE is a direct consequence of the anti-collapse (diversity) term in JEPA objectives. This term, often regarded as a mere collapse-prevention heuristic, fundamentally forces the network to allocate embedding space in proportion to the empirical data density. As such, any successfully trained JEPA (encompassing I-JEPA, DINOv2, MetaCLIP, etc.) yields a representation from which input sample densities can be extracted, in a model-and-dataset-agnostic manner, using only the encoder’s local Jacobian.

Whereas generative models (VAEs, normalizing flows) model pXp_X by explicit inversion or likelihood maximization, JEPAs accomplish similar density estimation in a latent, non-generative framework. JEPA-SCORE thus closes the gap between discriminative self-supervised learning and nonparametric density estimation.

6. Limitations and Future Research Directions

While JEPA-SCORE has demonstrated strong alignment with empirical densities in diverse scenarios, several practical and theoretical aspects merit further study:

  • Scalability of Jacobian computation for very high-dimensional input or particularly deep architectures.
  • Sensitivity to network architecture and regularization hyperparameters.
  • Extensions to more complex modalities (e.g., video, multimodal data) or architectures where level set structure is less trivial.
  • Integration with training procedures, e.g., for curriculum learning, balanced sampling, or zero-shot model diagnostics using density information.
  • Theoretical investigation into the sharpness of the learned density estimator and its relationship to sample likelihoods in non-unimodal or heavily skewed distributions.

7. Summary

JEPA-SCORE provides a theoretically grounded and empirically validated link between self-supervised embedding learning and data density estimation. By exploiting the relationship between JEPA’s anti-collapse constraints and the encoder’s Jacobian, it enables direct recovery of per-sample likelihoods without explicit generative modeling. Applications span data curation, outlier detection, density estimation, and model diagnostics, reflecting its utility as both a practical tool and a conceptual advance at the intersection of representation learning and statistical inference (Balestriero et al., 7 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to JEPA-SCORE.