Vendi Score: Kernel-Based Diversity Metric
- Vendi Score is a kernel-based diversity metric that quantifies effective diversity by computing the entropy of eigenvalues from a trace-normalized similarity matrix.
- It offers tunable sensitivity through the parameter q, allowing focus on rare types or dominant clusters across domains such as machine learning, ecology, and biology.
- Its algorithm involves constructing a kernel matrix, normalizing it, and performing spectral decomposition, with scalable approximations like the Nyström method enhancing efficiency.
The Vendi Score is a general, kernel-based diversity metric designed for quantifying the effective number of distinct elements in a finite sample, with deep connections to ecological diversity indices and quantum statistics. Its foundation is the entropy of the spectrum of a similarity matrix constructed via a user-specified positive semi-definite kernel. Unlike domain-specific or label-dependent metrics, the Vendi Score allows for tunable sensitivity to rare versus abundant types, admits theoretical generalizations, and has become a standard tool for measuring and optimizing diversity across machine learning, computational biology, ecology, and experimental design.
1. Formal Definition and Mathematical Foundations
Let be a collection of items (e.g., images, sequences, trajectories) and a user-chosen positive semi-definite kernel with the normalization . Construct the similarity matrix , where . For the trace-normalized matrix , let denote its eigenvalues (so ).
The Vendi Score of order 0 (with 1) is defined as:
2
The 3 case recovers the exponential of the Shannon/von Neumann entropy of the spectrum (also equivalent to the Hill number of order 1 in ecology):
4
This effective-number interpretation means that 5 lies in 6, achieving 7 if all items are mutually orthogonal and 8 if all are identical (Pasarkar et al., 2023, Friedman et al., 2022).
The kernel function 9 is central—by selection of 0 the user defines which differences are considered meaningful.
2. Algorithmic Computation and Scalability
The computation of the Vendi Score proceeds as follows:
- Compute the 1 kernel matrix: 2.
- Normalize: 3.
- Spectral decomposition: Obtain eigenvalues 4 of 5.
- Aggregate: Evaluate the relevant 6-order function as above to compute 7.
Pseudocode: 7
Complexity: Kernel matrix computation is 8; eigen-decomposition is 9 in general. If high-dimensional embeddings are available (dimension 0), one can leverage the low-rank structure for 1 complexity (Friedman et al., 2022, Lintunen, 3 Sep 2025). For large-scale problems, efficient approximations via the Nyström method or random projections enable sub-cubic scaling (Ospanov et al., 2024).
3. Theoretical Properties and Parameter Interpretation
Bounds, Invariance, and Interpretability
- Range: 2.
- Duplication invariance: Duplicating an item does not increase VS; redundancy is not counted as diversity.
- Similarity sensitivity: The score interpolates between "species richness" (3, counts modes) and "dominant-mode" (4, counts major clusters).
- Label-free: No need for class labels or type frequencies; purely uses sample similarities (Pasarkar et al., 2023, Nielsen et al., 26 Sep 2025).
Tuning via 5
- 6: Sensitive to rare clusters or outliers; emphasizes counting "distinct modes."
- 7: Balances rare and common types (Shannon entropy analog).
- 8: Emphasizes dominant groups; insensitive to rare types.
- 9: Counts the number of nonzero modes (matrix rank).
- 0: Returns 1, i.e., size of largest cluster.
This allows targeted sensitivity in applications—rare variant detection in genomics (2) or memorization in deep generative modeling (3) (Pasarkar et al., 2023, Nielsen et al., 26 Sep 2025).
4. Practical Application Domains
Machine Learning
- Generative modeling: VS distinguishes between generative models that look similar under conventional metrics but differ in sample redundancy and true diversity. It is used to diagnose mode collapse, memorization, and duplication (Friedman et al., 2022, Pasarkar et al., 15 Feb 2025).
- Active learning: Vendi Information Gain policies combine informativeness with sample diversity for acquisition, outperforming entropy- or uncertainty-based selection (Nguyen et al., 13 May 2025, Nguyen et al., 12 Sep 2025).
- Self-supervised RL: VS serves as an intrinsic reward, encouraging agents to discover maximally diverse policies under arbitrary similarity functions (Lintunen, 3 Sep 2025).
Experimental Design and Discovery
- Quality-weighted VS: In scientific discovery and experimental design (e.g., active search, BO), VS is extended with a quality multiplier, yielding 4. Such criteria flexibly balance exploitation (high score) and exploration (diversity), resulting in 70–170% increases in effective discoveries (Nguyen et al., 2024).
Computational Biology and Genomics
- Epidemiology: VS quantifies the diversity of viral populations in time-resolved sequence data and detects emerging low-diversity clusters indicative of new variants. It is particularly effective for unsupervised, reference-free tracking in large-scale surveillance (Nielsen et al., 26 Sep 2025).
- Protein/materials universe analysis: The Vendiscope applies VS with learned weighting to entire scientific datasets, quantifying rarity and identifying near-duplicate and high-diversity instances at scale (Pasarkar et al., 15 Feb 2025).
Generative Model Evaluation
- Conditional and Information-Vendi: For generative models conditioned on prompts, VS has been extended to decompose observed diversity into model-induced vs. prompt-induced components, enabling precise analysis of text-to-image, image-to-text, and video generators (Jalali et al., 2024).
OOD Detection
- Vendi Novelty Score (VNS): Measures the increase in VS when a test sample is added to the in-distribution set. VNS achieves state-of-the-art OOD detection using only samples and similarities, avoiding density estimation (Pasarkar et al., 10 Feb 2026).
5. Approximation Methods and Convergence
Truncated and Approximated Versions
- For large 5, the full spectrum is expensive and may not converge quickly (especially under infinite-dimensional kernels such as RBF/Gaussian).
- The 6-truncated Vendi Score uses just the top 7 eigenvalues, requiring only 8 samples for convergence. Efficient approximations via Nyström and FKEA random-feature methods concentrate tightly around the truncated statistic, with precise finite-sample error bounds (Ospanov et al., 2024).
Empirical findings
- On finite-dimensional kernels, VS converges rapidly with 9, where 0 is feature dimension.
- On infinite-dimensional kernels, convergence requires truncation and approximation. Nyström and random features provide accurate, scalable solutions.
6. Pitfalls, Limitations, and Implementation Guidance
- Kernel dependence: The selected similarity function fundamentally determines what diversity is measured.
- Computational scaling: Exact VS is 1; scalable SVD/approximation methods are advised for 2.
- Reference-freeness: VS measures internal diversity; it must be paired with a quality or precision metric to avoid high-diversity but low-quality (e.g., random noise) artifacts (Friedman et al., 2022).
- Sensitivity parameter tuning: The 3 parameter must be selected according to application needs; 4 is generally robust, but 5 for rare species and 6 for memorization/duplication detection (Pasarkar et al., 2023, Nguyen et al., 2024).
- Sparse or imbalanced data: Imbalanced prevalence affects sensitivity; in extreme cases, the probability-weighted form of VS is recommended (Pasarkar et al., 15 Feb 2025).
7. Extensions and Theoretical Innovations
- Conditional, Information, and Entropic Decompositions: Matrix-based analogs of conditional entropy and mutual information using the Vendi entropy underpin recent advances in prompt disentanglement for generative models, label-free information gain estimation, and active learning (Jalali et al., 2024, Nguyen et al., 13 May 2025).
- Generalized Information Metrics: The Vendi Information Gain (VIG) provides an asymmetric, similarity-aware extension of mutual information, reducing to MI when samples are maximally distinct and outperforming MI in sample-based, high-dimensional, and geometric settings (Nguyen et al., 13 May 2025).
- Gradient and Differentiability: VS-based objectives are differentiable, facilitating their use in gradient-based optimization for experimental design and generative modeling (Pasarkar et al., 15 Feb 2025, Nguyen et al., 2024, Hemmat et al., 2024).
Key References
- Fundamental metric, theoretical properties, and ML applications: (Friedman et al., 2022, Pasarkar et al., 2023)
- RL/skill learning: (Lintunen, 3 Sep 2025)
- Scalability and convergence: (Ospanov et al., 2024)
- Quality-weighted experimental design: (Nguyen et al., 2024)
- Genomics/epidemiology: (Nielsen et al., 26 Sep 2025)
- Prompt-based generation and conditional diversity: (Jalali et al., 2024)
- OOD detection: (Pasarkar et al., 10 Feb 2026)
- Vendiscope and large-data applications: (Pasarkar et al., 15 Feb 2025)
- Generalized information theory and VIG: (Nguyen et al., 13 May 2025)