Papers
Topics
Authors
Recent
Search
2000 character limit reached

Embedding-Virtualized Knowledge (EVK)

Updated 9 February 2026
  • Embedding-Virtualized Knowledge (EVK) is an embedding-centric method that perturbs token embeddings with Gaussian noise to create virtual knowledge points for latent factual evaluation.
  • EVK-Bench enables continuous, unsupervised measurement of knowledge drift, uncovering subtle side effects in LLM edits beyond conventional discrete prompts.
  • The EVK-Align module regularizes edits by confining embedding drift, thereby preserving specificity and efficacy while ensuring scalable, annotation-free evaluation.

Embedding-Virtualized Knowledge Benchmark (EVK-Bench) is a systematic framework for probing the effects of factual model editing in LLMs beyond traditional sample-based assessment. By leveraging controlled, continuous perturbations in embedding space, EVK-Bench enables unsupervised, high-resolution quantification of knowledge drift, offering a rigorous alternative to discrete prompt-based evaluation paradigms. This approach provides insight into latent side effects of editing methods—typically overlooked by finite textual benchmarks—and is complemented by the EVK-Align module, which regularizes edits to contain drift in the embedding-neighborhood without sacrificing efficacy or specificity (Liu et al., 2 Feb 2026).

1. Embedding-Virtualized Knowledge: Definition and Conceptual Basis

Embedding-Virtualized Knowledge (EVK) represents an embedding-centric method for characterizing the knowledge structure of LLMs. Rather than expanding the set of prompt texts to probe and evaluate an edited fact, EVK generates “virtual” knowledge points directly in the token embedding space by perturbing the embeddings associated with the subject, the relation, or the whole input.

Given a prompt PP encoding a tuple (s,r,o)(s, r, o), its token-level embedding matrix is E=[e1,...,en]Rn×d\mathbf{E} = [\,\mathbf{e}_1, ..., \mathbf{e}_n] \in \mathbb{R}^{n\times d}. EVK identifies index sets Is,Ir\mathcal{I}_s, \mathcal{I}_r corresponding to the subject and relation, extracting the relevant sub-embeddings Es\mathbf{E}_s and Er\mathbf{E}_r. Gaussian noise ΔjN(0,σ2I)\Delta_j \sim \mathcal{N}(0, \sigma^2\mathbf{I}) (j{s,r,a}j \in \{s, r, a\}) is then applied to generate:

E~={Es+Δs(Subject Drift) Er+Δr(Relation Drift) E+Δa(All Drift)\widetilde{\mathbf{E}} = \begin{cases} \mathbf{E}_s + \Delta_s & \text{(Subject Drift)} \ \mathbf{E}_r + \Delta_r & \text{(Relation Drift)} \ \mathbf{E} + \Delta_a & \text{(All Drift)} \end{cases}

Each E~\widetilde{\mathbf{E}} instantiates a point in the continuous semantic manifold in proximity to the original fact, with distance modulated by σ\sigma. Iterating over σ\sigma and offset samples Δj\Delta_j allows coverage of a dense neighborhood around a knowledge point, surpassing what can be realized through discrete prompt engineering.

2. EVK-Bench: Benchmark Construction and Methodology

EVK-Bench quantifies knowledge drift from factual edits by systematically sampling the latent neighborhood around edited points, applying the following pipeline:

  • Prompt Construction: For a dataset triple (s,r,o)(s, r, o), form a templated natural-language prompt PP, tokenize, and compute embedding matrices, identifying and extracting Es,Er\mathbf{E}_s, \mathbf{E}_r.
  • EVK Sample Generation: Randomly sample Gaussian offset matrices Δs\Delta_s, Δr\Delta_r, and Δa\Delta_a to construct Subject, Relation, and All EVK instances via the defined embedding transformations. Statelessly decode these through the model's stack (identical surface text with modified embeddings).
  • Stability Metrics: For each EVK input E~\widetilde{\mathbf{E}}, the pre-edit and post-edit model hidden states (hpre\mathbf{h}_\text{pre}, hpost\mathbf{h}_\text{post}) at the final token are computed. Embedding Stability (ES) is the cosine similarity:

    ES=cos(hpre,hpost)\mathrm{ES} = \cos(\mathbf{h}_\text{pre},\,\mathbf{h}_\text{post})

For text-level drift, Counterfact attribution prompts are evaluated analogously to yield Text Stability (TS). Both metrics are scalars, annotation-free, and generalizable to any editing benchmark.

Contrasted with conventional benchmarks—limited by curated sets of discrete surface prompts—EVK-Bench enables scalable, annotation-free, continuous assessment, allowing detection of subtle, non-local model perturbations otherwise invisible to traditional evaluation.

3. Conventional vs. EVK-Based Evaluation

Aspect Conventional Benchmarks EVK-Bench
Sampling Mode Discrete surface prompts Continuous embedding perturbations
Coverage Finite, dataset-bound Dense, annotation-free, embedding-neighborhood
Drift Detection Only direct/nearby edits Fine-grained, latent memory side effects

EVK-Bench reveals knowledge drift not captured by standard sample-based metrics. This motivates its deployment alongside existing benchmarks such as Counterfact and ZsRE. Empirical analysis demonstrates that even state-of-the-art Locate-Then-Edit (LTE) methods—including ROME, MEMIT, RECT, and AlphaEdit—can introduce measurable instability in the embedding space that EVK-Bench alone can discern (Liu et al., 2 Feb 2026).

4. EVK-Align Regularization and Objective

To constrain drift identified by EVK-Bench, EVK-Align introduces an embedding-level alignment loss as a plug-and-play module for LTE editors. The composite objective is:

  • Edit Loss: For edit data D={(xi,yi)}D = \{(x_i, y_i)\}:

    LEdit=1Dilogpθ+δ(yixi)\mathcal{L}_\text{Edit} = -\frac{1}{|D|} \sum_{i} \log p_{\theta+\delta}(y_i \mid x_i)

where δ\delta is a parameter update in an FFN output layer.

  • EVK Alignment Loss: For EVK samples {x^i}i=1N\{\hat x_i\}_{i=1}^N:

    LEVK=1Ni=1NDKL(pθ(x^i)pθ+δ(x^i))\mathcal{L}_\text{EVK} = \frac{1}{N} \sum_{i=1}^N D_\text{KL}\big(p_\theta(\cdot\mid\hat x_i)\,\|\,p_{\theta+\delta}(\cdot\mid\hat x_i)\big)

where DKLD_\text{KL} is computed on top-kk tokens (with kk increased over optimization for efficiency).

  • Combined Objective:

    L=LEdit+λLEVK\mathcal{L} = \mathcal{L}_\text{Edit} + \lambda\,\mathcal{L}_\text{EVK}

    λ\lambda controls the balance between locality of the edit and preservation of the embedding vicinity. This approach is compatible with both closed-form and gradient-based LTE update protocols.

5. Experimental Findings and Quantitative Evaluation

Extensive empirical studies were conducted on GPT2-XL (1.5B), GPT-J (6B), and LLaMA3-8B, utilizing Counterfact (2k triples), ZsRE, and EVK-Bench itself (3 EVK variants per Counterfact prompt at σ=0.3\sigma = 0.3; 6k embedding samples and 5k attribution prompts).

Key findings include:

  • On Counterfact and ZsRE, EVK-Edit (AlphaEdit + EVK-Align) achieves matching or improved Efficacy (Eff.) and Specificity (Spe.), e.g., GPT2-XL: Eff. ≈ 99.8 vs 99.6, Spe. ≈ 72.3 vs 70.1.
  • On EVK-Bench, EVK-Edit consistently achieves higher ES and TS: for GPT2-XL, ES increases from 67.70 to 69.52; TS from 75.58 to 76.60. Similar trends are observed on LLaMA3 and GPT-J.
  • EVK-Align thus confines the edit's effect to the targeted key–value while preserving latent associations.

Ablation studies demonstrate that smaller σ\sigma and larger λ\lambda tighten preservation (higher specificity) with negligible generalization loss. Increasing the number of EVK samples NN improves stability linearly with compute; higher kk in the KL loss offers marginal gains. Visualization (e.g., UMAP) confirms that EVK samples populate a much denser region around the edited point than typical discrete neighborhood prompts. GLUE evaluations show that EVK-Align does not degrade—and in some cases slightly improves—general language modeling ability across six NLU tasks.

6. Broader Implications and Applicability

EVK-Bench establishes a paradigm shift in evaluating model editing, offering the first high-resolution, embedding-level view of global and local side effects in LLMs' factual memory. EVK-Align, as an unsupervised, model-agnostic regularization layer, constitutes a lightweight addition that yields improved knowledge preservation without compromising edit precision or breadth. The methodology is annotation-free, scalable to any LTE-compatible LLM, and integrates seamlessly with prevailing edit pipelines.

This suggests expanded applications in robustness evaluation, interpretability research, and potentially, life-cycle management of deployed LLMs, where comprehensive characterization of knowledge drift is critical.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Embedding-Virtualized Knowledge (EVK).