Empirical Activation Similarity (EAS) Overview

Updated 5 June 2026

Empirical Activation Similarity (EAS) is a metric that quantifies cosine similarity in high-dimensional activation vectors to assess statistical alignment in neural systems.
It is applied across artificial neural networks and cognitive neuroscience to analyze representational specialization and inform model pruning and calibration strategies.
Empirical studies demonstrate that EAS can enhance model compression efficiency and support dynamic domain sensitivity analysis through time-resolved similarity measures.

Empirical Activation Similarity (EAS) quantifies the statistical alignment or correspondence between high-dimensional activation patterns elicited by different inputs within neural systems, including artificial neural networks and the human brain. EAS metrics have been deployed to measure semantic similarity, to guide model pruning, and to analyze representational specialization across domains and layers. Core instantiations span model-comparison in cognitive neuroscience, gradient-driven attribution in transformer architectures, and angular fidelity loss for deep learning compression. Definitions and protocols vary by field, but EAS typically leverages second-order activation statistics or cosine-based similarity over activation vectors, grounded directly in observed (empirical) activity rather than parametric modeling assumptions.

1. Fundamental Formulations of EAS Across Modalities

In LLMs, EAS formalizes the cosine similarity between “activation vectors” derived from parameterwise gradients of model outputs—measuring which parameters are influential for a particular input. Given a model output functional $D(X, w)$ for input $X$ and parameters $w = (w_1, ..., w_n)$ , the per-parameter activation metric is

$\mathcal{A}(X, w_i) \approx |w_i\, \partial D(X,w)/\partial w_i|$

where $\mathcal{A}(X)$ denotes the $n$ -dimensional activation vector. For two inputs $X_1, X_2$ , EAS is computed as

$\mathrm{EAS}(X_1, X_2) = \frac{\mathcal{A}(X_1) \cdot \mathcal{A}(X_2)}{\|\mathcal{A}(X_1)\|\, \|\mathcal{A}(X_2)\|}$

This metric, referred to as LLMDcos, takes values in $[0,1]$ due to nonnegative elements, with unity indicating maximal overlap in activated parameters (Wang et al., 2024).

In neural data analysis (e.g., MEG studies), EAS characterizes stimulus similarity via Pearson correlation between empirically reduced activation vectors $x_i(t), x_j(t)$ at time $X$ 0:

$X$ 1

where $X$ 2 is the time-resolved empirical activation similarity (Wardle et al., 2015). Alternative constructions may use classification-based dissimilarity measures (e.g., $X$ 3 from decoding analysis) normalized and inverted to produce similarity scores.

For transformer interpretability, gradient × activation saliency maps define tokenwise and word-group activations, enabling EAS-like explanatory matching of words/phrases between text pairs (Malkiel et al., 2022).

2. EAS in Model Compression and Pruning

Recent pruning strategies exploit EAS to preserve the angular structure of representations during parameter ablation. In the ACE framework, Empirical Activation Similarity measures the cosine fidelity between unpruned and pruned model activations:

$X$ 4

where $X$ 5 and $X$ 6 are dense and pruned layer outputs for an $X$ 7-token batch. The pruning score for each connection combines a weight-magnitude × activation-norm factor (CosP) with an activation-variance factor (VarP):

$X$ 8

$X$ 9

$w = (w_1, ..., w_n)$ 0

The ACE algorithm prunes weights ranked by $w = (w_1, ..., w_n)$ 1, directly minimizing angular distortion and improving calibration efficiency. Experiments show EAS-informed pruning achieves up to 18% reduction in perplexity and up to 63% reduction in time relative to non-EAS baselines, while requiring as few as 16 tokens of calibration data (2505.21987).

3. Layerwise and Domain-Sensitivity Analysis Using EAS

EAS provides a lens onto internal specialization, differentiating “universal encoder” layers (high activation similarity across domains) from deep “expert” layers which activate differently for task-specific or cross-domain inputs (Wang et al., 2024). Empirically:

Within-domain EAS: High ( $w = (w_1, ..., w_n)$ 2) for all layers—parameters consistently co-activated.
Cross-domain EAS: High for shallow layers then decays to $w = (w_1, ..., w_n)$ 3 in deep layers—deep blocks exhibit representational individuality.
Peak domain-agnosticity in layer 2; maximal specialization in layers 20–30 (for Llama2-7B).

Averaging EAS matrices over datasets recovers an interpretable domain-task similarity structure, accurately reflecting semantic or procedural overlap between benchmarks.

4. Empirical Activation Similarity in Cognitive Neuroscience

EAS enables the comparison of neural representations evoked by sensory stimuli. In MEG studies (Wardle et al., 2015), EAS is calculated as correlation similarity between PCA-reduced activation vectors for each stimulus at each timepoint, yielding dynamic similarity matrices. This approach supports representational similarity analysis (RSA) to compare empirical neural geometry to external models (retinotopic, computational, or perceptual). Key empirical findings:

Early visual cortex representations align with retinotopic models ( $w = (w_1, ..., w_n)$ 450–80 ms post-stimulus).
From $w = (w_1, ..., w_n)$ 5150 ms, EAS with perceptual-similarity models approaches the empirical noise ceiling.
EAS provides a metric for empirical quantification of perceptual Gestalts via brain-wide activation patterns.

5. Applications: Pruning, Interpretability, Retrieval, Model Calibration

EAS metrics have demonstrable utility:

Adaptive Model Pruning: EAS guides unstructured and semi-structured pruning; layerwise pruning ratios are tuned by observed activation density/sparsity (e.g., densest layers pruned less aggressively) (Wang et al., 2024, 2505.21987).
Calibration Efficiency: EAS-based pruning remains robust with very short calibration sequences, supporting rapid compression.
Interpretability and Attribution: EAS-inspired saliency and word-pair matching provide token-level explanations for BERT similarity (Malkiel et al., 2022).
Semantic Similarity: Deep-layer EAS correlates with human judgment on STS-B and SICK, offering embedding-free data relevance signals.
Monitoring and Robustness: Large rotational changes in activation similarity can signal domain shift or calibration drift in deployment (2505.21987).

6. Implementation Protocols and Assessment

EAS operationalizes as batch-wise or layer-wise cosine similarity between activation vectors. Variants exist:

Parameter-space activation statistics (gradient × parameter, (Wang et al., 2024))
Activation vector correlation (MEG, (Wardle et al., 2015))
Token/word-level saliency (transformers, (Malkiel et al., 2022))
Angular deviation between dense and compressed model activations (ACE, (2505.21987))

Calibration batch size ( $w = (w_1, ..., w_n)$ 6) and sequence length ( $w = (w_1, ..., w_n)$ 7) are key hyperparameters; practical settings range from $w = (w_1, ..., w_n)$ 8 to $w = (w_1, ..., w_n)$ 9 sequences. Validation is performed by correlating EAS matrices with external similarity labels or task/domain outcomes, using Spearman or Wilcoxon statistics.

7. Comparative Table: EAS Usage Across Domains

Field	EAS Formulation	Principal Use
LLMs/Pruning (2505.21987)	Cosine sim. of dense/pruned acts	Compression, calibration
LLMs/Interpretability (Wang et al., 2024)	Cosine sim. of activation vectors	Layer specialization, domain analysis
BERT/Interpretation (Malkiel et al., 2022)	Gradient × activation saliency	Token/word attribution
Cognitive Neuroscience (Wardle et al., 2015)	Corr. similarity over neural acts	Representational similarity

The diversity of EAS instantiations reflects the underlying generality of empirical activation geometry as a unifying framework for measuring representational, functional, and semantic similarity in both artificial and biological systems.

Markdown Report Issue Upgrade to Chat

References (4)

Exploring Activation Patterns of Parameters in Language Models (2024)

Interpreting BERT-based Text Similarity via Activation and Saliency Maps (2022)

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Empirical Activation Similarity (EAS).

Empirical Activation Similarity (EAS) Overview

1. Fundamental Formulations of EAS Across Modalities

2. EAS in Model Compression and Pruning

3. Layerwise and Domain-Sensitivity Analysis Using EAS

4. Empirical Activation Similarity in Cognitive Neuroscience

5. Applications: Pruning, Interpretability, Retrieval, Model Calibration

6. Implementation Protocols and Assessment

7. Comparative Table: EAS Usage Across Domains

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Empirical Activation Similarity (EAS) Overview

1. Fundamental Formulations of EAS Across Modalities

2. EAS in Model Compression and Pruning

3. Layerwise and Domain-Sensitivity Analysis Using EAS

4. Empirical Activation Similarity in Cognitive Neuroscience

5. Applications: Pruning, Interpretability, Retrieval, Model Calibration

6. Implementation Protocols and Assessment

7. Comparative Table: EAS Usage Across Domains

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research