Gaze Transition Entropy (GTE) Metrics
- Gaze Transition Entropy (GTE) is an information-theoretic metric that quantifies the predictability of sequential gaze transitions between predefined areas of interest.
- It is computed by estimating transition probabilities from eye-tracking data and applying conditional Shannon entropy under a first-order Markov assumption.
- Empirical studies show lower GTE in structured, goal-directed scanning while higher GTE indicates erratic, exploratory gaze behavior that may signal increased cognitive load.
Gaze Transition Entropy (GTE) is an information-theoretic metric used to quantify the predictability of sequential gaze transitions between predefined Areas of Interest (AOIs) during visual tasks. It formalizes the uncertainty associated with the location of the next fixation (Fₜ₊₁) given the current fixation (Fₜ), operationalizing a first-order Markov assumption over the scanpath sequence. Lower GTE values indicate structured, goal-directed scanning, whereas higher values reflect erratic, exploratory gaze patterns. GTE is widely applied in cognitive science, human-computer interaction, and applied domains such as design evaluation and adaptive interfaces, often complementing static measures like Stationary Gaze Entropy (SGE) to provide a dynamic perspective on attentional control (Wollstadt et al., 2020, Hakiminejad et al., 5 Jan 2025).
1. Theoretical Definition and Mathematical Formalism
Gaze Transition Entropy is grounded in conditional Shannon entropy, capturing the average remaining uncertainty about Fₜ₊₁ given knowledge of Fₜ. Let Fₜ be a random variable denoting the AOI of the fixation at time t, assuming values in a finite set . The joint transition probability is estimated from scanpaths by tallying adjacent AOI-pair occurrences:
where . The double sum averages the conditional log-probability over all possible AOI transitions, and the negative sign ensures non-negativity. Conceptually, GTE provides a Markov order-1 model of scanpath uncertainty, with lower entropy indicating that the next fixation is highly predictable from the previous, and higher entropy consistent with random exploration (Wollstadt et al., 2020).
2. Data-Driven Estimation and Workflow
Empirical estimation of GTE from eye-tracking data proceeds by first partitioning the stimulus into n non-overlapping AOIs. Each fixation is mapped to an AOI index, generating a sequence for each participant. All observed transitions from to are counted (), then normalized to obtain the first-order Markov transition probability matrix :
The stationary probability of visiting AOI is:
The operational formula for GTE is:
GTE can be contrasted with SGE, defined as , which characterizes instantaneous spatial dispersion without regard to sequence. SGE provides a univariate summary of AOI occupancy, whereas GTE encodes the sequential structure of gaze allocation (Hakiminejad et al., 5 Jan 2025).
3. Interpretive Scope and Limitations
GTE is a direct quantification of scanpath unpredictability under a first-order Markov assumption. Its principal limitation is the neglect of higher-order temporal dependencies: patterns or regularities involving more than the immediately preceding fixation are ignored. Empirical validation of the order-one assumption is infrequent in practical gaze studies. When scanpaths exhibit long-range regularities (e.g., cyclic revisits, context-dependent returns), GTE tends to overestimate entropy, thus underreporting actual predictability (Wollstadt et al., 2020). For instance, if three-step cyclical returns govern gaze, the conditional uncertainty computed from only pairwise transitions remains artificially high. This limitation has motivated the search for metrics incorporating longer memory or multivariate dependencies.
4. Comparative Information-Theoretic Measures
Active Information Storage (AIS) has been proposed as a superset metric that overcomes the Markov order-one restriction of GTE. AIS quantifies the mutual information between the next fixation and a vector of past fixations, parametrized by an embedding dimension :
where . AIS admits data-driven selection of relevant lags, using non-uniform embedding based on greedy statistical testing, thus flexibly capturing long-range dependencies as warranted by the data. Empirically, at least one higher-order lag was informative in 74% of trials, indicating that scanpaths are often non-Markovian (Wollstadt et al., 2020). Unlike GTE, which measures residual uncertainty, AIS directly measures the information contained in prior fixations about the next, thus being more sensitive to structured patterns spanning multiple transitions.
5. Experimental Applications and Findings
In applied research on public transportation design, GTE has been used as a core metric to evaluate the predictability of gaze behavior across varied built environments (Hakiminejad et al., 5 Jan 2025). During a controlled 10-second viewing window, participants’ fixations across manually defined AOIs were recorded, parsed, and used to compute transition matrices. GTE values were found to be lower in enhanced, biophilic, cyclist-friendly, and productivity-focused cabin designs compared to the conventional, poorly maintained baseline. Specifically, lower GTE in alternative environments was interpreted as reflecting more predictable, guided scanpaths, suggesting reduced cognitive load and more efficient allocation of attention. Conversely, higher GTE in degraded environments signaled erratic, cognitively demanding exploration. Demographic factors (e.g., ethnicity, transport habits) modulated fixation duration and related gaze metrics, but did not perturb the central finding that thoughtful design can shape gaze regularity as indexed by GTE.
6. Practical and Methodological Implications
The choice between GTE and more general measures such as AIS is contingent upon the empirical gaze dynamics of the task. GTE, being computationally efficient and interpretable, remains suitable when scanpaths are known or assumed to be well-modeled as order-one Markov chains. For tasks or users exhibiting long-range dependencies or highly structured patterns, AIS offers a principled, data-adaptive measure of predictability. In human–machine interaction and adaptive interface contexts, fluctuations in GTE or AIS can be leveraged for real-time inference of user state (e.g., stress, overload), enabling responsive system adaptations when attention becomes disordered (Wollstadt et al., 2020). A plausible implication is that combination of SGE, GTE, and AIS affords a comprehensive toolkit for both static and dynamic analysis of visual attention, with each metric capturing distinct but complementary aspects of gaze behavior.