Fixation Sampling Algorithm

Updated 23 June 2026

Fixation Sampling Algorithms are systematic methods that extract points of focused attention from complex signals using deterministic, stochastic, and learning-based frameworks.
They are applied in saliency prediction and deep learning architectures to optimize attention mechanisms, reduce computational costs, and enhance model precision.
In fields like evolutionary dynamics, surgical planning, and VR self-calibration, these algorithms improve simulation efficiency and optimize spatial or event-based sampling.

A fixation sampling algorithm refers to any systematic procedure that outputs locations, intervals, or events corresponding to “fixations”—discrete or continuous points of attentive focus—within a signal, map, trajectory, or stochastic process. The term arises primarily in visual neuroscience, computational vision, eye-tracking technology, deep learning models for saliency prediction, stochastic evolutionary models (such as fixations of genetic types), and in procedural planning (e.g., surgical fixation paths). Modern fixation sampling algorithms span deterministic, stochastic, parametric, and learning-based frameworks, often designed to faithfully mimic empirical fixation generation or to efficiently sample trajectories, points, or event times with specific statistical properties.

1. Fixation Sampling in Saliency Prediction

In computer vision, particularly in saliency prediction, fixation sampling algorithms are used to convert dense human fixation maps into sparse, information-preserving sets and to design loss functions for neural network training. In "Learning Saliency Prediction From Sparse Fixation Pixel Map" (Xiao, 2018), the fixation sampling algorithm operates as follows:

Sparse Fixation Extraction: Given a binary fixation pixel map $F \in \{0,1\}^{H' \times W'}$ (obtained by thresholding human gaze heatmaps), the algorithm identifies all fixation points $\{\mathbf{x}_i\}_{i=1}^N$ with $F(\mathbf{x}_i)=1$ .
Clustering: A core step employs hierarchical agglomerative clustering (Ward linkage, Euclidean affinity) to partition $\{\mathbf{x}_i\}$ into $K \ll N$ clusters. The objective minimizes the within-cluster sum of squares at each merge:

$\Delta(A,B) = \frac{|A||B|}{|A|+|B|} \|\mu_A - \mu_B\|^2$

where $\mu_A$ and $\mu_B$ are the centroids of clusters $A$ and $B$ .

Parameter Tuning: The number of clusters $\{\mathbf{x}_i\}_{i=1}^N$ 0 is determined by sweeping $\{\mathbf{x}_i\}_{i=1}^N$ 1 and evaluating the information loss relative to the ground-truth Gaussian-blurred saliency map, optimizing metrics such as AUC-Judd, NSS, Similarity, and KLD. On SALICON, $\{\mathbf{x}_i\}_{i=1}^N$ 2 yields an optimal trade-off.
Sparse Map Formation: Each cluster center is rounded to the nearest pixel $\{\mathbf{x}_i\}_{i=1}^N$ 3, and a sparse map $\{\mathbf{x}_i\}_{i=1}^N$ 4 sets $\{\mathbf{x}_i\}_{i=1}^N$ 5, zeros elsewhere—defining the sparse “fixation” output.
Max-Pooling Postprocessing: To address spatial misalignment between predicted and ground-truth fixations, a $\{\mathbf{x}_i\}_{i=1}^N$ 6 max-pooling operation is applied to predictions before loss computation, ensuring that near-misses do not incur severe loss penalties.
Loss Function: The final training loss is a pooling Kullback–Leibler divergence:

$\{\mathbf{x}_i\}_{i=1}^N$ 7

$\{\mathbf{x}_i\}_{i=1}^N$ 8 is the normalized sparse fixation map, and $\{\mathbf{x}_i\}_{i=1}^N$ 9 is the pooled prediction.

This clustering-based sampling approach is foundational for learning from sparse, high-fidelity representations of human eye fixations (Xiao, 2018).

2. Fixation Sampling in Deep Learning Architectures

Recent neural architectures integrate explicit fixation sampling modules to improve computational efficiency and focus model capacity on informative regions.

Task-Driven Fixation Network (TDFN): In "Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection" (Wang et al., 2 Jan 2025), fixation sampling is implemented as a module that outputs a sequence of attention points $F(\mathbf{x}_i)=1$ $F (x_{i}) = 1$ 0. These points drive high-resolution cropping and feature extraction in a two-channel transformer architecture, fusing low-resolution global and local high-resolution features.
- Fixation Point Generator: This component, parameterized by a two-layer MLP, produces a categorical distribution over discretized image locations at each step, sampling $F(\mathbf{x}_i)=1$ 1, where $F(\mathbf{x}_i)=1$ 2 and $F(\mathbf{x}_i)=1$ 3 is the internal recon token from multi-modal fusion.
- Training: The generator is initially decoupled (fixations sampled uniformly at random); it is subsequently fine-tuned using policy gradients (REINFORCE), with the reward signal equaling the task loss reduction after each fixation.
- Complexity: Fixation sampling reduces self-attention costs by an order of magnitude compared to exhaustive high-resolution global processing (Wang et al., 2 Jan 2025).

This direct fixation sampling design, driven by downstream task objectives and learned via reinforcement learning, formalizes and automatizes fixation choice in neural attention models.

3. Fixation Sampling in Stochastic Processes

In stochastic modeling—particularly population genetics and evolutionary dynamics—the "fixation sampling algorithm" refers to procedures for sampling fixation times or events in Markov processes.

Moran Process and Effective Step Sampling: In "Faster Monte-Carlo Algorithms for Fixation Probability of the Moran Process on Undirected Graphs" (Chatterjee et al., 2017), an effective-step sampling procedure greatly accelerates simulation of fixation events by skipping ineffective steps (those that do not change system state).
- Algorithm: Points of fixation are only sampled when a transition occurs (a node flips), identifying these effective steps via degree- and type-based data structures. Each effective step is simulated in $F(\mathbf{x}_i)=1$ 4 time, and the expected number of effective steps to absorption is bounded and tight.
- Monte Carlo Implementation: Repeated meta-simulations, with effective-step truncation, yield a Fully Polynomial Randomized Approximation Scheme (FPRAS) for the fixation probability, with substantial speedup over classical methods (Chatterjee et al., 2017).
Sampling the Fixation Time Distribution—Birth-Death Process: In "When the mean is not enough: Calculating fixation time distributions in birth-death processes" (Ashcroft et al., 2015), fixation sampling is framed as a phase-type process:
- Eigenspace Algorithm: The transient generator matrix $F(\mathbf{x}_i)=1$ 5 is spectrally decomposed; waiting times for transitions are sampled from exponentials determined by the eigenvalues, with explicit transition probabilities. This yields exact samples from the fixation time distribution at per-sample cost $F(\mathbf{x}_i)=1$ 6 after an $F(\mathbf{x}_i)=1$ 7 preprocessing phase, dramatically outperforming direct trajectory simulation (Ashcroft et al., 2015).

These algorithms, by focusing sampling effort solely on informative steps or by reparameterizing the process in spectral space, enable exact statistics or efficient rare-event simulation for fixation-related quantities.

4. Fixation Sampling in Cognitive and Scanpath Models

Generative fixation sampling is central to models of eye-movement behavior, particularly in reading or scene viewing.

Eyettention II—Autoregressive Scanpath Sampling: "Eyettention II: A Dual-Sequence Architecture for Modeling Fixation Location, Within-Word Landing Position, and Fixation Duration in Reading" (Deng et al., 1 Jun 2026) details a fixation sampling algorithm that outputs triplets $F(\mathbf{x}_i)=1$ $F (x_{i}) = 1$ 8: word index, landing position, and duration. The model is structured as follows:
- Bi-Encoder Architecture: Separate encoders for word sequences (contextual embeddings via frozen transformers plus BiLSTM) and fixation sequences (stacked LSTM). The previous fixation state attends via Gaussian-smoothed cross-attention over a window of neighboring words.
- Autoregressive Sampling: At each iteration, the algorithm (a) softmax-samples the next word index $F(\mathbf{x}_i)=1$ 9, (b) regresses the landing point $\{\mathbf{x}_i\}$ 0, and (c) predicts the duration $\{\mathbf{x}_i\}$ 1, updating the fixation state history for the next step.
- Training and Loss Function: The combined NLL, position MSE, and duration MSE are minimized via teacher forcing, ensuring the sampling distribution is closely matched to empirical human scanpaths.
- Cognitive Alignment: The cross-attention mechanism with local Gaussian window models perceptual foveation and saccadic behavior, paralleling cognitive models such as SWIFT and E-Z Reader (Deng et al., 1 Jun 2026).

These implementations demonstrate the use of fixation sampling as both a generative and discriminative process, systematically yielding biologically plausible or statistically matched fixation sequences for downstream applications in psycholinguistics and human-AI interaction.

5. Fixation Sampling in Self-Calibration and Anatomical Planning

Fixation sampling algorithms also underpin calibration and planning methodologies where the identification and evaluation of fixation-like locations are central.

VR Eye-Tracking Self-Calibration: In "Fixation-based Self-calibration for Eye Tracking in VR Headsets" (Uramune et al., 2023), a 3D extension of the I-VDT (velocity and dispersion threshold identification) algorithm is used.
- Algorithm: Fixations are robustly detected in noisy, head-motion-compensated 3D gaze data by simultaneous velocity (gaze stability) and projected-dispersion (tightness of point-of-regard clusters) criteria. Aggregated fixations drive a global calibration procedure by minimizing the mean-squared projection error (dispersion) with respect to the calibration parameters.
- Optimization: The minimization is performed using differential evolution, subdividing the parameter space for robustness.
Surgical Planning—Arc Fixation Sampling: In "An Internal Arc Fixation Channel and Automatic Planning Algorithm for Pelvic Fracture" (Yang et al., 2021), candidate internal arc fixation channels are sampled across a discretized product of feasible points on the bone surface. Each arc is constructed by fitting a circle through candidate entry, middle, and exit points, then sampled along its length for safety evaluation (clearance from cortex, anatomical constraints). The optimal arc is selected via a multi-stage safety filter and optimization over the candidate sample space.

These approaches repurpose fixation sampling to robustly aggregate, select, or optimize spatial locations under physical, anatomical, or geometric constraints.

6. Fixation Sampling in Structured Population Models

In mathematical population genetics, sampling from trajectories or genealogies conditioned on fixation events is central.

$\{\mathbf{x}_i\}$ 2-Seed-Bank-Wright-Fisher Process Conditioned on Fixation: The process in (Fittipaldi et al., 25 Nov 2025) describes allele frequencies in populations with skewed offspring distributions and dormancy. Conditioning on fixation is performed via a Doob $\{\mathbf{x}_i\}$ ${x_{i}}$ 3-transform, altering the dynamics to force eventual absorption at a specified state.
- Lookdown Sampling: The lookdown construction provides an explicit sampling algorithm for trajectories conditioned on eventual fixation. The system is simulated as an event-driven particle process, with coordinated mutation events and a random switching environment driving the fixation (Fittipaldi et al., 25 Nov 2025).
- Duality and Coalescence: Backward-in-time sampling (the dual) provides efficient calculation of sampling probabilities and moments for the fixation-conditioned process.

Fixation sampling in these structured models often leverages advanced probabilistic and algorithmic techniques to handle pathwise conditioning and rare-event simulation.

7. Summary Table: Fixation Sampling Algorithmic Contexts

Research Area	Fixation Sampling Role	Representative Reference
Saliency/vision DNN	Clustered map sparsification	(Xiao, 2018)
Task-driven neural attention	Learned gaze selection points	(Wang et al., 2 Jan 2025)
Evolutionary dynamics	Efficient event simulation	(Chatterjee et al., 2017, Ashcroft et al., 2015)
Cognitive scanpath modeling	Generative fixate (k,l,d)	(Deng et al., 1 Jun 2026)
VR eye-tracking/self-calibration	Fixation aggregation/calib	(Uramune et al., 2023)
Surgical planning	Feasible channel sampling	(Yang et al., 2021)
Population genetics (fixation)	Conditioned path/genealogy	(Fittipaldi et al., 25 Nov 2025)

Each context adapts fixation sampling algorithms to the structure, demands, and scientific goals of the domain, ranging from efficient simulation and learning to robust medical planning and behavioral data synthesis.