Papers
Topics
Authors
Recent
Search
2000 character limit reached

AdaSplash: GPU-Efficient Adaptive Sparse Attention

Updated 12 June 2026
  • AdaSplash is a family of GPU-efficient adaptive sparse attention algorithms that employs the α-entmax transformation to enhance scalability and accuracy in transformers.
  • It integrates hardware-tailored kernels, specialized root-finding solvers, and bitpacked block masking to achieve superior throughput compared to prior approaches.
  • AdaSplash-2 introduces a histogram-based initialization scheme, effectively addressing both algorithmic and system challenges posed by adaptive, input-dependent sparsity.

AdaSplash is a family of GPU-efficient adaptive sparse attention algorithms for transformers, centered on high-performance implementations of the α-entmax family of attention mechanisms. AdaSplash methods address both algorithmic and systems challenges posed by adaptive, input-dependent sparsity in attention, surpassing prior α-entmax implementations in efficiency, scale, and integration with end-to-end transformer training. The approach leverages specialized root-finding solvers, hardware-tailored kernels, bitpacked block masking, and, in AdaSplash-2, a histogram-based initialization scheme that dramatically accelerates the computation of the entmax normalizer. AdaSplash methods achieve competitive or superior throughput to FlashAttention-2 in moderate-to-high sparsity regimes and maintain accuracy head-to-head with softmax baselines on both short- and long-context benchmarks (Gonçalves et al., 17 Feb 2025, Gonçalves et al., 16 Apr 2026).

1. Mathematical Foundations: α-entmax and Sparse Attention

The α-entmax transformation is a parametric family of differentiable, input-adaptive sparse alternatives to softmax [Peters et al. 2019]. For a score vector sRns\in\mathbb{R}^n, the softmax attention weights are given by

$$

\

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to AdaSplash.