Takum Format: Tapered-Precision Arithmetic

Updated 24 March 2026

Takum Format is a family of real number representations that generalizes tapered-precision floating-point arithmetic with a capped regime run-length to ensure a minimum mantissa allocation.
The format employs novel exponent regime coding which simplifies hardware design, offering flat latency and efficient resource usage compared to traditional posit or IEEE 754 methods.
Takum has demonstrated robust performance in scientific computing and mixed numeric workloads, providing enhanced accuracy and numerical stability in high-performance applications.

The Takum number format is a family of machine representations for real numbers that generalize "tapered-precision" floating-point arithmetic. Takum arithmetic combines features from both posit arithmetic and classical IEEE 754 floating-point, introducing novel exponent regime coding to provide a bounded but extremely large dynamic range with a guaranteed lower bound on precision for all representable values. The key distinction is a capped exponent run—restricting regime length and thereby ensuring minimal mantissa allocation across the entire range, avoiding the extremes of lost precision seen in posits, while also simplifying hardware codec design. Takum codecs have been implemented in VHDL and exhibit advantageous latency, resource usage, binary coding efficiency, and numerical robustness in a variety of high-performance and scientific computing applications.

1. Formal Structure and Parameterization

An $n$ -bit Takum codeword $T$ is structured as

$T = (S,\;D,\;R_{r-1:0},\;C_{r-1:0},\;M_{p-1:0})$

where

$S \in \{0,1\}$ : sign bit,
$D \in \{0,1\}$ : direction ("regime-sign") bit,
$R_{r-1:0}$ : 3-bit regime pattern, interpreted with $D$ to yield regime run-length $r \in \{0,\ldots,7\}$ ,
$C_{r-1:0}$ : characteristic (exponent) field of length $r$ ,
$M_{p-1:0}$ : mantissa (fraction) field, $p = n - 5 - r$ .

Detailed decoding proceeds:

For $D=1$ , $r = \mathit{to\_uint}(R)$ ; for $D=0$ , $r = \mathit{to\_uint}(\overline{R})$ .
The raw characteristic $uc = \mathit{to\_uint}(C)$ .
The signed characteristic:

$c = \begin{cases} -2^{r+1} + 1 + uc & D=0\ 2^r - 1 + uc & D=1 \end{cases}$

Fraction: $f = \mathit{to\_uint}(M)/2^p$ .

The interpreted value (linear Takum) is

$\hat\tau(T) = \begin{cases} 0 & \text{special zero pattern}\ \mathrm{NaR} & \text{NaR pattern}\ (-1)^S(1+f)2^e & \text{otherwise} \end{cases}$

where $e = (-1)^S(c+S)$ and $\mathrm{NaR}$ denotes "Not a Real" (Hunhold, 2024, Hunhold, 2024).

Takum can also be realized as a logarithmic number system (LNS) variant, with $\ell = (-1)^S(c+f)$ , giving the real value $\tau(T) = (-1)^S e^{\ell}$ .

2. Regime Coding, Dynamic Range, and Tapered Precision

Unlike IEEE 754 (fixed-length exponent) or posit (unbounded regime consuming fraction bits for large exponents), Takum caps the regime run to $7$:

Exponents $c$ are restricted to $|c| \leq 2^8 \approx 255$ .
The dynamic range thus saturates at $n=12$ bits: $\left| \hat{\tau} \right|_{\max} \approx 2^{254}$ , $\left| \hat{\tau} \right|_{\min} \approx 2^{-255}$ , with $c \in [-255,254]$ .

Mantissa length $p = n - 5 - r$ is always lower-bounded: $p \ge n - 12$ , avoiding the degenerate case of zero mantissa bits. For small $|c|$ (i.e., around unity), $r$ is small and $p$ approaches $n-6$ ; for extreme exponents, $r = 7$ and $p = n-12$ . The result is a tapered-precision profile similar to posit but with less severe mantissa starvation at scale (Hunhold, 2024, Hunhold, 2024).

Exponent/characteristic decoding is implemented via a small set of combinatorial operations (masking, incrementing, conditional negation), and regime detection never requires scanning more than 7 bits, enabling flat decoding latency as $n$ increases.

3. Hardware Architecture and Implementation

Takum codecs have been architected for FPGA (Vivado 2024.1, Kintex US+ KCU116), with modular pipelines for both encode and decode:

Decoder stages: predecoder (extract S, D, regime window), regime detector (run counter), exponent determinator, postprocessing into LNS or FP outputs, and final flagging.
Encoder stages: under/overflow predictor, characteristic precursor, leading-one detector (LOD), rounding generator, bit packer, and override logics.

Key RTL modules include 8×1 LOD blocks, 8-gate conditional-OR biases, barrel shifters, and parallel incrementers. All modules are combinatorial and designed for single-cycle pipelines (Hunhold, 2024).

Measured performance:

$n$	Takum (Decoder CLB LUTs)	Takum (Decoder max latency, ns)	Posit (FloPoCo-2C LUTs)	Posit (latency, ns)
8	22 / 21	3.19 / 3.06	15	3.37
16	39 / 39	3.66 / 3.65	57	3.91
32	68 / 67	3.66 / 3.65	106	4.91
64	125 / 125	3.66 / 3.66	250	5.86

The decoder latency and logic usage for Takum remain virtually flat as $n$ increases, in contrast to posit implementations, which scale less favorably due to unbounded regime detection. Takum codecs typically run 10–30% faster and use up to 50% fewer LUTs at larger widths (Hunhold, 2024).

4. Role in Scientific and Numerical Computing

Takum arithmetic has been evaluated in core scientific kernels including FFT-based spectral solvers, sparse linear algebra, and Krylov subspace eigenmethods, with direct comparisons to IEEE 754, posit, float16/bfloat16, and OFP8.

FFT and Spectral Methods: Takum16 consistently outperforms bfloat16 and float16 (which frequently overflow in PDE solvers), approaches posit16 numerically, and avoids the instability of posit at high dynamic range. Takum16 achieves $\mathcal{O}(10^{-3})$ to $\mathcal{O}(10^{-4})$ error in heat- and Poisson-equation solvers, and remains robust in image and audio transforms (Hunhold et al., 29 Apr 2025).
Sparse Linear Solvers: In direct solvers (LU, QR), Takum formats show no range failures at 8–16 bits, outperform both IEEE and posit in accuracy, and avoid breakdowns typical in low-precision arithmetic. In mixed-precision iterative refinement and preconditioned GMRES, Takum matches or exceeds posit performance and converges where IEEE and bfloat16 do not (Hunhold et al., 2024).
Krylov Eigensolvers: Takum yields consistently smaller relative eigenvalue and eigenvector errors than float16/32/64 and posit, especially at higher precisions. It delivers an order of magnitude improvement in stability for 32- and 64-bit types and remains viable in 8- and 16-bit low-precision settings where IEEE and posit often fail (Hunhold et al., 29 Apr 2025).

5. Integer Representation and Mixed Numeric Workloads

Rigorous analysis demonstrates that Takums represent integer values more efficiently than posits and, for $n\ge32$ , match or surpass IEEE 754 formats in largest consecutive exact integer representable. The minimal Takum encoding of an integer $m$ of bit-width $v$ uses $\approx v+\log_2v+4$ bits, compared to $v+1$ for IEEE and $1.25v$ for posit. The safe-integer range grows as $2^{n-3-\log_2(n-3)}$ , already exceeding $2^{53}$ at $n=64$ (Hunhold, 2024). This property is attractive in numerically mixed scenarios (e.g., discrete indices or counters within floating-point-heavy pipelines).

6. Architectural and ISA Integration

Replacement of diverse non-standard IEEE 754-based types with Takum (for example, in Intel AVX10.2 SIMD ISAs) enables substantial instruction-set simplification. Takum provides:

Uniform regime decoder logic for all sizes ( $n$ ), reducing decoder logic by $\approx25\%$ and microcode size by up to 40%.
One-to-one mapping of arithmetic and conversion instructions, with a flexible, parameterizable data-type family (T8, T16, T32, T64) supplanting the heterogeneous E4M3, E5M2, bfloat16, and float16 formats.
Empirically, T8 achieves $\approx90\%$ of float32 accuracy at 8-bit storage and near-identical vector throughput, but with lower code size and complexity (Hunhold, 18 Mar 2025).

7. Limitations, Trade-Offs, and Prospective Applications

The Takum regime cap ( $r \le 7$ ) sets a hard bound on dynamic range ( $2^{254}$ in linear mode, $e^{255}$ in log mode), which suffices for most scientific and ML workloads but does not reach arbitrary exponent magnitudes supported by posits as $n \to \infty$ . In LNS Takum, arithmetic operations other than multiplication/division (notably addition) require Gaussian log hardware or table-lookup, analogous to classic LNS drawbacks.

Notably, Takum codecs guarantee a minimal mantissa width, avoid all-mantissa starvation, and provide monotonic injective binary representations. Primary use cases include:

Mixed-precision deep learning (enabling 8–16 bit uniform arithmetic with robust dynamic range and stable rounding).
Numerically sensitive HPC kernels (where dynamic range robustness and error analysis simplicity are necessary).
SIMD/FPGA vector pipelines and hardware streamlining (uniform opcode and register design).

Takum arithmetic exhibits an overview of tapered-precision efficiency, minimal loss of near-unity precision, and maximal representational regularity, supporting both advanced hardware pipelines and large-scale numerical computing (Hunhold, 2024, Hunhold, 2024, Hunhold et al., 29 Apr 2025, Hunhold et al., 2024, Hunhold, 2024, Hunhold et al., 29 Apr 2025, Hunhold, 18 Mar 2025).