- The paper presents an 84-format numeric catalog offering a comprehensive, vendor-neutral registry of low-precision formats for machine learning.
- It details bit-exact encode/decode vectors, cryptographic validation, and cross-validation with reference implementations like ml_dtypes.
- The work documents spec-sanctioned divergences and provides practical infrastructure for cross-vendor numerical conformance in ML.
Motivation and Context
The steady expansion of low-precision numerical formats—for example, FP8 variants (E4M3, E5M2), BF16, MXFP4, block-based microscaling, and other research-driven alternatives—has accelerated beyond the pace of interoperable, vendor-neutral, and bit-exact reference data. Model porting and validation across ML accelerators frequently encounter silent representation-layer divergences that are not minimally detectable without a canonical ruler. No comprehensive, multi-format public resource existed before this work that could verify bit-accurate conformance across diverse vendors and standardization efforts, especially in the most widely deployed numerics for machine learning.
This paper bridges that gap by constructing and disseminating two key artifacts:
- A catalog of 84 numeric formats spanning 13 structured clusters, each tracked with a uniform schema and a taxonomy—providing an explicit registry for the evolving landscape;
- A suite of six conformance packs for the most immediately relevant formats (GF16, MXFP4, BF16, FP8 E4M3, FP8 E5M2, E8M0 block), each defined by bit-exact encode/decode vectors and cryptographically anchored via SHA-256 hashes for provenance and tamper evidence.
Practical verification is grounded in cross-validation against reference implementations (notably, ml_dtypes 0.5.4 [Google/JAX]), and all divergences are tracked and reported explicitly, modeling the spec-permitted leeway in interpretation (notably, overflow handling and block-scale quantization). The entire corpus is open-licensed and published for broader adoption and audit.
Catalog Architecture and Claim Verification
The t27 catalog systematizes 84 formats into 13 clusters, such as classic IEEE-754 (binary/decimal), MLLowPrecision (including BF16, TF32, FP8 families), block-based microscaling, GoldenFloat (phi-anchored numerics), Posit/Unum III, LNS, integer/fixed-point, historical vendor types, and emerging quantized formats. This registry is strictly invariant-enforced—e.g., name uniqueness, bit-width constraints, saturation policies present, max-finite positivity, and explicit claim-status (Verified, Empirical_fit, Open_conjecture, Risk, Retracted). Each entry encodes the precise wire-level interpretation, avoiding ambiguities endemic to documentation-driven or vendor-defined references.
Technical reproducibility is ensured through a single-source JSON definition, with downstream code generation into markdown, JSON, Python (dataclasses), Rust (serde-annotated structs), C (header constants), and TypeScript (enums). A continuous integration matrix validates cross-language correctness, and generated artifacts are re-fingerprinted on every update.
Each conformance pack is a structured JSON recording, with per-row fields for input (f64, hex), corresponding bit-patterns, decoded f64, absolute error, and categorical label (zero, normal, subnormal, inf, nan, etc). Every pack is cryptographically fingerprinted (SHA-256); per-pack anchors embed the $3.0$ identity, consistent with the GoldenFloat phi-anchored identity φ2+1/φ2=3 (Vasiliev, 3 Jun 2026).
Strong empirical findings and documented gaps include:
- BF16: All 21 test vectors exactly match
ml_dtypes bfloat16 (21/21), encompassing zero, infinity, NaN (with payload), sub/normal boundaries, overflows, and round-to-nearest-even tie cases. This confirms high inter-vendor agreement for round-trip encoding (2606.09686).
- FP8 E5M2: Full alignment with
ml_dtypes (17/17), verifying all representational and special-value boundaries.
- FP8 E4M3: 15/16 exact matches. The documented divergence is in overflow behavior:
- For value $1000.0$:
- The tt-metal/AMD convention saturates at max-finite (0x7E; $448.0$);
- The JAX/TPU (
ml_dtypes) convention overflows to NaN (0x7F).
- Both behaviors are spec-permitted by OCP MX v1.0; this gap is stated explicitly and is not treated as an implementation error.
- MXFP4 element and E8M0 block: Defined and cross-validated against OCP MX specification tables; functional validation against
ml_dtypes for the latter.
- GF16: Implements phi-anchored numerics; lacking a public reference implementation in
ml_dtypes, conformance is limited to round-trip consistency.
All suites implement an honest abs_error approach: every deviation—underflow, overflow, rounding—is tracked and not suppressed to improve superficial match statistics.
Standardization and Cross-Referencing
A rigorous cross-walk aligns the six principal representation-layer packs with the v3.2.0 interim of IEEE P3109, highlighting:
- Direct format matches (e.g., Binary8p3se ↔ FP8 E4M3; Binary4p1sf ↔ MXFP4 element), with explicit differences (e.g., overflow saturation policies, block-structure in MX).
- No direct P3109 analog for GF16, BF16, and E8M0 block, which are either outside the P3109 operational scope or orthogonal (scale-only fields).
- Preparation for future extensions covering operation-layer conformance (arithmetic, rounding, comparison, and blockwise operations) as per P3109's StandardOperations.yaml and ongoing formal verification initiatives (Chang et al., 17 Feb 2026).
Documented Interpretation Gaps
The paper foregrounds two types of critical, spec-sanctioned gaps as domain-relevant for practitioners:
- FP8 E4M3 Overflow Policy: The spec allows both saturation and NaN for out-of-range inputs. This manifests as silent divergence in hardware, only discoverable through vendor-neutral bit-exact vectors. The conformance pack's role is precisely to reveal and document these gaps rather than implicitly assert uniformity.
- Block Structure Divergences (MXFP4 vs NVFP4): While implementing the same S1E2M1 element, blockwise arrangements differ (32-element E8M0 vs 16-element FP8 E4M3 scales), with nontrivial differences in scale range and quantization. This has practical implications for interop: tensor element match does not guarantee identical decoded tensors due to scale quantization and block sizing.
Limitations and Scope
The current release verifies only the representation layer—not arithmetic operations, which are slated for future tracks. Oracle cross-validation is limited to ml_dtypes; other oracles, e.g., libtakum for Posit/Unum or Pychop for arbitrary low-precision, are not yet integrated. Catalog coverage prioritizes the formats with actionable reference material; less mature or formally described types are included only as registry rows pending full vetting.
No accuracy, performance, or optimality claims are made regarding any format. The contribution is strictly infrastructural, not prescriptive. All artifacts (catalog, codegen paths, conformance vectors, manifest hashing) are fully open-source, enabling both direct reuse and third-party audits.
Implications and Future Directions
This work provides a direct, immediate impact for accelerator vendors, compiler/toolchain integrators, and researchers engaged in ML hardware/firmware validation. By providing a uniform, bit-exact, cryptographically anchored, and provenance-traceable reference for the most salient representations in contemporary training and inference, it greatly reduces the effort and ambiguity of cross-hardware numerical porting and validation.
The planned extension to operation-layer vectors, broader oracle integration, fuzz-based round-trip validation, and expansion of catalog coverage to less mainstream or newly emerging formats will increase its utility as the numeric substrate for standardized conformance testing in ML.
The explicit documentation of vendor- and spec-sanctioned divergence points models an essential practical attitude: uniformity cannot be presumed, and cross-vendor compatibility must be empirically established rather than inferred from documentation.
Conclusion
"An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors" (2606.09686) establishes a robust, reproducible, and open-licensed foundation for cross-vendor and cross-standard evaluation of low-precision floating-point and microscaling formats. By resolving long-standing gaps in the reference infrastructure for bit-level conformance, explicitly documenting spec-permitted differences, and providing a scalable schema for rapid format inclusion, it sets a new baseline for industry-wide numerical interoperability and standards compliance in machine learning. Its practical utility will compound as the catalog expands and the operation layer is incorporated, offering a model for rigorous, vendor-agnostic numeric specification in AI hardware and software engineering.