Papers
Topics
Authors
Recent
Search
2000 character limit reached

Ironwood: Quantum-Resilient Protocols & AI Hardware

Updated 1 July 2026
  • Ironwood is a dual system integrating a post-quantum cryptographic protocol based on braid groups with a seventh-generation TPU for AI.
  • The Ironwood MKAAP leverages non-abelian braid group algebra to resist both classical and quantum attacks, achieving efficient key agreement on constrained platforms.
  • The TPU 7 architecture employs multi-chiplet designs and high-speed interconnects to deliver exascale AI performance with enhanced energy efficiency and fault tolerance.

Ironwood denotes two distinct and significant systems in contemporary computing and cryptography: the Ironwood Meta Key Agreement and Authentication Protocol (MKAAP), an advanced post-quantum authentication and key agreement protocol leveraging braid group algebra, and the Ironwood (TPU 7), the seventh generation of Google’s AI training supercomputers, representing a culmination of eight years of architectural scaling in high-performance, resilient, and sustainable AI hardware. The following sections provide a comprehensive exposition of both Ironwood protocols and systems, including their internal methodology, core mathematical and computational innovations, performance metrics, and architectural significance in their respective domains.

1. Ironwood Meta Key Agreement and Authentication Protocol (MKAAP)

Ironwood MKAAP is an asymmetric-style protocol for mutual authentication and ephemeral key agreement, designed to withstand quantum attacks by exploiting the complexity of group-theoretic operations in braid groups rather than elliptic curve or number-theoretic primitives. Deployment enables authentication of two entities, “Home Device” (HD) and “Device” (DiD_i), with only a single pre-provisioning stage from a Trusted Third Party (TTP), but without real-time third-party interaction (Anshel et al., 2017).

System Model and Provisioning

Key parameters and entities in Ironwood include:

  • NN: Even integer, N≥10N \geq 10, size of the Artin braid group BNB_N.
  • FqF_q: Finite field, q≥7q \geq 7.
  • m0m_0: Non-singular base matrix in GL(N,Fq)GL(N, F_q).
  • CαC_\alpha, CÎłC_\gamma: TTP-selected conjugate sets in NN0 whose elements pairwise commute across sets.
  • NN1-values NN2: Non-unit elements in NN3, distributed per device.

Key provisioning involves TTP sampling private matrices NN4 (polynomials in NN5) and private braids for each NN6, generating signed device-specific certificates NN7. HD only needs to store NN8 and NN9-values post-provisioning.

Key Agreement and Protocol Flow

The interactive protocol consists of these high-level stages:

  1. N≥10N \geq 100 presents N≥10N \geq 101 to HD.
  2. HD selects random matrices and braids, computes E-Multiplications via the Colored Burau representation, then blends these with the public data from N≥10N \geq 102.
  3. Shared secret is computed from designated columns of resulting matrices and exchanged, with device-side verification ensuring protocol consistency and mutual authentication.
  4. Mutual authentication is finalized using hashes or MACs over the shared secret and fresh nonces.

Algebraic Backbone

The protocol’s security derives from the infinite, non-abelian, torsion-free structure of N≥10N \geq 103, their representation in terms of colored Burau matrices, and the E-Multiplication operation:

N≥10N \geq 104

where N≥10N \geq 105 substitutes N≥10N \geq 106-values into the corresponding Laurent polynomial entries, and N≥10N \geq 107 permutes matrix indices.

2. Security Properties and Quantum Resistance of MKAAP

Ironwood MKAAP is specifically constructed to resist both classical and quantum attacks:

  • Classical attacks such as invalid-public-key, length-based, and simultaneous conjugacy attacks are prevented since adversaries cannot obtain both conjugate sets and state validation requires nonzero entry checks and valid certificates.
  • Quantum resistance: Shor’s algorithm is ineffective over N≥10N \geq 108 due to its non-abelian, infinite nature. Secret guessing complexity scales linearly in N≥10N \geq 109, so Grover’s quantum search only provides quadratic, not exponential, improvement. For BNB_N0, BNB_N1, brute-force attack cost exceeds BNB_N2.
  • Weak key mitigation: Probability of weak (commuting) matrices occurring is negligible: BNB_N3.

3. Implementation and Performance: MKAAP

Ironwood is engineered for efficient execution on resource-constrained platforms typical in IoT:

Platform Clock ROM (bytes) RAM (bytes) Avg. Key-Agreement Time
MSP430 25 MHz 3,126 354 212 ms
ARM Cortex-M3 (LPC1768) 48 MHz 2,578 1,192 37.4 ms
ARM Cortex-M3 (CC2650) 48 MHz 3,568 1,192 37.4–37.6 ms

For comparison, Curve25519 key agreement typically requires 200–700 ms and BNB_N48 kB code size on comparable MCUs. This demonstrates that Ironwood achieves sub-millisecond mutual authentication and shared secret agreement at ROM BNB_N5 kB, RAM BNB_N6 kB, and with quantum-resilient primitives (Anshel et al., 2017).

4. Architectural Innovations in Google Ironwood TPU (TPU 7)

Google’s “Ironwood” denotes the seventh-generation TPU, representing the apex of a lineage focused on architectural stability, massive scaling, and efficiency for AI training applications (Jouppi et al., 14 Jun 2026). Its architecture is characterized by:

  • Multi-chiplet packaging: Two compute dies per package with eight HBM3E stacks, four per die; four times the HBM2E stacks of previous generations.
  • Enhanced TensorCores: Each with four BNB_N7 BF16 arrays and four BNB_N8 FP8 arrays; doubles both count and size of v5p arrays.
  • Vector and fabric scaleout: 16 vector lanes of BNB_N9-bit (was FqF_q0), each with four full ALUs (was two restricted).
  • Persistent SparseCores: Four per node, each with 16 tiles for embedding and collective operations.
  • High-speed interconnect: Six FqF_q1 GB/s ICI links per node, full 3D torus at pod scale, with distributed on-chip routers.

The VMEM scratchpad remains compiler-managed (128 MiB), eschewing hardware caches for predictable memory operations.

5. Performance, Scaling, and Power Efficiency of Ironwood TPU

Ironwood achieves significant scaling in compute and memory bandwidth:

  • HBM memory: FqF_q2 GiB/node (FqF_q3 increase vs. v2); FqF_q4 GB/s/node (FqF_q5).
  • Per-node compute performance:
    • FqF_q6 PFLOPS
    • FqF_q7 PFLOPS
  • Pod-scale throughput: FqF_q8; FqF_q9 EFLOPS, q≥7q \geq 70 EFLOPS.

Power and sustainability advances are quantitatively significant:

  • Power efficiency: q≥7q \geq 71; q≥7q \geq 72 improvement over TPU v2, driven by both architectural and process scaling.
  • Carbon intensity (“CCI”): q≥7q \geq 73 gCOâ‚‚e/ExaFLOP (q≥7q \geq 74 gCOâ‚‚e/FLOP), a q≥7q \geq 75 (operational) and q≥7q \geq 76 (embodied) improvement over TPU v4.

6. Fault Tolerance, Network Architecture, and Scalability

Ironwood incorporates advanced features for large-scale AI job reliability and deployment:

  • Optical circuit switches (OCS): Millisecond-responsive, scalable 3D MEMS-mirror OCSes support rapid topology changes, incremental upgrades, and routing around failures; a single cube of q≥7q \geq 77 chips forms the network building block.
  • Functional Built-In Self-Test (FBIST): MXU-embedded PVT testers run during production, burn-in, and in situ, targeting silent fault exclusion.
  • Hardware VPU replay: Compiler-transparent, lane-randomized replay for on-the-fly detection of transient datapath errors, maintaining q≥7q \geq 7890% goodput at pod scale.

7. Defining Features of Ironwood Systems

Across both Ironwood cryptography and AI hardware, the following design features are emphasized (Anshel et al., 2017, Jouppi et al., 14 Jun 2026):

  • For MKAAP: Unique blend of asymmetric deployment properties and symmetric-like TTP bootstrapping, quantum-resilient braid group structure, and extremely low resource demands for target platforms.
  • For TPU 7: Enduring value of systolic matrix-multiply cores, narrow floating-point formats (BF16/FP8/FP4), dedicated HBM main memory, custom high-speed interconnects, DMA-managed on-chip SRAM, and vector units supporting general non-matrix operations. OCS-enabled scaling and SparseCore-accelerated embedding and collective operations are distinctive to the TPU lineage.

These systems illustrate that stability in architectural primitives, coupled with targeted advances in scale, resilience, and efficiency, enables quantum-resilient cryptography and exascale AI training with regime-leading energy and carbon efficiency (Anshel et al., 2017, Jouppi et al., 14 Jun 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Ironwood.