Semantic Instruction Set Architecture

Updated 4 July 2026

Semantic ISA is a formally specified instruction set where each instruction’s effect is a pure function on a machine state.
It uses executable operational semantics enabling machine-checking, theorem proving, and cross-level validation across various frameworks.
It integrates security guarantees and weak memory models, providing universal contracts that bridge compiler correctness with hardware refinement.

Searching arXiv for the specified and closely related work on semantic ISA. A Semantic Instruction Set Architecture (ISA) is an instruction-set architecture whose syntax and behavior are given by an executable, formally specified operational semantics rather than by informal prose; in one formulation, every instruction’s effect is a pure function on a machine state (Goel, 2017, Huyghebaert et al., 2023). In the literature, the term covers several related but distinct uses: a formal model for reasoning about x86 and RISC-V machine code, a first-class semantics reusable across testing, proof, model checking, and hardware generation, a bridge between compiler correctness and hardware refinement, a semantics for weakly consistent litmus-test languages, and a governance-oriented contract between an untrusted probabilistic component and a deterministic kernel (Bourgeat et al., 2021, Kan et al., 6 May 2026, Alglave et al., 2016, Wen et al., 20 Apr 2026). The common theme is mechanization: the ISA contract is expressed in a proof-assistant-readable or executable formalism, so that simulation, theorem proving, and cross-level validation operate over the same semantic object.

1. Definition and conceptual scope

The core idea is to replace prose manuals and pseudocode with a precise syntax, a small-step or big-step operational semantics, and, in some formulations, formal security assertions. Traditional ISA descriptions define instruction behavior in English, with pseudocode or informal rule schemas; such specifications are useful for human designers but are not machine-checked and cannot directly discharge end-to-end correctness obligations (Kan et al., 6 May 2026). A semantic ISA lifts that contract into a formal mechanized model that specifies exactly what effects an instruction may have on registers, memory, and control flow.

In the x86isa formulation, the semantic ISA is “the architecture’s formal, executable operational semantics, made precise in a theorem-proving environment so that—and because—every instruction’s effect is a pure function on a machine state” (Goel, 2017). In the Sail-based formulation, the same notion is extended beyond ordinary functional behavior to include security guarantees, so that syntax, behavior, and security assertions are all stated formally and can be mechanically checked (Huyghebaert et al., 2023).

The scope of the term is broader than conventional processor manuals. For RISC-V, a machine-checked semantics is described as the bridge between compiler correctness proofs and hardware refinement proofs (Kan et al., 6 May 2026). For LISA, the semantic ISA is specialized to weak-memory litmus tests, separating computation from the consistency policy enforced by a separate .cat specification (Alglave et al., 2016). For Arbiter-K, the term is used in a domain-specific, high-level sense: a Semantic ISA is the formal contract between the untrusted Probabilistic Processing Unit and a deterministic, rule-based kernel, with instructions such as TOOL_CALL, VERIFY, and INTERRUPT rather than conventional arithmetic or system opcodes (Wen et al., 20 Apr 2026). This suggests that “semantic ISA” denotes a style of formal architectural contract rather than a single fixed representation.

2. Formal substrates and machine-state representations

Representative semantic-ISA frameworks use markedly different mechanization substrates while preserving the same basic role: they define state, instruction meaning, and execution in an executable formal object (Goel, 2017, Bourgeat et al., 2021, Kan et al., 6 May 2026, Kwan, 25 Jul 2025, Huyghebaert et al., 2023, Alglave et al., 2016, Wen et al., 20 Apr 2026).

Framework	Representative paper	Core abstraction
ACL2 x86isa	(Goel, 2017)	abstract stobj `x86`, `x86-fetch-decode-execute`, `x86-run`
ACL2 RV32I	(Kwan, 25 Jul 2025)	stobj `rv32`, separate `decode` and `eval`
Haskell/Coq RISC-V	(Bourgeat et al., 2021)	`RiscvMachine p t` type class over a monad
Rocq ITrees RISC-V	(Kan et al., 6 May 2026)	`itree E R` with `ProcessorE` and `VMemE` events
Sail + Katamaran	(Huyghebaert et al., 2023)	definitional interpreter `fdeStep()` / `fdeCycle()`
LISA	(Alglave et al., 2016)	per-thread traces plus global `rf`
Arbiter-K	(Wen et al., 20 Apr 2026)	five logical cores and 37 abstract instructions

The state models reflect the target domain. In x86isa, the entire x86-64 machine state is packaged as an abstract stobj x86, with general-purpose registers, rip, rflags, segment/control/descriptor/machine-specific/floating-point/SIMD registers, byte-addressable main memory of size $2^{52}$ bytes, and model-control fields such as ms, user-level-mode, page-structure-marking-mode, os-info, env, and undef; a concrete stobj x86 $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 2^{32} $bytes, and an auxiliary model-state field</code>ms` (<a href="/papers/2507.19009" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Kwan, 25 Jul 2025</a>).</p> <p>Other frameworks abstract the machine through interfaces rather than a single concrete record. In the type-class approach, <code>RiscvMachine p t</code> bundles the effect monad <code>p</code>, the register-width index <code>t</code>, and abstract primitives such as <code>getRegister</code>, <code>setRegister</code>, <code>getPC</code>, <code>setPC</code>, <code>loadWord</code>, <code>storeWord</code>, and <code>endCycle</code>, so that memory and <a href="https://www.emergentmind.com/topics/compressed-sparse-row-csr-matrix" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">CSR</a> behavior can be instantiated by a pure functional map, an <code>IO</code> array, a symbolic-constraint builder, or a weakest-precondition generator (<a href="/papers/2104.00762" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Bourgeat et al., 2021</a>). In the ITree approach, the semantic interface is event-based: <code>ProcessorE</code> covers <code>RegRead</code>, <code>RegWrite</code>, <code>PCRead</code>, <code>PCWrite</code>, <code>CSRRead</code>, and <code>CSRWrite</code>, while <code>VMemE</code> covers <code>VMemRead</code>, <code>VMemWrite</code>, and <code>VMemInstrFetch</code> (<a href="/papers/2605.04933" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Kan et al., 6 May 2026</a>).</p> <p>A further distinction concerns syntax. Some semantic ISAs are tightly connected to concrete machine-code encodings. The RV32I ACL2 model explicitly separates a decode layer$ \mathit{decode}:\{0,1\}^{32}\to\mathrm{Instr} $from an evaluate layer$ \mathit{eval}:\mathrm{Instr}\times\mathrm{State}\to\mathrm{State} $, and proves inversion theorems of the form$ \forall p.\ \decode(\encode(p))=p $(<a href="/papers/2507.19009" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Kwan, 25 Jul 2025</a>). By contrast, Arbiter-K treats its ISA at an abstract <a href="https://www.emergentmind.com/topics/semantic-layer" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">semantic layer</a> and gives no bit-level encodings or fixed operand widths (<a href="/papers/2604.18652" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Wen et al., 20 Apr 2026</a>).</p> <h2 class='paper-heading' id='operational-semantics-and-execution-models'>3. Operational semantics and execution models</h2> <p>Most semantic ISAs are organized around an executable fetch–decode–execute cycle. In ACL2 x86isa, the single-step function is <code>x86-fetch-decode-execute</code>, and the interpreter that runs for$ n $steps is <code>x86-run</code>; the recursive specification is</p> <p>$ \mathit{run}(0,s)=s,\quad \mathit{run}(n+1,s)=\mathit{run}\bigl(n,\mathit{step}(s)\bigr),$



with stepping halted when the model-control field indicates an error or halt (Goel, 2017). Sail adopts an analogous definitional interpreter:
fdeCycle() ≔ fdeStep(); fdeCycle() and fdeStep() ≔ let w = fetch(PC) in let i = decode(w) in execute(i) (Huyghebaert et al., 2023).

Instruction clauses are then ordinary executable definitions over the chosen state interface. The x86isa exposition illustrates this with ADD r_1,r_2, whose effect updates the destination register, updates flags via updateFlags, and advances rip by the instruction size (Goel, 2017). The ACL2 RV32I model gives the same style for ADD and LW, with ADD updating regs[d] and pc := pc+4, and LW computing an effective address, reading memory with rm32, and advancing pc (Kwan, 25 Jul 2025).

The semantics of memory can be either direct or elaborated. In the x86 user-level model, all linear memory accesses are served by mem_read(addr,s) and mem_write(addr,val,s), with direct access to a linear-memory record; in the system-level model, these are elaborated by a page walk that updates accessed/dirty bits (Goel, 2017). In the ITree RISC-V model, memory interaction is explicit in the event type, so that exec_LOAD first reads a base register, computes a sign-extended offset, triggers VMemRead, and either writes the resulting value to a register or returns an error (Kan et al., 6 May 2026).

A central design choice is whether the semantics is monolithic or parameterized. In the type-class RISC-V approach, an instruction such as Jalr is a monadic definition that only invokes abstract primitives like getRegister, getPC, raiseExceptionWithInfo, setRegister, and setPC; all state, alignment checks, arithmetic, and exception raising are delegated to the instance (Bourgeat et al., 2021). This makes the semantics a first-class object that can be instantiated differently within the same logic.

LISA departs from the single-thread interpreter pattern because its target is weak memory. An anarchic execution is a pair  $(\tau_{\mathrm{init}}\parallel\tau_0\parallel\dots\parallel\tau_{n-1}\parallel\tau_{\mathrm{fin}}, rf)$  in which each trace records local steps and states, while the only global component is the read-from relation  $rf$  (Alglave et al., 2016). Computation and consistency are deliberately separated: one first generates anarchic executions, then filters them through a separate .cat specification.
4. Verification styles, theorem reuse, and validation
Semantic ISAs are intended not only to execute code but also to support proofs about code. In x86isa, reasoning about “run  $n$  instructions” uses step-function opener rules, run-function opener rules, and sequential-composition rules such as

 $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 0

when there is no intervening ms; the framework also supplies read-over-write and write-over-write lemmas for registers and memory (Goel, 2017). Straight-line arithmetic-intensive code can be verified by symbolic execution with GL, while loops and branching are handled with a clock function clk(s) and induction on the clock; safety-style invariants, termination, and non-interference properties are also proved in this style (Goel, 2017).

The ACL2 RV32I model emphasizes automation and modularity. It proves read-over-write, write-over-write, writing-the-read, and state well-formedness theorems for the stobj accessors and updaters, and it verifies encoding/decoding functions for each RV32I instruction automatically with GL bit-blasting and rewrite rules (Kwan, 25 Jul 2025). Because decode lemmas are separated from semantic lemmas, proofs about bitfields and proofs about architectural behavior do not reopen one another’s code.

The type-class approach is designed for theorem reuse across tools. A single core semantics supports testing, interactive proof, and model checking of both software and hardware, and can even support a theorem relating hardware multiply to a software trap-handler implementation of multiply within one proof (Bourgeat et al., 2021). Concretely, instantiating p = IO yields a simulator of approximately ~200 MIPS that passes riscv-tests with exceptions and Linux boot; instantiating monad transformers differently yields a multicore litmus-test model checker for RVWMO, while interpreting the same Haskell specification with Clash generates Verilog checked by riscv-formal’s Yosys/SMT flow (Bourgeat et al., 2021).

The ITree RISC-V development places particular emphasis on machine-checked instruction lemmas and cross-level proofs. It reports 131 instruction-level correctness proofs, 27 page-table/VM invariants, and 60 encode/decode lemmas (Kan et al., 6 May 2026). It further proves heterogeneous bisimulation between LLVM ITrees and RISC-V ITrees for an array-load example, validates a macro-fusion instruction reordering via eutt, and proves that a Kôika hardware ALU correctly implements all R-type integer operations against the ISA contract (Kan et al., 6 May 2026). Extraction to OCaml yields a standalone simulator, and execution of all official RISC-V tests for I, M, F, A, Zicsr on RV32 and RV64 reports 172 test binaries / 3,228 test cases total, all passed (Kan et al., 6 May 2026).

For Sail-based security verification, Katamaran symbolically executes Sail code in Coq, prunes unreachable branches, calls out to an SMT-style solver for pure side conditions, allows user-written lemmas as ghost statements, and emits residual proof goals for Coq; in the MinimalCaps case study it discharges ~90 % of the proof for all 50+ helper functions, leaving ≈120 LoC of manual Coq proof (Huyghebaert et al., 2023).
5. Security guarantees, universal contracts, and governance
A major extension of the semantic-ISA idea is the claim that the formal ISA contract should include security guarantees, not only functional behavior. In the Sail-based account, this is captured by universal contracts: Hoare triples of the form

 $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 1

where  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 2 states the only authority that arbitrary untrusted code may have (Huyghebaert et al., 2023). Because fdeCycle() runs arbitrary instructions from memory, the contract must hold universally for all code images in memory, provided the initial state satisfies the precondition.

Two case studies illustrate this program. For MinimalCaps, the logical relation  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 3 characterizes safe words, and the universal contract states that if the program counter and registers contain values satisfying  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 4, then one cycle preserves the intended invariants (Huyghebaert et al., 2023). For a simplified RISC-V with PMP, the universal contract formalizes memory integrity by combining privilege-mode state, trap behavior, PMP entries, and a separation-logic predicate controlling access to physical addresses (Huyghebaert et al., 2023). That contract is then used to verify a femtokernel: if memory[84] = 42 initially, then after running fdeCycle() to completion, including adversarial user-mode steps and the trap handler, memory[84] = 42 (Huyghebaert et al., 2023).

Arbiter-K transfers the same structural idea to agentic execution. Its Semantic ISA reifies probabilistic messages into discrete instructions, associates each instruction with governance metadata, and allows the kernel to maintain a Security Context Registry and build an Instruction Dependency Graph at runtime (Wen et al., 20 Apr 2026). Taint propagation is defined by

 $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 5

and sink interdiction blocks high-risk execution-core instructions when taint or policy checks fail (Wen et al., 20 Apr 2026). The architecture also includes VERIFY, FALLBACK, and bounded rollback to a safe checkpoint. The paper reports evaluations on OpenClaw and NanoBot in which Arbiter-K achieves 76% to 95% unsafe interception for a 92.79% absolute gain over native policies (Wen et al., 20 Apr 2026).

The low-level and high-level cases differ sharply in representation, but both treat the ISA as the locus where authority, state transition, and enforcement become explicit. This suggests that security-oriented semantic ISAs can be formulated either as instruction-level machine semantics with separation-logic contracts or as governance-typed runtime instruction systems.
6. Weak memory, open limitations, and evolving directions
Semantic ISAs also appear in settings where the main issue is not single-core functional correctness but the structure of concurrent executions. LISA is a minimal assembly-style language for litmus tests whose semantics is analytic: the set of program behaviors is the intersection of an anarchic semantics, with no constraints on which write feeds which read, and a communication semantics specified in a separate .cat model (Alglave et al., 2016). Its candidate executions record events  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 6, program order  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 7, read-from  $c` underlies the abstract `x86`, and the two are shown equivalent by a correspondence proof [1705.01225]. In the ACL2 RV32I model, the stobj `rv32` contains a register file of 32 32-bit words with `x0` hard-wired to zero, a 32-bit program counter, a byte-addressable memory of size$ 8, initial writes, and final writes, so that control/register semantics and memory-consistency policy remain cleanly separated (Alglave et al., 2016). This is a different but related notion of semantic ISA: the operational core generates executions, and a second formal layer validates communication patterns.

Current systems also expose clear limitations. In x86isa, missing capabilities include full exceptions and asynchronous interrupts, I/O instructions such as in/out, more MSRs, xcr0, a memory hierarchy with caches and TLBs, multiprocessor coherence, automation of precondition discovery, and tighter integration with Codewalker (Goel, 2017). Planned work includes modeling exceptions by invoking descriptor tables and handler code, supporting in/out and env-based nondeterminism for device I/O, adding missing MSRs via abstract stobjs, and longer-term support for cache hierarchy, SMP, interrupt-oracle scheduling, and OS-level validation by booting a stripped-down kernel on the model (Goel, 2017).

The RISC-V literature identifies a related tension between specialization and reuse. Existing formal RISC-V specifications are described as focusing on hardware tooling rather than cross-level verification, while the ITree approach is designed specifically to support instruction-level lemmas, compiler/ISA bisimulation, and hardware conformance in a single framework (Kan et al., 6 May 2026). The type-class approach argues for a different unification principle: new features such as exceptions, floating point, atomics, weak memory, or I/O become new monad-transformer instances rather than a new DSL or translator (Bourgeat et al., 2021).

Taken together, these developments indicate that a semantic ISA is not merely a formal simulator. It is an executable semantic contract that can be instantiated as an ACL2 stobj model, a Sail definitional interpreter, a type-class-parameterized semantics, an ITree event system, an analytic weak-memory language, or a governance-typed kernel interface. Across those settings, its distinguishing function is the same: to make architectural behavior exact enough for symbolic execution, proof, validation, and cross-level reasoning.

      
        
          
  
    

    Markdown

  
    

    Report Issue


          
  
    

    Upgrade to Chat

        

      

      



  
    

    References (7)

    
  
  
    

    
      
        
          1.
        
        
          The x86isa Books: Features, Usage, and Future Plans 

          (2017)
        
      
    
    
      
        
          2.
        
        
          Formalizing, Verifying and Applying ISA Security Guarantees as Universal Contracts 

          (2023)
        
      
    
    
      
        
          3.
        
        
          Flexible Instruction-Set Semantics via Type Classes 

          (2021)
        
      
    
    
      
        
          4.
        
        
          Interaction Tree Semantics for RISC-V: Bridging Compiler and Hardware Verification 

          (2026)
        
      
    
    
      
        
          5.
        
        
          Syntax and analytic semantics of LISA 

          (2016)
        
      
    
    
      
        
          6.
        
        
          From Craft to Kernel: A Governance-First Execution Architecture and Semantic ISA for Agentic Computers 

          (2026)
        
      
    
    
      
        
          7.
        
        
          RV32I in ACL2 

          (2025)




  
    


  












  


    
    

        
        
            

        
        

      
      
          Topic to Video (Beta)

        
            
  


    No one has generated a video about this topic yet.
    
        
          

          Sign Up to Generate
        
          

          All Videos

      
  

  Subscribe on YouTube

    



        
      
      
    
    
  











  


    
    

        
        
            

        
        

      
      
          Whiteboard

        
            
  



    No one has generated a whiteboard explanation for this topic yet.
    
        
          

          Sign Up to Generate
    



        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Follow Topic

        
            
  Get notified by email when new papers are published related to Semantic Instruction Set Architecture (ISA).

  
      
        

        Sign Up to Follow Topic by Email
  

        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Continue Learning

        
            
    
        
          How do different frameworks implement the mechanization of semantic ISAs? 

        
        
          What are the benefits of using formal operational semantics in ISA verification? 

        
        
          How do semantic ISAs facilitate cross-level reasoning between software and hardware? 

        
        
          What challenges exist in integrating security guarantees into semantic ISA frameworks? 

        
        
          Find recent papers about semantic ISA verification. 

        
    

        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Related Topics

        
            
    
        
          Semantic Models in Computational Systems 

        
        
          Symbolic Sandbox: A Runtime for Binary Symbolic Execution 

        
        
          Parallel Syntax & Execution Models 

        
        
          LLM-to-Symbolic Integration in AI Systems 

        
        
          Memory-Locked Synthesis 

        
        
          Hybrid Concrete-Symbolic Interpretation 

        
        
          LLM Programs: Executable Language Model Systems 

        
        
          Contextual Memory Virtualisation (CMV) 

        
        
          VeriStruct: Structural Verification Paradigm 

        
        
          Semantic File Systems: A Survey


    

    
    


    
      
        
          Content



            
              

              Overview

              
                

                References

            
              

              Topic to Video

            
              

              Whiteboard

            
              

              Follow Topic

            
              

              Continue Learning

            
              

              Related Topics



  

  
    
      
        Stay informed about trending AI papers: