Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 42 tok/s
GPT-5 High 31 tok/s Pro
GPT-4o 100 tok/s
GPT OSS 120B 469 tok/s Pro
Kimi K2 213 tok/s Pro
2000 character limit reached

Deegen: A JIT-Capable VM Generator for Dynamic Languages (2411.11469v2)

Published 18 Nov 2024 in cs.PL

Abstract: Building a high-performance JIT-capable VM for a dynamic language has traditionally required a tremendous amount of time, money, and expertise. We present Deegen, a meta-compiler that allows users to generate a high-performance JIT-capable VM for their own language at an engineering cost similar to writing a simple interpreter. Deegen takes in the execution semantics of the bytecodes implemented as C++ functions, and automatically generates a two-tier VM execution engine with a state-of-the-art interpreter, a state-of-the-art baseline JIT, and the tier-switching logic that connects them into a self-adaptive system. We are the first to demonstrate the automatic generation of a JIT compiler, and the automatic generation of an interpreter that outperforms the state of the art. Our performance comes from a long list of optimizations supported by Deegen, including bytecode specialization and quickening, register pinning, tag register optimization, call inline caching, generic inline caching, JIT polymorphic IC, JIT IC inline slab, type-check removal and strength reduction, type-based slow-path extraction and outlining, JIT hot-cold code splitting, and JIT OSR-entry. These optimizations are either employed automatically, or guided by the language implementer through intuitive APIs. As a result, the disassembly of the Deegen-generated interpreter, baseline JIT, and the generated JIT code rivals the assembly code hand-written by experts in state-of-the-art VMs. We implement LuaJIT Remake (LJR), a standard-compliant Lua 5.1 VM, using Deegen. Across 44 benchmarks, LJR's interpreter is on average 179% faster than the official PUC Lua interpreter, and 31% faster than LuaJIT's interpreter. LJR's baseline JIT has negligible startup delay, and its execution performance is on average 360% faster than PUC Lua and only 33% slower (but faster on 13/44 benchmarks) than LuaJIT's optimizing JIT.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper presents a meta-compiler that automatically generates both a high-performance interpreter and a baseline JIT for dynamic languages.
  • It details a novel two-tier VM architecture that integrates continuation-passing style and register-pinning techniques for efficient state management.
  • Benchmark results show performance gains of up to 360% over traditional systems, validating its practical impact on VM development.

An Examination of Deegen: A Meta-Compiler for High-Performance VM Generation

The paper "Deegen: A JIT-Capable VM Generator for Dynamic Languages" presents a comprehensive exploration into the field of software engineering, specifically focusing on the challenges associated with building high-performance just-in-time (JIT) capable virtual machines (VMs) for dynamic programming languages. Traditionally, this undertaking has been resource-intensive, demanding substantial investment of time, financial resources, and domain-specific expertise. The authors, Haoran Xu and Fredrik Kjolstad, propose Deegen, a meta-compiler aimed at democratizing this process by allowing users to generate efficient VMs without the extensive engineering effort typically required.

Overview of Deegen's Architecture

Deegen is presented as a novel system that accepts the execution semantics of bytecodes as defined in C++ functions and outputs a two-tier VM execution engine. This engine encompasses a high-performance interpreter and a baseline JIT, integrated through a self-adaptive tier-switching logic. The architecture is designed to maintain high throughput for both short-running and long-running workloads, addressing performance optimizations often manually crafted in traditional JIT compilers.

One of the paper's critical contributions is demonstrating the automatic generation of both an interpreter and a JIT compiler that compete with the quality of hand-written, state-of-the-art systems. The interpreter architecture leverages continuation-passing style and register-pinning using the GHC calling convention, facilitating optimal performance by ensuring VM states are always passed in fixed CPU registers. For the JIT, the authors employ an improved Copy-and-Patch technique, allowing for fast, lightweight code generation.

Performance and Engineering Claims

In terms of quantitative claims, Deegen's efficacy is measured through its implementation of LuaJIT Remake (LJR), a standard-compliant Lua 5.1 VM. As per the benchmarks conducted across 44 programs, LJR's interpreter exhibits a performance increase of 179% over the official PUC Lua interpreter and 31% over LuaJIT's interpreter. Notably, LJR's baseline JIT produces negligible startup delays and achieves execution performance 360% faster than PUC Lua, although it remains 33% slower on average compared to LuaJIT's optimizing JIT.

These results confirm the authors' assertion that Deegen-generated VMs can rival expert-crafted solutions. The paper excels in detailing the numerous optimizations supported by Deegen, including bytecode specialization, quickening, and inline caching techniques, many of which can be automatically applied or guided by language implementers through intuitive APIs.

Implications and Future Directions

Theoretically, Deegen represents a step toward reducing the complexity and cost barriers associated with developing high-performance VMs for dynamic languages. Practically, it promises to enable smaller teams and individual developers to produce efficient execution environments for their languages. Furthermore, Deegen's approach provides a single source of truth—the bytecode semantic description—thereby reducing maintenance overhead and potential for inconsistencies in VM development.

Looking forward, the authors suggest the extension of Deegen to support an optimizing JIT tier, further bolstering its capability for competitive peak throughput. The modular and portable nature of Deegen's framework, as it currently targets x86-64 architecture, suggests potential extensions to other ISAs, contingent on user demand and resource availability.

Conclusion

Overall, Deegen manifests as a significant advancement in VM generation technology for dynamic languages. By abstracting the complexities of JIT compiler generation, Xu and Kjolstad have delivered a substantial contribution to the field of programming language implementation, demonstrating that automated systems can indeed match the craftsmanship of experienced engineers in generating high-performance VMs. The implications for programming languages, especially in lowering the barriers to efficient execution, are profound, and Deegen may stimulate further research into automated compiler generation in other contexts.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Reddit Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube