Memelang: Axial DSL for LLM Queries
- Memelang is a deterministic, axial multi-dimensional DSL that uses rank-specific separators to reconstruct a three-dimensional grid for precise query mapping.
- It features implicit context carry-forward, parse-time variable binding, and inline tags for aggregation, grouping, and ordering, streamlining SQL compilation.
- By integrating relational and vector-aware operations, Memelang enhances LLM pipelines with minimal syntax and clear semantic slot assignment.
Memelang is a deterministic, axial, multi-dimensional grammar and a compact domain-specific language (DSL) designed as an intermediate representation (IR) for Linear LLMs to emit complex relational and vector-relational database queries. Utilizing a linear token sequence encoded with rank-specific separators, Memelang recovers an explicit three-dimensional grid structure in a single left-to-right parse pass. Each “coordinate axis” in this grid is mapped to a semantic role, enabling unambiguous assignment of tokens to table, column, and value slots, thus supporting tool-augmented LLM pipelines with minimal surface syntax and maximal downstream determinism. The Memelang system includes mechanisms for implicit context carry-forward, parse-time variable binding, inline grouping/aggregation/ordering, and compilation to parameterized SQL—optionally targeting vector-aware backends such as PostgreSQL with pgvector support (Holt, 18 Dec 2025).
1. Formal Structure of Memelang’s Axial Grammar
Memelang instantiates the general axial grammar framework with axes: Matrix (2), Vector (1), and Limit (0). The token vocabulary is partitioned into atoms (non-separators) and separators . Separator tokens receive a rank via , assigning whitespace (), single semicolon (), and double semicolon (). Parsing proceeds with a scan state , incremented and reset according to the rank of each encountered separator. Each non-separator token is mapped uniquely into a coordinate , yielding the grid of cell contents for stream .
Memelang's EBNF grammar—abbreviated—includes:
- Query ::= ( Matrix “;;” )+
- Matrix ::= Vector ( “;” Vector )*
- Vector ::= Limit ( WS Limit )*
- Limit ::= [left] [cmp] right | left (cmp right)?
- Term ::= atom | (mod atom) | (atom mod atom)
- Inline tags (e.g., “:sum”, “:grp”, “:asc”) modify values. Three slots per vector are assigned via Axis:0 (Limit): (Table), (Column), (Value/Predicate), after right-alignment local reversal.
2. Streaming Parsing and Coordinate Indexing
The parser makes a single left-to-right pass over the token stream. For each token, those of separator type update coordinates as follows: all lower axes () are reset to zero, axis is incremented, higher axes remain unchanged. Each atom is written into cell at the current coordinate . There is no need for parentheses or indentation; the ranked separators are sufficient to recover the three-dimensional structure.
Once parsing is complete, a small record is constructed for each non-empty cell using a “cell interpreter” , which parses local token structure and accumulates inline tags and binding information. This coordinate system enables deterministic slot interpretation and downstream compositional compilation.
3. Semantic Roles, Carry-Forward, and Variable Binding
Each axis is mapped to a fixed role: Matrix (query/subquery block), Vector (column/predicate specification), and Limit (slot type). Within each (Matrix, Vector) slice, the last three non-empty positions are assigned to Table, Column, and Value. Carry-forward is applied along the Vector axis for Table and Column slots: if a slot is empty, its most recent non-empty value is inherited from the previous Vector in the same Matrix.
Variable binding is achieved with the inline tag “:” placed in the Value slot, binding variable to that coordinate. Subsequently, any occurrence “” in the same Matrix references the bound cell, supporting relative self-joins, nested queries, or parameter reuse.
4. Inline Tags, Aggregation, Grouping, and Ordering
Memelang provides inline tags as Value-slot prefixes to enable query plan specification:
- Aggregates: :sum, :cnt, :min, :max, :avg, :last (compile to corresponding SQL aggregation functions)
- Grouping key: :grp (marks column as GROUP BY key)
- Ordering: :asc, :des (specifies ORDER BY)
- Variable binding: : (binds to the cell)
When a Value slot contains , ..., “:f_m”, , ..., ]$, tags are separated from the core term, and the interpreter processes each, emitting the necessary aggregation, grouping, or ordering logic. Group keys are propagated for SELECT and GROUP BY clauses, while ordering tags trigger primary/secondary ORDER BYs on aggregate or raw expressions.</li>
</ul>
<h2 class='paper-heading' id='reference-implementation-lexer-parser-and-sql-compiler'>5. Reference Implementation: Lexer, Parser, and SQL Compiler</h2>
<p>The Memelang reference pipeline includes:</p>
<ol>
<li><strong>Lexer</strong>: Splits tokens on separators (whitespace, ";", ";;"), comparators, colon, commas, quoted literals, and variables, yielding a flat token stream.</li>
<li><strong>Parser</strong>: Executes the streaming coordinate assignment and populates the $C(x)$ cell map.</li>
<li><strong>Cell interpreter</strong>: For each cell, parses as (Table?), (Column?), (Value+tags?), extracting SQL fragment, aggregation, grouping, ordering, and binding information.</li>
<li><strong>Carry-forward and binding passes</strong>: Fill Table/Column slots where necessary and resolve variable bindings.</li>
<li><strong>IR→SQL compiler</strong>:
<ul>
<li>Enumerates unique Table instances, assigns aliases (t₀, t₁, ...).</li>
<li>Constructs FROM/JOIN clauses based on Table aliasing and self-join markers ("@").</li>
<li>WHERE clause comprises Value slot predicates with comparator.</li>
<li>SELECT clause is built from those marked for projection, applying aggregation/wrapping based on grouping.</li>
<li>GROUP BY collects all columns tagged :grp.</li>
<li>ORDER BY is constructed from slots with :asc/:des.</li>
<li>LIMIT is included if specified via a meta-mode Vector.</li>
<li>Vector similarity operators ("<=>" etc.) compile to PostgreSQL pgvector operators.</li>
</ul></li>
</ol>
<p>All literal values become parameterized SQL variables in the order they appear, and vector expressions are mapped to $...::VECTOR$ forms for backend compatibility.</p>
<h2 class='paper-heading' id='annotated-examples'>6. Annotated Examples</h2>
<p><strong>Scalar filter with carry-forward</strong></p>
<p>Memelang:
The coordinate assignment yields: | (Matrix, Vector, Limit) | Token(s) | Slot | |-------------------------|---------------|----------------| | (0,0,2) | [movies] | Table@V0 | | (0,0,1) | [year] | Column@V0 | | (0,0,0) | [<1970] | Value | | (0,1,2) | [] → movies | Table@V1 (cf) | | (0,1,1) | [title] | Column@V1 | | (0,1,0) | [_] | Value (wildcard)|</p> <p>This compiles to:1
movies year <1970; title _;;
with parameter $1=1970.1 2 3
SELECT t0.year, t0.title FROM movies AS t0 WHERE t0.year < $1
Vector predicate and ordering
Memelang:
This parses to expressions with inline ordering tag :asc, compiled as:1
movies description <=>“robot”:asc<0.35; title _;;
with parameters $1=embedding("robot"),$2=0.35.1 2 3 4
SELECT (t0.description <=> $1::VECTOR) AS col0, t0.title AS col1 FROM movies AS t0 WHERE (t0.description <=> %%%%30%%%%2 ORDER BY (t0.description <=> $1::VECTOR) ASC
Co-star self-join with binding
Memelang:
Cell (0,0,1) binds , the “@”s initiate self-joins, and all Table/Column/Value slots are inherited through carry-forward and binding passes.1 2
roles actor :$a="Bruce Willis"; movie _; @ @ @; actor !=$a;;
Grouped join, vector predicate, limit
Memelang:
Parsed tags :grp and :min:des determine grouping and order. Output:1
movies year <1980; description <=>"war"<=\$sim; title :grp; roles movie @; rating :min:des; %m lim 12;;
with $1=embedding("war"),$2=1980, $3=\$sim.1 2 3 4 5 6 7 8 9 10 11 12 13
SELECT MAX(t0.year) AS col0, MAX(t0.description <=> $1::VECTOR) AS col1, t0.title AS col2, MAX(t1.movie) AS col3, MIN(t1.rating) AS col4 FROM movies AS t0, roles AS t1 WHERE t0.year < $2 AND (t0.description <=> %%%%33%%%%3 AND t1.movie = t0.title GROUP BY t0.title ORDER BY MIN(t1.rating) DESC LIMIT 12
7. Context and Significance
Memelang’s axial grammar enables deterministic, context-minimal parsing for LLM-emitted relational and vector-relational queries, reducing verbosity, surface ambiguity, and post-processing overhead. Its coordinate-indexed slot assignments, along with mechanisms for context carry-forward, variable binding, and inline manipulation of group/aggregate/order semantics, permit LLMs to emit queries whose parse trees and execution plans require no ambiguity resolution or second-pass transformation. Deployments benefit from transparent mapping to parameterized SQL, composable joins/self-joins, full vector operator support for pgvector, and programmatic suitability for streaming tool-use settings. The formal framework also sets a foundation for further research on higher-rank DSLs and deterministic interface grammars for LLM tool-use (Holt, 18 Dec 2025).
References (1)