Greibach Normal Form in Lambek Grammar

Updated 11 November 2025

Greibach Normal Form in Lambek Grammar is a canonical representation that mirrors the leftmost derivation structure of context-free grammars.
It employs a three-step algorithm that extracts a CFG, converts it into Greibach Normal Form, and reconstructs an L(/→)-grammar with types in Greibach-shape.
Cut-elimination in the sequent calculus enforces the derivational normal form, preserving complexity, closure properties, and efficient polynomial-time parsing.

Greibach Normal Form (GNF) in the context of Lambek grammar refers to a precise and constructive proof-theoretic analogue of the familiar GNF for context-free grammars (CFGs), realized within the purely right-implicative fragment of the Lambek calculus, denoted L(/→). This correspondence establishes that every language recognized by an L(/→)-grammar (excluding the empty string) can be generated by an L(/→)-grammar whose assigned types are in a canonical "Greibach-shape." The result demonstrates that essential properties of CFGs—including complexity and closure properties—have direct counterparts in this logical calculus, and that cut-elimination in sequent calculus realizes, internally, the leftmost derivation scheme associated with GNF in standard grammar theory (Nishimiya et al., 4 Nov 2025).

1. Formal Definition and Sequent Calculus Fragment

The fragment L(/→) is defined by:

Primitive Types: A fixed finite set $Pr$ corresponding to nonterminals.
Types: $Tp(/) = \{ A \mid A \in Pr \} \cup \{ \beta/\gamma \mid \beta, \gamma \in Tp(/) \}$ , allowing only the right-implicative connective $/$ .
Sequents: Expressions of the form $\Gamma \to \alpha$ with $n \ge 1$ , $\Gamma = \alpha_1, \dots, \alpha_n$ , $\alpha_i, \alpha \in Tp(/)$ .
Inference Rules:
- (AX): $A \to A$
- (Cut): $\dfrac{\,\Gamma, \alpha, \Theta \to \gamma~~~\Delta \to \alpha\,}{\Gamma, \Delta, \Theta \to \gamma}$
- ( $\to/$ ): $\dfrac{\,\Gamma, \alpha \to \beta\,}{\Gamma \to \beta/\alpha}$ (with $\Gamma \neq \emptyset$ )
- (/ $\to$ ): $\dfrac{\,\Gamma \to \alpha~~~~\Delta, \beta, \cdot \to \gamma\,}{\Delta, (\beta/\alpha), \Gamma \to \gamma}$

Structural rules such as weakening, contraction, and exchange are absent.

2. The Greibach Analogue Theorem

A type $\tau \in Tp(/)$ is in Greibach-shape (with respect to start type $S \in Pr$ ) if:

$\tau = S$ ,
or $\tau = (\cdots((S/\beta_n)/\beta_{n-1})\cdots)/\beta_1$ for $\beta_1, \ldots, \beta_n \in Pr$ , $n \ge 1$ .

Theorem (Greibach Analogue):

For every L(/→)-grammar $\mathcal{G} = (Pr, \Sigma, S, f)$ , where $f: \Sigma \to \mathcal{P}(Tp(/))$ and $L(\mathcal{G})$ is the set of strings $w \in \Sigma^+$ such that some type-sequence $\Gamma \in f^+(w)$ yields a provable sequent $\Gamma \to S$ , there exists an effectively constructible grammar $\mathcal{G}' = (Pr', \Sigma, S, f')$ such that

every type in every $f'(a)$ is in Greibach-shape,
$L(\mathcal{G}') = L(\mathcal{G})$ .

Every CFG in standard GNF ( $A \to a\,B_1\cdots B_k$ or $A \to a$ ) can likewise be translated into an L(/→)-grammar with $f(a)$ consisting of the corresponding Greibach-shape types.

3. The Three-Step Construction Algorithm

The transformation from an arbitrary L(/→)-grammar to Greibach-shape proceeds in three steps:

Step	Description	Output Grammar
1. Extract a CFG	Build a CFG $G$ from $f(a)$ by mapping every type $((\cdots(S/\beta_n)\cdots)/\beta_1)$ to a production $S \to a\,\beta_1\cdots\beta_n$ , and every primitive $A$ to $A \to a$ .	$G=(N,\Sigma,P,S)$
2. Put into GNF	Apply standard CFG-to-GNF algorithms: eliminate left recursion, factor, and introduce nonterminals.	$G_{\text{GNF}}$
3. Reconstruct L(/→)	For each $A \to a\,B_1\cdots B_m$ in $G_{\text{GNF}}$ , assign $f'(a)$ the type $(((A/B_m)/B_{m-1})\cdots)/B_1)$ . For $A \to a$ , assign $A$ to $f'(a)$ .	$\mathcal{G}'$

This algorithm preserves language recognition at every step and ensures all types in $f'(a)$ are in Greibach-shape.

4. Cut-Elimination and Normal Form Proof Structure

A central technical component is the Reducibility Lemma:

Let $\Gamma$ be a nonempty sequence of types in $Tp(/)$ . Then

$\mbox{L(/→)} \vdash \Gamma \to S \iff \Gamma = \alpha, \Delta_1, \dots, \Delta_n$

where $\alpha = (\cdots((S/\beta_n)/\beta_{n-1})\cdots)/\beta_1$ and each $\Delta_k \to \beta_k$ is provable in L(/→). Moreover, the proof may be assumed cut-free.

Proof Sketch:

Cut-elimination in L(/→) ensures that all provable sequents can be derived without cut. Given the restriction to the two right-implicative rules, the only structural decomposition of the antecedent entails recursively peeling off one slash per application, matching the leftmost derivation of GNF forms in CFGs. This recursive normalization, via induction on the number of slashes, reconstructs exactly the typing and proof decomposition required.

This correspondence enables:

Extraction of CFG productions directly from cut-free L(/→) proofs.
Synthesis of L(/→) derivations for languages generated by CFGs in GNF via direct encoding with Greibach-shape types.

5. Explicit Examples

Example 1:

Let $\Sigma = \{a, b\}$ , $Pr = \{S, B\}$ , $f(a) = \{S/B\}$ , $f(b) = \{B\}$ . Language recognized: $\{a^n b \mid n \ge 0\}$ .

Construction Step	Grammar or Type Assignment
1. CFG Extraction	$S \to a\,B;\;\; B \to b$
2. GNF Conversion	Already in GNF
3. L(/→) Reconstruction	$f'(a) = \{S/B\};\;\; f'(b) = \{B\}$

A derivation proceeds by applying (/ $\to$ ) $n$ times to peel off $n$ $B$ types from the antecedent.

Example 2:

$\Sigma = \{c,d,e\}$ , $Pr = \{S,X,Y\}$ , with GNF productions $S \to c\,X\,Y$ , $X \to d$ , $Y \to e$ . The corresponding Greibach-shape assignments are $f'(c) = \{(S/Y)/X\}$ , $f'(d) = \{X\}$ , $f'(e) = \{Y\}$ . Derivation for $cde$ is constructed with antecedent $((S/Y)/X), X, Y$ , following the same peeling mechanism.

6. Complexity and Closure Properties

Language-theoretic equivalence: L(/→)-grammars characterize precisely the context-free languages over $\Sigma^+$ , excluding the empty word.
Parsing complexity: Parsing can be performed in $O(n^3)$ time (as in Cocke–Younger–Kasami) via reduction to CFG and back.
Closure properties: Union, concatenation, Kleene star, homomorphism, intersection with regular languages, and reversal are preserved under the translation by the three-step transformation.
Decidability: L(/→) proof search is polynomial-time decidable, in contrast to the NP-/PSPACE-completeness of full Lambek calculus fragments.
Conceptual significance: The cut-elimination theorem, within this minimal two-rule fragment, enforces a sequent structure matching the leftmost derivation order of GNF grammars, offering a direct logical interpretation of an external grammar-theoretic constraint.

7. Context and Broader Impact

This identification of a Greibach Normal Form analogue within L(/→) provides a transparent, proof-theoretic account of context-free syntax within a system of intuitionistic, non-commutative linear logic. The constructive method clarifies the direct interplay between cut-elimination, derivational normal forms, and grammatical leftmost derivation. The approach is notably simpler and more transparent than previous correspondences between Lambek calculus and formal languages, relying only on two inference rules and induction on sequents. A plausible implication is that richer fragments of Lambek calculus may admit finer complexity stratifications and additional normal forms as further explored in recent literature (Nishimiya et al., 4 Nov 2025). The construction provides a pathway for the transfer of classical parsing and closure results to type-logical grammars and formal proof systems.

PDF Markdown Chat (Pro)

References (1)

Non-commutative linear logic fragments with sub-context-free complexity (2025)

Follow Topic

Get notified by email when new papers are published related to Greibach Normal Form in Lambek Grammar.