Telling-Making-Enacting: Narrative Pipeline

Updated 16 April 2026

Telling-Making-Enacting (TME) is a computational framework that decomposes narrative generation into deep semantic encoding, systematic content transformation, and persona-driven surface realization.
The Making phase algorithmically converts monologic stories into dialogs using techniques like content allocation, aggregation, pronominalization, and lexical adjustments.
Empirical evaluations of TME, implemented in the M2D system, demonstrate enhanced engagement and clear personality instantiation in generated dialogic narratives.

The Telling–Making–Enacting (TME) paradigm constitutes a pipeline architecture for transforming deep story representations into dialogic conversational narratives. Within the context of computational storytelling, TME formally decomposes the narrative generation process into “Telling” (deep semantic encoding), “Making” (systematic content transformation), and “Enacting” (persona-driven surface realization). The architecture is instantiated in the M2D (“Monolog to Dialog”) system, which algorithmically converts monologic retellings into multi-speaker dialog while incorporating personality models and explicit dialogic variation (Bowden et al., 2017).

1. Deep Representation (“Telling”)

The Telling phase composes the underlying semantic content of a story using two hierarchical structures: the Story Intention Graph (SIG) and Deep Syntactic Structures (DsyntS).

Story Intention Graph (SIG): Formally, the SIG is a directed graph

$G = (V_E \,\cup\, V_C,\, E_T \,\cup\, E_C \,\cup\, E_A)$

where - $V_E = \{e_1, ..., e_n\}$ are event nodes (atomic story actions), - $V_C = \{c_1, ..., c_m\}$ are character/prop nodes, - $E_T \subset V_E \times V_E$ encode temporal precedence ( $e_i \to e_j$ if $e_i$ precedes $e_j$ ), - $E_C \subset V_E \times V_C \cup V_C \times V_E$ are participation edges (event-character/prop links), - $E_A \subset V_E \times \{\textit{goal}, \textit{plan}, \textit{affect}\} \times V_C$ annotate intentions and affects.

Deep Syntactic Structures (DsyntS): Each text sentence $s_k$ is represented as a tree

$V_E = \{e_1, ..., e_n\}$ 0

with node set $V_E = \{e_1, ..., e_n\}$ 1, node–label function $V_E = \{e_1, ..., e_n\}$ 2, and labeled arcs $V_E = \{e_1, ..., e_n\}$ 3.

The coupling of the SIG (“fabula”) and DsyntS (“discourse skeleton”) comprises the narrative’s deep structure in the TME system (Bowden et al., 2017).

2. Systematic Conversion Pipeline (“Making”)

The Making phase constitutes the algorithmic transformation from deep story structures into dialogic form. This phase is realized by the M2D system as a multi-stage process, summarized both in descriptive steps and as high-level pseudocode.

Content Allocation: Parameter $V_E = \{e_1, ..., e_n\}$ 4 determines the fraction of content assigned to Speaker 1, with $V_E = \{e_1, ..., e_n\}$ 5 for Speaker 2.
Aggregation/De-aggregation: If a DsyntS $V_E = \{e_1, ..., e_n\}$ 6 exceeds a node threshold $V_E = \{e_1, ..., e_n\}$ 7, split; if consecutive trees are both below $V_E = \{e_1, ..., e_n\}$ 8 and share participant, merge.
Pronominalization & Coreference: Entities previously mentioned and now available for first/second-person realization are replaced with pronouns.
Content Elaboration: Question–answer (QA) pairs are generated by pruning—with probability $V_E = \{e_1, ..., e_n\}$ 9, a selected node is replaced with a suitable wh-word, dropping its subtree to form the question, and the answer is the removed subtree.
Pragmatic Marker Insertion: Style-weight vector $V_C = \{c_1, ..., c_m\}$ 0 over $V_C = \{c_1, ..., c_m\}$ 1 features controls the insertion of dialogic markers (e.g., hedges, exclamations); probability for marker $V_C = \{c_1, ..., c_m\}$ 2 is

$V_C = \{c_1, ..., c_m\}$ 3

Lexical Choice: Content words can be substituted by synonyms $V_C = \{c_1, ..., c_m\}$ 4 if $V_C = \{c_1, ..., c_m\}$ 5 or to diversify.
Morphosyntactic Post-processing: Grammatical contractions, possessives, and final punctuation are processed before realization.
Surface Realization: RealPro renders DsyntS trees into grammatical English dialog turns.

Table: Core Pipeline Stages and Parameters

Stage	Key Parameter(s)	Operation Summary
Content Allocation	$V_C = \{c_1, ..., c_m\}$ 6	Split turns between speakers
Aggregation/De-aggregation	$V_C = \{c_1, ..., c_m\}$ 7	Merge/split DsyntS trees
Pronominalization	--	Replace with pronoun when contextually licensed
Pragmatic Markers	$V_C = \{c_1, ..., c_m\}$ 8	Insert dialogic feature by softmax
Lexical Choice	$V_C = \{c_1, ..., c_m\}$ 9, $E_T \subset V_E \times V_E$ 0	Synonym replacement

A high-level pseudocode implementation of the full pipeline is included as an explicit procedure (see original source for details) (Bowden et al., 2017).

3. Persona and Personality Control (“Enacting”)

Enacting operationalizes speaker-level variation in dialogic realization by controlling personality traits along a reduced Big-Five axis.

Trait Vector: Each speaker $E_T \subset V_E \times V_E$ 1 is given a trait $E_T \subset V_E \times V_E$ 2, with $E_T \subset V_E \times V_E$ 3 (extravert) and $E_T \subset V_E \times V_E$ 4 (introvert).
Style-Weight Vector: For dialog-feature set $E_T \subset V_E \times V_E$ 5, two prototypes ( $E_T \subset V_E \times V_E$ 6 for extraverts, $E_T \subset V_E \times V_E$ 7 for introverts) are defined, with

$E_T \subset V_E \times V_E$ 8

The probability of feature $E_T \subset V_E \times V_E$ 9 in speaker $e_i \to e_j$ 0’s turn is then:

$e_i \to e_j$ 1

Content Allocation Bias: Extraverts are assigned more dialog content:

$e_i \to e_j$ 2

Personality instantiation directly modulates dialogic markers, degree of content allocation, and dialogic act selection in realization.

4. Key Mathematical Formulations

Dialogue-Act Selection: For candidate dialog act $e_i \to e_j$ 3 in context $e_i \to e_j$ 4, score for speaker $e_i \to e_j$ 5 is

$e_i \to e_j$ 6

where $e_i \to e_j$ 7 encodes local context and act, $e_i \to e_j$ 8 indicates interactivity, and $e_i \to e_j$ 9 biases extraverted behavior. Select

$e_i$ 0

LLM Adaptation: Basic LLM

$e_i$ 1

is adapted for persona embedding $e_i$ 2 as

$e_i$ 3

where $e_i$ 4 encodes $e_i$ 5 and $e_i$ 6 is a learned projection (Bowden et al., 2017).

5. Evaluation Methodology and Quantitative Results

M2D is empirically evaluated using a controlled setup on two narratives (Garden, Squirrel) from PersonaBank, with three system variants generated per story:

M2D-EST: Pure EST, equally segmented, no dialogic variation.
M2D-Basic: Adds pronominalization, aggregation/deaggregation, minimal post-processing.
M2D-Chatty: Further introduces pragmatic markers, QA, paraphrase.

Evaluation protocol consists of human-subjects assessment (Amazon Mechanical Turk, 5 judges per HIT) for engagement (“How engaging was the dialog?”) and naturalness (“How natural did it read?”) on a 1–5 Likert scale. Personality recognition tasks have judges identifying extraverted, introverted, or default dialog variants across four stories and three personality configurations, generating a confusion matrix of judgments and overall accuracy.

Key quantitative outcomes include:

Engagement: Mean scores—M2D-EST: 2.4, M2D-Basic: 2.6, M2D-Chatty: 3.1; Chatty vs. EST: $e_i$ 7, significant increase.
Naturalness: Mean scores—M2D-EST: 2.2, M2D-Basic: 3.4, M2D-Chatty: 2.9; Basic vs. EST: $e_i$ 8, Basic significantly more natural.
Personality Recognition: 88% overall accuracy; confusion matrix shows near-perfect discrimination between extraverted and introverted instantiations.

These findings indicate that dialogic enrichments via the TME pipeline reliably increase engagement, minimal transformations boost naturalness, and generated personalities are clearly discernable by human judges (Bowden et al., 2017).

6. Synthesis and Thematic Implications

The TME paradigm—synthesized as “Telling” (SIG + DsyntS), “Making” (pipeline control over content and dialogic structure), and “Enacting” (persona-informed realization)—provides a principled decomposition of computational conversational storytelling. Its explicit mapping from deep semantic representation to dialogic surface form enables controlled experimentation with dialogic and personality features, facilitating systematic ablation and enrichment analyses.

A plausible implication is that parameterized TME pipelines offer a robust framework for both fine-grained dialog generation research and scalable applications in storytelling systems, virtual agents, and educational narrative technologies. The explicit control over persona parameters and dialogic strategies distinguishes TME-based architectures from approaches that treat dialog generation as monolithic sequence transduction.

Markdown Report Issue Upgrade to Chat

References (1)

M2D: Monolog to Dialog Generation for Conversational Story Telling (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Telling-Making-Enacting (TME).