Papers
Topics
Authors
Recent
Search
2000 character limit reached

Telling-Making-Enacting: Narrative Pipeline

Updated 16 April 2026
  • Telling-Making-Enacting (TME) is a computational framework that decomposes narrative generation into deep semantic encoding, systematic content transformation, and persona-driven surface realization.
  • The Making phase algorithmically converts monologic stories into dialogs using techniques like content allocation, aggregation, pronominalization, and lexical adjustments.
  • Empirical evaluations of TME, implemented in the M2D system, demonstrate enhanced engagement and clear personality instantiation in generated dialogic narratives.

The Telling–Making–Enacting (TME) paradigm constitutes a pipeline architecture for transforming deep story representations into dialogic conversational narratives. Within the context of computational storytelling, TME formally decomposes the narrative generation process into “Telling” (deep semantic encoding), “Making” (systematic content transformation), and “Enacting” (persona-driven surface realization). The architecture is instantiated in the M2D (“Monolog to Dialog”) system, which algorithmically converts monologic retellings into multi-speaker dialog while incorporating personality models and explicit dialogic variation (Bowden et al., 2017).

1. Deep Representation (“Telling”)

The Telling phase composes the underlying semantic content of a story using two hierarchical structures: the Story Intention Graph (SIG) and Deep Syntactic Structures (DsyntS).

  • Story Intention Graph (SIG): Formally, the SIG is a directed graph

G=(VEVC,ETECEA)G = (V_E \,\cup\, V_C,\, E_T \,\cup\, E_C \,\cup\, E_A)

where - VE={e1,...,en}V_E = \{e_1, ..., e_n\} are event nodes (atomic story actions), - VC={c1,...,cm}V_C = \{c_1, ..., c_m\} are character/prop nodes, - ETVE×VEE_T \subset V_E \times V_E encode temporal precedence (eieje_i \to e_j if eie_i precedes eje_j), - ECVE×VCVC×VEE_C \subset V_E \times V_C \cup V_C \times V_E are participation edges (event-character/prop links), - EAVE×{goal,plan,affect}×VCE_A \subset V_E \times \{\textit{goal}, \textit{plan}, \textit{affect}\} \times V_C annotate intentions and affects.

  • Deep Syntactic Structures (DsyntS): Each text sentence sks_k is represented as a tree

VE={e1,...,en}V_E = \{e_1, ..., e_n\}0

with node set VE={e1,...,en}V_E = \{e_1, ..., e_n\}1, node–label function VE={e1,...,en}V_E = \{e_1, ..., e_n\}2, and labeled arcs VE={e1,...,en}V_E = \{e_1, ..., e_n\}3.

The coupling of the SIG (“fabula”) and DsyntS (“discourse skeleton”) comprises the narrative’s deep structure in the TME system (Bowden et al., 2017).

2. Systematic Conversion Pipeline (“Making”)

The Making phase constitutes the algorithmic transformation from deep story structures into dialogic form. This phase is realized by the M2D system as a multi-stage process, summarized both in descriptive steps and as high-level pseudocode.

  • Content Allocation: Parameter VE={e1,...,en}V_E = \{e_1, ..., e_n\}4 determines the fraction of content assigned to Speaker 1, with VE={e1,...,en}V_E = \{e_1, ..., e_n\}5 for Speaker 2.
  • Aggregation/De-aggregation: If a DsyntS VE={e1,...,en}V_E = \{e_1, ..., e_n\}6 exceeds a node threshold VE={e1,...,en}V_E = \{e_1, ..., e_n\}7, split; if consecutive trees are both below VE={e1,...,en}V_E = \{e_1, ..., e_n\}8 and share participant, merge.
  • Pronominalization & Coreference: Entities previously mentioned and now available for first/second-person realization are replaced with pronouns.
  • Content Elaboration: Question–answer (QA) pairs are generated by pruning—with probability VE={e1,...,en}V_E = \{e_1, ..., e_n\}9, a selected node is replaced with a suitable wh-word, dropping its subtree to form the question, and the answer is the removed subtree.
  • Pragmatic Marker Insertion: Style-weight vector VC={c1,...,cm}V_C = \{c_1, ..., c_m\}0 over VC={c1,...,cm}V_C = \{c_1, ..., c_m\}1 features controls the insertion of dialogic markers (e.g., hedges, exclamations); probability for marker VC={c1,...,cm}V_C = \{c_1, ..., c_m\}2 is

VC={c1,...,cm}V_C = \{c_1, ..., c_m\}3

  • Lexical Choice: Content words can be substituted by synonyms VC={c1,...,cm}V_C = \{c_1, ..., c_m\}4 if VC={c1,...,cm}V_C = \{c_1, ..., c_m\}5 or to diversify.
  • Morphosyntactic Post-processing: Grammatical contractions, possessives, and final punctuation are processed before realization.
  • Surface Realization: RealPro renders DsyntS trees into grammatical English dialog turns.

Table: Core Pipeline Stages and Parameters

Stage Key Parameter(s) Operation Summary
Content Allocation VC={c1,...,cm}V_C = \{c_1, ..., c_m\}6 Split turns between speakers
Aggregation/De-aggregation VC={c1,...,cm}V_C = \{c_1, ..., c_m\}7 Merge/split DsyntS trees
Pronominalization -- Replace with pronoun when contextually licensed
Pragmatic Markers VC={c1,...,cm}V_C = \{c_1, ..., c_m\}8 Insert dialogic feature by softmax
Lexical Choice VC={c1,...,cm}V_C = \{c_1, ..., c_m\}9, ETVE×VEE_T \subset V_E \times V_E0 Synonym replacement

A high-level pseudocode implementation of the full pipeline is included as an explicit procedure (see original source for details) (Bowden et al., 2017).

3. Persona and Personality Control (“Enacting”)

Enacting operationalizes speaker-level variation in dialogic realization by controlling personality traits along a reduced Big-Five axis.

  • Trait Vector: Each speaker ETVE×VEE_T \subset V_E \times V_E1 is given a trait ETVE×VEE_T \subset V_E \times V_E2, with ETVE×VEE_T \subset V_E \times V_E3 (extravert) and ETVE×VEE_T \subset V_E \times V_E4 (introvert).
  • Style-Weight Vector: For dialog-feature set ETVE×VEE_T \subset V_E \times V_E5, two prototypes (ETVE×VEE_T \subset V_E \times V_E6 for extraverts, ETVE×VEE_T \subset V_E \times V_E7 for introverts) are defined, with

ETVE×VEE_T \subset V_E \times V_E8

The probability of feature ETVE×VEE_T \subset V_E \times V_E9 in speaker eieje_i \to e_j0’s turn is then:

eieje_i \to e_j1

  • Content Allocation Bias: Extraverts are assigned more dialog content:

eieje_i \to e_j2

Personality instantiation directly modulates dialogic markers, degree of content allocation, and dialogic act selection in realization.

4. Key Mathematical Formulations

  • Dialogue-Act Selection: For candidate dialog act eieje_i \to e_j3 in context eieje_i \to e_j4, score for speaker eieje_i \to e_j5 is

eieje_i \to e_j6

where eieje_i \to e_j7 encodes local context and act, eieje_i \to e_j8 indicates interactivity, and eieje_i \to e_j9 biases extraverted behavior. Select

eie_i0

  • LLM Adaptation: Basic LLM

eie_i1

is adapted for persona embedding eie_i2 as

eie_i3

where eie_i4 encodes eie_i5 and eie_i6 is a learned projection (Bowden et al., 2017).

5. Evaluation Methodology and Quantitative Results

M2D is empirically evaluated using a controlled setup on two narratives (Garden, Squirrel) from PersonaBank, with three system variants generated per story:

  • M2D-EST: Pure EST, equally segmented, no dialogic variation.
  • M2D-Basic: Adds pronominalization, aggregation/deaggregation, minimal post-processing.
  • M2D-Chatty: Further introduces pragmatic markers, QA, paraphrase.

Evaluation protocol consists of human-subjects assessment (Amazon Mechanical Turk, 5 judges per HIT) for engagement (“How engaging was the dialog?”) and naturalness (“How natural did it read?”) on a 1–5 Likert scale. Personality recognition tasks have judges identifying extraverted, introverted, or default dialog variants across four stories and three personality configurations, generating a confusion matrix of judgments and overall accuracy.

Key quantitative outcomes include:

  • Engagement: Mean scores—M2D-EST: 2.4, M2D-Basic: 2.6, M2D-Chatty: 3.1; Chatty vs. EST: eie_i7, significant increase.
  • Naturalness: Mean scores—M2D-EST: 2.2, M2D-Basic: 3.4, M2D-Chatty: 2.9; Basic vs. EST: eie_i8, Basic significantly more natural.
  • Personality Recognition: 88% overall accuracy; confusion matrix shows near-perfect discrimination between extraverted and introverted instantiations.

These findings indicate that dialogic enrichments via the TME pipeline reliably increase engagement, minimal transformations boost naturalness, and generated personalities are clearly discernable by human judges (Bowden et al., 2017).

6. Synthesis and Thematic Implications

The TME paradigm—synthesized as “Telling” (SIG + DsyntS), “Making” (pipeline control over content and dialogic structure), and “Enacting” (persona-informed realization)—provides a principled decomposition of computational conversational storytelling. Its explicit mapping from deep semantic representation to dialogic surface form enables controlled experimentation with dialogic and personality features, facilitating systematic ablation and enrichment analyses.

A plausible implication is that parameterized TME pipelines offer a robust framework for both fine-grained dialog generation research and scalable applications in storytelling systems, virtual agents, and educational narrative technologies. The explicit control over persona parameters and dialogic strategies distinguishes TME-based architectures from approaches that treat dialog generation as monolithic sequence transduction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Telling-Making-Enacting (TME).