Telling-Making-Enacting: Narrative Pipeline
- Telling-Making-Enacting (TME) is a computational framework that decomposes narrative generation into deep semantic encoding, systematic content transformation, and persona-driven surface realization.
- The Making phase algorithmically converts monologic stories into dialogs using techniques like content allocation, aggregation, pronominalization, and lexical adjustments.
- Empirical evaluations of TME, implemented in the M2D system, demonstrate enhanced engagement and clear personality instantiation in generated dialogic narratives.
The Telling–Making–Enacting (TME) paradigm constitutes a pipeline architecture for transforming deep story representations into dialogic conversational narratives. Within the context of computational storytelling, TME formally decomposes the narrative generation process into “Telling” (deep semantic encoding), “Making” (systematic content transformation), and “Enacting” (persona-driven surface realization). The architecture is instantiated in the M2D (“Monolog to Dialog”) system, which algorithmically converts monologic retellings into multi-speaker dialog while incorporating personality models and explicit dialogic variation (Bowden et al., 2017).
1. Deep Representation (“Telling”)
The Telling phase composes the underlying semantic content of a story using two hierarchical structures: the Story Intention Graph (SIG) and Deep Syntactic Structures (DsyntS).
- Story Intention Graph (SIG): Formally, the SIG is a directed graph
where - are event nodes (atomic story actions), - are character/prop nodes, - encode temporal precedence ( if precedes ), - are participation edges (event-character/prop links), - annotate intentions and affects.
- Deep Syntactic Structures (DsyntS): Each text sentence is represented as a tree
0
with node set 1, node–label function 2, and labeled arcs 3.
The coupling of the SIG (“fabula”) and DsyntS (“discourse skeleton”) comprises the narrative’s deep structure in the TME system (Bowden et al., 2017).
2. Systematic Conversion Pipeline (“Making”)
The Making phase constitutes the algorithmic transformation from deep story structures into dialogic form. This phase is realized by the M2D system as a multi-stage process, summarized both in descriptive steps and as high-level pseudocode.
- Content Allocation: Parameter 4 determines the fraction of content assigned to Speaker 1, with 5 for Speaker 2.
- Aggregation/De-aggregation: If a DsyntS 6 exceeds a node threshold 7, split; if consecutive trees are both below 8 and share participant, merge.
- Pronominalization & Coreference: Entities previously mentioned and now available for first/second-person realization are replaced with pronouns.
- Content Elaboration: Question–answer (QA) pairs are generated by pruning—with probability 9, a selected node is replaced with a suitable wh-word, dropping its subtree to form the question, and the answer is the removed subtree.
- Pragmatic Marker Insertion: Style-weight vector 0 over 1 features controls the insertion of dialogic markers (e.g., hedges, exclamations); probability for marker 2 is
3
- Lexical Choice: Content words can be substituted by synonyms 4 if 5 or to diversify.
- Morphosyntactic Post-processing: Grammatical contractions, possessives, and final punctuation are processed before realization.
- Surface Realization: RealPro renders DsyntS trees into grammatical English dialog turns.
Table: Core Pipeline Stages and Parameters
| Stage | Key Parameter(s) | Operation Summary |
|---|---|---|
| Content Allocation | 6 | Split turns between speakers |
| Aggregation/De-aggregation | 7 | Merge/split DsyntS trees |
| Pronominalization | -- | Replace with pronoun when contextually licensed |
| Pragmatic Markers | 8 | Insert dialogic feature by softmax |
| Lexical Choice | 9, 0 | Synonym replacement |
A high-level pseudocode implementation of the full pipeline is included as an explicit procedure (see original source for details) (Bowden et al., 2017).
3. Persona and Personality Control (“Enacting”)
Enacting operationalizes speaker-level variation in dialogic realization by controlling personality traits along a reduced Big-Five axis.
- Trait Vector: Each speaker 1 is given a trait 2, with 3 (extravert) and 4 (introvert).
- Style-Weight Vector: For dialog-feature set 5, two prototypes (6 for extraverts, 7 for introverts) are defined, with
8
The probability of feature 9 in speaker 0’s turn is then:
1
- Content Allocation Bias: Extraverts are assigned more dialog content:
2
Personality instantiation directly modulates dialogic markers, degree of content allocation, and dialogic act selection in realization.
4. Key Mathematical Formulations
- Dialogue-Act Selection: For candidate dialog act 3 in context 4, score for speaker 5 is
6
where 7 encodes local context and act, 8 indicates interactivity, and 9 biases extraverted behavior. Select
0
- LLM Adaptation: Basic LLM
1
is adapted for persona embedding 2 as
3
where 4 encodes 5 and 6 is a learned projection (Bowden et al., 2017).
5. Evaluation Methodology and Quantitative Results
M2D is empirically evaluated using a controlled setup on two narratives (Garden, Squirrel) from PersonaBank, with three system variants generated per story:
- M2D-EST: Pure EST, equally segmented, no dialogic variation.
- M2D-Basic: Adds pronominalization, aggregation/deaggregation, minimal post-processing.
- M2D-Chatty: Further introduces pragmatic markers, QA, paraphrase.
Evaluation protocol consists of human-subjects assessment (Amazon Mechanical Turk, 5 judges per HIT) for engagement (“How engaging was the dialog?”) and naturalness (“How natural did it read?”) on a 1–5 Likert scale. Personality recognition tasks have judges identifying extraverted, introverted, or default dialog variants across four stories and three personality configurations, generating a confusion matrix of judgments and overall accuracy.
Key quantitative outcomes include:
- Engagement: Mean scores—M2D-EST: 2.4, M2D-Basic: 2.6, M2D-Chatty: 3.1; Chatty vs. EST: 7, significant increase.
- Naturalness: Mean scores—M2D-EST: 2.2, M2D-Basic: 3.4, M2D-Chatty: 2.9; Basic vs. EST: 8, Basic significantly more natural.
- Personality Recognition: 88% overall accuracy; confusion matrix shows near-perfect discrimination between extraverted and introverted instantiations.
These findings indicate that dialogic enrichments via the TME pipeline reliably increase engagement, minimal transformations boost naturalness, and generated personalities are clearly discernable by human judges (Bowden et al., 2017).
6. Synthesis and Thematic Implications
The TME paradigm—synthesized as “Telling” (SIG + DsyntS), “Making” (pipeline control over content and dialogic structure), and “Enacting” (persona-informed realization)—provides a principled decomposition of computational conversational storytelling. Its explicit mapping from deep semantic representation to dialogic surface form enables controlled experimentation with dialogic and personality features, facilitating systematic ablation and enrichment analyses.
A plausible implication is that parameterized TME pipelines offer a robust framework for both fine-grained dialog generation research and scalable applications in storytelling systems, virtual agents, and educational narrative technologies. The explicit control over persona parameters and dialogic strategies distinguishes TME-based architectures from approaches that treat dialog generation as monolithic sequence transduction.