Character Interaction Networks
- Character interaction networks are graph representations that encode narrative characters as nodes and their interactions as edges, capturing relationships and dynamics.
- They employ modular NLP pipelines—including tokenization, NER, and coreference resolution—to extract and weight character co-occurrences, enabling static, dynamic, and multilayer analyses.
- These networks facilitate quantitative narrative analysis, aiding in plot progression, genre classification, and exploration of social dynamics across diverse media.
A character interaction network is a graph-theoretic representation of a narrative's ensemble, in which each node denotes a character and each edge encodes an interaction, relation, or co-occurrence between two characters. This formalism is inherently multi-modal: it is applicable to novels, plays, movie and TV scripts, comics, and even audio-visual narratives. The approach enables quantitative and algorithmic analysis of narrative structure, plot progression, and social dynamics within fiction, drama, and other narrative genres (Labatut et al., 2019).
1. Formal Structures and Definitions
Let where is the set of characters and is the set of interactions, typically instantiated as edges in a (usually undirected) graph (Labatut et al., 2019). Edges can be binary (presence/absence), weighted (by frequency, intensity, or semantic content), signed (friendship/rivalry), directed (speaker → addressee), and time-resolved (for dynamic network models). The adjacency matrix encodes this structure, with for an interaction between and , and otherwise. In dynamic contexts, a sequence or encodes how the network evolves across scenes, chapters, or time-slices (Min et al., 2016, Amalvy et al., 2 Jul 2024).
Character interaction networks may be further enriched by node and edge attributes: narrative roles, gender, sentiment, topic-profiles, or narrative importance (Mujtaba et al., 14 Dec 2025, Tripto et al., 2023, Min et al., 2016).
2. Extraction Pipelines and Methodologies
Extraction proceeds through an NLP pipeline, with the canonical modular sequence:
- Tokenization: Segmenting the text into sentences and words.
- Named Entity Recognition (NER): Identification of PERSON entities in the text (with BERT-based models achieving 91.5% precision, 85.2% recall in French literary corpora (Chen et al., 26 Dec 2024)).
- Coreference Resolution: Clustering entity mentions (names, pronouns, titles) that denote the same character, with dedicated models for the literature domain (e.g., BookNLP-fr for French).
- Alias/Canonicalization: Resolution of alternate names/aliases to unique character IDs.
- Interaction Detection: Defining narrative windows (sentence, paragraph, scene, fixed token/k-sentence window), then declaring a tie when two characters co-occur, converse, or interact in actions or dialogue (Amalvy et al., 2 Jul 2024, Bonato et al., 2016, Holanda et al., 2017, Tripto et al., 2023).
- Edge Weighting: Aggregating interaction counts or employing more complex weighting (TF–IDF, sentiment, or topic similarity) (Amalvy et al., 2 Jul 2024, Min et al., 2016).
- Graph Construction: Assembling the final graph from nodes and weighted edges.
This pipeline is modular and extensible, as exemplified by the Renard library, which allows users to compose pipelines from interchangeable NLP modules, enforce I/O contract validity between steps, and extract both static and dynamic interaction networks (Amalvy et al., 2 Jul 2024).
3. Static, Dynamic, and Multilayer Character Networks
Static Networks
A static network aggregates all character interactions over an entire narrative; edge weights count the number of co-occurrences within defined windows, resulting in a weighted adjacency matrix (Chen et al., 26 Dec 2024, Amalvy et al., 2 Jul 2024).
Dynamic Networks
Dynamic analysis partitions the narrative into slices (e.g., by chapter) and computes a snapshot network for each (Min et al., 2016, Bost et al., 2018). In Renard, this yields a collection , with as the number of interactions in slice (Amalvy et al., 2 Jul 2024). Advanced techniques, such as narrative smoothing, interpolate or prorate edge strengths between explicit occurrences to accommodate parallel or non-linear storylines (e.g., in episodic TV) (Bost et al., 2018).
Multilayer Networks
Some frameworks distinguish interaction types and additional semantic layers, yielding multilayer (or multiplex) character networks. In movie script analysis, nodes are partitioned as characters (), locations (), and keywords (), with explicit intra- and inter-layer edges captured in a block supra-adjacency matrix (Mourchid et al., 2018).
4. Analytical Metrics and Structural Properties
Character interaction networks are analyzed using classic and advanced network measures (Labatut et al., 2019, Chen et al., 26 Dec 2024, Min et al., 2016, Tripto et al., 2023):
- Degree centrality (): Fraction of characters a focal node interacts with. .
- Strength / Weighted degree (): Sum of raw interaction weights. .
- Betweenness centrality (): Frequency with which node intermediates shortest paths between pairs.
- Closeness centrality (): Inverse mean shortest path length from to all others.
- Clustering coefficient (): Fraction of possible triangles through a node that are closed.
- Eigenvector centrality: Node importance reflecting connection to other central nodes.
- Modularity : For community detection, capturing the extent to which the network decomposes into subplots or social groups.
- Assortativity: Tendency for nodes to connect to others of similar degree ( in character networks, indicating disassortativity (Holanda et al., 2017)).
- Motif analysis: Frequencies of 3- and 4-node subgraphs, used to fingerprint local connectivity and fit random graph models (Bonato et al., 2016).
- Growth and densification laws: Network size and edge growth across narrative progression can reveal expository phases and segmental structure (Min et al., 2016).
Empirical studies show that degree distributions typically follow a truncated power law, with a few hub characters and many with low degree (Holanda et al., 2017, Bonato et al., 2016).
5. Advanced Modeling: Edge Attribution, Multiplexity, Embeddings
Edge Attribution
Edges may be further enriched to capture additional aspects of relationships:
- Sentiment annotation: Aggregation of sentiment polarity indices over co-occurrences yields signed or real-valued edges, revealing positive vs. negative dynamics (Min et al., 2016, Tripto et al., 2023).
- Topic-weighted edges: Pairwise cosine similarity between topic-distribution vectors of characters quantifies thematic proximity (Min et al., 2016).
- Action-based or dialog-based interactions: Dependency parsing or dialog-role mining recovers directed or differentiated edge types (Labatut et al., 2019).
Multimodal and Multilayer Models
Integration of visual (face tracks, speaker diarization), textual (dialogue, narrative prose), and metadata (narrative role, gender, age) is supported in recent neural character interaction frameworks (Kukleva et al., 2020, Mujtaba et al., 14 Dec 2025). Multilayer models can quantify the role of non-character entities in narrative flow (Mourchid et al., 2018).
Representation Learning
Graph-level and node-level embeddings—via unsupervised schemes (Graph2Vec), node2vec, or supervised Graph Attention Networks (GAT)—capture structural and attribute-based information; these enable downstream tasks such as authorship attribution, genre classification, and information retrieval (Mujtaba et al., 14 Dec 2025). In Urdu authorship attribution, GAT models leveraging node semantics achieve up to 0.857 accuracy, significantly surpassing hand-crafted feature baselines.
6. Applications, Empirical Results, and Interpretative Impact
Character interaction networks have facilitated:
- Quantitative plot analysis and literary theory evaluation (e.g., tracing motif transitions such as “star-to-clique” in “Boule de Suif”) (Chen et al., 26 Dec 2024).
- Cross-lingual and cross-genre comparison (French, Bengali, Urdu corpora) (Chen et al., 26 Dec 2024, Tripto et al., 2023, Mujtaba et al., 14 Dec 2025).
- Genre and authorship classification (via structural features or learned embeddings) (Holanda et al., 2017, Mujtaba et al., 14 Dec 2025).
- Extraction of information related to social structure, gender/role prominence, and historical change (e.g., quantifying shifts in women’s network centrality in Bengali fiction after legislative changes) (Tripto et al., 2023).
- Simulation and generative narratives: Character networks can serve as the backbone for automated plot generation, hypothesis-testing on narrative structure, or procedural storytelling, particularly when guided by appropriate random graph null models (the Chung–Lu model fits character networks better than preferential attachment or configuration models (Bonato et al., 2016)).
- Dynamic analysis and event detection: Tracking node strength or intercommunity connectivity over time exposes key plot twists, sub-plot emergence, or protagonist role changes (Bost et al., 2018, Min et al., 2016).
7. Challenges, Limitations, and Prospects
Character identification can be hampered by alias proliferation, ambiguity in pronoun resolution, and domain transfer limitations of generic NER/coreference tools, especially for low-resource languages (Tripto et al., 2023, Labatut et al., 2019). Static networks obscure temporal plot evolution, while dynamic slicing remains sensitive to narrative windowing choices and can struggle with nonlinear or intertwined plots (Bost et al., 2018, Min et al., 2016).
Manual annotation often supplements automated pipelines under such constraints. Richer models—including those supporting multilayer structure, directed/signed interactions, and joint multimodal inference—are active research areas (Mourchid et al., 2018, Kukleva et al., 2020). Future directions include integrating generative models of narrative, learning graph embeddings across media and languages, advancing coreference for complex literary registers, and extending evaluation to dynamic, at-scale gold standards (Labatut et al., 2019, Amalvy et al., 2 Jul 2024, Mujtaba et al., 14 Dec 2025).
Key References:
- Renard pipeline: modular extraction, static/dynamic, and custom weighting (Amalvy et al., 2 Jul 2024)
- Multilingual corpora and cross-genre analytics (French, Bengali, Urdu): (Chen et al., 26 Dec 2024, Tripto et al., 2023, Mujtaba et al., 14 Dec 2025)
- Narrative smoothing, dynamic networks for screen media (Bost et al., 2018)
- Multilayer movie script models (Mourchid et al., 2018)
- Empirical and generative modeling, motif analysis (Bonato et al., 2016, Holanda et al., 2017)
- Comprehensive survey (Labatut et al., 2019)
- Advanced textual networks with sentiment/topic weights (Min et al., 2016)
- Multimodal interaction/relationship learning (Kukleva et al., 2020)