BookWorld System Framework
- BookWorld System is a comprehensive, modular framework for quantifying and simulating literary social networks and narrative structures.
- It employs network science techniques and community detection methods to extract character relationships and reveal evolving narrative dynamics.
- The system integrates agent-based simulation with LLM-driven planning to generate interactive, empirically validated narratives.
BookWorld System refers to a comprehensive, modular framework for extracting, representing, analyzing, and simulating the social and narrative structure of literary worlds, particularly in the context of novels and their adaptations. Modern BookWorld Systems integrate network science techniques for quantifying character relationships with multi-agent LLM–based simulation for generative and interactive storytelling. The architecture extends from network-extracted character maps to dynamic agent-based generative narratives, supporting empirical comparative analysis, creative story generation, interactive games, and narrative-driven social simulation (Janosov, 2022, Ran et al., 20 Apr 2025).
1. Data-Driven Social Network Extraction
BookWorld Systems originate with the exhaustive extraction and quantification of character networks from raw narrative sources (novel text, subtitle files, screenplays). The canonical workflow is as follows (Janosov, 2022):
- Construct an authoritative character list and assign unique IDs.
- Tokenize the narrative into sentences; apply Named-Entity Recognition (NER) or deterministic string matching to identify character presence in each sentence.
- Using a sliding window of size (empirically ), for all sentence pairs with , increment the weighted edge between co-occurring character pairs .
- Build the adjacency matrix , where , or an edge list representation.
- Extend to multi-layer networks for cross-media comparison by defining separate adjacency matrices , supporting tensorial representation .
This sequence enables transforming the corpus into a robust, quantitative social map for downstream analysis.
2. Network Analytic Metrics and Community Structure
BookWorld Systems leverage standard network-theoretic metrics for quantifying centrality, influence, and group structure:
- Degree and strength centrality: (unweighted); (weighted).
- Betweenness centrality: , where is the number of shortest paths between and , and is the count passing through .
- Closeness centrality: , where is the shortest-path length.
- Eigenvector centrality: , with as relative scores.
- Optional: Clustering coefficient, PageRank.
Community detection algorithms—primarily Louvain for modularity maximization and Girvan–Newman for edge-based partitioning—expose the mesoscale structure. Clique enumeration (Bron–Kerbosch) identifies maximally connected subgroups.
Global statistics include node and edge counts , density $2m/[N(N-1)]$, average path length, diameter, and mean clustering coefficient.
3. Visualization and Comparative Workflow
Information visualization is integral to the BookWorld analytic pipeline:
- Force-directed layout algorithms (e.g., ForceAtlas2, Fruchterman–Reingold) provide interpretable, low-dimensional embeddings.
- Aesthetic encoding: node size proportional to centrality; color by community label; edge thickness by weight; selective labeling to reduce clutter.
- Export GraphML or GEXF formats to Gephi or similar platforms for interactive exploration.
The pipeline supports direct comparison of novel and screen adaptation layers, including:
- Character overlap analyses
- Centrality ranking contrasts
- Community structure evolution (merges/splits)
- Subgraph induction for intersectional metrics
4. Agent-Based Simulation and Generative Narrative
Contemporary BookWorld Systems implement book-derived multi-agent environments for simulation and story generation (Ran et al., 20 Apr 2025):
- Persona encoder: Extracts static character profiles, dynamic personality vectors, and act-by-act outlines via LLM chunking, fact extraction, filtering, clustering, and summarization.
- World Model: Represents settings as a discrete geospatial map; a World Agent maintains global state, occupancy, and worldview constraints, resolving environment interactions through LLM prompts.
- Role Agents: Each major character is an agent with static and dynamic attributes (goals, health, memories), including short-term and long-term memory subsystems augmented via vector retrieval.
- Action Planning: At each simulation turn, an LLM-driven planner for each agent contemplates context (profile, memories, worldview, visible actors) to yield a JSON-wrapped action plan.
Story generation proceeds by iterating through scenes, simulating agent interactions—role-to-role, NPC, environment, or solitary—capturing action logs, and employing post-simulation LLM rephrasing to synthesize a readable narrative.
BookWorld is formalized as an MDP-style multi-agent system with extended narrative reward:
where reward combines character fidelity, world consistency, and narrative tension.
5. Implementation Details, Empirical Results, and Tools
A typical pipeline combines open-source and cloud-based tools:
| Step | Implementation | Tools/Libraries |
|---|---|---|
| Character/entity extraction | NER / LLM fact extraction | spaCy, LLM APIs, NLTK, pandas |
| Network assembly | Co-mention counting, matrices | Python (NetworkX, igraph), R, pandas |
| Analysis & community | Centrality, modularity | NetworkX, python-igraph, Gephi, R (ggraph) |
| Visualization | Force-directed layouts | Gephi, NetworkX, GraphML/GEXF export |
| Agent-based simulation | LLM orchestration, retrieval | GPT-4-o, Gemini-2, Qwen-Plus, vector DBs |
Empirical evaluation on six Chinese and ten English novels (e.g., A Song of Ice and Fire, Solaris, Dracula) demonstrates:
- BookWorld surpasses direct generation on anthropomorphism (91.3%), character fidelity (73.9%), immersion and setting (98.5%), writing quality (91.3%), and storyline/creativity (87.0%).
- Against HoLLMwood baseline [Chen et al., 2024], BookWorld achieves 56.5–97.1% win rates across evaluation dimensions, with substantial improvements in setting immersion and source fidelity.
- Aggregate win in majority metrics is 75.36%.
- Ablation reveals degraded narrative coherence and immersion if scene segmentation, environmental responses, or worldview settings are omitted (Ran et al., 20 Apr 2025).
6. Applications, Extensions, and Future Trajectories
BookWorld Systems support a spectrum of applications:
- Creative story branching: Exploration of counterfactual narrative scenarios within established worlds.
- Interactive games: Tabletop or digital environments with dynamic non-player character (NPC) and world modeling.
- Social simulation and narrative analysis: Controlled experiments on character network dynamics and emergent groups under modified conditions.
Ongoing and future research targets:
- Enhanced spatial reasoning and path planning
- Fine-grained emotion and affect simulation
- Real-time multi-user interactivity
- Novel-specific LLM fine-tuning for style consistency and deep persona fidelity
- Reinforcement learning targeting the global narrative reward for increased coherence and dramatic quality
A plausible implication is that BookWorld Systems offer a reproducible, extensible methodology bridging quantitative literary analysis and generative AI-based story simulation, combining the rigor of network science with the creative flexibility of LLMs (Janosov, 2022, Ran et al., 20 Apr 2025).