Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 165 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 112 tok/s Pro

Kimi K2 208 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Thoughtbubbles in AI: Adaptive and Visual Reasoning

Updated 2 October 2025

Thoughtbubbles are computational and physical constructs that isolate and parallelize individual reasoning units in AI, multimodal systems, and robotics.
They leverage adaptive computation, graph-based decomposition, and interactive visualization to improve inference quality and operational efficiency.
Empirical studies show improvements in perplexity, zero-shot generalization, and user engagement across transformer models, reasoning graphs, and expressive robotic platforms.

Thoughtbubbles refer to computational and physical constructs designed to isolate, parallelize, or visualize individual units of thought or reasoning, whether in transformer-based AI architectures, multimodal human-computer interfaces, or expressive robotic systems. Recent research addresses “Thoughtbubbles” in multiple domains: transformer mechanisms for adaptive parallel reasoning, frameworks for graph-based decomposition of problem-solving steps, interactive visualization platforms, expressive modules for social robots, and reflective ideation in virtual reality.

1. Adaptive Parallel Reasoning in Transformers

The “Thoughtbubbles” model (Liu et al., 30 Sep 2025) introduces an unsupervised, architectural mechanism for adaptive parallel computation in transformer networks. Unlike chain-of-thought (CoT) approaches, which emit explicit text-based rationales for reasoning and are limited to inference-time chain verbalization, Thoughtbubbles allow transformers to natively “fork” residual streams in latent space. Each token at designated layers can either clone (fork) its internal representation, generating a “bubble” of additional computation, or be deleted, enabling dynamic allocation of computational resources per token. The central decision process is governed by forking and keep scores, computed per token as:

(1) ĉ₍fork,i₎^k = p₍cum,i₎^k–1 * p₍fork,i₎^k (2) ĉ₍keep,i₎^k = p₍cum,i₎^k–1 * p₍keep,i₎^k

A top-k operation is applied for each layer to constrain compute budgets, with forked bubbles inheriting a learned offset for position encoding. The adaptive computation is learned entirely under the LLMing objective; no extra supervision is used. Performance metrics across OpenWebText, peS2o, HellaSwag, and LAMBADA benchmarks show lower perplexity and improved zero-shot generalization over standard non-adaptive baselines.

2. Graph-Based Reasoning and Information Aggregation

The Graph of Thoughts (GoT) framework (Besta et al., 2023) extends thoughtbubbles into the prompt engineering and reasoning paradigm for LLMs. Here, each “thought” is treated as a vertex (“bubble”) in an arbitrarily structured graph, where edges represent dependencies among reasoning units. This approach generalizes CoT and tree-of-thoughts (ToT), supporting aggregation, feedback loops, and the merging of partial solutions. The volume of a thought—quantifying how many bubbles contributed contextually—is formalized as:

$\text{Volume}(v) = \left|\{ w \in V \; :\; \text{there is a path from } w \text{ to } v \}\right|$

Task performance (sorting, set intersection, document merging) demonstrates that GoT yields up to 62% improvement in output quality and 31% reduction in inference cost versus ToT. Crucially, GoT mirrors human cognition—nonlinear, network-like aggregation of ideas—rather than strict serial logic.

3. Interactive Visualization and Human-LLM Collaboration

“Thoughtbubbles” as a metaphor and technical construct appear in interactive systems such as iToT (Boyle et al., 31 Aug 2024), which enables users to navigate, edit, and evaluate a tree-of-thought reasoning process via a node-link visual interface. Every “thought” acts as a bubble/node connected by edges, with semantic grouping and scoring by path. The design facilitates expansion, user injection of custom thoughts, and real-time feedback—transforming static outputs into manipulable reasoning graphs.

In virtual reality, VIVRA (Xing et al., 23 Sep 2024) uses LLMs to transcribe user speech, extract key topics, and generate floating “idea balloons” (thoughtbubbles) in a 3D space. Balloons are dynamically created, merged, edited, and repositioned with multimodal controls (gaze, controller input). Evaluations find that visualizing thoughts in this manner enhances reflection, engagement, and creative reorganization processes.

4. Expressivity in Physical Robotic Platforms

Beyond computation, “Thoughtbubbles” serve as explicit nonverbal expressive cues in robotics (Koike et al., 2023). Fluid-based cues—especially bubbles—are prototyped as part of EmoPack, a modular add-on for social robots. Bubbles are chosen to convey curiosity, delight, and playful affect, leveraging cultural and graphical tropes from animation and comics. Their physical dynamics (drifting upward, transient presence) make robots seem lively even with static facial hardware, and rhythm/repetition can signal nuanced mental states. Bubbles augment traditional eye and head gestures, enriching the social presence and communicative repertoire of robots.

5. Iterative Reasoning and Visible Cognition

THiNK (Yu et al., 26 May 2025) operationalizes thoughtbubbles in the context of evaluating higher-order reasoning in LLMs. Its framework uses multiple agents modeled on Bloom’s Taxonomy, each producing feedback and scoring over iterative cycles of problem revision. The process is explicitly stepwise (“think-aloud”), mapping human-like think-aloud annotation and feedback into the LLM’s output. Each revision and critique can be interpreted as a visible “thoughtbubble,” permitting detailed introspection and transparency in reasoning evaluation. Empirical results show improvement in higher-order cognitive abilities when structured feedback is provided, versus static output evaluation.

6. Reflection and Ideation Enhancement via Multimodal Systems

Multimodal systems such as VIVRA (Xing et al., 23 Sep 2024) demonstrate the utility of thoughtbubble visualization for personal and educational ideation. By rendering verbalized topics as spatially distributed, size-coded 3D bubbles, users gain quick visual access to discussion structure and content emphasis. Empirical data shows superiority in promoting reflection and creativity compared to transcript and word cloud baselines. The use of both interactive and recorded narrative modes allows time-based, reorganizable idea tracking.

7. Implications and Unified Directions

A cross-cutting implication of these approaches is the trend toward unification of train-time and inference-time behavior in reasoning models (Liu et al., 30 Sep 2025). Thoughtbubbles architectures learn computation allocation purely through LLMing, dissolving the artificial boundary between explicit chain-of-thought supervision and model-native “thinking.” Similarly, graph-, tree-, and feedback-linked thoughtbubbles provide extensible APIs for both human and automated reasoning systems, augmenting interpretability, transparency, and expressivity across physical and virtual platforms.

This suggests that future AI systems may benefit from further integration of native thoughtbubble mechanisms—adaptive computation, semantic visualization, physical expressiveness, and iterative think-aloud annotation—not only in LLMing but also in collaborative, educational, and social contexts.