InMind: Cognitive & Neural Decoding Framework

Updated 3 July 2026

InMind is a dual-focus framework that evaluates LLMs on individualized reasoning in social deduction games and decodes visual signals from fMRI using a subject–object disentanglement approach.
The LLM component measures how well models capture personalized, temporally anchored reasoning through tasks like player identification, trace attribution, and role inference in structured game settings.
The $i$MIND neural pipeline employs a self-supervised ViT-MAE, orthonormal basis factorization, and dual decoding to achieve state-of-the-art cross-subject generalization and interpretable neural representation.

InMind encompasses both a cognitively anchored evaluation framework for LLMs in capturing and applying individualized human reasoning styles in social deduction games (SDGs), and the $i$ MIND neural decoding architecture for subject-invariant decoding of visual signals from fMRI. Both approaches address distinct dimensions of subjectivity and individuation in cognition—one in behavioral strategy, the other in neural representation and decoding—exemplifying convergent themes in computational cognitive science and neural engineering (Li et al., 22 Aug 2025, Yin et al., 22 Sep 2025).

1. Conceptual Foundation and Motivation

InMind (LLM): The InMind framework emerges from limitations in existing theory-of-mind (ToM) benchmarks that typically restrict evaluation to global plausibility of intent judgments or false-belief attribution. These settings fail to probe whether LLMs genuinely internalize the style and trajectory of a specific individual's reasoning as leveraged in real-time, sequential social contexts. SDGs like Avalon, with their transparency of utterances, sequential moves, and evolving private/public states, present a testbed for observing and evaluating actual individualized reasoning behaviors (Li et al., 22 Aug 2025).

$i$ MIND (Neural Decoding): In parallel, the $i$ MIND (Insightful Multi-subject Invariant Neural Decoding) model is introduced to mitigate the challenge that cross-subject neural decoding from fMRI is dominated by subject-specific variability, which both limits generalization and occludes interpretation of brain-based visual processing. The $i$ MIND architecture is designed to explicitly factorize, decode, and interpret both individual- and object-level components of neural representations, enabling scalable, interpretable, and generalizable neural decoding within and across subjects (Yin et al., 22 Sep 2025).

2. InMind LLM Framework: Methodology and Task Structure

The InMind framework is built around a formal structured game representation: $\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ where $\mathcal{A}$ designates the mapping from players to roles, $E_z$ encodes round-level (utterances, game state, and strategy trace), and $\mathcal{F}$ is the global session reflection. Data is collected under two "modes": Observer (no direct participation) and Participant (active player).

Dual-layer cognitive annotations are attached: (1) round-level strategy traces $\{S_z\}$ capturing quasi-veridical records of evolving beliefs, intentions, and inference processes; and (2) a high-level reflective summary $\mathcal{F}$ , interpreting key events and meta-strategic assessments.

Evaluation is operationalized as a two-stage pipeline:

Capturing: Profile induction from observer-mode annotation using a “ProfilePrompt”—free-form profiles summarizing temporally diffuse reasoning styles.
Applying: Given a subject profile $i$ 0 and a participant-mode session, the model is evaluated on four tasks:

Task	Description	Metric(s)
Player Identification	Rank the true subject in anonymized session	Top- $i$ 1 accuracy $i$ 2
Reflection Alignment	Fill in masked IDs in post-game reflection	Exact match $i$ 3
Trace Attribution	Map masked trace segments to correct IDs per round	Match accuracy; adaptation $i$ 4
Role Inference	Infer player roles at each round, strict or grouped labeling	Strict/group accuracy

General-purpose LLMs and reasoning-enhanced models (DeepSeek-R1, QwQ, O3-mini) are compared under zero-shot prompting regimens with enforced structured output, and all data is Mandarin voice-chat transcribed (Li et al., 22 Aug 2025).

3. $i$ 5MIND Neural Decoding Pipeline: Architecture and Objectives

The $i$ 6MIND model is a three-stage end-to-end pipeline for multi-subject fMRI decoding:

Self-supervised ViT-MAE Pretraining The input is a flattened, uniformly padded voxel vector $i$ 7, split into $i$ 8 non-overlapping 64-voxel patches. The model employs a 12-layer ViT encoder and an 8-layer transformer decoder. Masked patches (75%) are reconstructed with a voxel-wise mean-squared error loss:

$i$ 9

yielding shared neural features $i$ 0, with $i$ 1.

Subject–Object Disentanglement Each patch embedding is factorized via a learned orthonormal basis $i$ 2, with explicit decomposition:

$i$ 3

where $i$ 4 and $i$ 5, $i$ 6.

Dual Decoding: Biometric and Semantic
- Biometric (subject ID): Pool $i$ 7, apply linear classifier to output logits for $i$ 8 subjects.
- Semantic (object classification): Freeze CLIP visual features; cross-attend CLIP tokens with $i$ 9 via multi-head attention, and classify pooled outputs.

The objective function in the dual decoding phase combines: $i$ 0 with classification (cross-entropy or binary cross-entropy) for subject and object, and orthonormality regularization on $i$ 1 ( $i$ 2) (Yin et al., 22 Sep 2025).

4. Empirical Evaluation and Comparative Results

InMind (LLM):

The primary case study is on Avalon (6-player, Mandarin, transcribed voice chat). Dataset statistics: 30 sessions, 884 utterances, 160 traces, and 30 reflections. Quantitative results highlight several constraints:

Player identification: General LLMs achieve top-1 accuracy near baseline (0.16 for GPT-4o), DeepSeek-R1 marginally higher (0.24). Top-3 accuracy remains modest. BERT-based cosine matching performs comparably, indicating frequent reliance on surface lexical cues.
Reflection alignment: With explicit trace input, models reach $i$ 380% accuracy; without, accuracy drops to $i$ 430%, showing strong anchoring dependency.
Trace attribution: Incremental gains from prior-trace context are small or negative (e.g., $i$ 5 GPT-4o), indicating limited true adaptive reasoning.
Role inference: Strict match accuracy is $i$ 630–40%, relaxed grouping $i$ 760–70%. Reasoning-enhanced models outperform general-purpose LLMs, but performance remains far from ceiling (Li et al., 22 Aug 2025).

$i$ 8MIND (Neural Decoding):

Utilizing the NSD dataset (8 subjects, 10,000 images/subject, $i$ 9 train, $\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 0 test), $\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 1MIND delivers:

Method	mAP	AUC	Hamming	Subject-ID ACC	Generalization (mAP)
Single-subject ViT/MLP	.24–.26	.82–.85	—	—	—
CLIP-MUSED	.258	.877	—	—	—
$\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 2MIND (fMRI only)	.310	.913	.027	.999	.784 (holdout)
$\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 3MIND (fMRI+image)	.784	.984	.012	.999	.790 (all)

This demonstrates state-of-the-art performance, elimination of scalability limits, and robust cross-subject generalization (Yin et al., 22 Sep 2025).

5. Key Insights and Theoretical Contributions

InMind (LLM):

Standard LLMs default to lexical mimicry and shallow pattern recognition rather than temporally consistent, individualized reasoning. Reflection and role inference tasks cannot be satisfied unless models receive explicit round-level traces, i.e., temporal anchoring is not learned in the absence of direct cues.
Reasoning-enhanced models, notably DeepSeek-R1, manifest partial style-sensitive reasoning: backward inference, hedging/certainty modulation, and context-consistent assignments improve, but absolute scores remain low. Performance gains under grouped scoring suggest that coarse-grained cognitive traits are more easily aligned than precise role attribution.

$\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 4MIND (Neural Decoding):

Produces interpretable voxel–object activation fingerprints by tracing Grad-CAM attributions through ViT layers, yielding visualization of region-selective activations (e.g., consistent ventral stream responses to “horse”/“bird,” strong social stimulus activation for “person”).
Reveals clear subject-specific attention dynamics during rapid (3 s) visual exposure: shared and residual attention maps highlight both universally salient object detection (e.g., “chair”) and idiosyncratic focal patterns (e.g., “cup” receiving elevated attention by specific subjects correlated with high recognition probability).
Clustering voxels by mean and standard deviation of activation exposes functional specialization: “bystanders,” “discriminators,” and “supporters” for semantic decoding roles (Yin et al., 22 Sep 2025).

6. Limitations and Prospects

LLM Evaluation:

Current LLMs lack the ability to internalize individualized reasoning styles without explicit temporal and strategic annotation. Dynamic adaptation remains shallow, with models frequently treating sequential rounds as independent.
Future work in InMind aims to scale to diverse SDGs, automate strategy profile induction to mitigate annotation bias, and integrate memory/belief tracking modules to maintain cross-round coherence. Extension to cooperation and negotiation scenarios is proposed as crucial domains for contextually adaptive inference (Li et al., 22 Aug 2025).

Neural Decoding:

$\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 5MIND establishes a foundation for more interpretable and generalizable neural decoding, but practical constraints (e.g., requirement of large fMRI datasets, pre-defined object basis dimensionality) persist.
A plausible implication is that learned subject–object disentanglement and interpretability frameworks could inform future low-shot or online neural decoding domains, and illuminate the neural basis of visual and attentional idiosyncrasy at population scale (Yin et al., 22 Sep 2025).

References:

(Li et al., 22 Aug 2025) InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles (Yin et al., 22 Sep 2025) $\mathcal{G} = \langle mode,\,\mathcal{A},\,\{E_z\}_{z=1}^m,\,\mathcal{F}\rangle$ 6MIND: Insightful Multi-subject Invariant Neural Decoding

Markdown Report Issue Upgrade to Chat

References (2)

InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles (2025)

$i$MIND: Insightful Multi-subject Invariant Neural Decoding (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to InMind.