Papers
Topics
Authors
Recent
Search
2000 character limit reached

CogInstrument: Human–AI Cognitive Planning

Updated 11 May 2026
  • CogInstrument is a human–AI collaboration system that formalizes cognitive reasoning through compositional, revisable motifs—minimal subgraphs capturing causally-linked concepts.
  • It employs a graph-based interface and multi-stage motif extraction to enable explicit, bidirectional alignment between users and large language model assistants.
  • Empirical evaluations demonstrate significant improvements in reasoning externalization, dependency grounding, and revision coherence over traditional text-based LLM dialogues.

CogInstrument is a human–AI collaboration system for planning tasks that formalizes and externalizes cognitive reasoning through the concept of cognitive motifs—compositional, revisable subgraphs of causally-linked concepts. By extracting, representing, and manipulating cognitive motifs, CogInstrument facilitates explicit, bidirectional alignment between human users and LLM assistants, enabling structural agency, transparent revision, and verifiable alignment of intent and output. The system combines LLM-assisted motif extraction, a motif-centric interface, and mechanisms for graph-based reasoning negotiation, yielding measurable improvements in reasoning externalization, dependency grounding, and cross-task transfer compared to text-based LLM dialogues (Wang et al., 12 Apr 2026).

1. Cognitive Motifs: Definitions and Properties

A cognitive motif is defined as a minimal, reusable subgraph capturing a recurring unit of human reasoning, consisting of concepts connected by typed, directed causal dependencies and associated with a reasoning function.

Formally, a cognitive motif is a triple:

μ=(Cμ,Eμ,φμ)\mu = (C_\mu, E_\mu, \varphi_\mu)

where:

  • CμCC_\mu \subset C is a set of concept nodes;
  • EμEE_\mu \subset E is a set of typed causal edges (dependencies) among nodes;
  • φμ\varphi_\mu is an abstract reasoning function (e.g., constraint propagation, trade-off resolution).

Concepts (cc) are typed as beliefs, constraints, preferences, or factual assertions. Causal dependencies are directed edges ciϕcjc_i \xrightarrow{\phi} c_j, with ϕ\phi indicating enable, constrain, or determine. For example, a "budget-constraint" motif may be:

  • $C_\mu = \{\text{budget_limit},\,\text{option_cost},\,\text{option_feasibility}\}$
  • $E_\mu = \{\text{budget_limit} \xrightarrow{[\text{constrain}]} \text{option_cost},\,\text{option_cost} \xrightarrow{[\text{determine}]} \text{option_feasibility}\}$
  • φμ\varphi_\mu: "filter out options whose cost > budget_limit."

Motifs are abstractions over reasoning substructures, enabling their explicit reuse, revision, and transfer.

2. Data Structures, Graph Representation, and Motif Library

CogInstrument represents cognitive motifs and their interrelations in an attributed directed acyclic graph (DAG), augmented with explicit conflict (tension) edges.

Node Type Edge Type Motif Attributes
concept node enable/constrain/ name, CμCC_\mu \subset C0, domain tags
(belief, determine (DAG) timestamp, task-id
constraint, conflict (outside concept mapping
preference, DAG)
assertion)

Concepts are nodes with unique IDs, textual labels, type, optional values, and provenance/confidence metadata. Causal dependencies are directed edges with type (enable, constrain, determine), confidence weight, and attribution to utterances. Conflict edges represent unresolved tensions and are kept external to the DAG.

Motif templates and their instantiations are stored in a motif library, with meta-information supporting retrieval and transfer between tasks. Graph integrity is maintained by re-enforcing DAG invariants, removing weak-cycle edges (Tarjan’s algorithm), and performing layout and compaction (Sugiyama-style layering, Brandes–Köpf position stabilization, A* anchor heuristics).

3. Motif Extraction and Maintenance Pipeline

The CogInstrument extraction pipeline operates in three staged phases:

  1. Concept and Edge Extraction
    • User utterances are processed by the LLM to extract candidate concepts and type labels.
    • Disambiguation and grounding leverage multi-turn evidence and external API verification.
    • Causal edge proposals are generated via LLM-assisted causal discovery.
    • An impact score

    CμCC_\mu \subset C1

    is computed for each motif candidate, with thresholded, uncertainty-driven user confirmation questions surfaced as needed.

  2. Motif Abstraction

    • Pattern matching identifies stabilized subgraphs corresponding to motifs in a fixed taxonomy.
    • Matched subgraphs are instantiated as motifs (with metadata) and added to the library.
  3. Mixed-Initiative Maintenance
    • State separation distinguishes cognitive (user-grounded) and task-plan (LLM draft) representations.
    • Low-impact structural patches are automatically applied; high-impact diffs require user approval.
    • Motif transfer is supported at task initiation, with candidate motifs instantiated based on relevance.
    • After each patch, graph repairs and visual stabilization are performed.

Pseudocode excerpt (simplified):

φμ\varphi_\mu5

4. Interactive Cognitive Reasoning Interface

CogInstrument’s interface is composed of five coordinated panels:

  • Dialogue Panel: Supports free-form chat, draft plans, and clarifications.
  • Concept & Motif List: Tabular overview of active concepts and instantiated motifs.
  • Graph Canvas: Layered graph visualization with node and edge typology indicated through color and stroke.
  • Motif Detail Panel: Detailed views of selected motif rationale, CμCC_\mu \subset C2 descriptions, evidence, and status.
  • Control Panel: Structural editing, patch acceptance/rejection, manual motif creation, and transfer actions.

Three user–system interaction modes enable different editing flows:

  • Auto-revision: Automated silent commit of low-impact edges.
  • Uncertainty-driven question: Single clarification prompts surfaced per interaction.
  • User-driven revision: Direct manipulation of graph and motif lists.

Editing operations include node/edge addition, deletion, relabeling, retyping, motif deprecation/cancellation, and selective transfer confirmation. Users can fluidly switch between chat-based, structural, and edit-based navigation without context loss.

5. Structural Bidirectional Human–LLM Reasoning Alignment

CogInstrument implements an explicit, bidirectional alignment mechanism between user cognitive representations and LLM outputs:

  • LLM-to-cognitive graph: At each planning step, the cognitive motif graph is serialized (via function calls or JSON) and incorporated into the LLM prompt, contextualizing subsequent generation. The system tracks motif usage (covered vs. unincorporated).
  • Cognitive graph-to-LLM grounding: User-driven edits (e.g., modifying edge types) are injected into the LLM’s prompt for stateful consistency. Cognitive state (user-confirmed motifs) and task-plan state (assistant suggestions) are maintained in parallel; only user-approved changes are promoted to the cognitive state.
  • Mismatch and transfer detection: The interface highlights motif ommissions or contradictions in LLM outputs, requesting clarification or re-proposal. Motif transfer candidates are surfaced as reviewable suggestions at task commencement, supporting cross-task knowledge transfer.

6. Empirical Evaluation: User Study and Outcomes

A within-subjects study (CμCC_\mu \subset C3 experienced LLM users, backgrounds in design/research/engineering) evaluated CogInstrument against a baseline LLM chat (identical GPT-5.3 backend, no graph support) on travel and growth planning scenarios. Each task sequence included both open planning and constraint-injection phases, with order counterbalanced.

Assessment instruments included a custom 17-item Likert questionnaire aggregating into six constructs, System Usability Scale (SUS), Raw NASA-TLX, and detailed interaction/event logs.

Summary of key results (Wilcoxon paired, CμCC_\mu \subset C4 and effect sizes CμCC_\mu \subset C5):

  • Reasoning Externalization: 3.10CμCC_\mu \subset C66.00 (CμCC_\mu \subset C70.001, CμCC_\mu \subset C8=0.84)
  • Dependency Grounding: 2.89CμCC_\mu \subset C95.97 (EμEE_\mu \subset E00.001, EμEE_\mu \subset E1=0.63)
  • Revision Coherence: 3.09EμEE_\mu \subset E25.77 (EμEE_\mu \subset E30.001, EμEE_\mu \subset E4=0.64)
  • Trust & Control: 3.25EμEE_\mu \subset E55.71 (EμEE_\mu \subset E60.001, EμEE_\mu \subset E7=0.60)
  • Cross-Task Transfer: 3.62EμEE_\mu \subset E85.29 (EμEE_\mu \subset E9=0.014, φμ\varphi_\mu0=0.46)
  • Diagnosis Clarity: 4.25φμ\varphi_\mu15.89 (φμ\varphi_\mu2=0.009, φμ\varphi_\mu3=0.48)
  • SUS and overall NASA-TLX: no significant difference; Mental Demand increased (φμ\varphi_\mu4=0.041).

Qualitative themes included users describing the graph as a “mirror” and “rein to steer” LLM reasoning, preference for localized patch edits and selective transfer, and reframing higher cognitive load as fostering agency.

7. Limitations, Applicability Boundaries, and Future Directions

Constraints and open challenges identified include:

  • Extraction fidelity: Absence of ground-truth labeled data precludes objective assessment of motif/edge accuracy.
  • Sample and generalizability: Evaluation limited to 12 expert LLM users; outcomes for novices or larger samples untested.
  • Plan quality: No independent expert rubric for output plan quality is instituted.
  • Domain generality: Evaluation restricted to planning; brainstorming and Q&A contexts may require architectural adaptation.
  • Model dependency: Implementation built on GPT-5.3; cross-model robustness unexamined.
  • Motif granularity: Only single-level motifs; complex or hierarchical motifs remain unexplored.
  • Learning curve: Higher entry barrier for non-expert or casual users; the possibility of novelty effects remains.

Future research aims include developing extraction accuracy benchmarks, broadening sample and domain scope, introducing multi-level motif hierarchies, integrating objective plan-quality metrics and expert judgments, and extending reasoning structures beyond DAGs to allow for cycles and mutual enablement (Wang et al., 12 Apr 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CogInstrument.