Papers
Topics
Authors
Recent
Search
2000 character limit reached

SiliconFriend: AI Companion Chatbot

Updated 8 January 2026
  • SiliconFriend is an AI companion chatbot that leverages the MemoryBank framework to provide long-term, personalized, and contextually coherent interactions.
  • It employs a retrieval-augmented memory system with components for storage, retrieval, and decay to enhance user-specific dialog and empathy.
  • The system demonstrates significant advances in memory recall and adaptive dialog generation across both open- and closed-source large language models.

SiliconFriend is a long-term AI companion chatbot built upon the MemoryBank framework, specifically engineered to endow LLMs with anthropomorphic long-term memory capabilities. It is designed for domains such as psychological counseling and personal companionship where sustained, contextually coherent interaction and adaptation to user personality are critical. SiliconFriend’s architecture, methodology, and evaluation demonstrate significant advances in empathy, memory recall, and user-adaptive dialog generation across both open- and closed-source LLM platforms (Zhong et al., 2023).

1. MemoryBank Architecture and Mechanisms

MemoryBank provides SiliconFriend with a fine-grained, retrieval-augmented, and dynamically updated memory system. Its architecture consists of three principal components: Memory Storage, Memory Retriever, and Memory Updater.

  • Memory Storage archives:
    • Raw, timestamped multi-turn conversations.
    • Daily event summaries of user interactions.
    • Rolling global summaries spanning all sessions.
    • Daily and aggregated user-personality portraits.
  • Memory Retriever employs a dual-tower dense retrieval model using an encoder E()E(\cdot); it transforms each memory piece mm to vector hmh_m and the live context cc to hch_c, retrieving the top-kk nearest memories using a FAISS index. This permits contextually relevant recall at each dialog turn.
  • Memory Updater implements a forgetting/strengthening protocol inspired by the Ebbinghaus Forgetting Curve. Each memory mm tracks:
    • tmt_m: Time since creation or last recall.
    • SmS_m: Discrete memory strength (initialized to $1$).
    • Retention follows the rule Rm(t)=exp(tm/Sm)R_m(t) = \exp(-t_m / S_m).
    • Memory is pruned when Rm(t)<θdelR_m(t) < \theta_{del}. Each retrieval strengthens the memory (SmSm+1S_m \leftarrow S_m + 1, tm0t_m \leftarrow 0).

Per-turn data flow:

  1. User utters utu_t; appended to raw log.
  2. Memory Retriever encodes cthcc_t \rightarrow h_c; queries FAISS for top-kk memories.
  3. Prompt to LLM combines task instructions, retrieved memories (raw and summarized), global summaries, user portraits, recent history.
  4. LLM generates response RtR_t.
  5. Memory Updater logs RtR_t, updates decay, strengths, triggers summarization and personality updates periodically.

Pseudocode strictly adheres to outlined mechanisms, supporting rigorous reproducibility.

2. Integration with LLMs

SiliconFriend supports both closed- and open-source LLMs and decouples memory mechanisms from core model weights.

  • Closed-source (ChatGPT): Interacts via OpenAI's API from a lightweight orchestrator, which merges user input, retrieved memories, and constructed prompts before submission.
  • Open-source (ChatGLM, BELLE): Operates on in-house infrastructure using Python inference APIs. LangChain intermediates embedding, FAISS-based retrieval, and prompt assembly. Embedding models are MiniLM (English) and Text2vec (Chinese).
  • Orchestration structure:

1
2
3
4
5
6
[User] → [Orchestrator] → 
    ├─> [Memory Retriever: Embedding+FAISS] 
    ├─> [Memory Store: raw logs, summaries, portraits] 
    └─> [LLM Inference: OpenAI API or local ChatGLM/BELLE] 
         ← Response 
         → [Memory Updater: log/decay/strength/summarize]

This modular deployment achieves broad compatibility, extensibility, and promotes separation of memory from language modeling.

3. Implementation: LLM Tuning and System Flow

  • Base models: ChatGPT (proprietary), ChatGLM 6.2B, BELLE 7B (open).
  • Empathy fine-tuning: Open-source models undergo parameter-efficient LoRA finetuning on 38,000 psychological dialog pairs:
    • LoRA injects low-rank adapters into linear layers: for y=Wxy = Wx, revised as y=Wx+BAxy = Wx + B A x, with ARr×k,BRd×rA \in \mathbb{R}^{r \times k}, B \in \mathbb{R}^{d \times r}, rmin(d,k)r \ll \min(d, k), r=16r=16 used, WW frozen.
    • Emotional scope includes anxiety, grief, relationship issues, enabling empathic tone, active listening, positive reframing.
  • Pipeline: Post-finetuning, MemoryBank is integrated with no further parameter updates, employing retrieval-augmented prompting for dynamic, memory-aware responses.

Real-time logic:

  1. User message traverses orchestrator, Memory Retriever fetches relevant logs, summaries, personality profiles.
  2. Prompt is constructed with all contextually relevant components.
  3. LLM infers response; output returned to user and Memory Updater.
  4. Decay, memory strength, and summarizations are atomically updated per dialog episode or at daily boundaries.

4. Evaluation Methodologies and Metrics

Qualitative Assessment

Deployed on a web platform for early user trials, SiliconFriend was benchmarked against baseline ChatGLM absent MemoryBank or empathy tuning. Real-world dialogs demonstrated:

  • Enhanced empathic phrasing and personal relevance.
  • Accurate recall of user-specific details (e.g., “Your girlfriend’s birthday is tomorrow…”).
  • Personality-adaptive suggestions.

Quantitative Simulation

Utilizing ChatGPT for simulating 15 synthetic users over 10 dialog days (with ≥2 topics/day), SiliconFriend faced 194 probing questions (English/Chinese balanced) for robust evaluation.

Metrics:

  • Memory Retrieval Accuracy: Correct memory recall (binary).
  • Response Correctness: Judged as 0/0.5/1.
  • Contextual Coherence: Judged as 0/0.5/1.
  • Model Ranking Score: s=1/rs = 1 / r, r{1,2,3}r \in \{1,2,3\} for variant rankings.
Model Retrieval Acc. Correctness Coherence Ranking
ChatGLM (Eng) 0.809 0.438 0.680 0.498
BELLE (Eng) 0.814 0.479 0.582 0.517
ChatGPT (Eng) 0.763 0.716 0.912 0.818
ChatGLM (Chn) 0.840 0.418 0.428 0.510
BELLE (Chn) 0.856 0.603 0.562 0.565
ChatGPT (Chn) 0.711 0.655 0.675 0.758

Retrieval accuracy consistently exceeds 0.75, demonstrating MemoryBank's cross-model efficacy. ChatGPT exhibits superior correctness and coherence (reflecting underlying model strength), while BELLE attains highest accuracy in Chinese due to bilingual tuning.

5. Dialog Examples and Behavioral Case Studies

  • Psychological Companionship:
    • User expresses feeling lost post-breakup.
    • Baseline: General advice (“talk with friends or seek help”).
    • SiliconFriend: Contextually nuanced, empathic, and memory-referential (“…I remember you enjoy journaling…”).
  • Memory Recall:
    • Longitudinal tracking (e.g., recalling a book previously recommended, accurately identifying that heap sort was not previously discussed).
  • Personality-Tailored Suggestions:
    • “Linda” (introverted/ambitious): AI recommends a low-key, growth-oriented event (art-history lecture).
    • “Emily” (open-minded/curious): Suggests new experiences aligned with prior expressed interests (dance workshop).

6. Strengths, Limitations, and Future Directions

Strengths

  • Implements a biologically inspired, anthropomorphic long-term memory with selective forgetting and reinforcement.
  • Generalizes across both open and closed LLM architectures by decoupled, plug-and-play retrieval systems.
  • Bilingual dialog capabilities with empirically validated gains in empathy and personalized interaction.
  • Methodologically rigorous evaluation combining both qualitative and quantitative axes.

Limitations

  • The forgetting model uses only a single scalar SmS_m, and decay is uniform—more complex memory dynamics are not yet modeled.
  • Retention/pruning threshold θdel\theta_{del} and update frequency Δt\Delta t require manual tuning.
  • Only text memory is supported; no direct handling of multimodal information.
  • Token budget can be exceeded with large memory—domain scaling requires further research.

Future Developments

  • Incorporation of advanced forgetting schedules (e.g., spacing effect, overlearning).
  • Hierarchical and topic-weighted memory indexing.
  • Integrating user feedback as reinforcement during memory updates.
  • Addition of multimodal (audio, image) episodic memory.
  • End-to-end co-training for retrieval and generation systems.

MemoryBank and SiliconFriend together constitute a substantive advance in the development of AI companions with robust, human-like long-term memory, closing the gap between stateless conversational agents and contextually adaptive, memory-driven assistants (Zhong et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SiliconFriend.