A Machine with Short-Term, Episodic, and Semantic Memory Systems

Published 5 Dec 2022 in cs.AI | (2212.02098v4)

Abstract: Inspired by the cognitive science theory of the explicit human memory systems, we have modeled an agent with short-term, episodic, and semantic memory systems, each of which is modeled with a knowledge graph. To evaluate this system and analyze the behavior of this agent, we designed and released our own reinforcement learning agent environment, "the Room", where an agent has to learn how to encode, store, and retrieve memories to maximize its return by answering questions. We show that our deep Q-learning based agent successfully learns whether a short-term memory should be forgotten, or rather be stored in the episodic or semantic memory systems. Our experiments indicate that an agent with human-like memory systems can outperform an agent without this memory structure in the environment.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (4)

View on Semantic Scholar

Summary

The paper introduces a novel RL agent that integrates short-term, episodic, and semantic memory systems using knowledge graphs.
It employs a deep Q-learning framework and LSTM networks to decide whether observations are stored or discarded, enhancing learning efficiency.
Experimental results demonstrate that prefilled semantic memory significantly improves performance in the simulated 'Room' environment.

A Machine with Short-Term, Episodic, and Semantic Memory Systems

Introduction

The paper "A Machine with Short-Term, Episodic, and Semantic Memory Systems" (2212.02098) explores the integration of cognitive science insights into the construction of artificial reinforcement learning (RL) agents. By emulating human-like memory systems—specifically, short-term, episodic, and semantic memories—the researchers aim to improve question-answering capabilities of machines in dynamic environments. The proposed model encapsulates these memory systems within knowledge graphs to facilitate efficient storage and retrieval processes.

Methodology

The architecture of the proposed agent comprises three distinct memory systems, each represented as a knowledge graph (Figure 1):

Figure 1: The memory systems of the agent. The long-term (explicit) memory systems consist of episodic and semantic memory systems.

Short-Term Memory: Holds recent observations and is bounded in capacity. Decisions regarding the storage of these observations into long-term memory systems (either episodic or semantic) or discarding them entirely are learned behaviors.
Episodic Memory: Stores time-bound, individual-specific events that allow the agent to mimic human personal experiences. This memory is crucial for reconstructing specific event sequences.
Semantic Memory: Houses general world knowledge and commonly accepted facts. This system abstracts information into more generalized, entity-based knowledge, lacking temporal specifics.

The memory retrieval process employs decision rules that prioritize the recency of episodes and the strength of semantic memories, facilitating efficient retrieval for answering environment-related questions (Algorithm 1).

Figure 2: An episodic memory and a semantic memory represented as a knowledge graph.

Reinforcement Learning Model

The agent's decision-making is powered by a deep Q-learning framework. The agent learns to decide, through interactions with a simulation environment called "the Room," whether an observation should be stored or forgotten. Each observation is represented as a quadruple (head, relation, tail, timestamp), translating the memory systems into sequences of embeddings processed by LSTM networks to inform decisions (Figure 3).

Figure 3: The Q-network diagram, where the short-term $\bm{M_o},$ episodic $\bm{M_e},$ and semantic $\bm{M_s}$ memory systems are given as the initial input.

Experimental Setup and Results

The simulation environment, "the Room," features a discrete-event system with multiple humans and objects, providing a varied and challenging setting for testing the agent's memory capabilities. The experiments demonstrate that the RL agent with human-like memory systems outperforms those lacking such structured memory, achieving high accuracy in answering both individual-specific and generalized questions.

Numerical results (Figure 4) indicate that the RL agent, when initialized with a prefilled semantic memory resembling prior world knowledge, learns more effectively and performs better in test environments compared to those starting from an empty semantic state.

Figure 4: Training, validation, and test results of the agents with the memory capacity of 32.

Discussion

The introduction of a dual-memory model—episodic and semantic—results in a nuanced memory retrieval mechanism that parallels human cognitive processes. While episodic memories support context-specific recall, semantic memories underpin generalized knowledge, allowing for adaptability across varied scenarios. This dual system effectively addresses partial observability issues inherent in many RL problems.

Conclusion

By integrating cognitive theories into algorithmic frameworks, the paper introduces an innovative RL architecture with distinct memory systems, leading to enhanced learning and problem-solving capabilities. Future research directions include extending the complexity of the environment, incorporating multimodal inputs, and employing other human-like memory representations to further emulate human cognitive functions. Continued exploration in this field may yield more robust and adaptable AI systems capable of handling a wider array of real-world applications.

Markdown Report Issue