Chronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term Memory
This presentation explores Chronos, a breakthrough system that enables conversational agents to remember and reason accurately across extended multi-month dialogues. By combining selective temporal event extraction with dense retrieval over raw conversation history, Chronos achieves state-of-the-art performance on challenging long-term memory benchmarks, outperforming prior systems that rely on either comprehensive knowledge graphs or purely unstructured retrieval. The talk examines how query-driven structuring and dual indexing resolve the fundamental trade-offs between recall precision and computational overhead in persistent AI agents.Script
Conversational agents that remember yesterday's discussion but forget last month's preference change are fundamentally broken. Chronos solves this by giving agents temporally precise memory that spans months of dialogue without the crushing overhead of building massive knowledge graphs.
Current approaches force an impossible choice. Comprehensive knowledge graphs deliver temporal precision but drown retrieval in irrelevant context and indexing costs. Turn-level methods stay lightweight but can't answer questions like when did my preferences change, or what happened between March and May.
Chronos resolves this tension with a fundamentally different strategy.
Chronos maintains two parallel indexes. The event calendar structures only temporally relevant facts, converting phrases like last Tuesday or three weeks ago into precise datetime spans. The turn calendar keeps everything else intact. This dual approach delivers temporal selectivity without sacrificing contextual richness, and crucially, events are extracted dynamically based on the query, not exhaustively at ingestion time.
The performance gains are dramatic and consistent. On the LongMemEvalS benchmark, which tests knowledge update tracking, multi-session aggregation, and fine-grained temporal recall, Chronos Low achieves 92.60% accuracy, a 7.67 percentage point jump over the previous best practical system. Chronos High pushes further to 95.60%, setting new records across virtually every task category. The error rate reductions shown here reveal that temporal structuring matters most for queries requiring precise event ordering and state change tracking.
Chronos proves that selective, intelligent structuring beats exhaustive preparation. By extracting events only when queries demand them and maintaining dense access to raw context, the system achieves what previous architectures couldn't: temporal precision without computational collapse. This opens a path for conversational agents that remember accurately across months, not just minutes.
The future of AI agents depends on memory systems that scale temporally, not just contextually. Visit EmergentMind.com to explore more research and create your own video presentations.