KnowThyself: An Agentic Assistant for LLM Interpretability (2511.03878v1)

Published 5 Nov 2025 in cs.AI, cs.IR, cs.LG, and cs.MA

Abstract: We develop KnowThyself, an agentic assistant that advances LLM interpretability. Existing tools provide useful insights but remain fragmented and code-intensive. KnowThyself consolidates these capabilities into a chat-based interface, where users can upload models, pose natural language questions, and obtain interactive visualizations with guided explanations. At its core, an orchestrator LLM first reformulates user queries, an agent router further directs them to specialized modules, and the outputs are finally contextualized into coherent explanations. This design lowers technical barriers and provides an extensible platform for LLM inspection. By embedding the whole process into a conversational workflow, KnowThyself offers a robust foundation for accessible LLM interpretability.