Evaluation methodology for conversational capability under system prompt constraints
Develop a comprehensive, publicly available evaluation methodology to measure the conversational capability of large language models when their behavior is constrained by fixed system prompts that limit scope and behavior, enabling standardized assessment beyond ad hoc or general-purpose evaluation benchmarks.
References
However, we are unaware of any comprehensive, publicly known approach for evaluating this specifically when constrained by a system prompt that limits scope and behavior.
— Safeguarding System Prompts for LLMs
(2412.13426 - Jiang et al., 18 Dec 2024) in Section 5: Experimental Setup, Metrics — Conversational capability