Evaluation methodology for conversational capability under system prompt constraints

Develop a comprehensive, publicly available evaluation methodology to measure the conversational capability of large language models when their behavior is constrained by fixed system prompts that limit scope and behavior, enabling standardized assessment beyond ad hoc or general-purpose evaluation benchmarks.

Background

The authors emphasize that conversational capability under system prompt constraints is a distinct evaluation need, because prompts enforce specific scope and behavior that typical helpfulness/relevance metrics do not capture.

They note the absence of a comprehensive public approach and therefore adopt an MT-bench-inspired judge-LM scheme tailored to prompt adherence, underscoring the need for standardized, rigorous methods in this setting.

References

However, we are unaware of any comprehensive, publicly known approach for evaluating this specifically when constrained by a system prompt that limits scope and behavior.

— Safeguarding System Prompts for LLMs (2412.13426 - Jiang et al., 2024) in Section 5: Experimental Setup, Metrics — Conversational capability

Evaluation methodology for conversational capability under system prompt constraints

Sponsor

Background

References

Related Problems