Placement of Evaluation Scaffolding in the Harness
Determine whether generator–evaluator separation, sprint contracts, and post-hoc checks (building on Self-Refine) should be implemented inside the Claude Code harness (for example, as additional hook events) or outside it as a separate evaluation layer.
References
Against the permission pipeline and tool-orchestration layers analysed in \Cref{sec:auth,sec:turn}, two architectural questions remain open. First, whether the scaffolding the paper cites from \citet{anthropic2026harness} (generator--evaluator separation, sprint contracts, post-hoc checks, building on \citet{madaan2023selfrefine}'s self-refine pattern) belongs inside the harness (e.g., as additional hook events alongside the 27 documented in \Cref{sec:ext}) or outside it as a separate evaluation layer is not settled by the cited sources.