Effectiveness of WriteBack-RAG without Labeled Supervision

Determine the effectiveness of WriteBack-RAG when labeled supervision is limited or absent by evaluating its performance in low-label and unsupervised settings, including scenarios where the utility and document gates are driven by an LLM-as-a-Judge rather than gold labels.

Background

WriteBack-RAG trains the knowledge base by using labeled examples to gate which queries and documents provide helpful retrieval signals before distilling them into persistent write-back documents. This supervision-driven design raises questions about how the method performs when labeled data are scarce or unavailable.

The authors note that the gating signals could potentially be replaced by an LLM-as-a-Judge in unsupervised or low-label scenarios, but the actual effectiveness of WriteBack-RAG under those conditions has not been established.

References

It relies on labeled training examples, so its effectiveness in low-label or unsupervised settings remains unclear (though can be replaced by LLM-as-a-Judge).

Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment  (2603.25737 - Lu et al., 26 Mar 2026) in Limitations section