Robustness of RoboArena to Adversarial Evaluators
Investigate the robustness of the RoboArena distributed, double-blind, pairwise robot policy evaluation framework to intentionally adversarial evaluators who attempt to tamper with results by providing random preference labels or misleading natural-language feedback, and develop methods to harden the evaluation process against such tampering.
References
While RoboArena's distributed, double-blind evaluation scheme gives it an inherent robustness against individual influencing, we have not investigated its robustness to intentionally adversarial evaluators that try to temper with evaluation results, for example by providing random preference ratings or intentionally misleading language feedback. Future work should investigate how distributed robot evaluation approaches can be hardened against such tampering.