Unknown identities of 14 privately tested Chatbot Arena models

Identify the model providers corresponding to the privately tested Chatbot Arena models with codenames kiwi, space, maxwell, luca, anonymous-engine-1, tippu, sky, pineapple, pegasus, dasher, dancer, blueprint, dry_goods, and prancer that the authors were unable to de-anonymize during their January–March 2025 scraping of the leaderboard.

Background

To quantify the extent of private testing on Chatbot Arena, the authors scraped battles from January to March 2025 and used a de-anonymizing prompt to attribute private model aliases to providers. This procedure allowed them to assign many anonymous variants to major providers (e.g., Meta, Google, Amazon) and to tally provider-specific counts of private testing.

However, 14 anonymous models remained unassigned because the authors could not de-anonymize them via model self-identification. Resolving these identities would complete the audit of who benefitted from private testing access and would refine the estimates of data asymmetry and leaderboard sampling advantages.

References

We also captured 14 other private models as part of our scraping but weren't able to de-anonymize them: kiwi, space, maxwell, luca, anonymous-engine-1, tippu, sky, pineapple, pegasus, dasher, dancer, blueprint, dry_goods, prancer.

The Leaderboard Illusion (2504.20879 - Singh et al., 29 Apr 2025) in Appendix, Encountered Private Models in Scraping (Section app:private-scrape-models)