Disentangle effects of reasoning training versus size in Llama‑4 Maverick vs Scout

Ascertain whether the observed higher AI-identity disclosure in Meta’s Llama-4-Maverick-17B-128E-Instruct compared to Llama-4-Scout-17B-16E-Instruct is driven by reasoning training or by increased total parameter count, holding active parameters constant.

Background

Within the Llama-4 family, Maverick differs from Scout both in total parameter count and in the presence of reasoning capability, while both share the same active parameter count.

This confounding prevents clear attribution of the disclosure difference to either reasoning training or model size without further controlled experimentation.

References

This dual difference prevents isolating whether reasoning training or model size drove the observed pattern in this model family.

Self-Transparency Failures in Expert-Persona LLMs: A Large-Scale Behavioral Audit (2511.21569 - Diep, 26 Nov 2025) in Section: Reasoning Training Shows Heterogeneous Effects on Self-Transparency