Ground-truth validation of CoV-based autonomy classification

Establish controlled experiments and ground-truth labeled datasets in which the operational mode of AI agents on Moltbook-like platforms (autonomous heartbeat scheduling versus human prompting) is known, to directly validate and calibrate the coefficient-of-variation-based temporal classification for detecting human influence.

Background

The authors classify autonomy versus human influence using the coefficient of variation (CoV) of inter-post intervals and validate this approach primarily via a natural experiment created by a 44-hour platform shutdown. Content and ownership signals are shown to be largely orthogonal to temporal patterns.

However, they explicitly state that they lack ground-truth labels for known human-prompted versus autonomous posts and therefore cannot cross-validate temporal classification against content features. Controlled experiments with known operational modes would provide a definitive validation and calibration framework for the temporal method.

References

Our classification framework is validated primarily through the natural experiment (44- hour shutdown) and the sliding window temporal dynamics, but lacks ground truth labels of known human-prompted versus autonomous posts. This signal independence means we cannot cross-validate temporal classification against content features; each signal provides complementary rather than convergent evidence.

— The Moltbook Illusion: Separating Human Influence from Emergent Behavior in AI Agent Societies (2602.07432 - Li, 7 Feb 2026) in Limitations

Ground-truth validation of CoV-based autonomy classification

Background

References

Related Problems