Dice Question Streamline Icon: https://streamlinehq.com

Accuracy of clustering and classification of investors from level IV limit order book data

Determine how accurately investors can be clustered or classified into correct behavioral groups using investor-level limit order book (level IV) data, to enable reliable empirical analyses in agent-based modeling and behavioral finance.

Information Square Streamline Icon: https://streamlinehq.com

Background

Agent-based models and behavioral finance research often require assigning real investors to behavioral categories (e.g., market makers, market takers, fundamentalists, chartists, noise traders), yet investor-level labels are rarely available and methods for inferring them from trading data are uncertain. Investor-level limit order book (level IV) data offer a rich basis for profiling, but the reliability of clustering or classification into the correct groups has not been established.

This paper constructs a synthetic LOB environment with ground-truth agent labels to evaluate supervised classification and unsupervised clustering performance across feature sets and noise levels, aiming to gauge the practical limits of profiling investors. The open question frames the need to quantify achievable accuracy when only investor-level LOB data are available.

References

However, a fundamental question has remained open: How accurately can investors be clustered or classified into the correct groups with investor-level limit order book data (level IV data)?

Classifying and Clustering Trading Agents (2505.21662 - Wilinski et al., 27 May 2025) in Section 1 (Introduction)