Clustered Mallows Model (2403.12880v1)
Abstract: Rankings are a type of preference elicitation that arise in experiments where assessors arrange items, for example, in decreasing order of utility. Orderings of n items labelled {1,...,n} denoted are permutations that reflect strict preferences. For a number of reasons, strict preferences can be unrealistic assumptions for real data. For example, when items share common traits it may be reasonable to attribute them equal ranks. Also, there can be different importance attributions to decisions that form the ranking. In a situation with, for example, a large number of items, an assessor may wish to rank at top a certain number items; to rank other items at the bottom and to express indifference to all others. In addition, when aggregating opinions, a judging body might be decisive about some parts of the rank but ambiguous for others. In this paper we extend the well-known Mallows (Mallows, 1957) model (MM) to accommodate item indifference, a phenomenon that can be in place for a variety of reasons, such as those above mentioned.The underlying grouping of similar items motivates the proposed Clustered Mallows Model (CMM). The CMM can be interpreted as a Mallows distribution for tied ranks where ties are learned from the data. The CMM provides the flexibility to combine strict and indifferent relations, achieving a simpler and robust representation of rank collections in the form of ordered clusters. Bayesian inference for the CMM is in the class of doubly-intractable problems since the model's normalisation constant is not available in closed form. We overcome this challenge by sampling from the posterior with a version of the exchange algorithm \citep{murray2006}. Real data analysis of food preferences and results of Formula 1 races are presented, illustrating the CMM in practical situations.
- Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels. Statistics and Computing, 26:29–47.
- The pseudo-marginal approach for efficient Monte Carlo computations. Annals of Statistics, 37:697–725.
- Product partition models for change point problems. The Annals of Statistics, 20:260 – 279.
- Assessing a mixture model for clustering with the integrated classification likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:719–725.
- From "i like" to "i prefer" in collaborative filtering. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, volume 2, pages 365–367.
- Towards preference relations in recommender systems. In Workshop on Preference Learning, European Conference on Machine Learning and Principle and Practice of Knowledge Discovery in Databases (ECML-PKDD 2010), volume 51.
- Learning to rank with nonsmooth cost functions. In Schölkopf, B., Platt, J., and Hoffman, T., editors, Advances in Neural Information Processing Systems, volume 19. MIT Press.
- Burges, C. J. (2010). From ranknet to lambdarank to lambdamart: An overview. In Technical Report, Microsoft Research.
- Bayesian inference for exponential random graph models. Social Networks, 33:41–55.
- Use of nonnull models for rank statistics in bivariate, two-sample, and analysis of variance problems. Journal of the American Statistical Association, 86:188–200.
- A Bayesian Mallows approach to non-transitive pair comparison data: How human are sounds? The Annals of Applied Statistics, 13:492–519.
- Critchlow, D. (1985). Metric Methods for Analyzing Partially Ranked Data. Springer New York, NY.
- Probability models on rankings. Journal of Mathematical Psychology, 35:294–318.
- Bayesian aggregation of order-based rank data. Journal of the American Statistical Association, 109:1023–1039.
- Diaconis, P. (2009). The Markov Chain Monte Carlo revolution. Bulletin of the American Mathematical Society, 46:179–205.
- Doignon, J. (2023). The best-worst-choice polytope on four alternatives. Journal of Mathematical Psychology, 114:102769.
- Distance based ranking models. Journal of the Royal Statistical Society: Series B, 48:359–369.
- Analysis of irish third-level college applications data. Journal of the Royal Statistical Society: Series A, 169:361–379.
- Hartigan, J. (1990). Partition models. Communications in Statistics - Theory and Methods, 19:2745–2756.
- Sampling and learning Mallows and generalized Mallows models under the Cayley distance. Methodology and Computing in Applied Probability, 20:1–35.
- Mallows and generalized Mallows model for matchings. Bernoulli, 25:1160 – 1188.
- Kamishima, T. (2003). Nantonac collaborative filtering: Recommendation based on order responses. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
- Bayesian Networks: An Introduction. John Wiley & Sons.
- A review on evolutionary algorithms in Bayesian network learning and inference tasks. Information Sciences, 233:109–125.
- An extended Mallows model for ranked data aggregation. Journal of the American Statistical Association, 115:730–746.
- Effective sampling and learning for Mallows models with pairwise-preference data. Journal of Machine Learning Research, 15:3783–3829.
- Mallows, C. L. (1957). Non-null ranking models. I. Biometrika, 44:114–130.
- Marden, J. (1995). Analyzing and Modeling Rank Data. Chapman and Hall/CRC, 1st edition.
- Models of best–worst choice and ranking among multiattribute options (profiles). Journal of Mathematical Psychology, 56:24–34.
- A formal and empirical comparison of two score measures for best–worst scaling. Journal of Choice Modelling, 21:15–24.
- Estimation and clustering with infinite rankings. In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, Arlington, Virginia, USA.
- An exponential model for infinite rankings. Journal of Machine Learning Research, 11:3481–3518.
- MCMC for doubly-intractable distributions. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI-06).
- Plackett, R. L. (1975). The analysis of permutations. Journal of the Royal Statistical Society: Series C, 24:193–202.
- Bayesian Product Partition Models. John Wiley & Sons, Ltd.
- Variable selection for model-based clustering. Journal of the American Statistical Association, 101:168–178.
- Bayesian Networks: With Examples in R. CRC press.
- BayesMallows: An R package for the Bayesian Mallows model. The R Journal, 12:324–342.
- An overview of composite likelihood methods. Statistica Sinica, pages 5–42.
- Probabilistic preference learning with the Mallows rank model. Journal of Machine Learning Research, 18:1–49.
- Partition–Mallows model and its inference for rank aggregation. Journal of the American Statistical Association, 118:343–359.