Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scalable Learning of Item Response Theory Models (2403.00680v2)

Published 1 Mar 2024 in cs.LG, cs.DS, and stat.ML

Abstract: Item Response Theory (IRT) models aim to assess latent abilities of $n$ examinees along with latent difficulty characteristics of $m$ test items from categorical data that indicates the quality of their corresponding answers. Classical psychometric assessments are based on a relatively small number of examinees and items, say a class of $200$ students solving an exam comprising $10$ problems. More recent global large scale assessments such as PISA, or internet studies, may lead to significantly increased numbers of participants. Additionally, in the context of Machine Learning where algorithms take the role of examinees and data analysis problems take the role of items, both $n$ and $m$ may become very large, challenging the efficiency and scalability of computations. To learn the latent variables in IRT models from large data, we leverage the similarity of these models to logistic regression, which can be approximated accurately using small weighted subsets called coresets. We develop coresets for their use in alternating IRT training algorithms, facilitating scalable learning from large data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Multilevel Item Response Models: An Approach to Errors in Variables Regression. Journal of Educational and Behavioral Statistics, 22(1):47–76.
  2. Item Response Theory: Parameter estimation techniques. CRC press, second revised and expanded edition.
  3. An upper asymptote for the three-parameter logistic item-response model. ETS Research Report Series, 1981(1):i–8.
  4. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F. M. and Novick, M. R., editors, Statistical theories of mental test scores, pages 397–479. Addison-Wesley.
  5. Education as a lifelong process: The German National Educational Panel Study (NEPS). Edition ZfE. Springer, VS.
  6. Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4):929–965.
  7. On the complexity of item response theory models. Multivariate behavioral research, 52(4):465–484.
  8. Börsch-Supan, A. (2022). Survey of Health, Ageing and Retirement in Europe (SHARE) wave 1. Data Set, Release version, 8(0.0).
  9. Data resource profile: the Survey of Health, Ageing and Retirement in Europe (SHARE). International Journal of Epidemiology, 42(4):992–1001.
  10. Health, ageing and retirement in Europe -– First results from the Survey of Health, Ageing and Retirement in Europe. Mannheim Research Institute for the Economics of Aging (MEA).
  11. The Survey of Health, Ageing and Retirement in Europe -– Methodology. Mannheim Research Institute for the Economics of Aging (MEA).
  12. New frameworks for offline and streaming coreset constructions. CoRR, abs/1612.00889.
  13. Efficient coreset constructions via sensitivity sampling. In Proceedings of The 13th Asian Conference on Machine Learning (ACML), pages 948–963.
  14. Chao, M. T. (1982). A general purpose unequal probability sampling plan. Biometrika, 69(3):653–656.
  15. β3superscript𝛽3\beta^{3}italic_β start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT-IRT: A new item response model and its applications. In The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), pages 1013–1021.
  16. Item response theory based ensemble in machine learning. International Journal of Automation and Computing, 17(5):621–636.
  17. Low rank approximation and regression in input sparsity time. In Symposium on Theory of Computing Conference (STOC), pages 81–90.
  18. Input sparsity and hardness for robust subspace approximation. In IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 310–329.
  19. A new coreset framework for clustering. In Symposium on Theory of Computing (STOC), pages 169–182.
  20. DeMars, C. E. (2016). Partially Compensatory Multidimensional Item Response Theory Models: Two Alternate Model Forms. Educational and Psychological Measurement, 76(2):231–257.
  21. Feature space sketching for logistic regression. CoRR, abs/2303.14284.
  22. Fast approximation of matrix coherence and statistical leverage. J. Mach. Learn. Res., 13:3475–3506.
  23. A unified framework for approximating and clustering data. In Proceedings of the 43rd ACM Symposium on Theory of Computing (STOC), pages 569–578.
  24. Turning Big Data into tiny data: Constant-size coresets for k𝑘kitalic_k-means, PCA, and projective clustering. SIAM J. Comput., 49(3):601–657.
  25. Matrix computations (4th Edition). Johns Hopkins University Press.
  26. Coresets for scalable Bayesian logistic regression. In Advances in Neural Information Processing Systems 29 (NeurIPS), pages 4080–4088.
  27. Karadavut, T. (2016). Comparison of data sampling methods on IRT parameter estimation. Master’s thesis, University of Georgia, Athens, GA, USA.
  28. An introduction to computational learning theory. MIT press.
  29. Universal epsilon-approximators for integrals. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 598–607.
  30. Statistical Theories of Mental Test Scores. Addison-Wesley.
  31. Coresets for classification - simplified and strengthened. In Advances in Neural Information Processing Systems 34 (NeurIPS), pages 11643–11654.
  32. When AI difficulty is easy: The explanatory power of predicting IRT difficulty. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 7719–7727.
  33. Item response theory in AI: analysing machine learning classifiers at the instance level. Artificial Intelligence, 271:18–42.
  34. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2):149–174.
  35. A meta-analysis investigating the association between metacognition and math performance in adolescence. Educational Psychology Review, pages 1–34.
  36. An instance space analysis of regression problems. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(2):1–25.
  37. Munteanu, A. (2023). Coresets and sketches for regression problems on data streams and distributed data. In Machine Learning under Resource Constraints, Volume 1 - Fundamentals, pages 85–98. De Gruyter.
  38. p𝑝pitalic_p-generalized probit regression and scalable maximum likelihood estimation via sketching and coresets. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 2073–2100.
  39. Oblivious sketching for logistic regression. In Proceedings of the 38th International Conference on Machine Learning (ICML), pages 7861–7871.
  40. Almost linear constant-factor sketching for ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and logistic regression. In The Eleventh International Conference on Learning Representations (ICLR).
  41. Coresets-methods and history: A theoreticians design pattern for approximation and streaming algorithms. Künstliche Intell., 32(1):37–53.
  42. On coresets for logistic regression. In Advances in Neural Information Processing Systems 31 (NeurIPS), pages 6562–6571.
  43. NEPS-Network (2021). German National Educational Panel, Scientific Use File of Starting Cohort Grade 9. Leibniz Institute for Educational Trajectories (LIfBi), Bamberg.
  44. A beta item response model for continuous bounded responses. Applied Psychological Measurement, 31(1):47–73.
  45. OECD (2009). PISA Data Analysis Manual: SPSS, Second Edition. OECD Publishing, Paris.
  46. OECD (2019). PISA 2018 Results (Volume I): What Students Know and Can Do. OECD Publishing, Paris.
  47. Rasch, G. (1960). Studies in Mathematical Psychology: I. Probabilistic Models for Some Intelligence and Attainment Tests. Nielsen & Lydiche.
  48. Communication efficient coresets for empirical loss minimization. In Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence (UAI), pages 752–761.
  49. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34(4.2):1–97.
  50. On the unidentifiability of the fixed-effects 3PL model. Psychometrika, 80(2):450–467.
  51. An empirical evaluation of k𝑘kitalic_k-means coresets. In Proceedings of the 30th Annual European Symposium on Algorithms (ESA), pages 84:1–84:17.
  52. Generic coreset for scalable learning of monotonic kernels: Logistic regression, sigmoid and more. In Proceedings of the 39th International Conference on Machine Learning (ICML), pages 21520–21547.
  53. Coresets for near-convex functions. In Advances in Neural Information Processing Systems 33 (NeurIPS).
  54. von Davier, M. (2020). TIMSS 2019 Scaling Methodology: Item Response Theory, Population Models, and Linking Across Modes. In Methods and Procedures: TIMSS 2019 Technical Report, pages 11.1–11.25. TIMSS & PIRLS International Study Center.
  55. Online Lewis weight sampling. In Proceedings of the 34th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4622–4666.
  56. Sharper bounds for ℓpsubscriptℓ𝑝\ell_{p}roman_ℓ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT sensitivity sampling. In Proceedings of the 40th International Conference on Machine Learning (ICML), pages 37238–37272.
Citations (4)

Summary

We haven't generated a summary for this paper yet.