Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Survey of Computerized Adaptive Testing: A Machine Learning Perspective (2404.00712v2)

Published 31 Mar 2024 in cs.LG, cs.AI, cs.CY, and cs.IR

Abstract: Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees, by dynamically adjusting test questions based on their performance. Widely adopted across diverse fields like education, healthcare, sports, and sociology, CAT has revolutionized testing practices. While traditional methods rely on psychometrics and statistics, the increasing complexity of large-scale testing has spurred the integration of machine learning techniques. This paper aims to provide a machine learning-focused survey on CAT, presenting a fresh perspective on this adaptive testing method. By examining the test question selection algorithm at the heart of CAT's adaptivity, we shed light on its functionality. Furthermore, we delve into cognitive diagnosis models, question bank construction, and test control within CAT, exploring how machine learning can optimize these components. Through an analysis of current methods, strengths, limitations, and challenges, we strive to develop robust, fair, and efficient CAT systems. By bridging psychometric-driven CAT research with machine learning, this survey advocates for a more inclusive and interdisciplinary approach to the future of adaptive testing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (182)
  1. Using multidimensional item response theory to evaluate educational and psychological tests. Educational Measurement: Issues and Practice 22, 3 (2003), 37–51.
  2. On the theory of policy gradient methods: Optimality, approximation, and distribution shift. The Journal of Machine Learning Research 22, 1 (2021), 4431–4506.
  3. Xinming An and Yiu-Fai Yung. 2014. Item response theory: What it is and how you can use the IRT procedure to apply it. SAS Institute Inc. SAS364-2014 10, 4 (2014), 1–14.
  4. Validity and reliability of computerized adaptive test of soccer tactical skill. Football Science 15 (2018), 38–51.
  5. Constructing rotating item pools for constrained adaptive testing. Journal of Educational Measurement 41, 4 (2004), 345–359.
  6. A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017).
  7. Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds. In International Conference on Learning Representations.
  8. The Hamilton Depression Rating Scale: has the gold standard become a lead weight? American Journal of Psychiatry 161, 12 (2004), 2163–2177.
  9. Frank B Baker and Seock-Ho Kim. 2004. Item response theory: Parameter estimation techniques. CRC press.
  10. Optimal number of strata for the stratified methods in computerized adaptive testing. The Spanish Journal of Psychology 17 (2014), E48.
  11. METODOLOGÍA: Comparison of methods for controlling maximum exposure rates in computerized adaptive testing. Psicothema (2009), 313–320.
  12. Maximum information stratification method for controlling item exposure in computerized adaptive testing. Psicothema 18, 1 (2006), 156–159.
  13. Multiple maximum exposure rates in computerized adaptive testing. Applied Psychological Measurement 33, 1 (2009), 58–73.
  14. Dmitry I Belov and Ronald D Armstrong. 2009. Direct and inverse problems of item pool design for computerized adaptive testing. Educational and Psychological Measurement 69, 4 (2009), 533–547.
  15. Jon Louis Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 9 (1975), 509–517.
  16. Dimitri P Bertsekas and John N Tsitsiklis. 1991. An analysis of stochastic shortest path problems. Mathematics of Operations Research 16, 3 (1991), 580–595.
  17. Evaluating ChatGPT-generated Textbook Questions using IRT. In Generative AI for Education Workshop (GAIED) at the Thirty-seventh Conference on Neural Information Processing Systems.
  18. BETA-CD: A Bayesian meta-learned cognitive diagnosis framework for personalized learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 5018–5026.
  19. Quality meets diversity: A model-agnostic framework for computerized adaptive testing. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 42–51.
  20. Andrew P Bradley. 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition 30, 7 (1997), 1145–1159.
  21. Henry I Braun. 1982. Observed-score test equating: A mathematical anaysis of some ETS equating procedures. Test equating (1982).
  22. S Leellen Brigman and WL Bashaw. 1976. Multiple Test Equating Using the Rasch Model. (1976).
  23. A survey on active learning and human-in-the-loop deep learning for medical image analysis. Medical Image Analysis 71 (2021), 102062.
  24. Gregory Camilli and Lorrie A Shepard. 1994. Methods for identifying biased test items. Vol. 4. Sage.
  25. Junyi Chai and Xiaoqian Wang. 2022. Fairness with Adaptive Weights. In Proceedings of the 39th International Conference on Machine Learning, Vol. 162. 2853–2866.
  26. Hua-Hua Chang. 2015. Psychometrics behind computerized adaptive testing. Psychometrika 80, 1 (2015), 1–20.
  27. a-Stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement 25, 4 (2001), 333–341.
  28. Hua-Hua Chang and Zhiliang Ying. 1996. A global information approach to computerized adaptive testing. Applied Psychological Measurement 20, 3 (1996), 213–229.
  29. Hua-Hua Chang and Zhiliang Ying. 1999. A-stratified multistage computerized adaptive testing. Applied Psychological Measurement 23, 3 (1999), 211–222.
  30. Modeling Exercise Relationships in E-Learning: A Unified Approach.. In EDM. 532–535.
  31. Wen-Chih Chang and Hsuan-Che Yang. 2009. Applying IRT to Estimate Learning Ability and K-means Clustering in Web based Learning. J. Softw. 4, 2 (2009), 167–174.
  32. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109 (2023).
  33. A personalized courseware recommendation system based on fuzzy item response theory. In IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE’04. 2004. IEEE, 305–308.
  34. Computer Adaptive Testing Using the Same-Decision Probability.. In BMA@ UAI. 34–43.
  35. Recommendation system for adaptive learning. Applied psychological measurement 42, 1 (2018), 24–41.
  36. Statistical analysis of Q-matrix based diagnostic classification models. J. Amer. Statist. Assoc. 110, 510 (2015), 850–866.
  37. DIRT: Deep learning enhanced item response theory for cognitive diagnosis. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2397–2400.
  38. Ying Cheng. 2008. Computerized adaptive testing—new developments and applications. University of Illinois at Urbana-Champaign.
  39. Ying Cheng. 2009. When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika 74 (2009), 619–632.
  40. Y Cheng and H Chang. 2007. The modified maximum global discrimination index method for cognitive diagnostic computerized adaptive testing. CAT and Cognitive Structure Paper Session, June 7 (2007).
  41. Ednet: A large-scale hierarchical dataset in education. In Artificial Intelligence in Education: 21st International Conference, AIED 2020, Ifrane, Morocco, July 6–10, 2020, Proceedings, Part II 21. Springer, 69–73.
  42. Man-Wai Chu and Hollis Lai. 2013. Detecting biased items using CATSIB to increase fairness in computer adaptive tests. Alberta Journal of Educational Research 59, 4 (2013), 630–643.
  43. T. Anne Cleary. 1968. Test Bias: Prediction of Grades of Negro and White Students in Integrated Colleges. Journal of Educational Measurement 5 (1968), 115–124.
  44. SIETTE: A web-based tool for adaptive testing. International Journal of Artificial Intelligence in Education 14, 1 (2004), 29–61.
  45. Linda Crocker and James Algina. 1986. Introduction to classical and modern test theory. ERIC.
  46. Exploration of item selection in dual-purpose cognitive diagnostic computerized adaptive testing: Based on the RRUM. Applied Psychological Measurement 40, 8 (2016), 625–640.
  47. André F De Champlain. 2010. A primer on classical test theory and item response theory for assessments in medical education. Medical education 44, 1 (2010), 109–117.
  48. Jimmy De La Torre. 2009. DINA model and parameter estimation: A didactic. Journal of educational and behavioral statistics 34, 1 (2009), 115–130.
  49. Jimmy de la Torre. 2011. The generalized DINA model framework. Psychometrika 76, 2 (2011), 179–199. https://doi.org/10.1007/s11336-011-9207-7 Place: Germany Publisher: Springer.
  50. Kalyanmoy Deb. 2011. Multi-objective optimisation using evolutionary algorithms: an introduction. In Multi-objective evolutionary optimisation for product design and manufacturing. Springer, 3–34.
  51. Robert F DeVellis. 2006. Classical test theory. Medical care (2006), S50–S59.
  52. Towards Next-Generation Intelligent Assistants Leveraging LLM Techniques. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5792–5793.
  53. Benchmarking adversarial robustness on image classification. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 321–331.
  54. Susan E Embretson and Steven P Reise. 2013. Item response theory. Psychology Press.
  55. Eugene A Feinberg and Adam Shwartz. 2012. Handbook of Markov decision processes: methods and applications. Vol. 40. Springer Science & Business Media.
  56. Addressing the assessment challenge with an online system that tutors as it assesses. User modeling and user-adapted interaction 19 (2009), 243–266.
  57. Balancing Test Accuracy and Security in Computerized Adaptive Testing. arXiv preprint arXiv:2305.18312 (2023).
  58. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. PMLR, 1126–1135.
  59. Deep cognitive diagnosis model for predicting students’ performance. Future Generation Computer Systems 126 (Jan. 2022), 252–262.
  60. RCD: Relation map driven cognitive diagnosis for intelligent education systems. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. 501–510.
  61. Data shapley valuation for efficient batch active learning. In 2022 56th Asilomar Conference on Signals, Systems, and Computers. IEEE, 1456–1462.
  62. Aritra Ghosh and Andrew Lan. 2021. BOBCAT: Bilevel Optimization-Based Computerized Adaptive Testing. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. International Joint Conferences on Artificial Intelligence Organization, 2410–2417.
  63. Successful validation of the CAT-MH Scales in a sample of Latin American migrants in the United States and Spain. Psychological assessment 30, 10 (2018), 1267.
  64. Robert D Gibbons and Frank V deGruy. 2019. Without wasting a word: Extreme improvements in efficiency and accuracy using computerized adaptive testing for mental health disorders (CAT-MH). Current Psychiatry Reports 21 (2019), 1–9.
  65. The computerized adaptive diagnostic test for major depressive disorder (CAD-MDD): a screening tool for depression. The Journal of clinical psychiatry 74, 7 (2013), 3579.
  66. Development of a computerized adaptive test suicide scale—The CAT-SS. The Journal of clinical psychiatry 78, 9 (2017), 3581.
  67. Development of the CAT-ANX: a computerized adaptive test for anxiety. American Journal of Psychiatry 171, 2 (2014), 187–194.
  68. Patricia Gilavert and Valdinei Freire. 2022. Computerized Adaptive Testing: A Unified Approach Under Markov Decision Process. In International Conference on Computational Science and Its Applications. Springer, 591–602.
  69. Technical guidelines for assessing computerized adaptive tests. Journal of Educational measurement 21, 4 (1984), 347–360.
  70. Harold Gulliksen. 2013. Theory of mental tests. Routledge.
  71. Fundamentals of item response theory. Vol. 2. Sage.
  72. Wei He and Mark D Reckase. 2014. Item pool design for an operational variable-length computerized adaptive test. Educational and Psychological Measurement 74, 3 (2014), 473–494.
  73. Robert Henson and Jeff Douglas. 2005. Test construction for cognitive diagnosis. Applied Psychological Measurement 29, 4 (2005), 262–277.
  74. Marcus Hoerger and Hanna Kurniawati. 2021. An on-line POMDP solver for continuous observation spaces. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 7643–7649.
  75. Search-Efficient Computerized Adaptive Testing. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 773–782.
  76. Paradoxical results in multidimensional item response theory. Psychometrika 74, 3 (2009), 419–442.
  77. Hierarchical multi-label text classification: An attention-based recurrent network approach. In Proceedings of the 28th ACM international conference on information and knowledge management. 1051–1060.
  78. HmcNet: A General Approach for Hierarchical Multi-Label Classification. IEEE Transactions on Knowledge and Data Engineering (2022).
  79. Stan: adversarial network for cross-domain question difficulty prediction. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 220–229.
  80. An adaptive testing system for supporting versatile educational assessment. Computers & Education 52, 1 (2009), 53–67.
  81. Question Difficulty Prediction for READING Problems in Standard Tests. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.
  82. A survey of deep meta-learning. Artificial Intelligence Review 54, 6 (2021), 4483–4541.
  83. ELO JAN. [n. d.]. Subpopulation Differences In Equating Computerized Adaptive And Paper-and-pencil Versions of The ASVAB. ([n. d.]).
  84. Heinrich Jiang and Ofir Nachum. 2019. Identifying and Correcting Label Bias in Machine Learning. In International Conference on Artificial Intelligence and Statistics.
  85. Michael I Jordan and Tom M Mitchell. 2015. Machine learning: Trends, perspectives, and prospects. Science 349, 6245 (2015), 255–260.
  86. Dual-Objective Item Selection Criteria in Cognitive Diagnostic Computerized Adaptive Testing. Journal of Educational Measurement 54, 2 (2017), 165–183.
  87. New item selection methods for cognitive diagnosis computerized adaptive testing. Applied psychological measurement 39, 3 (2015), 167–188.
  88. Sanjay Kariyappa and Moinuddin K Qureshi. 2019. Improving adversarial robustness of ensembles with diversity training. arXiv preprint arXiv:1901.09981 (2019).
  89. René F Kizilcec and Hansol Lee. 2022. Algorithmic fairness in education. In The ethics of artificial intelligence in education. Routledge, 174–202.
  90. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32, 11 (2013), 1238–1274.
  91. Adrianna Kozierkiewicz-Hetmańska and Rafał Poniatowski. 2014. An item bank calibration method for a computer adaptive test. In Asian Conference on Intelligent Information and Database Systems. Springer, 375–383.
  92. Anita Krishnakumar. 2007. Active Learning Literature Survey. (07 2007).
  93. The PHQ-9: validity of a brief depression severity measure. Journal of general internal medicine 16, 9 (2001), 606–613.
  94. Consistency-aware Multi-modal Network for Hierarchical Multi-label Classification in Online Education System. In 2021 IEEE International Conference on Big Knowledge (ICBK). IEEE, 1–8.
  95. The attribute hierarchy method for cognitive assessment: A variation on Tatsuoka’s rule-space approach. Journal of educational measurement 41, 3 (2004), 205–237.
  96. Dongyue Li and Hongyang Zhang. 2021. Improved regularization and robustness for fine-tuning in neural networks. Advances in Neural Information Processing Systems 34 (2021), 27249–27262.
  97. HierCDF: A Bayesian Network-based Hierarchical Cognitive Diagnosis Framework. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 904–913.
  98. Peizhao Li and Hongfu Liu. 2022. Achieving Fairness at No Utility Cost via Data Reweighing. In Proceedings of the 39th International Conference on Machine Learning, Vol. 162. 12917–12930.
  99. Xiao Li. 2020. Data-driven adaptive learning systems. Ph. D. Dissertation.
  100. Deep reinforcement learning for adaptive learning systems. arXiv preprint arXiv:2004.08410 (2020).
  101. Deep reinforcement learning for adaptive learning systems. Journal of Educational and Behavioral Statistics 48, 2 (2023), 220–243.
  102. Learning Evidential Cognitive Diagnosis Networks Robust to Response Bias. In CAAI International Conference on Artificial Intelligence. Springer, 171–181.
  103. Theory of the self-learning Q-matrix. Bernoulli: official journal of the Bernoulli Society for Mathematical Statistics and Probability 19, 5A (2013), 1790.
  104. Fuzzy cognitive diagnosis for modelling examinee performance. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 4 (2018), 1–26.
  105. Adaptive evaluation in an e-learning system architecture. Current Developments in Technology-Assisted Education (2006), 1507–1511.
  106. Helping tools for item bank calibration and development of computerized adaptive tests. In International Technology, Education, and Development Conference (INTED2008). Valencia, España: International Association of Technology, Education, and Development.
  107. Computerized adaptive testing, the item bank calibration and a tool for easing the process. Technology Education and Development (2009), 457–478.
  108. Frederic M Lord. 2012. Applications of item response theory to practical testing problems. Routledge.
  109. Richard M Luecht and Ronald J Nungester. 1998. Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement 35, 3 (1998), 229–249.
  110. A novel computerized adaptive testing framework with decoupled learning selector. Complex & Intelligent Systems (2023), 1–12.
  111. Computerized adaptive and multistage testing with R: Using packages catR and mstR. Springer.
  112. Carlo Magno. 2009. Demonstrating the difference between classical test theory and item response theory using derived test data. The international Journal of Educational and Psychological assessment 1, 1 (2009), 1–11.
  113. Alan D Mead. 2006. An introduction to multistage testing. Applied Measurement in Education 19, 3 (2006), 185–187.
  114. Rob R Meijer and Michael L Nering. 1999. Computerized adaptive testing: Overview and introduction. Applied psychological measurement 23, 3 (1999), 187–194.
  115. Gideon J Mellenbergh. 1989. Item bias and item response theory. International journal of educational research 13, 2 (1989), 127–143.
  116. Alan Miller. 2002. Subset selection in regression. CRC Press.
  117. Craig N Mills and Manfred Steffen. 2000. The GRE computer adaptive test: Operational issues. In Computerized adaptive testing: Theory and practice. Springer, 75–99.
  118. Coresets for Data-efficient Training of Machine Learning Models. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 6950–6960.
  119. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928–1937.
  120. Jacob M Montgomery and Josh Cutler. 2013. Computerized adaptive testing for public opinion surveys. Political Analysis 21, 2 (2013), 172–192.
  121. Dena F Mujtaba and Nihar R Mahapatra. 2020. Artificial intelligence in computerized adaptive testing. In 2020 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, 649–654.
  122. Dena F Mujtaba and Nihar R Mahapatra. 2021. Multi-objective optimization of item selection in computerized adaptive testing. In Proceedings of the Genetic and Evolutionary Computation Conference. 1018–1026.
  123. Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training. arXiv preprint arXiv:2301.07558 (2023).
  124. Darkhan Nurakhmetov. 2019. Reinforcement learning applied to adaptive classification testing. In Theoretical and Practical Advances in Computer-based Educational Measurement. Springer, Cham, 325–336.
  125. Self-Attention Gated Cognitive Diagnosis for Faster Adaptive Educational Assessments. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 408–417.
  126. Question difficulty prediction for multiple choice problems in medical exams. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 139–148.
  127. Parikshit Ram and Alexander G Gray. 2012. Maximum inner-product search using cone trees. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 931–939.
  128. Mark D Reckase. 2006. 18 Multidimensional Item Response Theory. Handbook of statistics 26 (2006), 607–642.
  129. Mark D Reckase. 2010. Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling 52, 2 (2010), 127.
  130. Javier Revuelta and Vicente Ponsoda. 1998. A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement 35, 4 (1998), 311–327.
  131. Jorma J Rissanen. 1996. Fisher information and stochastic complexity. IEEE transactions on information theory 42, 1 (1996), 40–47.
  132. Philip Roberts. [n. d.]. Standardised tests are culturally biased against rural students. https://theconversation.com/standardised-tests-are-culturally-biased-against-rural-students-86305. Accessed: 2017-11-21.
  133. Edw E Roskam and Paul GW Jansen. 1984. A new derivation of the Rasch model. In Advances in Psychology. Vol. 20. Elsevier, 293–307.
  134. Sheldon M Ross. 2014. A first course in probability. Pearson.
  135. Lawrence M Rudner. 2009. Implementing the graduate management admission test computerized adaptive test. In Elements of adaptive testing. Springer, 151–165.
  136. Computerized adaptive testing: From inquiry to operation. American Psychological Association.
  137. Yasuyo Sawaki. 2001. Comparability of conventional and computerized tests of reading in a second language. (2001).
  138. Online Planning for Interactive-POMDPs Using Nested Monte Carlo Tree Search. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 8770–8777.
  139. Daniel O Segall. 2005. Computerized adaptive testing. Encyclopedia of social measurement 1 (2005), 429–438.
  140. Jinnie Shin and Okan Bulut. 2022. Building an intelligent recommendation system for personalized test scheduling in computerized assessments: A reinforcement learning approach. Behavior Research Methods 54, 1 (2022), 216–232.
  141. Martha L Stocking and Len Swanson. 1998. Optimal design of item banks for computerized adaptive tests. Applied Psychological Measurement 22, 3 (1998), 271–279.
  142. Alternating recursive method for Q-matrix learning. In Educational Data Mining 2014.
  143. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
  144. Len Swanson and Martha L Stocking. 1993. A model and heuristic for solving very large item selection problems. Applied Psychological Measurement 17, 2 (1993), 151–166.
  145. JB Sympson and RD Hetter. 1985. Controlling item-exposure rates in computerized adaptive testing. In Proceedings of the 27th annual meeting of the Military Testing Association. 973–977.
  146. Curtis Tatsuoka. 2002. Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society Series C: Applied Statistics 51, 3 (2002), 337–350.
  147. Greg Thompson. [n. d.]. Is the NAPLAN results delay about politics or precision? https://blog.aare.edu.au/is-the-naplan-results-delay-about-politics-or-precision/. Accessed: 2022-8-29.
  148. Item Response Ranking for Cognitive Diagnosis.. In IJCAI. 1750–1756.
  149. Wim J van der Linden. 1998. Bayesian item selection criteria for adaptive testing. Psychometrika 63, 2 (1998), 201–216.
  150. Wim J van der Linden. 2000. A test-theoretic approach to observed-score equating. Psychometrika 65, 4 (2000), 437–456.
  151. Assembling a computerized adaptive testing item pool as a set of linear tests. Journal of Educational and Behavioral Statistics 31, 1 (2006), 81–99.
  152. Wim J Van der Linden and Cees AW Glas. 2010. Elements of adaptive testing. Vol. 10. Springer.
  153. Wim J van der Linden and Bernard P Veldkamp. 2004. Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics 29, 3 (2004), 273–291.
  154. Wim J van der Linden and Bernard P Veldkamp. 2007. Conditional item-exposure control in adaptive testing using item-ineligibility probabilities. Journal of Educational and Behavioral Statistics 32, 4 (2007), 398–418.
  155. An Integer-Programming Approach to Item Pool Design. Law School Admission Council Computerized Testing Report. LSAC Research Report Series. (2000).
  156. Wim JJ Veerkamp and Martijn PF Berger. 1997. Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics 22, 2 (1997), 203–226.
  157. Bernard P Veldkamp and Angela J Verschoor. 2019. Robust computerized adaptive testing. Theoretical and practical advances in computer-based educational measurement (2019), 291–305.
  158. Angela J Verschoor and Gerard JJM Straetmans. 2010. MATHCAT: A flexible testing system in mathematics education for adults. Elements of adaptive testing (2010), 137–149.
  159. A review of recent advances in adaptive assessment. Learning analytics: fundaments, applications, and trends (2017), 113–142.
  160. Matthias Von Davier. 2014. The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. Brit. J. Math. Statist. Psych. 67, 1 (2014), 49–71.
  161. Computerized adaptive testing: A primer. Routledge.
  162. Howard Wainer and Gerard L Kiely. 1987. Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational measurement 24, 3 (1987), 185–201.
  163. An enhanced approach to combine item response theory with cognitive diagnosis in adaptive testing. Journal of Educational Measurement 51, 4 (2014), 358–380.
  164. Neural cognitive diagnosis for intelligent education systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6153–6161.
  165. GMOCAT: A Graph-Enhanced Multi-Objective Method for Computerized Adaptive Testing. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2279–2289.
  166. Dual adversarial network for deep active learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16. Springer, 680–696.
  167. Self-supervised graph learning for long-tailed cognitive diagnosis. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 110–118.
  168. Regularizing deep networks with semantic data augmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 7 (2021), 3733–3748.
  169. Diagnostic questions: The neurips 2020 education challenge. arXiv preprint arXiv:2007.12061 (2020).
  170. Developing, maintaining, and renewing the item inventory to support CBT. In Computer-Based Testing. Routledge, 143–164.
  171. Variational Item Response Theory: Fast, Accurate, and Expressive. In Proceedings of the 13th International Conference on Educational Data Mining, EDM 2020, Fully virtual conference, July 10-13, 2020, Anna N. Rafferty, Jacob Whitehill, Cristóbal Romero, and Violetta Cavalli-Sforza (Eds.). International Educational Data Mining Society.
  172. Data-driven Q-matrix learning based on Boolean matrix factorization in cognitive diagnostic assessment. Brit. J. Math. Statist. Psych. 75, 3 (2022), 638–667.
  173. Q Yi and H Chang. 2001. a-Stratified computerized adaptive testing with content blocking. In Annual Meeting of the Psychometric Society, King of Prussia, PA.
  174. Quesnet: A unified representation for heterogeneous test questions. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining. 1328–1336.
  175. Donggeun Yoo and In So Kweon. 2019. Learning loss for active learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 93–102.
  176. SACAT: Student-Adaptive Computerized Adaptive Testing. In The Fifth International Conference on Distributed Artificial Intelligence. 1–7.
  177. Meltem Yurtcu and Cem GÜZELLER. 2021. Bibliometric analysis of articles on computerized adaptive testing. Participatory Educational Research 8, 4 (2021), 426–438.
  178. The information product methods: A unified approach to dual-purpose computerized adaptive testing. Applied Psychological Measurement 42, 4 (2018), 321–324.
  179. Learning tree-based deep model for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1079–1088.
  180. A Robust Computerized Adaptive Testing Approach in Educational Question Retrieval. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 416–426.
  181. Fully Adaptive Framework: Neural Computerized Adaptive Testing for Online Education. Proceedings of the AAAI Conference on Artificial Intelligence 36, 4 (Jun. 2022), 4734–4742.
  182. A Bounded Ability Estimation for Computerized Adaptive Testing. In Thirty-seventh Conference on Neural Information Processing Systems.
Citations (3)

Summary

We haven't generated a summary for this paper yet.