Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 44 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 208 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Enhancing Essay Scoring with Adversarial Weights Perturbation and Metric-specific AttentionPooling (2401.05433v1)

Published 6 Jan 2024 in cs.CL and cs.AI

Abstract: The objective of this study is to improve automated feedback tools designed for English Language Learners (ELLs) through the utilization of data science techniques encompassing machine learning, natural language processing, and educational data analytics. Automated essay scoring (AES) research has made strides in evaluating written essays, but it often overlooks the specific needs of English Language Learners (ELLs) in language development. This study explores the application of BERT-related techniques to enhance the assessment of ELLs' writing proficiency within AES. To address the specific needs of ELLs, we propose the use of DeBERTa, a state-of-the-art neural LLM, for improving automated feedback tools. DeBERTa, pretrained on large text corpora using self-supervised learning, learns universal language representations adaptable to various natural language understanding tasks. The model incorporates several innovative techniques, including adversarial training through Adversarial Weights Perturbation (AWP) and Metric-specific AttentionPooling (6 kinds of AP) for each label in the competition. The primary focus of this research is to investigate the impact of hyperparameters, particularly the adversarial learning rate, on the performance of the model. By fine-tuning the hyperparameter tuning process, including the influence of 6AP and AWP, the resulting models can provide more accurate evaluations of language proficiency and support tailored learning tasks for ELLs. This work has the potential to significantly benefit ELLs by improving their English language proficiency and facilitating their educational journey.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. S. Dikli, “An overview of automated scoring of essays,” The Journal of Technology, Learning and Assessment, vol. 5, no. 1, 2006.
  2. C.-Y. Lin and E. Hovy, “Automatic evaluation of summaries using n-gram co-occurrence statistics,” in Proceedings of the 2003 human language technology conference of the North American chapter of the association for computational linguistics, 2003, pp. 150–157.
  3. E. Brent and M. Townsend, “Automated essay grading in the sociology classroom: Finding common ground,” Machine scoring of student essays: Truth and consequences, pp. 177–198, 2006.
  4. M. M. Ashenafi, G. Riccardi, and M. Ronchetti, “Predicting students’ final exam scores from their course activities,” in 2015 IEEE Frontiers in education conference (FIE).   IEEE, 2015, pp. 1–9.
  5. A. Rokade, B. Patil, S. Rajani, S. Revandkar, and R. Shedge, “Automated grading system using natural language processing,” in 2018 Second international conference on inventive communication and computational technologies (ICICCT).   IEEE, 2018, pp. 1123–1127.
  6. J. Gao, Q. Yang, Y. Zhang, L. Zhang, and S. Wang, “A bi-modal automated essay scoring system for handwritten essays,” in 2021 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2021, pp. 1–8.
  7. E. Mayfield and A. W. Black, “Should you fine-tune bert for automated essay scoring?” in Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, 2020, pp. 151–162.
  8. M. Beseiso and S. Alzahrani, “An empirical analysis of bert embedding for automated essay scoring,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 10, 2020.
  9. J. Xue, X. Tang, and L. Zheng, “A hierarchical bert-based transfer learning approach for multi-dimensional essay scoring,” Ieee Access, vol. 9, pp. 125 403–125 415, 2021.
  10. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  11. P. He, J. Gao, and W. Chen, “Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing,” arXiv preprint arXiv:2111.09543, 2021.
  12. D. Wu, S.-T. Xia, and Y. Wang, “Adversarial weight perturbation helps robust generalization,” Advances in Neural Information Processing Systems, vol. 33, pp. 2958–2969, 2020.
  13. M. J. Er, Y. Zhang, N. Wang, and M. Pratama, “Attention pooling-based convolutional neural network for sentence modelling,” Information Sciences, vol. 373, pp. 388–403, 2016.
  14. P. Li, Y. Song, I. V. McLoughlin, W. Guo, and L.-R. Dai, “An attention pooling based representation learning method for speech emotion recognition,” 2018.
  15. M. W. Browne, “Cross-validation methods,” Journal of mathematical psychology, vol. 44, no. 1, pp. 108–132, 2000.
  16. T. Chai and R. R. Draxler, “Root mean square error (rmse) or mean absolute error (mae)?–arguments against avoiding rmse in the literature,” Geoscientific model development, vol. 7, no. 3, pp. 1247–1250, 2014.
  17. D. Guo, S. Shamai, and S. Verdú, “Mutual information and minimum mean-square error in gaussian channels,” IEEE transactions on information theory, vol. 51, no. 4, pp. 1261–1282, 2005.
Citations (19)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.