Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models (2406.18125v2)

Published 26 Jun 2024 in cs.CL, cs.AI, cs.CY, and cs.LG

Abstract: The increasing reliance on online recruitment platforms coupled with the adoption of AI technologies has highlighted the critical need for efficient resume classification methods. However, challenges such as small datasets, lack of standardized resume templates, and privacy concerns hinder the accuracy and effectiveness of existing classification models. In this work, we address these challenges by presenting a comprehensive approach to resume classification. We curated a large-scale dataset of 13,389 resumes from diverse sources and employed LLMs such as BERT and Gemma1.1 2B for classification. Our results demonstrate significant improvements over traditional machine learning approaches, with our best model achieving a top-1 accuracy of 92\% and a top-5 accuracy of 97.5\%. These findings underscore the importance of dataset quality and advanced model architectures in enhancing the accuracy and robustness of resume classification systems, thus advancing the field of online recruitment practices.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ahmed Heakl (17 papers)
  2. Youssef Mohamed (14 papers)
  3. Noran Mohamed (2 papers)
  4. Ahmed Zaky (3 papers)
  5. Aly Elsharkawy (1 paper)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com