Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies (2407.19816v2)

Published 29 Jul 2024 in cs.CL

Abstract: The labor market is undergoing rapid changes, with increasing demands on job seekers and a surge in job openings. Identifying essential skills and competencies from job descriptions is challenging due to varying employer requirements and the omission of key skills. This study addresses these challenges by comparing traditional Named Entity Recognition (NER) methods based on encoders with LLMs for extracting skills from Russian job vacancies. Using a labeled dataset of 4,000 job vacancies for training and 1,472 for testing, the performance of both approaches is evaluated. Results indicate that traditional NER models, especially DeepPavlov RuBERT NER tuned, outperform LLMs across various metrics including accuracy, precision, recall, and inference time. The findings suggest that traditional NER models provide more effective and efficient solutions for skill extraction, enhancing job requirement clarity and aiding job seekers in aligning their qualifications with employer expectations. This research contributes to the field of NLP and its application in the labor market, particularly in non-English contexts.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Nikita Matkin (2 papers)
  2. Aleksei Smirnov (2 papers)
  3. Mikhail Usanin (1 paper)
  4. Egor Ivanov (1 paper)
  5. Kirill Sobyanin (1 paper)
  6. Sofiia Paklina (1 paper)
  7. Petr Parshakov (3 papers)