Papers
Topics
Authors
Recent
Search
2000 character limit reached

LLM-TOPSIS Framework

Updated 6 February 2026
  • The paper presents the LLM-TOPSIS framework that integrates fine-tuned transformer models with fuzzy TOPSIS to rank candidate profiles using both structured and unstructured data.
  • The methodology converts NLP-derived proficiency labels into numerical scores and triangular fuzzy numbers, forming a decision matrix for comprehensive multi-criteria evaluation.
  • Empirical results demonstrate high performance with 91%+ accuracy and near-perfect alignment with expert rankings, underscoring the framework's potential in enhancing recruitment processes.

The LLM-TOPSIS framework is an integrated system that combines LLM natural language processing with a fuzzy extension of the Technique for Order Preference by Similarity to Ideal Solution (Fuzzy-TOPSIS), applied to the automated, multi-criteria ranking of personnel profiles in software engineering recruitment. The methodology is designed to operationalize both structured expert knowledge and the nuanced, unstructured data found in LinkedIn profiles by leveraging fine-tuned transformer models as scoring front-ends and a fuzzy multi-criteria decision-making (MCDM) backend operating on triangular fuzzy numbers (TFNs) (Hoque et al., 30 Jan 2026).

1. System Architecture and Workflow

The LLM-TOPSIS system ingests a set of NN LinkedIn profiles, each with four key textual fields: Experience, Skills, Education, and About (self-introduction). The primary workflow consists of the following steps:

  1. Fine-tuned DistilRoBERTa Multi-class Classification: For each field, a DistilRoBERTa model is fine-tuned to predict one of three proficiency labels—Poor, Fair, or Excellent.
  2. Label-to-Score Mapping: Predicted labels are mapped to numerical scores: 1–2 (Poor), 3 (Fair), 4–5 (Excellent).
  3. Matrix Construction: Scores are organized into a numeric decision matrix VV of shape N×4N \times 4.
  4. Fuzzy TOPSIS Application: The decision matrix is transformed using TFNs for both criteria weights and candidate scores. Fuzzy-TOPSIS is then applied to produce an overall candidate ranking.

DistilRoBERTa acts as the quantitative interpreter of unstructured profile data, while the fuzzy-TOPSIS backend aggregates the resulting scores under explicit modeling of linguistic and subjective uncertainty.

2. Mathematical Preliminaries and Notation

2.1 Triangular Fuzzy Numbers (TFNs)

A TFN is specified as a~=(l,m,u)\tilde{a} = (l, m, u), where lmul \leq m \leq u are lower, modal, and upper bounds. The membership function μa~(x)\mu_{\tilde{a}}(x) increases linearly from ll to mm and decreases linearly from mm to uu.

2.2 Linguistic-to-TFN Mapping

Candidate attribute labels and criteria weights are converted to TFNs via predefined mappings. For example, the translation from linguistic term to TFN is as follows:

Linguistic Term TFN
Very Low (0.0, 0.1, 0.3)
Low (0.1, 0.3, 0.5)
Medium (0.3, 0.5, 0.7)
High (0.5, 0.7, 0.9)
Very High (0.7, 0.9, 1.0)

Criteria weights are specified as w~j=(lj,mj,uj)\tilde{w}_j = (l_j, m_j, u_j) for j{j \in \{Experience, Skills, Education, About}\}, and candidate scores as x~ij=(lij,mij,uij)\tilde{x}_{ij} = (l_{ij}, m_{ij}, u_{ij}) via interval or linguistic mappings.

3. Fuzzy TOPSIS Computation

3.1 Fuzzy Decision Matrix and Weights

The fuzzy decision matrix is X~=[x~ij]N×m\tilde{X} = [\tilde{x}_{ij}]_{N \times m}, and the fuzzy weight vector w~=[w~j]1×m\tilde{w} = [\tilde{w}_j]_{1 \times m}, with m=4m=4 criteria.

3.2 Fuzzy Normalization

Each criterion jj is normalized (for benefit attributes) as:

r~ij=x~ijmaxiuij\tilde{r}_{ij} = \frac{\tilde{x}_{ij}}{\max_i u_{ij}}

3.3 Weighted Normalized Decision Matrix

Elementwise fuzzy multiplication yields:

v~ij=r~ijw~j\tilde{v}_{ij} = \tilde{r}_{ij} \otimes \tilde{w}_j

where (l1,m1,u1)(l2,m2,u2)=(l1l2,m1m2,u1u2)(l_1, m_1, u_1) \otimes (l_2, m_2, u_2) = (l_1 l_2, m_1 m_2, u_1 u_2).

3.4 Ideal Solutions

  • Fuzzy positive ideal: v~j+=(maxilij,maximij,maxiuij)\tilde{v}_j^+ = (\max_i l_{ij}', \max_i m_{ij}', \max_i u_{ij}')
  • Fuzzy negative ideal: v~j=(minilij,minimij,miniuij)\tilde{v}_j^- = (\min_i l_{ij}', \min_i m_{ij}', \min_i u_{ij}') where (lij,mij,uij)=v~ij(l_{ij}', m_{ij}', u_{ij}') = \tilde{v}_{ij}.

3.5 Fuzzy Distance Measures

The vertex method computes distance between TFNs:

d(a~,b~)=13[(lalb)2+(mamb)2+(uaub)2]d(\tilde{a}, \tilde{b}) = \sqrt{\tfrac{1}{3}[(l_a-l_b)^2 + (m_a-m_b)^2 + (u_a-u_b)^2]}

for each candidate ii:

Di+=j=1md(v~ij,v~j+),Di=j=1md(v~ij,v~j)D_i^+ = \sum_{j=1}^{m} d(\tilde{v}_{ij}, \tilde{v}_j^+), \quad D_i^- = \sum_{j=1}^{m} d(\tilde{v}_{ij}, \tilde{v}_j^-)

3.6 Closeness Coefficient

The closeness coefficient is then

CCi=DiDi++DiCC_i = \frac{D_i^-}{D_i^+ + D_i^-}

Defuzzification may be performed with the centroid method C(v~ij)=(l+m+u)/3C(\tilde{v}_{ij}) = (l + m + u)/3. Higher CCiCC_i values indicate more preferred candidates.

4. DistilRoBERTa LLM for Textual Attribute Scoring

The DistilRoBERTa LLM is fine-tuned separately per attribute (Experience, Skills, Education, About) on a dataset of 100 expert-labeled profiles, expanded to 10,000 samples per attribute via data augmentation (paraphrasing, synonym substitution). Key parameters include:

  • Model: distilroberta-base (6 layers, 82M parameters)
  • Training: 18 epochs, 1×1051 \times 10^{-5} learning rate, batch size 16, max sequence length 256
  • Labels: 3 classes (Poor, Fair, Excellent)
  • Loss: cross-entropy with knowledge distillation from a RoBERTa teacher

The model predicts a class yijy_{ij} for each profile field, which is mapped to a numeric score sij[1,5]s_{ij} \in [1,5], then to a TFN either by a small symmetric interval around sijs_{ij} or via a linguistic-to-TFN lexicon.

5. Algorithmic Summary

The LLM-TOPSIS ranking pipeline executes as follows:

  1. For each candidate i=1Ni=1\dots N:
    • For each criterion j{j\in\{skill, exp, edu, about}\}:
      • Compute yij=Mj(xi(j))y_{ij} = M_j(x_i^{(j)}) (class)
      • Map class to sijs_{ij} (numeric score), then to x~ij\tilde{x}_{ij} (TFN)
      • Assemble x~i=[x~i1,,x~i4]\tilde{x}_i = [\tilde{x}_{i1}, \dots, \tilde{x}_{i4}]
  2. Construct the decision matrix X~\tilde{X}
  3. Normalize X~\tilde{X} and apply fuzzy weights {w~j}\{\tilde{w}_j\}
  4. Compute v~j+\tilde{v}_j^+, v~j\tilde{v}_j^-, Di+D_i^+, DiD_i^- for each candidate
  5. Compute CCi=Di/(Di++Di)CC_i = D_i^-/(D_i^+ + D_i^-)
  6. Rank candidates by descending CCiCC_i values

6. Empirical Evaluation

6.1 DistilRoBERTa Classification Performance

  • Experience attribute: 91% accuracy (Precision = 0.95/1.00/0.99, Recall = 1.00/0.36/0.99 for Poor/Fair/Excellent)
  • Overall attribute: 91% accuracy (F1F_1 \approx 1.00/0.87/0.85)

6.2 Fuzzy-TOPSIS Ranking Quality

Using DistilRoBERTa-generated scores:

  • Mean Average Precision (MAP): 0.99
  • Normalized Discounted Cumulative Gain (NDCG): 0.926
  • Mean Reciprocal Rank (MRR): >> 0.999
  • Root Mean Square Error (RMSE): 0.043
  • Mean Absolute Error (MAE): 0.036
  • Cosine similarity: 0.983

Comparative analysis with human expert rankings yields cosine similarity of 0.981 and NDCG of 0.911. In a sample of 10 senior software engineering candidates, the system's rankings exhibited top-spot agreement with the expert panel and achieved cosine similarity >> 0.98 with human rankings, indicating a high degree of alignment.

7. Significance and Future Prospects

The LLM-TOPSIS approach demonstrates the viability of combining transformer-based profile assessment with a fuzzy logic MCDM framework for personnel selection tasks. Its capacity to encode and reason with subjectivity and imprecision in candidate evaluation is evidenced by empirical results: classification accuracy of ≥91% on key attributes and near-perfect concordance with human expert rankings. The framework enhances recruitment by supporting scalability, consistency, and minimization of bias. Proposed future directions include dataset expansion, improved interpretability, and validation in live recruitment scenarios to assess practical impact and robustness (Hoque et al., 30 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LLM-TOPSIS Framework.