HGNNLink: Heterogeneous GNN for Traceability

Updated 10 September 2025

HGNNLink is a heterogeneous graph neural network model that fuses textual embeddings and code dependencies to recover traceability links between requirements and code artifacts.
It employs transformer-based models like RoBERTa for requirements and GraphCodeBERT for code, using graph message passing to integrate both semantic and structural information.
Empirical evaluations on Java projects demonstrate that HGNNLink significantly outperforms text-only methods, establishing state-of-the-art performance in NL–PL traceability.

HGNNLink refers to a heterogeneous graph neural network-based model for software traceability link recovery, particularly focused on the requirements-to-code task, where the goal is to identify and recover traceability links between natural language artifacts (such as requirements) and programming language artifacts (such as source code files). While earlier methods in software traceability relied predominantly on textual similarity, HGNNLink incorporates additional code dependency information to bridge the semantic gap inherent to natural language–programming language (NL–PL) artifact pairs. The method is situated at the intersection of graph representation learning and software engineering, providing state-of-the-art performance on requirements-to-code traceability link recovery (TLR) prior to the integration of more extensive multi-strategy approaches (Zou et al., 6 Sep 2025).

1. Problem Context: Software Traceability and the Limitations of Textual Similarity

Software traceability link recovery is the process of identifying relationships (links) between related software engineering artifacts—typically, from requirements to source code. The core challenge arises from the semantic gap between natural language texts (requirements) and programming language constructs (code files or methods). Previous models have primarily relied on textual similarity measures such as cosine similarity over TFIDF vectors, or comparably, semantic similarities derived from pretrained NLP models. However, large-scale empirical evaluations demonstrate that, for NL–PL artifact pairs, lexical similarity is a weak proxy for true semantic correspondence due to code-specific vocabulary, syntax, and structure.

Recent work highlights the inadequacy of pure text-matching in the requirements-to-code scenario and motivates the use of structural and operational relationships (such as method calls, inheritance, or code dependency) to supplement artifact embeddings and message passing within a neural network framework (Zou et al., 6 Sep 2025).

2. Methodological Foundations of HGNNLink

HGNNLink operationalizes the TLR task using a heterogeneous graph neural network (HGNN), in which the nodes label both requirements (NL) and code files (PL). The central paradigm is to encode both text-based and structural information in a unified graph representation.

Node Representation: Requirements and code files are encoded via respective transformers (e.g., RoBERTa for requirements, GraphCodeBERT for code artifacts), resulting in high-dimensional semantic node embeddings. These embeddings are then used as feature vectors for graph nodes.
Edge Construction: Edges are constructed in the heterogeneous graph according to two main principles:
- Textual Similarity: If the TFIDF-based or transformer-based similarity between a requirement and a code file exceeds a threshold, an edge may be introduced.
- Code Dependency: Importantly, structural code dependencies (method calls, inheritance, imports) are also included as edges among code nodes, providing a means for the message passing to exploit the hierarchical and collaborative organization of code bases.
Graph Neural Network Architecture: The HGNN model propagates information via message passing along both text-similarity-based and dependency-based edges. Attention-based mechanisms or type-specific aggregators are often used to modulate contributions from different edge types, allowing the network to learn from both direct semantic overlap and indirect relational context.
Prediction Objective: The final prediction is made via inner-product or feed-forward layers over node representations, followed by a sigmoid or softmax activation to output the traceability confidence score for requirement–code pairs.

3. Algorithmic Contributions and Key Mechanisms

A core innovation of HGNNLink is the augmentation of basic textual similarity with signals derived from code dependencies:

Code Dependency Edge Augmentation: By incorporating edges corresponding to real code structure (function calls, class relationships), the model can leverage the fact that semantically related requirements often link to functionally related code artifacts—even when their text content is divergent. This enables the message-passing mechanism to propagate relevant context from connected code modules, increasing the model’s ability to bridge NL–PL semantic gaps.
Heterogeneous Graph Message Passing: The model treats requirement and code nodes as different types (with potentially different feature spaces and aggregation kernels), relying on the graph’s heterogeneity to tune information flow. Structural edges allow information from confirmed or strongly related code files to reach requirement nodes indirectly, compensating for weak direct text similarity.
Loss and Training: The network is trained via standard link prediction objectives, often with a binary cross-entropy (BCE) loss comparing the predicted link probabilities and ground-truth links extracted from the software project’s verified traceability data.

4. Empirical Results and Comparative Evaluation

In benchmarking studies on twelve open-source Java projects, HGNNLink established itself as the prior state-of-the-art for requirements-to-code TLR, serving as the baseline for further improvements using multi-strategy augmentation (Zou et al., 6 Sep 2025). Across all projects, HGNNLink outperformed pure text-matching techniques and demonstrated that integrating code dependency achieves robust improvements in metrics such as F1-score.

A summary of performance as reported shows:

Model	F1-Score Gain (vs. text only)	Structural Signal Used
HGNNLink	Significant	Code Dependency
HGT-All	+3.68% over HGNNLink	Dependency, Feedback, Fine-Grained Semantic
Gemini-All	+8.84% over HGNNLink	Multi-strategy Prompt

It was subsequently shown that combining additional auxiliary information—such as simulated user feedback links and fine-grained semantic relationships—offers further, statistically significant accuracy improvements, establishing the value of a multi-signal approach for NL–PL TLR.

5. Practical Implications and Limitations

HGNNLink demonstrates that the integration of domain-specific structural information into heterogeneous graph encoding is a powerful strategy for requirements-to-code traceability. By leveraging code dependencies, the method aligns with the operational realities of software systems, where code modules tend to be semantically related to requirements through both explicit and implicit architecture.

However, empirical results suggest that code dependency alone, while impactful, may not capture all the nuances of true artifact linkage, especially in cases with cross-cutting concerns or multiple layers of abstraction. The method is also dependent on the availability of reliable code dependency information and may be less effective in highly entangled, weakly modular, or non-object-oriented codebases.

6. Successors and Further Developments

Recent studies show that further integration of heterogeneous auxiliary signals—such as simulated user feedback (i.e., the annotation of true trace links) and fine-grained semantic matches—leads to statistically significant gains (e.g., 3.68% F1-score increase with HGT-All, and 8.84% with Gemini-All), confirming that software TLR relies on a confluence of heterogenous data signals (Zou et al., 6 Sep 2025). These models implement additional edge types in the heterogeneous graph (for supervised GNNs) or supplement LLM prompts with structured, domain-specific cues (for unsupervised prompts), surpassing the original HGNNLink design.

This suggests that the next generation of code–requirements traceability systems should use integrated multi-strategy frameworks, combining graph-based learning on engineering structure, high-fidelity semantic encoding, and interaction feedback to resolve the inherent semantic gaps in NL–PL artifact tracing.

7. Mathematical Highlights and Architectures

The following expressions capture the embedding and prediction mechanisms central to HGNNLink:

Node Feature Encoding:
- Requirements (NL): $h_{r_i}^{(0)} = \mathrm{RoBERTa}(r_i)$
- Code (PL): $h_{c_j}^{(0)} = \mathrm{GraphCodeBERT}(c_j)$
Edge Attention in Heterogeneous GNN (HGT formalism):

$\mathrm{ATT}\text{-}\mathrm{Head}_i (s, t) = \frac{ (Q_i(t))^T \cdot W_\phi^{(ATT)} \cdot K_i(s) }{ \sqrt{d_h} \cdot \mu(\tau(s), \phi, \tau(t)) }$

where $Q_i$ and $K_i$ are type-specific query and key projections, $W_\phi^{(ATT)}$ are meta-relation matrices, and $\mu$ is the scaling factor for edge types.

Final Inner-Product Prediction:

$\hat{y}_{ij} = \sigma(r_i^T c_j)$

with $\sigma$ denoting the sigmoid function, and $r_i$ , $c_j$ the learned embeddings for requirement and code nodes.

Summary

HGNNLink is a heterogeneous graph neural network-based approach targeting the software traceability link recovery problem, with specific focus on requirements-to-code tasks involving NL–PL artifact pairs (Zou et al., 6 Sep 2025). Its key distinction from purely textual methods is the incorporation of code dependency information in the construction and message passing of the heterogeneous graph. This augmentation yields improved traceability performance, although subsequent research shows that comprehensive multi-strategy integration using additional structural and semantic signals achieves even greater improvements. The method serves as an important baseline and demonstrates the power of structural signals over text-only matching for NL–PL TLR scenarios.

PDF Markdown Chat (Pro)

References (1)

Natural Language-Programming Language Software Traceability Link Recovery Needs More than Textual Similarity (2025)

HGNNLink: Heterogeneous GNN for Traceability

1. Problem Context: Software Traceability and the Limitations of Textual Similarity

2. Methodological Foundations of HGNNLink

3. Algorithmic Contributions and Key Mechanisms

4. Empirical Results and Comparative Evaluation

5. Practical Implications and Limitations

6. Successors and Further Developments

7. Mathematical Highlights and Architectures

Summary

Whiteboard

Follow Topic

Continue Learning

HGNNLink: Heterogeneous GNN for Traceability

1. Problem Context: Software Traceability and the Limitations of Textual Similarity

2. Methodological Foundations of HGNNLink

3. Algorithmic Contributions and Key Mechanisms

4. Empirical Results and Comparative Evaluation

5. Practical Implications and Limitations

6. Successors and Further Developments

7. Mathematical Highlights and Architectures

Summary

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics