SmartPatchLinker: An Open-Source Tool to Linked Changes Detection for Code Review

Published 5 Apr 2026 in cs.SE | (2604.04045v1)

Abstract: In large software ecosystems, semantically related code changes, such as alternative solutions or overlapping modifications are often discovered only days after submission, leading to duplicated effort and delayed reviews. We present SmartPatchLinker, a browser based tool that supports the discovery of related patches directly within the code review interface. SmartPatchLinker is implemented as a lightweight Chrome extension with a local inference backend and integrates with Gerrit to retrieve and rank semantically linked changes when a reviewer opens a patch. The tool allows reviewers to configure the search scope, view ranked candidates with confidence indicators, and examine related work without leaving their workflow or relying on server-side installations. We perform both usefulness and usability evaluations to study how SmartPatchLinker can support reviewers during code review. SmartPatchLinker is open source, and its source code, Docker containers, and the replication package used in our evaluation are publicly available on GitHub at https://github.com/islem-kms/gerrit-chrome-extension . A video demonstrating the tool is also available online at https://drive.google.com/drive/folders/1MCcTj5OSlT7lHVBFMq5m9iatas2joaGb

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces a lightweight tool that integrates SBERT-based semantic embedding to detect linked code changes directly within Gerrit.
It combines semantic similarity, file path metrics, and temporal features to outperform traditional baselines in Recall@K and MRR metrics.
Experimental results on Qt, Android, and OpenStack show a significant reduction in review latency with sub-two-minute end-to-end detection.

SmartPatchLinker: An Open-Source Framework for Semantic Patch Linkage Detection in Code Review

Problem Statement and Motivations

Large-scale software ecosystems frequently encounter semantically related but independently submitted code changes, termed "linked changes." Failure to rapidly identify these linkages during code review leads to redundant implementations, conflict resolution overhead, and increased review latency. Prior approaches relying on static similarity heuristics—TF-IDF on summaries and file path-based matching—are ineffective in the face of semantic variation and alternate solution strategies. While LLM-based solutions offer improved representation power, their typical realization as heavyweight, server-side bots is poorly suited for seamless, privacy-sensitive, real-time interactions required by reviewers inside modern code review tools.

System Overview

SmartPatchLinker addresses these deficits via a browser-based tool that injects semantic patch linkage detection capabilities directly into the Gerrit code review interface. The core architecture decouples a lightweight Chrome extension UI from a local Python/Flask backend responsible for code analysis and inference.

Figure 1: SmartPatchLinker system architecture, showing seamless integration between Gerrit, the client-side Chrome extension, and the inference backend.

Upon page load, the extension automatically detects active Gerrit sessions, extracts patch context, and enables reviewers to configure analysis parameters such as temporal window and Top-K retrieval. The extracted context is securely communicated to the local backend, which executes candidate selection and similarity-based ranking. Predictions are rendered in situ, annotated with confidence indicators to support rapid reviewer judgment without the need to disrupt workflow or transfer sensitive data externally.

Figure 2: SmartPatchLinker's UI, showing reviewer interaction with time window, Top-K configuration, and result display within Gerrit's change view.

Model Architecture and Feature Engineering

SmartPatchLinker’s backend leverages Sentence-BERT (all-MiniLM-L6-v2), enabling deep semantic embedding of patch titles and descriptions. For each candidate pair within the specified temporal window, the model extracts a feature vector comprising:

SBERT-based semantic cosine similarity,
File path similarity metrics (longest common prefix/suffix, Jaccard index on file lists),
Temporal and structural meta-features (time delta, difference in file count).

Candidate selection is scoped by a reviewer-tunable window (default $\delta = 14$ days) to balance efficiency and recall. A Random Forest classifier, trained on labeled patch linkages, produces confidence scores. Top-K candidates are returned, prioritizing the most plausible semantic linkages.

Experimental Evaluation

SmartPatchLinker was evaluated on three major OSS ecosystems (Qt, Android, OpenStack), using datasets originated by Wang et al. [wang2021automatic]. The experiments involved comparing against three baseline variants: text-only, file-location-only, and their static combination.

Quantitative results reveal SmartPatchLinker's robust superiority in both Recall@K and MRR metrics, especially at low K values that directly impact interactive code review efficiency. For instance, in the Qt dataset with a 2-day window, SmartPatchLinker attains an MRR of 0.60, outperforming the best baseline at 0.52, and this margin persists across all projects and window settings.

Figure 3: Recall@K: SmartPatchLinker versus baselines, showing consistently higher recall across all K for Qt, Android, and OpenStack.

Notably, the model achieves high recall and optimal rank placement of relevant linkages even when lexical and path overlap is minimal, validating the benefit of semantic embedding. The system’s real-time interactivity is evidenced by sub-two-minute end-to-end usage, encompassing detection, configuration, prediction, and inspection phases.

Workflow Integration and Usability

Reviewers access SmartPatchLinker via a Chrome extension popup, seamlessly configuring temporal and Top-K parameters before querying for linked changes. Results, including confidence and semantic badges, are displayed in the extension UI without leaving Gerrit.

Figure 4: Reviewer UI for setting the time window and Top-K retrieval count, tailoring results for immediate analysis.

Figure 5: Example Top-K results shown with percentage confidence, directly within the review session.

This workflow eliminates context-switching, supports dynamic exploration, and removes the administrative and privacy barriers typical of server-based or bot-style implementations.

Implications and Future Directions

SmartPatchLinker provides strong empirical evidence that semantic feature fusion and SBERT-based similarity yield substantive practical gains in early patch linkage detection. Its non-intrusive architecture and private local inference make it suited for organizations wary of code/data leakage. The significant recall improvements at low K suggest material reductions in duplicated effort and review latency for large engineering teams.

Possible future work includes cross-platform generalization (support for GitHub/GitLab), richer semantic reasoning across multi-branch and multi-repository linkages, and further integration with agentic or LLM-based assistants. Enhanced dependency summarization and alternative solution surfacing could further shift review dynamics toward more informed and context-aware decision-making. The practical fusion of real-time, privacy-preserving deployment with state-of-the-art semantic modeling positions SmartPatchLinker as a canonical approach for next-generation code review augmentation.

Conclusion

SmartPatchLinker advances the state of semantic patch linkage detection in code review through the fusion of SBERT-derived features and lightweight browser-native deployment. Its empirical results demonstrate marked superiority over traditional and hybrid baseline methods, particularly in scenarios with limited lexical or structural overlap. The tool's privacy-aware, workflow-preserving interaction model facilitates adoption in industrial review settings. Future extensions integrating LLM capabilities and supporting broader platforms present clear research and engineering opportunities for more contextually aware, automated code review support.

Reference: For full methodological details and supplementary resources, see "SmartPatchLinker: An Open-Source Tool to Linked Changes Detection for Code Review" (2604.04045).

Markdown Report Issue