Quantify bug-label misclassification in CodeRed and UOwns and compare with open source

Determine the fraction of mislabeled 'bug'-type issues in the CodeRed dataset (39 proprietary projects) and the UOwns dataset (40 proprietary projects), and ascertain whether bug-label correctness in proprietary issue trackers is higher than in open-source projects.

Background

The study measures defect counts using Jira issues labeled as 'bug'. Prior work by Herzig et al. indicates that issue labels in open-source software projects are often mislabeled, raising concerns about the accuracy of defect-count constructs when labels are used as proxies for true defects.

For the two proprietary datasets analyzed—CodeRed (39 projects) and UOwns (40 projects)—the authors explicitly state that they cannot assess the fraction of mislabelled issues and, although they suspect proprietary projects might have higher label correctness than open-source projects, they acknowledge that this remains unknown. Establishing the mislabeling rate and validating comparative label correctness are necessary to strengthen construct validity for defect-based analyses in proprietary contexts.

References

We cannot assess the fraction of mislabelled issues in the CodeRed and UOwns datasets. We suspect that the label correctness is generally higher in proprietary projects, but we cannot know.

Increasing, not Diminishing: Investigating the Returns of Highly Maintainable Code  (2401.13407 - Borg et al., 2024) in Threats to Validity, Construct validity (Section 7)