Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Language Models for Enhancing the Completeness of Natural-language Requirements (2302.04792v1)

Published 9 Feb 2023 in cs.SE

Abstract: [Context and motivation] Incompleteness in natural-language requirements is a challenging problem. [Question/problem] A common technique for detecting incompleteness in requirements is checking the requirements against external sources. With the emergence of LLMs such as BERT, an interesting question is whether LLMs are useful external sources for finding potential incompleteness in requirements. [Principal ideas/results] We mask words in requirements and have BERT's masked LLM (MLM) generate contextualized predictions for filling the masked slots. We simulate incompleteness by withholding content from requirements and measure BERT's ability to predict terminology that is present in the withheld content but absent in the content disclosed to BERT. [Contribution] BERT can be configured to generate multiple predictions per mask. Our first contribution is to determine how many predictions per mask is an optimal trade-off between effectively discovering omissions in requirements and the level of noise in the predictions. Our second contribution is devising a machine learning-based filter that post-processes predictions made by BERT to further reduce noise. We empirically evaluate our solution over 40 requirements specifications drawn from the PURE dataset [1]. Our results indicate that: (1) predictions made by BERT are highly effective at pinpointing terminology that is missing from requirements, and (2) our filter can substantially reduce noise from the predictions, thus making BERT a more compelling aid for improving completeness in requirements.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Dipeeka Luitel (3 papers)
  2. Shabnam Hassani (6 papers)
  3. Mehrdad Sabetzadeh (28 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.