A deep learning model for estimating story points (1609.00489v2)

Published 2 Sep 2016 in cs.SE, cs.LG, and stat.ML

Abstract: Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues. Story points are the most common unit of measure used for estimating the effort involved in implementing a user story or resolving an issue. In this paper, we offer for the \emph{first} time a comprehensive dataset for story points-based estimation that contains 23,313 issues from 16 open source projects. We also propose a prediction model for estimating story points based on a novel combination of two powerful deep learning architectures: long short-term memory and recurrent highway network. Our prediction system is \emph{end-to-end} trainable from raw input data to prediction outcomes without any manual feature engineering. An empirical evaluation demonstrates that our approach consistently outperforms three common effort estimation baselines and two alternatives in both Mean Absolute Error and the Standardized Accuracy.

Citations (159)

View on Semantic Scholar

Summary

The paper introduces a novel LD-RNN model that estimates story points using deep learning architectures.
It curates a comprehensive dataset of over 23,000 issues from 16 open source projects for robust empirical evaluation.
The end-to-end model eliminates manual feature engineering, significantly reducing prediction errors compared to standard baselines.

Analysis of Deep Learning for Agile Effort Estimation

The research paper titled "A deep learning model for estimating story points" by Morakot Choetkiertikul et al. presents an innovative approach towards estimating software development effort within agile project environments. The primary contribution of this paper lies in its formulation and empirical evaluation of a model that harnesses the capabilities of deep learning architectures to estimate story points, which are widely used in agile methodologies for assessing the relative effort required to implement user stories or resolve issues.

Summary of Research Contributions

Data Curation: The paper introduces a comprehensive dataset encompassing 23,313 issues from 16 open source projects across diverse repositories, marking its significance as the first large-scale publicly available dataset focused on story point estimation at the issue level rather than the project level.
Proposed Model: The paper proposes a unique prediction model leveraging a Long-Deep Recurrent Neural Network (LD-RNN), which integrates two sophisticated deep learning architectures: Long Short-Term Memory (LSTM) networks and Recurrent Highway Networks (RHN). This amalgamation is instrumental in modeling textual descriptions of issues to predict story points effectively.
End-to-End Approach: The LD-RNN model operates in a fully end-to-end manner, from raw text inputs to prediction outputs, devoid of manual feature engineering. Such an approach enables the automatic learning of semantic representations, thereby enhancing prediction accuracy while reducing manual overhead.
Empirical Evaluation: An extensive empirical assessment demonstrates that the LD-RNN model surpasses three baseline estimators (Random Guessing, Mean Effort, and Median Effort) and two alternative models (LSTM with Random Forest and Bag-of-Words with Random Forest) in terms of Mean Absolute Error (MAE) and Standardized Accuracy (SA). These results substantiate the efficacy of the proposed model over traditional methodologies.

Implications and Future Directions

The practical implications of this research are significant for agile teams seeking reliable and consistent effort estimation techniques. The automated, data-driven approach proposed can complement existing expert-based methods, offering a consistent and unbiased baseline that can be refined with expert judgment.

Theoretically, this research extends the application of deep learning methods, particularly LSTM and RHN, in software engineering tasks, demonstrating their utility beyond traditional applications such as natural language processing and computer vision.

Future research could explore several avenues, such as:

Extending the dataset to include commercial projects, which might offer nuanced insights given the variability in team dynamics and project structures.
Developing adaptive models that can accommodate team changes over time, potentially integrating additional features characterizing team composition and dynamics.
Evaluating the integration of additional issue metadata, such as priority and type, which may further enhance model accuracy.
Conducting further experimental validation through trial deployments in various organizational settings to affirm the practical benefits of the LD-RNN approach.

Conclusion

This paper is a commendable contribution to software project management literature, particularly within agile methodology contexts. The successful application of deep learning techniques to story point estimation paves the way for more robust, adaptable, and scalable effort estimation models in software development, with potential implications for enhanced predictive accuracy and project management efficacy.

PDF Markdown

Related Papers

YouTube

Show All Videos