Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning Semantic Vector Representations of Source Code via a Siamese Neural Network

Published 26 Apr 2019 in cs.LG, cs.PL, cs.SE, and stat.ML | (1904.11968v1)

Abstract: The abundance of open-source code, coupled with the success of recent advances in deep learning for natural language processing, has given rise to a promising new application of machine learning to source code. In this work, we explore the use of a Siamese recurrent neural network model on Python source code to create vectors which capture the semantics of code. We evaluate the quality of embeddings by identifying which problem from a programming competition the code solves. Our model significantly outperforms a bag-of-tokens embedding, providing promising results for improving code embeddings that can be used in future software engineering tasks.

Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.