Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Modular Tree Network for Source Code Representation Learning (2104.00196v1)

Published 1 Apr 2021 in cs.SE

Abstract: Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural information of programs. Although abstract syntax tree-based neural models can handle the tree structure in the source code, they cannot capture the richness of different types of substructure in programs. In this paper, we propose a modular tree network (MTN) which dynamically composes different neural network units into tree structures based on the input abstract syntax tree. Different from previous tree-structural neural network models, MTN can capture the semantic differences between types of ASTsubstructures. We evaluate our model on two tasks: program classification and code clone detection. Ourmodel achieves the best performance compared with state-of-the-art approaches in both tasks, showing the advantage of leveraging more elaborate structure information of the source code.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Wenhan Wang (22 papers)
  2. Ge Li (213 papers)
  3. Sijie Shen (8 papers)
  4. Xin Xia (171 papers)
  5. Zhi Jin (160 papers)
Citations (45)

Summary

We haven't generated a summary for this paper yet.