Measuring code similarity is fundamental for many software engineering tasks,
e.g., code search, refactoring and reuse. However, most existing techniques
focus on code syntactical similarity only, while measuring code functional
similarity remains a challenging problem. In this paper, we propose a
novel approach that encodes code control flow and
data flow into a semantic matrix in which each element is a
high dimensional sparse binary feature vector, and we design a new deep
learning model that measures code functional similarity based
on this representation.
By concatenating hidden representations learned from a code pair,
this new
model transforms the problem of detecting functionally similar code
to binary classification, which can effectively learn patterns between
functionally similar code with very different syntactics.
We have implemented our approach, DeepSim, for Java programs and evaluated its
recall, precision and time performance on two large datasets of functionally
similar code. The experimental results show that DeepSim significantly
outperforms existing state-of-the-art techniques, such as DECKARD, RtvNN, CDLH,
and two baseline deep neural networks models.
Tue 6 NovDisplayed time zone: Guadalajara, Mexico City, Monterrey change
13:30 - 15:00 | Deep LearningResearch Papers at Horizons 6-9F Chair(s): David Rosenblum National University of Singapore | ||
13:30 22mTalk | Deep Learning Type Inference Research Papers Vincent J. Hellendoorn University of California at Davis, USA, Christian Bird Microsoft Research, Earl T. Barr , Miltiadis Allamanis Microsoft Research, Cambridge | ||
13:52 22mTalk | DeepSim: Deep Learning Code Functional Similarity Research Papers | ||
14:15 22mTalk | Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces Research Papers Jordan Henkel University of Wisconsin–Madison, Shuvendu K. Lahiri Microsoft Research, Ben Liblit University of Wisconsin–Madison, Thomas Reps University of Wisconsin - Madison and GrammaTech, Inc. | ||
14:37 22mTalk | MODE: Automated Neural Network Model Debugging via State Differential Analysis and Input Selection Research Papers Shiqing Ma Purdue University, USA, Yingqi Liu Purdue University, USA, Wen-Chuan Lee Purdue University, Xiangyu Zhang Purdue University, Ananth Grama Purdue University, USA |