Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces (ESEC/FSE 2018 - Research Papers)

Sun 4 - Fri 9 November 2018 Lake Buena Vista, Florida, United States

Who

Jordan Henkel, Shuvendu K. Lahiri, Ben Liblit, Thomas Reps

Track

ESEC/FSE 2018 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 6 Nov 2018 14:15 - 14:37 at Horizons 6-9F - Deep Learning Chair(s): David Rosenblum

Abstract

With the rise of machine learning, there is a great deal of interest in treating programs as data to be fed to learning algorithms. However, programs do not start off in a form that is immediately amenable to most off-the-shelf learning techniques. Instead, it is necessary to transform the program to a suitable representation before a learning technique can be applied.

In this paper, we use abstractions of traces obtained from symbolic execution of a program as a representation for learning word embeddings. We trained a variety of word embeddings under hundreds of parameterizations, and evaluated each learned embedding on a suite of different tasks. In our evaluation, we obtain 93% top-1 accuracy on a benchmark consisting of over 19,000 API-usage analogies extracted from the Linux kernel. In addition, we show that embeddings learned from (mainly) semantic abstractions provide nearly triple the accuracy of those learned from (mainly) syntactic abstractions.

Jordan Henkel

University of Wisconsin–Madison

United States

Shuvendu K. Lahiri

Microsoft Research

Ben Liblit

University of Wisconsin–Madison

United States

Thomas Reps

University of Wisconsin - Madison and GrammaTech, Inc.

United States

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 6 Nov
Displayed time zone: Guadalajara, Mexico City, Monterrey change

13:30 - 15:00	Deep LearningResearch Papers at Horizons 6-9F Chair(s): David Rosenblum National University of Singapore

13:30 22m Talk		Deep Learning Type Inference Research Papers Vincent J. Hellendoorn University of California at Davis, USA, Christian Bird Microsoft Research, Earl T. Barr , Miltiadis Allamanis Microsoft Research, Cambridge
13:52 22m Talk		DeepSim: Deep Learning Code Functional Similarity Research Papers Gang Zhao , Jeff Huang Texas A&M University
14:15 22m Talk		Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces Research Papers Jordan Henkel University of Wisconsin–Madison, Shuvendu K. Lahiri Microsoft Research, Ben Liblit University of Wisconsin–Madison, Thomas Reps University of Wisconsin - Madison and GrammaTech, Inc.
14:37 22m Talk		MODE: Automated Neural Network Model Debugging via State Differential Analysis and Input Selection Research Papers Shiqing Ma Purdue University, USA, Yingqi Liu Purdue University, USA, Wen-Chuan Lee Purdue University, Xiangyu Zhang Purdue University, Ananth Grama Purdue University, USA