Oreo: Detection of Clones in the Twilight Zone (ESEC/FSE 2018 - Research Papers)

Sun 4 - Fri 9 November 2018 Lake Buena Vista, Florida, United States

Who

Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Pierre Baldi, Crista Lopes

Track

ESEC/FSE 2018 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 7 Nov 2018 14:37 - 15:00 at Horizons 5 - Software Analysis II Chair(s): Myra Cohen

Abstract

Source code clones are categorized into four types of increasing difficulty of detection, ranging from purely textual (Type-1) to purely semantic (Type-4). Most clone detectors reported in the literature work well up to Type-3, which accounts for syntactic differences. In between Type-3 and Type-4, however, there lies a spectrum of clones that, although still exhibiting some syntactic similarities, are extremely hard to detect – the Twilight Zone. Most clone detectors reported in the literature fail to operate in this zone. We present Oreo, a novel approach to source code clone detection that not only detects Type-1 to Type-3 clones accurately, but is also capable of detecting harder-to-detect clones in the Twilight Zone. Oreo is built using a combination of machine learning, information retrieval, and software metrics. We evaluate the recall of Oreo on BigCloneBench, and perform manual evaluation for precision. Oreo has both high recall and precision. More importantly, it pushes the boundary in detection of clones with moderate to weak syntactic similarity in a scalable manner

Vaibhav Saini

University of California at Irvine, USA

United States

Farima Farmahinifarahani

University of California at Irvine, USA

Yadong Lu

University of California at Irvine, USA

Pierre Baldi

University of California at Irvine, USA

Crista Lopes

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 7 Nov
Displayed time zone: Guadalajara, Mexico City, Monterrey change

13:30 - 15:00	Software Analysis IIResearch Papers / Journal-First at Horizons 5 Chair(s): Myra Cohen Iowa State University

13:30 22m Talk		A Systematic Evaluation of Static API-Misuse Detectors Journal-First Sven Amann Technische Universität Darmstadt, Hoan Nguyen Iowa State University, Sarah Nadi University of Alberta, Tien N. Nguyen University of Texas at Dallas, Mira Mezini TU Darmstadt DOI
13:52 22m Talk		Do Android Taint Analysis Tools Keep Their Promises? Research Papers Felix Pauck Paderborn University, Germany, Eric Bodden Heinz Nixdorf Institut, Paderborn University and Fraunhofer IEM, Heike Wehrheim Paderborn University
14:15 22m Talk		Neural-Augmented Static Analysis of Android Communication Research Papers Jinman Zhao University of Wisconsin-Madison, USA, Aws Albarghouthi University of Wisconsin-Madison, Vaibhav Rastogi University of Wisconsin-Madison, USA, Somesh Jha University of Wisconsin, Madison, Damien Octeau University of Wisconsin and Pennsylvania State University
14:37 22m Talk		Oreo: Detection of Clones in the Twilight Zone Research Papers Vaibhav Saini University of California at Irvine, USA, Farima Farmahinifarahani University of California at Irvine, USA, Yadong Lu University of California at Irvine, USA, Pierre Baldi University of California at Irvine, USA, Crista Lopes