A Systematic Evaluation of Static API-Misuse Detectors (ESEC/FSE 2018 - Journal-First)

Sun 4 - Fri 9 November 2018 Lake Buena Vista, Florida, United States

Who

Sven Amann, Hoan Nguyen, Sarah Nadi, Tien N. Nguyen, Mira Mezini

Track

ESEC/FSE 2018 Journal-First

Time Zone

The program is currently displayed in (GMT-05:00) Cancun.

Use conference time zone: (GMT-05:00) CancunSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 7 Nov 2018 13:30 - 13:52 at Horizons 5 - Software Analysis II Chair(s): Myra Cohen

Abstract

Application Programming Interfaces (APIs) often have usage constraints, such as restrictions on call order or call conditions. API misuses, i.e., violations of these constraints, may lead to software crashes, bugs, and vulnerabilities. Though researchers developed many API-misuse detectors over the last two decades, recent studies show that API misuses are still prevalent. Therefore, we need to understand the capabilities and limitations of existing detectors in order to advance the state of the art. In this paper, we present the first-ever qualitative and quantitative evaluation that compares static API-misuse detectors along the same dimensions, and with original author validation. To accomplish this, we develop MuC, a classification of API misuses, and MuBenchPipe, an automated benchmark for detector comparison, on top of our misuse dataset, MuBench. Our results show that the capabilities of existing detectors vary greatly and that existing detectors, though capable of detecting misuses, suffer from extremely low precision and recall. A systematic root-cause analysis reveals that, most importantly, detectors need to go beyond the naive assumption that a deviation from the most-frequent usage corresponds to a misuse and need to obtain additional usage examples to train their models. We present possible directions towards more-powerful API-misuse detectors.

DOI

https://doi.org/10.1109/TSE.2018.2827384

Sven Amann

Technische Universität Darmstadt

Germany

Hoan Nguyen

Iowa State University

Sarah Nadi

University of Alberta

Canada

Tien N. Nguyen

University of Texas at Dallas

United States

Mira Mezini

TU Darmstadt