To write code, developers stitch together patterns, like API protocols or data structure traversals. Discovering these patterns can identify inconsistencies in code or opportunities to replace these patterns with an API or a language construct. We present coiling, a technique for automatically mining code for semantic idioms: surprisingly probable, semantic patterns. We specialize coiling for loop idioms, semantic idioms of loops. First, we show that automatically identifiable patterns exist, in great numbers, with a large-scale empirical study of loops over 25MLOC. We find that most loops in this corpus are simple and predictable: 90% have fewer than 15LOC and 90% have no nesting and very simple control. Encouraged by this result, we then mine loop idioms over a second, buildable corpus. Over this corpus, we show that only 50 loop idioms cover 50% of the concrete loops. Our framework opens the door to data-driven tool and language design, discovering opportunities to introduce new API calls and language constructs. Loop idioms show that LINQ would benefit from an Enumerate operator. This can be confirmed by the exitence of a StackOverflow question with 542k views that requests precisely this feature.
Wed 7 NovDisplayed time zone: Guadalajara, Mexico City, Monterrey change
15:30 - 17:00 | MiningJournal-First / Research Papers at Horizons 6-9F Chair(s): Hridesh Rajan Iowa State University | ||
15:30 22mTalk | Finding Better Active Learners for Faster Literature Reviews Journal-First DOI | ||
15:52 22mTalk | Mining Semantic Loop Idioms Journal-First Miltiadis Allamanis Microsoft Research, Cambridge, Earl T. Barr University College London, Christian Bird Microsoft Research, Prem Devanbu University of California, Mark Marron Microsoft Research, Charles Sutton University of Edinburgh DOI | ||
16:15 22mTalk | NAR-Miner: Discovering Negative Association Rules from Code for Bug Detection Research Papers Pan Bian Renmin University of China, China, Bin Liang Renmin University of China, China, Wenchang Shi Renmin University of China, China, Jianjun Huang Renmin University of China, China, Yan Cai Institute of Software, Chinese Academy of Sciences | ||
16:37 22mTalk | Path-Based Function Embedding and Its Application to Error-Handling Specification Mining Research Papers Daniel DeFreez University of California, Davis, Aditya V. Thakur University of California, Davis, Cindy Rubio-González University of California, Davis |