Source code is bimodal: it combines a formal, algorithmic channel and a natural language channel of identifiers and comments. In this work, we model the bimodality of code with name flows, an assignment flow graph augmented to track identifier names. Conceptual types are logically distinct types that do not always coincide with program types. Passwords and URLs are example conceptual types that can share the program type string. Our tool, RefiNym, is an unsupervised method that mines a lattice of conceptual types from name flows and reifies them into distinct nominal types. For string, RefiNym finds and splits conceptual types originally merged into a single type, reducing the number of same-type variables per scope from 8.7 to 2.2 while eliminating 21.9% of scopes that have more than one same-type variable in scope. This makes the code more self-documenting and frees the type system to prevent a developer from inadvertently assigning data across conceptual types.
Tue 6 NovDisplayed time zone: Guadalajara, Mexico City, Monterrey change
13:30 - 15:00 | Software Analysis IJournal-First / Research Papers at Horizons 5 Chair(s): Sebastian Elbaum University of Nebraska-Lincoln, USA | ||
13:30 22mTalk | On Accelerating Source Code Analysis At Massive Scale Journal-First DOI | ||
13:52 22mTalk | RefiNym: Using Names to Refine Types Research Papers Santanu Dash University College London, UK, Miltiadis Allamanis Microsoft Research, Cambridge, Earl T. Barr | ||
14:15 22mTalk | Darwinian Data Structure Selection Research Papers Michail Basios University College London, Lingbo Li University College London, UK, Fan Wu University College London, UK, Leslie Kanthan University College London, UK, Earl T. Barr DOI Pre-print | ||
14:37 22mTalk | Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity Research Papers Yue Li Aarhus University, Denmark, Tian Tan Aarhus University, Denmark, Anders Møller Aarhus University, Yannis Smaragdakis University of Athens |