Complementing Global and Local Contexts in Representing API Descriptions to Improve API Retrieval Tasks
When being trained on API documentation and tutorials, Word2Vec
produces vector representations to estimate the relevance between
texts and API elements. However, existing Word2Vec-based approaches to
measure document similarities aggregate Word2Vec vectors of individual
words or APIs to build the representation of a document as if the
words are independent. Thus, the semantics of API descriptions or code
fragments are not well represented. In this work, we conjecture that
we need a new model that fits with API documentation better than
Word2Vec. We present D2Vec, a neural network model that considers two
complementary contexts to better capture the semantics of API
documentation. First, we connect the global context of the current API
topic under description to all the text phrases within the description
of that API. Second, the local orders of words and API elements in the
text phrases are maintained in computing the vector representations
for the APIs. We conducted an experiment to verify two intrinsic
properties of D2Vec's vectors: 1) similar words and relevant API
elements are projected into nearby locations; and 2) some vector
operations carry semantics. We demonstrate the usefulness and good
performance of D2Vec in three applications: API code search
(text-to-code retrieval), API tutorial fragment search (code-to-text
retrieval), and mining API mappings between software libraries
(code-to-code retrieval). Finally, we provide actionable insights and
implications for researchers in using our model in other
applications with other types of documents.
Thu 8 NovDisplayed time zone: Guadalajara, Mexico City, Monterrey change
13:30 - 15:00 | Software Maintenance IIResearch Papers / Journal-First at Horizons 10-11 Chair(s): Emerson Murphy-Hill North Carolina State University | ||
13:30 22mTalk | Automating Change-level Self-admitted Technical Debt Determination Journal-First Meng Yan , Xin Xia Monash University, Emad Shihab Concordia University, David Lo Singapore Management University, Jianwei Yin , Xiaohu Yang DOI | ||
13:52 22mTalk | Large-Scale Study of Substitutability in the Presence of Effects Research Papers Jackson Maddox Iowa State University, USA, Yuheng Long Iowa State University, Hridesh Rajan Iowa State University | ||
14:15 22mTalk | An Empirical Study on Crash Recovery Bugs in Large-Scale Distributed Systems Research Papers Yu Gao Institute of Software, Chinese Academy of Sciences, Wensheng Dou Institute of Software, Chinese Academy of Sciences, Feng Qin Ohio State University, USA, Chushu Gao Institute of Software, Chinese Academy of Sciences, Dong Wang Institute of Software at Chinese Academy of Sciences, China, Jun Wei State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, Ruirui Huang Alibaba Group, China, Li Zhou Alibaba Group, China, Yongming Wu Alibaba Group, China | ||
14:37 22mTalk | Complementing Global and Local Contexts in Representing API Descriptions to Improve API Retrieval Tasks Research Papers Thanh Nguyen Iowa State University, Ngoc Tran , Hung Phan , Trong Nguyen Iowa State University, USA, Linh Truong , Trong Nguyen Iowa State University, USA, Hoan Anh Nguyen Iowa State University, USA, Tien N. Nguyen University of Texas at Dallas |