Considerable effort has gone into the discovery of API usage patterns or examples. However, how to enable programmers to search for discovered API usage examples using natural language queries is still a significant research problem. This paper presents an approach, referred to as Codepus, to facilitate the discovery of API usage examples based on mining comments in open source code while permitting searches using natural language queries. The approach includes two key features: API usage patterns as well as multiple keywords and tf-idf values are discovered by mining open source comments and code snippets; and a matchmaking function is devised for searching for API usage examples using natural language queries by aggregating scores related to semantic similarity, correctness, and the number of APIs. In a practical application, the proposed approach discovered 43,721 API usage patterns with 641,591 API usage examples from 15,814 open source projects. Experiment results revealed the following: (1) Codepus reduced the browsing time required for locating API usage examples by 46.5%, compared to the time required when using a web search engine. (2) The precision of Codepus is 91% when using eleven real-world frequently asked questions, which is superior to those of Gists and Open Hub.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications