skip to main content
10.1145/1137983.1137997acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
Article

MAPO: mining API usages from open source repositories

Authors Info & Claims
Published:22 May 2006Publication History

ABSTRACT

To improve software productivity, when constructing new software systems, developers often reuse existing class libraries or frameworks by invoking their APIs. Those APIs, however, are often complex and not well documented, posing barriers for developers to use them in new client code. To get familiar with how those APIs are used, developers may search the Web using a general search engine to find relevant documents or code examples. Developers can also use a source code search engine to search open source repositories for source files that use the same APIs. Nevertheless, the number of returned source files is often large. It is difficult for developers to learn API usages from a large number of returned results. In order to help developers understand API usages and write API client code more effectively, we have developed an API usage mining framework and its supporting tool called MAPO (for <u>M</u>ining <u>AP</u>I usages from <u>O</u>pen source repositories). Given a query that describes a method, class, or package for an API, MAPO leverages the existing source code search engines to gather relevant source files and conducts data mining. The mining leads to a short list of frequent API usages for developers to inspect. MAPO currently consists of five components: a code search engine, a source code analyzer, a sequence preprocessor, a frequent sequence miner, and a frequent sequence post processor. We have examined the effectiveness of MAPO using a set of various queries. The preliminary results show that the framework is practical for providing informative and succinct API usage patterns.

References

  1. CodeBase, 2005. http://www.codase.com/.Google ScholarGoogle Scholar
  2. DocJar, 2005. http://www.docjar.com/.Google ScholarGoogle Scholar
  3. The Koders source code search engine, 2005. http://www.koders.com.Google ScholarGoogle Scholar
  4. PMD, 2005. http://pmd.sourceforge.net/.Google ScholarGoogle Scholar
  5. SPARS-J, 2005. http://demo.spars.info/.Google ScholarGoogle Scholar
  6. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases, pages 487--499, Sept. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering, pages 3--14, Taipei, Taiwan, Mar. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Ammons, R. Bodik, and J. R. Larus. Mining specifications. In Proc. 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 4--16, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Dahm and J. van Zyl. Byte Code Engineering Library, April 2003. http://jakarta.apache.org/bcel/.Google ScholarGoogle Scholar
  10. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data, pages 1--12, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the ACM, 24:644--675, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Holmes and G. C. Murphy. Using structural context to recommend source code examples. In Proc. 27th International Conference on Software Engineering, pages 117--125, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Inoue, R. Yokomori, H. Fujiwara, K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Ranking significance of software components based on use relations. IEEE Transactions on Software Engineering, 31(3):213--225, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proc. ESEC/FSE, pages 306--315, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Livshits and T. Zimmermann. DynaMine: finding common error patterns by mining software revision histories. In Proc. ESEC/FSE, pages 296--305, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Mariani and M. Pezzè. Behavior capture and test: Automated analysis of component integration. In Proc. 10th International Conference on Engineering of Complex Computer Systems, pages 292--301, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Michail. Data mining library reuse patterns using generalized association rules. In Proc. 22nd International Conference on Software Engineering, pages 167--176, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. V. Raman and J. D. Patrick. The sk-strings method for inferring pfsa. In Proc. Workshop on Automata Induction, Grammatical Inference and Language Acquisition, 1997.Google ScholarGoogle Scholar
  19. J. Wang and J. Han. BIDE: Efficient mining of frequent closed sequences. In Proc. 20th International Conference on Data Engineering, pages 79--90, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. C. Williams and J. K. Hollingsworth. Recovering system specific rules from software repositories. In Proc. 2005 International Workshop on Mining Software Repositories, pages 1--5, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MAPO: mining API usages from open source repositories

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MSR '06: Proceedings of the 2006 international workshop on Mining software repositories
        May 2006
        191 pages
        ISBN:1595933972
        DOI:10.1145/1137983

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 May 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Upcoming Conference

        ICSE 2025

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader