Article

MAPO: mining API usages from open source repositories

Authors:
Tao Xie

North Carolina State University, Raleigh, NC

North Carolina State University, Raleigh, NC
View Profile

,
Jian Pei

Simon Fraser University, Burnaby, BC, Canada

Simon Fraser University, Burnaby, BC, Canada
View Profile

MSR '06: Proceedings of the 2006 international workshop on Mining software repositoriesMay 2006Pages 54–57https://doi.org/10.1145/1137983.1137997

Published:22 May 2006Publication History

MSR '06: Proceedings of the 2006 international workshop on Mining software repositories

Pages 54–57

ABSTRACT

To improve software productivity, when constructing new software systems, developers often reuse existing class libraries or frameworks by invoking their APIs. Those APIs, however, are often complex and not well documented, posing barriers for developers to use them in new client code. To get familiar with how those APIs are used, developers may search the Web using a general search engine to find relevant documents or code examples. Developers can also use a source code search engine to search open source repositories for source files that use the same APIs. Nevertheless, the number of returned source files is often large. It is difficult for developers to learn API usages from a large number of returned results. In order to help developers understand API usages and write API client code more effectively, we have developed an API usage mining framework and its supporting tool called MAPO (for Mining API usages from Open source repositories). Given a query that describes a method, class, or package for an API, MAPO leverages the existing source code search engines to gather relevant source files and conducts data mining. The mining leads to a short list of frequent API usages for developers to inspect. MAPO currently consists of five components: a code search engine, a source code analyzer, a sequence preprocessor, a frequent sequence miner, and a frequent sequence post processor. We have examined the effectiveness of MAPO using a set of various queries. The preliminary results show that the framework is practical for providing informative and succinct API usage patterns.

References

CodeBase, 2005. http://www.codase.com/.Google Scholar
DocJar, 2005. http://www.docjar.com/.Google Scholar
The Koders source code search engine, 2005. http://www.koders.com.Google Scholar
PMD, 2005. http://pmd.sourceforge.net/.Google Scholar
SPARS-J, 2005. http://demo.spars.info/.Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases, pages 487--499, Sept. 1994. Google ScholarDigital Library
R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering, pages 3--14, Taipei, Taiwan, Mar. 1995. Google ScholarDigital Library
G. Ammons, R. Bodik, and J. R. Larus. Mining specifications. In Proc. 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 4--16, 2002. Google ScholarDigital Library
M. Dahm and J. van Zyl. Byte Code Engineering Library, April 2003. http://jakarta.apache.org/bcel/.Google Scholar
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data, pages 1--12, May 2000. Google ScholarDigital Library
D. S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the ACM, 24:644--675, 1977. Google ScholarDigital Library
R. Holmes and G. C. Murphy. Using structural context to recommend source code examples. In Proc. 27th International Conference on Software Engineering, pages 117--125, 2005. Google ScholarDigital Library
K. Inoue, R. Yokomori, H. Fujiwara, K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Ranking significance of software components based on use relations. IEEE Transactions on Software Engineering, 31(3):213--225, March 2005. Google ScholarDigital Library
Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proc. ESEC/FSE, pages 306--315, 2005. Google ScholarDigital Library
B. Livshits and T. Zimmermann. DynaMine: finding common error patterns by mining software revision histories. In Proc. ESEC/FSE, pages 296--305, 2005. Google ScholarDigital Library
L. Mariani and M. Pezzè. Behavior capture and test: Automated analysis of component integration. In Proc. 10th International Conference on Engineering of Complex Computer Systems, pages 292--301, June 2005. Google ScholarDigital Library
A. Michail. Data mining library reuse patterns using generalized association rules. In Proc. 22nd International Conference on Software Engineering, pages 167--176, 2000. Google ScholarDigital Library
A. V. Raman and J. D. Patrick. The sk-strings method for inferring pfsa. In Proc. Workshop on Automata Induction, Grammatical Inference and Language Acquisition, 1997.Google Scholar
J. Wang and J. Han. BIDE: Efficient mining of frequent closed sequences. In Proc. 20th International Conference on Data Engineering, pages 79--90, 2004. Google ScholarDigital Library
C. C. Williams and J. K. Hollingsworth. Recovering system specific rules from software repositories. In Proc. 2005 International Workshop on Mining Software Repositories, pages 1--5, 2005. Google ScholarDigital Library

Index Terms

MAPO: mining API usages from open source repositories
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management
        Software maintenance
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues

Recommendations

MAPO: Mining and Recommending API Usage Patterns
Genoa: Proceedings of the 23rd European Conference on ECOOP 2009 --- Object-Oriented Programming

To improve software productivity, when constructing new software systems, programmers often reuse existing libraries or frameworks by invoking methods provided in their APIs. Those API methods, however, are often complex and not well documented. To get ...
Read More
Design patterns for annotation-based APIs
SugarLoafPLoP '16: Proceedings of the 11th Latin-American Conference on Pattern Languages of Programming

With the introduction of code annotations in popular languages like Java and C#, several frameworks and platforms adopted a metadata-based API (Application Programming Interface). By using this approach, instead of extending classes, implementing ...
Read More
The SENSEI generic in situ interface
ISAV '16: Proceedings of the 2nd Workshop on In Situ Infrastructures for Enabling Extreme-scale Analysis and Visualization

The SENSEI generic in situ interface is an API that promotes code portability and reusability. From the simulation view, a developer can instrument their code with the SENSEI API and then make make use of any number of in situ infrastructures. From the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MSR '06: Proceedings of the 2006 international workshop on Mining software repositories
May 2006
191 pages
ISBN:1595933972
DOI:10.1145/1137983
General Chairs:
Stephan Diehl
University Trier, Germany
,
Harald Gall
University of Zurich, Switzerland
,
Ahmed E. Hassan
Research in Motion RIM, Canada
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 May 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
application programming interfaces
mining software repositories
program comprehension
Qualifiers
- Article
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 156
  Total Citations
  View Citations
- 1,317
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

MAPO: mining API usages from open source repositories

MSR '06: Proceedings of the 2006 international workshop on Mining software repositories

ABSTRACT

References

Cited By

Index Terms

Recommendations

MAPO: Mining and Recommending API Usage Patterns

Design patterns for annotation-based APIs

The SENSEI generic in situ interface