research-article

Mining library specifications using inductive logic programming

Authors:
Sriram Sankaranarayanan

NEC Labs America, Princeton, NJ, USA

NEC Labs America, Princeton, NJ, USA
View Profile

,
Franjo Ivančić

NEC Labs America, Princeton, NJ, USA

NEC Labs America, Princeton, NJ, USA
View Profile

,
Aarti Gupta

NEC Labs America, Princeton, NJ, USA

NEC Labs America, Princeton, NJ, USA
View Profile

ICSE '08: Proceedings of the 30th international conference on Software engineeringMay 2008Pages 131–140https://doi.org/10.1145/1368088.1368107

Published:10 May 2008Publication History

ICSE '08: Proceedings of the 30th international conference on Software engineering

Pages 131–140

ABSTRACT

Software libraries organize useful functionalities in order to promote modularity and code reuse. A typical library is used by client programs through an application programming interface (API) that hides its internals from the client. Typically, the rules governing the correct usage of the API are documented informally. In many cases, libraries may have complex API usage rules and unclear documentation. As a result, the behaviour of the library under some corner cases may not be well understood by the programmer. Formal specifications provide a precise understanding of the API behaviour.

We propose a methodology for learning interface specifications using Inductive Logic Programming (ILP). Our technique runs several unit tests on the library in order to generate relations describing the operation of the library. The data collected from these tests are used by an inductive learner to obtain rich Datalog/Prolog specifications. Such specifications capture essential properties of interest to the user. They may be used for applications such as reverse engineering the library internals or constructing checks on the application code to enforce proper API usage along with other properties of interest.

References

Acharya, M., Xie, T., Pei, J., and Xu, J. Mining API patterns as partial orders from source code: from usage scenarios to specifications. In ESEC-FSE '07 (2007), ACM Press, pp. 25--34. Google ScholarDigital Library
Alur, R., Černý, P., Madhusudan, P., and Nam, W. Synthesis of interface specifications for java classes. In Proc. ACM SIGPLAN symp. on Principles of prog. lang. (POPL) (2005), ACM Press, pp. 98--109. Google ScholarDigital Library
Ammons, G., Bodík, R., and Larus, J. R. Mining specifications. In POPL (2002), pp. 4--16. Google ScholarDigital Library
Bergadano, F., and Gunetti, D. Inductive Logic Programming: From Machine Learning to Software Engineering. MIT Press, 1996. Google ScholarDigital Library
Cadar, C., and Engler, D. R. Execution Generated Test Cases: How to make systems code crash itself. In SPIN (2005), vol. 3639 of LNCS, Springer-Verlag, pp. 2--23. Google ScholarDigital Library
Christodorescu, M., Kruegel, C., and Jha, S. Mining specifications of malicious behavior. In ESEC-FSE '07 (2007), ACM Press, pp. 5--14. Google ScholarDigital Library
Cohen, W. W. Recovering software specifications with inductive logic programming. In AAAI (1994), pp. 142--148. Google ScholarDigital Library
Csallner, C., and Smaragdakis, Y. Jcrasher: an automatic robustness tester for Java. Softw., Pract. Exper. 34, 11 (2004), 1025--1050. Google ScholarDigital Library
de Alfaro, L., and Henzinger, T. A. Interface automata. In ESEC / SIGSOFT FSE (2001), ACM Press, pp. 109--120. Google ScholarDigital Library
Ernst, M. D. Dynamically Discovering Likely Program Invariants. Ph.D., University of Washington Department of Computer Science and Engineering, Seattle, Washington, Aug. 2000. Google ScholarDigital Library
Godefroid, P., Klarlund, N., and Sen, K. DART: Directed Automated Random Testing. In ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation (PLDI'05) (2005), pp. 213--223. Google ScholarDigital Library
Jiang, G., Chen, H., Ungureanu, C., and Yoshihira, K. Multi-resolution abnormal trace detection using varied-length N-grams and automata. In ICAC (2005), IEEE Computer Society, pp. 111--122. Google ScholarDigital Library
Jones, C. B. Software Development: A Rigorous Approach. 1980. Google ScholarDigital Library
Mitchell, T. M. Machine Learning. McGraw-Hill, 1997. Google ScholarDigital Library
Muggleton, S. Inverse resolution and Progol. New Generation Computing 13 (1995), 245--286.Google ScholarDigital Library
Pazzani, M., Brunk, C., and Silverstein, G. A knowledge-intensive approach to learning relational concepts. In Proc. Workshop on Machine Learning (1991), Morgan Kauffmann, pp. 432--436.Google ScholarCross Ref
Quinlan, J. R. Learning logical definitions from relations. Machine Learning 5 (1990), 239--266. Google ScholarDigital Library
Ramanathan, M. K., Grama, A., and Jagannathan, S. Static specification inference using predicate mining. In Proc. ACM SIGPLAN Prog. Lang. Design & Implementation (PLDI) (2007), ACM press, pp. 123--134. Google ScholarDigital Library
Sen, K., Marinov, D., and Agha, G. Cute: A concolic unit testing engine for c. In ESEC/FSE'05 (2005), ACM Press. Google ScholarDigital Library
Srinivasan, A. The Aleph manual. Available at http://www.comlab.ox.ac.uk/oucl/ research/areas/machlearn/Aleph/.Google Scholar
Whaley, J., and Lam, M. S. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI '04 (2004), ACM Press, pp. 131--144. Google ScholarDigital Library
Whaley, J., Martin, M. C., and Lam, M. S. Automatic extraction of object-oriented component interfaces. In ISSTA (2002), pp. 218--228. Google ScholarDigital Library
Yang, J., Evans, D., Bhardwaj, D., Bhat, T., and Das, M. Perracotta: Mining temporal API rules from imperfect traces. In Proc. of ICSE (2006). Google ScholarDigital Library

Index Terms

Mining library specifications using inductive logic programming
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Logical and relational learning
        Inductive logic learning
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program verification

Recommendations

Inductive equivalence in clausal logic and nonmonotonic logic programming

This paper provides a logical framework for comparing inductive capabilities among agents having different background theories. A background theory is called inductively equivalent to another background theory if the two theories induce the same ...
Read More
On temporal logic versus datalog
Logic and complexity in computer science

We provide a direct and modular translation from the temporal logics CTL, ETL, FCTL (CTL extended with the ability to express fairness) and the Modal µ-calculus to Monadic inf-Datalog with built-in predicates. We call it inf-Datalog because the ...
Read More
Complexity and expressive power of logic programming

This article surveys various complexity and expressiveness results on different forms of logic programming. The main focus is on decidable forms of logic programming, in particular, propositional logic programming and datalog, but we also mention ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '08: Proceedings of the 30th international conference on Software engineering
May 2008
558 pages
ISBN:9781605580791
DOI:10.1145/1368088
General Chair:
Wilhelm Schäfer
University of Paderborn
,
Program Chairs:
Matthew B. Dwyer
University of Nebraska
,
Volker Gruhn
University of Leipzig
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 May 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
datalog
inductive logic programming
machine learning.
software specification
verification
Qualifiers
- research-article
Conference

Acceptance Rates
ICSE '08 Paper Acceptance Rate56of370submissions,15%Overall Acceptance Rate276of1,856submissions,15%
More

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 510
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining library specifications using inductive logic programming

ICSE '08: Proceedings of the 30th international conference on Software engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Inductive equivalence in clausal logic and nonmonotonic logic programming

On temporal logic versus datalog

Complexity and expressive power of logic programming