Skip to main content

2013 | Buch

Trustworthy Eternal Systems via Evolving Software, Data and Knowledge

Second International Workshop, EternalS 2012, Montpellier, France, August 28, 2012, Revised Selected Papers

herausgegeben von: Alessandro Moschitti, Barbara Plank

Verlag: Springer Berlin Heidelberg

Buchreihe : Communications in Computer and Information Science

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed proceedings of the Second International Workshop on Trustworthy Eternal Systems via Evolving Software, Data and Knowledge, EternalS, held in Montpellier, France, in August 2012 and co-located with the 20th European Conference on Artificial Intelligence (ECAI 2012). The 10 revised full papers presented were carefully reviewed and selected from various submissions. The papers are organized into three main sections: natural language processing (NLP) for software systems, machine learning for software systems, roadmap for future research.

Inhaltsverzeichnis

Frontmatter

Machine Learning for Software Systems

Semantic and Algorithmic Recognition Support to Porting Software Applications to Cloud
Abstract
This paper presents a methodology, a technique and an ongoing implementation, aimed at supporting software porting (i.e. to adapt the software to be used in different execution environments), from object oriented domain towards Cloud Computing. The technique is based on semantic representation of Cloud Application Programming Interfaces, and on automated algorithmic concept recognition in source code, integrated by structural based matchmaking techniques. In particular the following techniques are composed and integrated: automatic recognition of the algorithms and algorithmic concepts implemented in the source code and the calls to libraries and APIs performing actions and functionalities relevant to the target environment; comparison through matchmaking of the recognized concepts and APIs with those present in the functional ontology which describes the target API; mapping of the source code excerpts and the source calls to APIs to the target API calls and elements.
Beniamino Di Martino, Giuseppina Cretella
Machine Learning for Emergent Middleware
Abstract
Highly dynamic and heterogeneous distributed systems are challenging today’s middleware technologies. Existing middleware paradigms are unable to deliver on their most central promise, which is offering interoperability. In this paper, we argue for the need to dynamically synthesise distributed system infrastructures according to the current operating environment, thereby generating “Emergent Middleware” to mediate interactions among heterogeneous networked systems that interact in an ad hoc way. The paper outlines the overall architecture of Enablers underlying Emergent Middleware, and in particular focuses on the key role of learning in supporting such a process, spanning statistical learning to infer the semantics of networked system functions and automata learning to extract the related behaviours of networked systems.
Amel Bennaceur, Valérie Issarny, Daniel Sykes, Falk Howar, Malte Isberner, Bernhard Steffen, Richard Johansson, Alessandro Moschitti
Security Oracle Based on Tree Kernel Methods
Abstract
The objective of software testing is to stress a program to reveal programming defects. Security testing is, more specifically, that branch of testing which aims to reveal defects that could lead to security problems. Most of security testing declensions, however, have been mostly interested in the automatic generation of test cases that “try” to reveal a vulnerability, rather than assessing if test cases have actually “managed” to expose security issues.
In this paper, we cope with the latter problem. We investigate on the feasibility of using tree kernel methods to implement a classifier able to evaluate if a test case revealed a vulnerability, i.e. a security oracle for injection attacks. We compare six different variants of tree kernel methods in terms of their effectiveness in detecting attacks.
Andrea Avancini, Mariano Ceccato

NLP for Software Systems

Robust Requirements Analysis in Complex Systems through Machine Learning
Abstract
Requirement Analysis (RA) is a relevant application for Semantic Technologies focused on the extraction and exploitation of knowledge derived from technical documents. Language processing technologies are useful for the automatic extraction of concepts as well as norms (e.g. constraints on the use of devices) that play a key role in knowledge acquisition and design processes. A distributional method to train a kernel-based learning algorithm is here proposed, as a cost-effective approach for the validation stage in RA of Complex Systems, i.e. Naval Combat Systems. The targeted application of Requirement Identification and Information Extraction techniques is here discussed in the realm of robust search processes that allows to suitably locate software functionalities within large collections of requirements written in natural language.
Francesco Garzoli, Danilo Croce, Manuela Nardini, Francesco Ciambra, Roberto Basili
Automatic Generation and Reranking of SQL-Derived Answers to NL Questions
Abstract
In this paper, given a relational database, we automatically translate a natural language question into an SQL query retrieving the correct answer. We exploit the structure of the DB to generate a set of candidate SQL queries, which we rerank with a SVM-ranker based on tree kernels. In particular we use linguistic dependencies in the natural language question and the DB metadata to build a set of plausible SELECT, WHERE and FROM clauses enriched with meaningful joins. Then, we combine all the clauses to get the set of all possible SQL queries, producing candidate queries to answer the question. This approach can be recursively applied to deal with complex questions, requiring nested queries. We sort the candidates in terms of scores of correctness using a weighting scheme applied to the query generation rules. Then, we use a SVM ranker trained with structural kernels to reorder the list of question and query pairs, where both members are represented as syntactic trees. The f-measure of our model on standard benchmarks is in line with the best models (85% on the first question), which use external and expensive hand-crafted resources such as the semantic interpretation. Moreover, we can provide a set of candidate answers with a Recall of the answer of about 92% and 96% on the first 2 and 5 candidates, respectively.
Alessandra Giordani, Alessandro Moschitti
Assessment of Software Testing and Quality Assurance in Natural Language Processing Applications and a Linguistically Inspired Approach to Improving It
Abstract
Significant progress has been made in addressing the scientific challenges of biomedical text mining. However, the transition from a demonstration of scientific progress to the production of tools on which a broader community can rely requires that fundamental software engineering requirements be addressed. In this paper we characterize the state of biomedical text mining software with respect to software testing and quality assurance. Biomedical natural language processing software was chosen because it frequently specifically claims to offer production-quality services, rather than just research prototypes.
We examined twenty web sites offering a variety of text mining services. On each web site, we performed the most basic software test known to us and classified the results. Seven out of twenty web sites returned either bad results or the worst class of results in response to this simple test. We conclude that biomedical natural language processing tools require greater attention to software quality.
We suggest a linguistically motivated approach to granular evaluation of natural language processing applications, and show how it can be used to detect performance errors of several systems and to predict overall performance on specific equivalence classes of inputs.
We also assess the ability of linguistically-motivated test suites to provide good software testing, as compared to large corpora of naturally-occurring data. We measure code coverage and find that it is considerably higher when even small structured test suites are utilized than when large corpora are used.
K. Bretonnel Cohen, Lawrence E. Hunter, Martha Palmer
Supporting Agile Software Development by Natural Language Processing
Abstract
Agile software development puts more emphasis on working programs than on documentation. However, this may cause complications from the management perspective when an overview of the progress achieved within a project needs to be provided. In this paper, we outline the potential for applying natural language processing (NLP) in order to support agile development. We point out that using NLP, the artifacts created during agile software development activities can be traced back to the requirements expressed in user stories. This allows determining how far the project has progressed in terms of realized requirements.
Barbara Plank, Thomas Sauer, Ina Schaefer

Roadmap for Future Research

Anomaly Detection in the Cloud: Detecting Security Incidents via Machine Learning
Abstract
Cloud computing is now on the verge of being embraced as a serious usage-model. However, while outsourcing services and workflows into the cloud provides indisputable benefits in terms of flexibility of costs and scalability, there is little advance in security (which can influence reliability), transparency and incident handling. The problem of applying the existing security tools in the cloud is twofold. First, these tools do not consider the specific attacks and challenges of cloud environments, e.g., cross-VM side-channel attacks. Second, these tools focus on attacks and threats at only one layer of abstraction, e.g., the network, the service, or the workflow layers. Thus, the semantic gap between events and alerts at different layers is still an open issue. The aim of this paper is to present ongoing work towards a Monitoring-as-a-Service anomaly detection framework in a hybrid or public cloud. The goal of our framework is twofold. First it closes the gap between incidents at different layers of cloud-sourced workflows, namely we focus both on the workflow and the infrastracture layers. Second, our framework tackles challenges stemming from cloud usage, like multi-tenancy. Our framework uses complex event processing rules and machine learning, to detect populate user-specified metrics that can be used to assess the security status of the monitored system.
Matthias Gander, Michael Felderer, Basel Katt, Adrian Tolbaru, Ruth Breu, Alessandro Moschitti
Using Machine Learning and Information Retrieval Techniques to Improve Software Maintainability
Abstract
In this paper, we investigate some ideas based on Machine Learning, Natural Language Processing, and Information Retrieval to outline possible research directions in the field of software architecture recovery and clone detection. In particular, after presenting an extensive related work, we illustrate two proposals for addressing these two issues, that represent hot topics in the field of Software Maintenance. Both proposals use Kernel Methods for exploiting structural representation of source code and to automate the detection of clones and the recovery of the actually implemented architecture in a subject software system.
Anna Corazza, Sergio Di Martino, Valerio Maggio, Alessandro Moschitti, Andrea Passerini, Giuseppe Scanniello, Fabrizio Silvestri
The EternalS Roadmap – Defining a Research Agenda for Eternal Systems
Abstract
Science, technology and business are increasingly dependent on software. This trend is driven by increasing system size, complexity, diversity and flexibility and the obligation for tailored integration of end-users, processes and evolving technologies. The complexity scale of current systems exceeds our current understanding of systems engineering and the number of system parameters to be controlled as part of the overall design process exceeds the performance of the associated tools and techniques we are using. This leads to excessive costs for software maintenance and system degradation over its lifetime. The tools and techniques must evolve to take into account this increasing systems, software and architecture scale and complexity. Software intensive systems must be flexible to accommodate a range of requirements and operating conditions, and capable of evolving to allow these parameters to change over time. Software Engineering approaches to reusability and maintenance must cope with the dynamics and longevity of future software applications and infrastructures, e.g., for the Future Internet, e-commerce, e-health, and egovernment. The EternalS project is developing a roadmap for the next two decades to inspire a research agenda for software and systems engineering to help address these issues. This paper presents some of the key issues outlined above, the roadmapping process and some of the key findings to date.
Robert Mullins
Backmatter
Metadaten
Titel
Trustworthy Eternal Systems via Evolving Software, Data and Knowledge
herausgegeben von
Alessandro Moschitti
Barbara Plank
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-45260-4
Print ISBN
978-3-642-45259-8
DOI
https://doi.org/10.1007/978-3-642-45260-4

Premium Partner