Skip to main content

2014 | Buch

Search-Based Software Engineering

6th International Symposium, SSBSE 2014, Fortaleza, Brazil, August 26-29, 2014. Proceedings

herausgegeben von: Claire Le Goues, Shin Yoo

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 6th International Symposium on Search-Based Software Engineering, SSBSE 2014, held in Fortaleza, Brazil.

The 14 revised full papers presented together with 2 keynote addresses, 1 invited talk, 1 short paper, 3 papers of the graduate track, and 4 challenge track papers were carefully reviewed and selected from 51 submissions. Search Based Software Engineering (SBSE) studies the application of meta-heuristic optimization techniques to various software engineering problems, ranging from requirements engineering to software testing and maintenance.

Inhaltsverzeichnis

Frontmatter

Full Research Papers

On the Effectiveness of Whole Test Suite Generation
Abstract
A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., statements, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole test suite generation optimizes entire test suites towards satisfying all goals at the same time. There is evidence that the overall coverage achieved with this approach is superior to that of targeting individual coverage goals. Nevertheless, there remains some uncertainty on whether the whole test suite approach might be inferior to a more focused search in the case of particularly difficult coverage goals. In this paper, we perform an in-depth analysis to study if this is the case. An empirical study on 100 Java classes reveals that indeed there are some testing goals that are easier to cover with the traditional approach. However, their number is not only very small in comparison with those which are only covered by the whole test suite approach, but also those coverage goals appear in small classes for which both approaches already obtain high coverage.
Andrea Arcuri, Gordon Fraser
Detecting Program Execution Phases Using Heuristic Search
Abstract
Understanding a program from its execution traces is extremely difficult because a trace consists of thousands to millions of events, such as method calls, object creation and destruction, etc. Nonetheless, execution traces can provide valuable information, once abstracted from their low-level events. We propose to identify feature-level phases based on events collected from traces of the program execution. We cast our approach in an optimization problem, searching through the dynamic information provided by the program’s execution traces to form a set of phases that minimizes coupling while maximizing cohesion. We applied and evaluated our search algorithms on different execution scenarios of JHotDraw and Pooka.
Omar Benomar, Houari Sahraoui, Pierre Poulin
On the Use of Machine Learning and Search-Based Software Engineering for Ill-Defined Fitness Function: A Case Study on Software Refactoring
Abstract
The most challenging step when adapting a search-based technique for a software engineering problem is the definition of the fitness function. For several software engineering problems, a fitness function is ill-defined, subjective, or difficult to quantify. For example, the evaluation of a software design is subjective. This paper introduces the use of a neural network-based fitness function for the problem of software refactoring. The software engineers evaluate manually the suggested refactoring solutions by a Genetic Algorithm (GA) for few iterations then an Artificial Neural Network (ANN) uses these training examples to evaluate the refactoring solutions for the remaining iterations. We evaluate the efficiency of our approach using six different open-source systems through an empirical study and compare the performance of our technique with several existing refactoring studies.
Boukhdhir Amal, Marouane Kessentini, Slim Bechikh, Josselin Dea, Lamjed Ben Said
Producing Just Enough Documentation: The Next SAD Version Problem
Abstract
Software architecture knowledge is an important asset in today’s projects, as it serves to share the main design decisions among the project stakeholders. Architectural knowledge is commonly captured by the Software Architecture Document (SAD), an artifact that is useful but can also be costly to produce and maintain. In practice, the SAD often fails to fulfill its mission of addressing the stakeholders’ information needs, due to factors such as: detailed or high-level contents that do not consider all stakeholders, outdated documentation, or documentation generated late in the lifecycle, among others. To alleviate this problem, we propose a documentation strategy that seeks to balance the stakeholders’ interests in the SAD against the efforts of producing it. Our strategy is cast as an optimization problem called ”the next SAD version problem” (NSVP) and several search-based techniques for it are discussed. A preliminary evaluation of our approach has shown its potential for exploring cost-benefit tradeoffs in documentation production.
J. Andres Diaz-Pace, Matias Nicoletti, Silvia Schiaffino, Santiago Vidal
A Multi-model Optimization Framework for the Model Driven Design of Cloud Applications
Abstract
The rise and adoption of the Cloud computing paradigm had a strong impact on the ICT world in the last few years; this technology has now reached maturity and Cloud providers offer a variety of solutions and services to their customers. However, beside the advantages, Cloud computing introduced new issues and challenges. In particular, the heterogeneity of the Cloud services offered and their relative pricing models makes the identification of a deployment solution that minimizes costs and guarantees QoS very complex. Performance assessment of Cloud based application needs for new models and tools to take into consideration the dynamism and multi-tenancy intrinsic of the Cloud environment. The aim of this work is to provide a novel mixed integer linear program (MILP) approach to find a minimum cost feasible cloud configuration for a given cloud based application. The feasibility of the solution is considered with respect to some non-functional requirements that are analyzed through multiple performance models with different levels of accuracy. The initial solution is further improved by a local search based procedure. The quality of the initial feasible solution is compared against first principle heuristics currently adopted by practitioners and Cloud providers.
Danilo Ardagna, Giovanni Paolo Gibilisco, Michele Ciavotta, Alexander Lavrentev
A Pattern-Driven Mutation Operator for Search-Based Product Line Architecture Design
Abstract
The application of design patterns through mutation operators in search-based design may improve the quality of the architectures produced in the evolution process. However, we did not find, in the literature, works applying such patterns in the optimization of Product Line Architecture (PLA). Existing works offer manual approaches, which are not search-based, and only apply specific patterns in particular domains. Considering this fact, this paper introduces a meta-model and a mutation operator to allow the design patterns application in the search-based PLA design. The model represents suitable scopes, that is, set of architectural elements that are suitable to receive a pattern. The mutation operator is used with a multi-objective and evolutionary approach to obtain PLA alternatives. Quantitative and qualitative analysis of empirical results show an improvement in the quality of the obtained solutions.
Giovani Guizzo, Thelma Elita Colanzi, Silvia Regina Vergilio
Mutation-Based Generation of Software Product Line Test Configurations
Abstract
Software Product Lines (SPLs) are families of software products that can be configured and managed through a combination of features. Such products are usually represented with a Feature Model (FM). Testing the entire SPL may not be conceivable due to economical or time constraints and, more simply, because of the large number of potential products. Thus, defining methods for generating test configurations is required, and is now a very active research topic for the testing community. In this context, mutation has recently being advertised as a promising technique. Mutation evaluates the ability of the test suite to detect defective versions of the FM, called mutants. In particular, it has been shown that existing test configurations achieving the mutation criterion correlate with fault detection. Despite the potential benefit of mutation, there is no approach which aims at generating test configurations for SPL with respect to the mutation criterion. In this direction, we introduce a search-based approach which explores the SPL product space to generate product test configurations with the aim of detecting mutants.
Christopher Henard, Mike Papadakis, Yves Le Traon
Multi-objective Genetic Optimization for Noise-Based Testing of Concurrent Software
Abstract
Testing of multi-threaded programs is a demanding work due to the many possible thread interleavings one should examine. The noise injection technique helps to increase the number of thread interleavings examined during repeated test executions provided that a suitable setting of noise injection heuristics is used. The problem of finding such a setting, i.e., the so called test and noise configuration search problem (TNCS problem), is not easy to solve. In this paper, we show how to apply a multi-objective genetic algorithm (MOGA) to the TNCS problem. In particular, we focus on generation of TNCS solutions that cover a high number of distinct interleavings (especially those which are rare) and provide stable results at the same time. To achieve this goal, we study suitable metrics and ways how to suppress effects of non-deterministic thread scheduling on the proposed MOGA-based approach. We also discuss a choice of a concrete MOGA and its parameters suitable for our setting. Finally, we show on a set of benchmark programs that our approach provides better results when compared to the commonly used random approach as well as to the sooner proposed use of a single-objective genetic approach.
Vendula Hrubá, Bohuslav Křena, Zdeněk Letko, Hana Pluháčková, Tomáš Vojnar
Bi-objective Genetic Search for Release Planning in Support of Themes
Abstract
Release planning is a mandatory part of incremental and iterative software development. For the decision about which features should be implemented next, the values of features need to be balanced with the effort and readiness of their implementation. Traditional planning looks at the sum of the values of individual and potentially isolated features. As an alternative idea, a theme is a meta-functionality which integrates a number of individual features under a joint umbrella. That way, possible value synergies from offering features in conjunction (theme-related) can be utilized.
In this paper, we model theme-based release planning as a bi-objective (search-based) optimization problem. Each solution of this optimization problem balances the preference between individual and theme-based planning objectives. We apply a two-stage solution approach. In Phase 1, the existing Non-dominated Sorting Genetic Algorithm-II (NSGA-II) is adapted. Subsequently, the problem of guiding the user in the set of non-dominated solutions is addressed in Phase 2. We propose and explore two alternative ways to select among the (potentially large number) Pareto-optimal solutions. The applicability and empirical analysis of the proposed approach is evaluated for two explorative case study projects having 50 resp. 25 features grouped around 8 resp. 5 themes.
Muhammad Rezaul Karim, Guenther Ruhe
Combining Stochastic Grammars and Genetic Programming for Coverage Testing at the System Level
Abstract
When tested at the system level, many programs require complex and highly structured inputs, which must typically satisfy some formal grammar. Existing techniques for grammar based testing make use of stochastic grammars that randomly derive test sentences from grammar productions, trying at the same time to avoid unbounded recursion. In this paper, we combine stochastic grammars with genetic programming, so as to take advantage of the guidance provided by a coverage oriented fitness function during the sentence derivation and evolution process. Experimental results show that the combination of stochastic grammars and genetic programming outperforms stochastic grammars alone.
Fitsum Meshesha Kifetew, Roberto Tiella, Paolo Tonella
Feature Model Synthesis with Genetic Programming
Abstract
Search-Based Software Engineering (SBSE) has proven successful on several stages of the software development life cycle. It has also been applied to different challenges in the context of Software Product Lines (SPLs) like generating minimal test suites. When reverse engineering SPLs from legacy software an important challenge is the reverse engineering of variability, often expressed in the form of Feature Models (FMs). The synthesis of FMs has been studied with techniques such as Genetic Algorithms. In this paper we explore the use of Genetic Programming for this task. We sketch our general workflow, the GP pipeline employed, and its evolutionary operators. We report our experience in synthesizing feature models from sets of feature combinations for 17 representative feature models, and analyze the results using standard information retrieval metrics.
Lukas Linsbauer, Roberto Erick Lopez-Herrejon, Alexander Egyed
A Robust Multi-objective Approach for Software Refactoring under Uncertainty
Abstract
Refactoring large systems involves several sources of uncertainty related to the severity levels of code smells to be corrected and the importance of the classes in which the smells are located. Due to the dynamic nature of software development, these values cannot be accurately determined in practice, leading to refactoring sequences that lack robustness. To address this problem, we introduced a multi-objective robust model, based on NSGA-II, for the software refactoring problem that tries to find the best trade-off between quality and robustness. We evaluated our approach using six open source systems and demonstrated that it is significantly better than state-of-the-art refactoring approaches in terms of robustness in 100% of experiments based on a variety of real-world scenarios. Our suggested refactoring solutions were found to be comparable in terms of quality to those suggested by existing approaches and to carry an acceptable robustness price. Our results also revealed an interesting feature about the trade-off between quality and robustness that demonstrates the practical value of taking robustness into account in software refactoring.
Mohamed Wiem Mkaouer, Marouane Kessentini, Slim Bechikh, Mel Ó Cinnéide
Towards Automated A/B Testing
Abstract
User-intensive software, such asWeb andmobile applications, heavily depends on the interactions with large and unknown populations of users. Knowing the preferences and behaviors of these populations is crucial for the success of this class of systems. A/B testing is an increasingly popular technique that supports the iterative development of userintensive software based on controlled experiments performed on live users. However, as currently performed, A/B testing is a time consuming, error prone and costly manual activity. In this paper, we investigate a novel approach to automate A/B testing. More specifically, we rephrase A/B testing as a search-based software engineering problem and we propose an initial approach that supports automated A/B testing through aspect-oriented programming and genetic algorithms.
Giordano Tamburrelli, Alessandro Margara
Random-Weighted Search-Based Multi-objective Optimization Revisited
Abstract
Weight-based multi-objective optimization requires assigning appropriate weights using a weight strategy to each of the objectives such that an overall optimal solution can be obtained with a search algorithm. Choosing weights using an appropriate weight strategy has a huge impact on the obtained solutions and thus warrants the need to seek the best weight strategy. In this paper, we propose a new weight strategy called Uniformly Distributed Weights (UDW), which generates weights from uniform distribution, while satisfying a set of user-defined constraints among various cost and effectiveness measures. We compare UDW with two commonly used weight strategies, i.e., Fixed Weights (FW) and Randomly-Assigned Weights (RAW), based on five cost/effectiveness measures for an industrial problem of test minimization defined in the context of Video Conferencing System Product Line developed by Cisco Systems. We empirically evaluate the performance of UDW, FW, and RAW in conjunction with four search algorithms ((1+1) Evolutionary Algorithm (EA), Genetic Algorithm, Alternating Variable Method, and Random Search) using the industrial case study and 500 artificial problems of varying complexity. Results show that UDW along with (1+1) EA achieves the best performance among the other combinations of weight strategies and algorithms.
Shuai Wang, Shaukat Ali, Arnaud Gotlieb

Short Papers

A New Learning Mechanism for Resolving Inconsistencies in Using Cooperative Co-evolution Model
Abstract
Many aspects of Software Engineering problems lend themselves to a coevolutionary model of optimization because software systems are complex and rich in potential population that could be productively coevolved. Most of these aspects can be coevolved to work better together in a cooperative manner. Compared with the simple and common used predator-prey co-evolution model, cooperative co-evolution model has more challenges that need to be addressed. One of these challenges is how to resolve the inconsistencies between two populations in order to make them work together with no conflict. In this position paper, we propose a new learning mechanism based on Baldwin effect, and introduce the learning genetic operators to address the inconsistency issues. A toy example in the field of automated architectural synthesis is provided to describe the use of our proposed approach.
Yongrui Xu, Peng Liang

Graduate Student Track Papers

Improving Heuristics for the Next Release Problem through Landscape Visualization
Abstract
The selection of the requirements to be included in the next release of a software is a complex task. Each customer has their needs, but it is usually impossible to fulfill all of them due to constraints such as budget availability. The Next Release Problem (NRP) aims to select the requirements that maximize customer’s benefit, while minimizing development effort. Visualizing the problem search space from a customer concentration perspective, we observed a recurring behavior in all instances analyzed. This work presents these findings and shows some initial results of a Hill Climbing algorithm modified to take advantage of this pattern. The modified algorithm was able to generate solutions that are statistically better than those generated by the original Hill Climbing.
Richard Fuchshuber, Márcio de Oliveira Barros
Machine Learning for User Modeling in an Interactive Genetic Algorithm for the Next Release Problem
Abstract
The Next Release Problem consists in selecting which requirements will be implemented in the next software release. For many SBSE approaches to the NRP, there is still a lack of ability to efficiently include the human opinion and its peculiarities in the search process. Most of these difficulties are due to the problem of the human fatigue. Thus, it is proposed the use of a machine learning technique to model the user and replace him in an Interactive Genetic Algorithm to the NRP. Intermediate results are able to show that an IGA can succesfully incorporate the user preferences in the final solution.
Allysson Allex Araújo, Matheus Paixão
Transaction Profile Estimation of Queueing Network Models for IT Systems Using a Search-Based Technique
Abstract
The software and hardware systems required to deliver modern Web based services are becoming increasingly complex. Management and evolution of the systems requires periodic analysis of performance and capacity to maintain quality of service and maximise efficient use of resources. In this work we present a method that uses a repeated local search technique to improve the accuracy of modelling such systems while also reducing the complexity and time required to perform this task. The accuracy of the model derived from the search-based approach is validated by extrapolating the performance to multiple load levels which enables system capacity and performance to be planned and managed more efficiently.
Shadi Ghaith, Miao Wang, Philip Perry, John Murphy

SBSE Challenge Track Papers

Less is More: Temporal Fault Predictive Performance over Multiple Hadoop Releases
Abstract
We investigate search based fault prediction over time based on 8 consecutive Hadoop versions, aiming to analyse the impact of chronology on fault prediction performance. Our results confound the assumption, implicit in previous work, that additional information from historical versions improves prediction; though G-mean tends to improve, Recall can be reduced.
Mark Harman, Syed Islam, Yue Jia, Leandro L. Minku, Federica Sarro, Komsan Srivisut
Babel Pidgin: SBSE Can Grow and Graft Entirely New Functionality into a Real World System
Abstract
Adding new functionality to an existing, large, and perhaps poorly-understood system is a challenge, even for the most competent human programmer. We introduce a ‘grow and graft’ approach to Genetic Improvement (GI) that transplants new functionality into an existing system. We report on the trade offs between varying degrees of human guidance to the GI transplantation process. Using our approach, we successfully grew and transplanted a new ‘Babel Fish’ linguistic translation feature into the Pidgin instant messaging system, creating a genetically improved system we call ‘Babel Pidgin’. This is the first time that SBSE has been used to evolve and transplant entirely novel functionality into an existing system. Our results indicate that our grow and graft approach requires surprisingly little human guidance.
Mark Harman, Yue Jia, William B. Langdon
Pidgin Crasher: Searching for Minimised Crashing GUI Event Sequences
Abstract
We present a search based testing system that automatically explores the space of all possible GUI event interleavings. Search guides our system to novel crashing sequences using Levenshtein distance and minimises the resulting fault-revealing UI sequences in a post-processing hill climb. We report on the application of our system to the SSBSE 2014 challenge program, Pidgin. Overall, our Pidgin Crasher found 20 different events that caused 2 distinct kinds of bugs, while the event sequences that caused them were reduced by 84% on average using our minimisation post processor.
Haitao Dan, Mark Harman, Jens Krinke, Lingbo Li, Alexandru Marginean, Fan Wu
Repairing and Optimizing Hadoop hashCode Implementations
Abstract
We describe how contract violations in JavaTM hashCode methods can be repaired using novel combination of semantics-preserving and generative methods, the latter being achieved via Automatic Improvement Programming. The method described is universally applicable. When applied to the Hadoop platform, it was established that it produces hashCode functions that are at least as good as the original, broken method as well as those produced by a widely-used alternative method from the ‘Apache Commons’ library.
Zoltan A. Kocsis, Geoff Neumann, Jerry Swan, Michael G. Epitropakis, Alexander E. I. Brownlee, Sami O. Haraldsson, Edward Bowles
Erratum: Repairing and Optimizing Hadoop hashCode Implementations
Abstract
In the original version, the name of the sixth author is incorrect. Instead of “Sami O. Haraldsson” it should be read as “Saemundur O. Haraldsson”.
Zoltan A. Kocsis, Geoff Neumann, Jerry Swan, Michael G. Epitropakis, Alexander E. I. Brownlee, Saemundur O. Haraldsson, Edward Bowles
Backmatter
Metadaten
Titel
Search-Based Software Engineering
herausgegeben von
Claire Le Goues
Shin Yoo
Copyright-Jahr
2014
Verlag
Springer International Publishing
Electronic ISBN
978-3-319-09940-8
Print ISBN
978-3-319-09939-2
DOI
https://doi.org/10.1007/978-3-319-09940-8