ABSTRACT
Two of the attractions of search-based software engineering (SBSE) derive from the nature of the fitness functions used to guide the search. These have proved to be highly robust (for a variety of different search algorithms) and have yielded insight into the nature of the search space itself, shedding light upon the software engineering problem in hand.This paper aims to exploit these two benefits of SBSE in the context of search based module clustering. The paper presents empirical results which compare the robustness of two fitness functions used for software module clustering: one (MQ) used exclusively for module clustering. The other is EVM, a clustering fitness function previously applied to time series and gene expression data.The results show that both metrics are relatively robust in the presence of noise, with EVM being the more robust of the two. The results may also yield some interesting insights into the nature of software graphs.
- D. G. Altman. Practical Statistics for Medical Research. Chapman and Hall, 1997.]] Google ScholarDigital Library
- J. Clark, J. J. Dolado, M. Harman, R. M. Hierons, B. Jones, M. Lumkin, B. Mitchell, S. Mancoridis, K. Rees, M. Roper, and M. Shepperd. Reformulating software engineering as a search problem. IEE Proceedings - Software, 150(3):161--175, 2003.]]Google ScholarCross Ref
- L. L. Constantine and E. Yourdon. Structured Design. Prentice Hall, 1979.]]Google Scholar
- D. Doval, S. Mancoridis, and B. S. Mitchell. Automatic clustering of software systems using a genetic algorithm. In International Conference on Software Tools and Engineering Practice (STEP'99), Pittsburgh, PA, 30 August - 2 September 1999.]] Google ScholarDigital Library
- M. Harman, R. Hierons, and M. Proctor. A new representation and crossover operator for search-based optimization of software modularization. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 1351--1358, New York, 9-13 July 2002. Morgan Kaufmann Publishers.]]Google ScholarDigital Library
- D. Hutchens and V. Basili. System structure analysis: clustering with data bindings. IEEE Transactions on Software Engineering, SE-11(8):749--757, 1985.]] Google ScholarDigital Library
- P. Kellam, X. Liu, N. Martin, C. Orengo, S. Swift, and A. Tucker. A framework for modelling virus gene expression data. Intelligent Data Analysis, 6(3):267--279, 2002.]]Google ScholarCross Ref
- C. Kirsopp, M. Shepperd, and J. Hart. Search heuristics, case-based reasoning and software project effort prediction. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 1367--1374, New York, 9-13 July 2002. Morgan Kaufmann Publishers.]]Google ScholarDigital Library
- C. Lindig and G. Snelting. Assessing modular structure of legacy code based on mathematical concept analysis. In Proceedings of the 1997 International Conference on Software Engineering, pages 349--359. ACM Press, 1997.]] Google ScholarDigital Library
- R. Lutz. Evolving good hierarchical decompositions of complex systems. Journal of Systems Architecture, 47:613--634, 2001.]] Google ScholarDigital Library
- K. Mahdavi, M. Harman, and R. Hierons. Finding building blocks for software clustering. In Genetic and Evolutionary Computation - GECCO-2003, volume 2724 of LNCS, pages 2513--2514, Chicago, 12-16 July 2003. Springer-Verlag.]] Google ScholarDigital Library
- K. Mahdavi, M. Harman, and R. M. Hierons. A multiple hill climbing approach to software module clustering. In IEEE International Conference on Software Maintenance (ICSM 2003), pages 315--324, Amsterdam, Netherlands, Sept. 2003. IEEE Computer Society Press, Los Alamitos, California, USA.]] Google ScholarDigital Library
- S. Mancoridis, B. S. Mitchell, Y.-F. Chen, and E. R. Gansner. Bunch: A clustering tool for the recovery and maintenance of software system structures. In Proceedings; IEEE International Conference on Software Maintenance, pages 50--59. IEEE Computer Society Press, 1999.]] Google ScholarDigital Library
- S. Mancoridis, B. S. Mitchell, C. Rorres, Y.-F. Chen, and E. R. Gansner. Using automatic clustering to produce high-level system organizations of source code. In International Workshop on Program Comprehension (IWPC'98), pages 45--53, Ischia, Italy, 1998. IEEE Computer Society Press, Los Alamitos, California, USA.]] Google ScholarDigital Library
- B. S. Mitchell. A Heuristic Search Approach to Solving the Software Clustering Problem. PhD Thesis, Drexel University, Philadelphia, PA, Jan. 2002.]] Google ScholarDigital Library
- B. S. Mitchell and S. Mancoridis. Using heuristic search techniques to extract design abstractions from source code. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 1375--1382, New York, 9-13 July 2002. Morgan Kaufmann Publishers.]]Google ScholarDigital Library
- B. S. Mitchell and S. Mancoridis. Using interconnection style rules to infer software architecture relations. In 8th Genetic and Evolutionary Computing Conference (GECCO'04), Seattle, USA, July 2004. Springer-Verlag.]]Google ScholarCross Ref
- H. Pohlheim and J. Wegener. Testing the temporal behavior of real-time software modules using extended evolutionary algorithms. In W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith, editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, page 1795, Orlando, Florida, USA, 13-17 July 1999. Morgan Kaufmann.]]Google Scholar
- R. Pressman. Software Engineering: A Practitioner's Approach. McGraw-Hill Book Company Europe, Maidenhead, Berkshire, England, UK., 3rd edition, 1992. European adaptation (1994). Adapted by Darrel Ince. ISBN 0-07-707936-1.]] Google ScholarDigital Library
- R. W. Schwanke. An intelligent tool for re-engineering software modularity. In Proceedings of the 13th International Conference on Software Engineering, pages 83--92, May 1991.]] Google ScholarDigital Library
- C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379--423 and 623--656, July and October 1948.]]Google ScholarCross Ref
- A. Tucker, S. Swift, and X. Liu. Grouping multivariate time series via correlation. IEEE Transactions on Systems, Man, and Cybernetics. Part B: Cybernetics, 31(2):235--245, 2001.]] Google ScholarDigital Library
- A. van Deursen and T. Kuipers. Identifying objects using cluster and concept analysis. Technical Report SEN-R9814, Centrum voor Wiskunde en Informatica (CWI), Sept. 1998.]] Google ScholarDigital Library
Index Terms
- An empirical study of the robustness of two module clustering fitness functions
Recommendations
A two-leveled symbiotic evolutionary algorithm for clustering problems
Because of its unsupervised nature, clustering is one of the most challenging problems, considered as a NP-hard grouping problem. Recently, several evolutionary algorithms (EAs) for clustering problems have been presented because of their efficiency for ...
A Comparative Landscape Analysis of Fitness Functions for Search-Based Testing
SYNASC '08: Proceedings of the 2008 10th International Symposium on Symbolic and Numeric Algorithms for Scientific ComputingLandscape analysis of fitness functions is an important topic.This paper makes an attempt to characterize the search problems associated with the fitness functions used in search-based testing, employingthe following measures: diameter, autocorrelation ...
Robustness of density-based clustering methods with various neighborhood relations
Cluster analysis is one of the most crucial techniques in statistical data analysis. Among the clustering methods, density-based methods have great importance due to their ability to recognize clusters with arbitrary shape. In this paper, robustness of ...
Comments