Skip to main content
Erschienen in: Empirical Software Engineering 2/2010

01.04.2010

Improving automated requirements trace retrieval: a study of term-based enhancement methods

verfasst von: Xuchang Zou, Raffaella Settimi, Jane Cleland-Huang

Erschienen in: Empirical Software Engineering | Ausgabe 2/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automated requirements traceability methods that utilize Information Retrieval (IR) methods to generate and maintain traceability links are often more efficient than traditional manual approaches, however the traces they generate are imprecise and significant human effort is needed to evaluate and filter the results. This paper investigates and compares three term-based enhancement methods that are designed to improve the performance of a probabilistic automated tracing tool. Empirical studies show that the enhancement methods can be effective in increasing the accuracy of the retrieved traces; however the effectiveness of each method varies according to specific project characteristics. The analysis of such characteristics has lead to the development of two new project-level metrics which can be used to predict the effectiveness of each enhancement method for a given data set. A procedure to automatically extract critical keywords and phrases from a set of traceable artifacts is also presented to enhance the automated trace retrieval algorithm. The procedure is tested on two new datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Antoniol G, Canfora G, De Lucia A, Casazza G (2000) Information Retrieval Models for Recovering Traceability Links between Code and Documentation. Proceedings of the International Conference on Software Maintenance, San Jose, California, USA, pp. 40–51. Antoniol G, Canfora G, De Lucia A, Casazza G (2000) Information Retrieval Models for Recovering Traceability Links between Code and Documentation. Proceedings of the International Conference on Software Maintenance, San Jose, California, USA, pp. 40–51.
Zurück zum Zitat Borger E, Gotzhein R (2000) Requirements Engineering Case Study ‘Light Control’. Journal of Universal Computer Science 6(7):580–596 Borger E, Gotzhein R (2000) Requirements Engineering Case Study ‘Light Control’. Journal of Universal Computer Science 6(7):580–596
Zurück zum Zitat Burke R, Hammond K., Kulyukin V., Lytinen S., Tomuro N. and Schoenberg S. (1997) Natural language processing in the FAQ finder system: results and prospects. AAAI Spring Symposium on Natural Language Processing for the World Wide Web, pp. 17–26. Burke R, Hammond K., Kulyukin V., Lytinen S., Tomuro N. and Schoenberg S. (1997) Natural language processing in the FAQ finder system: results and prospects. AAAI Spring Symposium on Natural Language Processing for the World Wide Web, pp. 17–26.
Zurück zum Zitat Church K, Hanks P (1990) Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16(1):22–29 Church K, Hanks P (1990) Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16(1):22–29
Zurück zum Zitat Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005a) Goal-Centric traceability for managing non-functional requirements. Proceedings of the 27th International Conference on Software Engineering, St. Louis, MO, USA, pp. 362–271. Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005a) Goal-Centric traceability for managing non-functional requirements. Proceedings of the 27th International Conference on Software Engineering, St. Louis, MO, USA, pp. 362–271.
Zurück zum Zitat Cleland-Huang J, Settimi R, Duan C, Zou X (2005b) Utilizing supporting evidence to improve dynamic requirements traceability. Proceedings of the 13th IEEE International Requirements Engineering Conference, Paris, France, pp. 135–144. Cleland-Huang J, Settimi R, Duan C, Zou X (2005b) Utilizing supporting evidence to improve dynamic requirements traceability. Proceedings of the 13th IEEE International Requirements Engineering Conference, Paris, France, pp. 135–144.
Zurück zum Zitat Croft W, Turtle H, Lewis A (1991) The use of phrases and structured queries in information retrieval. Proceeding of the 14th International ACM SIGIR conference on Research and development in information retrieval, Chicago, IL, USA, pp. 32–45. Croft W, Turtle H, Lewis A (1991) The use of phrases and structured queries in information retrieval. Proceeding of the 14th International ACM SIGIR conference on Research and development in information retrieval, Chicago, IL, USA, pp. 32–45.
Zurück zum Zitat Cronen-Townsend S, Zhou Y, Croft W B (2002) Predicting Query Performance. Proceedings of the 25th Annual International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR 2002), pp 299–306. Cronen-Townsend S, Zhou Y, Croft W B (2002) Predicting Query Performance. Proceedings of the 25th Annual International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR 2002), pp 299–306.
Zurück zum Zitat Davis AM (1990) Software Requirements: Analysis and Specification. Prentice Hall, Englewood Cliffs, NJ Davis AM (1990) Software Requirements: Analysis and Specification. Prentice Hall, Englewood Cliffs, NJ
Zurück zum Zitat De Lucia A, Fasano F, Oliveto R, Tortora G (2007) Recovering traceability links in software artifact management systems using information retrieval methods. ACM Transactions on Software Engineering and Methodology (TOSEM), 16(4), article n.13. De Lucia A, Fasano F, Oliveto R, Tortora G (2007) Recovering traceability links in software artifact management systems using information retrieval methods. ACM Transactions on Software Engineering and Methodology (TOSEM), 16(4), article n.13.
Zurück zum Zitat De Lucia A, Oliveto R, Tortora G (2009) Assessing IR-based traceability recovery tools through controlled experiments. Empirical Software Engineering 14(1):57–92CrossRef De Lucia A, Oliveto R, Tortora G (2009) Assessing IR-based traceability recovery tools through controlled experiments. Empirical Software Engineering 14(1):57–92CrossRef
Zurück zum Zitat Deerwester S, Dumais S, Furnas G, Landauer T, Harshman R (1990) Indexing by latent semantic analysis. Journal of the American Society for Information Science 41:391–407CrossRef Deerwester S, Dumais S, Furnas G, Landauer T, Harshman R (1990) Indexing by latent semantic analysis. Journal of the American Society for Information Science 41:391–407CrossRef
Zurück zum Zitat Dekhtyar, A.; Hayes, J.H.; Sundaram, S.; Holbrook, A.; Dekhtyar, O., (2007) Technique Integration for Requirements Assessment, Proceedings of 15th International Requirements Engineering Conference, pp.141–150. Dekhtyar, A.; Hayes, J.H.; Sundaram, S.; Holbrook, A.; Dekhtyar, O., (2007) Technique Integration for Requirements Assessment, Proceedings of 15th International Requirements Engineering Conference, pp.141–150.
Zurück zum Zitat Evans MW (1989) The Software Factory. John Wiley and Sons, Hoboken, NJ Evans MW (1989) The Software Factory. John Wiley and Sons, Hoboken, NJ
Zurück zum Zitat Fagan J (1987) Experiments in Automatic Phrase Indexing for Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods (Doctoral dissertation, Cornell University, Computer Science Department). Technical Report, pp. 87–868. Fagan J (1987) Experiments in Automatic Phrase Indexing for Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods (Doctoral dissertation, Cornell University, Computer Science Department). Technical Report, pp. 87–868.
Zurück zum Zitat Fellbaum, C editor (1998). Wordnet: An Electronic Lexical Database, MIT Press Books. Fellbaum, C editor (1998). Wordnet: An Electronic Lexical Database, MIT Press Books.
Zurück zum Zitat Forsythe GE, Malcolm MA, Moler CB (1977) Computer Methods for Mathematical Computations (Chapter 9: Least squares and the singular value decomposition). Prentice Hall, Englewood Cliffs, NJ Forsythe GE, Malcolm MA, Moler CB (1977) Computer Methods for Mathematical Computations (Chapter 9: Least squares and the singular value decomposition). Prentice Hall, Englewood Cliffs, NJ
Zurück zum Zitat Frakes WB, Baeza-Yates R (1992) Information Retrieval: Data Structures and Algorithms. Prentice Hall, Englewood Cliffs, NJ Frakes WB, Baeza-Yates R (1992) Information Retrieval: Data Structures and Algorithms. Prentice Hall, Englewood Cliffs, NJ
Zurück zum Zitat Furnas G W, Deerwester S, Dumais S T, Landauer T K, Harshman R A, Streeter V, Lochbaum K E (1988), Information retrieval using a singular value decomposition model of latent semantic structure. Proceedings of SIGIR, pp. 465–480. Furnas G W, Deerwester S, Dumais S T, Landauer T K, Harshman R A, Streeter V, Lochbaum K E (1988), Information retrieval using a singular value decomposition model of latent semantic structure. Proceedings of SIGIR, pp. 465–480.
Zurück zum Zitat Gay L, Croft W (1990) Interpreting Nominal Compounds for Information Retrieval. Inf Process Manage 26(1):21–38CrossRef Gay L, Croft W (1990) Interpreting Nominal Compounds for Information Retrieval. Inf Process Manage 26(1):21–38CrossRef
Zurück zum Zitat Gotel O, Finkelstein A (1994) An analysis of the requirements traceability problem. Proceedings of the 1st International Conference on Requirements Engineering, Colorado Springs, Colorado, USA, pp. 94–101. Gotel O, Finkelstein A (1994) An analysis of the requirements traceability problem. Proceedings of the 1st International Conference on Requirements Engineering, Colorado Springs, Colorado, USA, pp. 94–101.
Zurück zum Zitat Hayes , J. H., Dekhtyar, A., Osbourne, J. (2003). Improving requirements tracing via information retrieval. Proceedings of the 11th International Conference on Requirements Engineering, pp. 151–161. Hayes , J. H., Dekhtyar, A., Osbourne, J. (2003). Improving requirements tracing via information retrieval. Proceedings of the 11th International Conference on Requirements Engineering, pp. 151–161.
Zurück zum Zitat Hayes JH, Dekhtyar A, Sundaram S (2006) Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods. IEEE Transactions on Software Engineering 32(1):4–19CrossRef Hayes JH, Dekhtyar A, Sundaram S (2006) Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods. IEEE Transactions on Software Engineering 32(1):4–19CrossRef
Zurück zum Zitat Interactive Development Environments (1991). Software through pictures: products and services overview, IDE Inc. Interactive Development Environments (1991). Software through pictures: products and services overview, IDE Inc.
Zurück zum Zitat Joho H, Sanderson M (2007) Document Frequency and Term Specificity. Proceeding of the 8th Recherche d’Information Assistée par Ordinateur Conference (RIAO’07), Pittsburgh, PA, USA. Joho H, Sanderson M (2007) Document Frequency and Term Specificity. Proceeding of the 8th Recherche d’Information Assistée par Ordinateur Conference (RIAO’07), Pittsburgh, PA, USA.
Zurück zum Zitat Jones KS, van Rijsbergen CJ (1976) Information Retrieval Test Collections. Journal of Documentation 32:59–75CrossRef Jones KS, van Rijsbergen CJ (1976) Information Retrieval Test Collections. Journal of Documentation 32:59–75CrossRef
Zurück zum Zitat Kaindl H (1993) The Missing Link in Requirements Engineering. ACM SIGSOFT Software Engineering Notes 18(2):30–39CrossRef Kaindl H (1993) The Missing Link in Requirements Engineering. ACM SIGSOFT Software Engineering Notes 18(2):30–39CrossRef
Zurück zum Zitat Lin J, Lin C C, Cleland-Huang J, Settimi R, Amaya J, Bedford G, Berenbach B, Khadra O B, Duan C Zou X. (2006). Poirot: a distributed tool supporting enterprise-wide traceability. Proceeding of the 14th IEEE International Conference on Requirements Engineering, Minneapolis, MN, USA, pp. 11–15. Lin J, Lin C C, Cleland-Huang J, Settimi R, Amaya J, Bedford G, Berenbach B, Khadra O B, Duan C Zou X. (2006). Poirot: a distributed tool supporting enterprise-wide traceability. Proceeding of the 14th IEEE International Conference on Requirements Engineering, Minneapolis, MN, USA, pp. 11–15.
Zurück zum Zitat Maletic J I, Munson E V, Marcus A, Nguyen T N (2003) Using a hypertext model for traceability link conformance analysis. Proceeding of the 2nd International Workshop on Traceability in Emerging Forms of Software Engineering, Montreal, CA, USA, pp. 47–54. Maletic J I, Munson E V, Marcus A, Nguyen T N (2003) Using a hypertext model for traceability link conformance analysis. Proceeding of the 2nd International Workshop on Traceability in Emerging Forms of Software Engineering, Montreal, CA, USA, pp. 47–54.
Zurück zum Zitat Marcus A, Maletic J I (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. Proceeding of the 25th IEEE International Conference on Software Engineering, Portland, Oregon, USA, pp. 125–135. Marcus A, Maletic J I (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. Proceeding of the 25th IEEE International Conference on Software Engineering, Portland, Oregon, USA, pp. 125–135.
Zurück zum Zitat Matsuo Y, Ishisuka M (2004) Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information. International Journal on Artificial Intelligence Tools 13(1):157–169CrossRef Matsuo Y, Ishisuka M (2004) Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information. International Journal on Artificial Intelligence Tools 13(1):157–169CrossRef
Zurück zum Zitat Robertson S, Robertson J (1999) Mastering the Requirements Process, Reading. Addison-Wesley, MA Robertson S, Robertson J (1999) Mastering the Requirements Process, Reading. Addison-Wesley, MA
Zurück zum Zitat Rocchio J (1971) The SMART Retrieval System: Experiments in Automatic Document Processing (Relevance feedback in information retrieval). Prentice-Hall, Englewood Cliffs, NJ Rocchio J (1971) The SMART Retrieval System: Experiments in Automatic Document Processing (Relevance feedback in information retrieval). Prentice-Hall, Englewood Cliffs, NJ
Zurück zum Zitat Salton G, Buckley C (1988) Term weighting approaches in automatic retrieval. Information Processing and Management 24(5):513-523.CrossRef Salton G, Buckley C (1988) Term weighting approaches in automatic retrieval. Information Processing and Management 24(5):513-523.CrossRef
Zurück zum Zitat Salton G, Yang C, Yu C (1974) A Theory of Term Importance in Automatic Text Analysis. Journal of the American Society for Information Science 26(1):33–44CrossRef Salton G, Yang C, Yu C (1974) A Theory of Term Importance in Automatic Text Analysis. Journal of the American Society for Information Science 26(1):33–44CrossRef
Zurück zum Zitat Salton G, Wong A, Yang CS (1975) A Vector Space Model for Automatic Indexing. Commun ACM 18(11):613–620MATHCrossRef Salton G, Wong A, Yang CS (1975) A Vector Space Model for Automatic Indexing. Commun ACM 18(11):613–620MATHCrossRef
Zurück zum Zitat Settimi R, Cleland-Huang J, BenKhadra O, Mody J, Lukasik W, DePalma C (2004) Supporting change in evolving software systems through dynamic traces to UML. Proceeding of the 7th IEEE International Workshop on Principles of Software Evolution, Kyoto, Japan, pp. 49–54. Settimi R, Cleland-Huang J, BenKhadra O, Mody J, Lukasik W, DePalma C (2004) Supporting change in evolving software systems through dynamic traces to UML. Proceeding of the 7th IEEE International Workshop on Principles of Software Evolution, Kyoto, Japan, pp. 49–54.
Zurück zum Zitat Singhal A, Choi J, Hindle D, Lewis DD, Pereira F (1999) AT&T at TREC-7. Proceedings of TREC-7, Gaithersburg, MD, USA, pp. 239–252. Singhal A, Choi J, Hindle D, Lewis DD, Pereira F (1999) AT&T at TREC-7. Proceedings of TREC-7, Gaithersburg, MD, USA, pp. 239–252.
Zurück zum Zitat Tufis D, Mason O (1998) Tagging Romanian texts: a case study for QTAG, a language independent probabilistic tagger. Proceedings of the International Conference on Language Resources & Evaluation, Granada, Spain, pp 589–596 Tufis D, Mason O (1998) Tagging Romanian texts: a case study for QTAG, a language independent probabilistic tagger. Proceedings of the International Conference on Language Resources & Evaluation, Granada, Spain, pp 589–596
Zurück zum Zitat Wong SKM, Yao YY (1991) A Probabilistic Inference Model for Information Retrieval. Information Systems 16(3):301–321CrossRefMathSciNet Wong SKM, Yao YY (1991) A Probabilistic Inference Model for Information Retrieval. Information Systems 16(3):301–321CrossRefMathSciNet
Zurück zum Zitat Zou X (2009) Improving Automated Requirements Trace Retrieval Through Term-Based Enhancement Strategies. PhD thesis, School of Computing, DePaul University, Chicago, IL. Technical Report n. 09–001. Zou X (2009) Improving Automated Requirements Trace Retrieval Through Term-Based Enhancement Strategies. PhD thesis, School of Computing, DePaul University, Chicago, IL. Technical Report n. 09–001.
Zurück zum Zitat Zou X, Settimi R, Cleland-Huang J (2006) Phrasing in Dynamic Requirements Trace Retrieval, Proceedings of the 30th Annual International Computer Software and Application Conference (COMPSAC06). Chicago, IL, USA, pp 265–272CrossRef Zou X, Settimi R, Cleland-Huang J (2006) Phrasing in Dynamic Requirements Trace Retrieval, Proceedings of the 30th Annual International Computer Software and Application Conference (COMPSAC06). Chicago, IL, USA, pp 265–272CrossRef
Zurück zum Zitat Zou X, Settimi R, Cleland-Huang J (2007) Term-based Enhancement Factors in Automated Requirements Traceability Retrieval, Proceedings of the 2nd International Symposium on Grand Challenge in Traceability. Lexington, KY, USA, pp 40–45 Zou X, Settimi R, Cleland-Huang J (2007) Term-based Enhancement Factors in Automated Requirements Traceability Retrieval, Proceedings of the 2nd International Symposium on Grand Challenge in Traceability. Lexington, KY, USA, pp 40–45
Zurück zum Zitat Zou X, Settimi R, Cleland-Huang J (2008) Evaluating the Use of Project Glossaries in Automated Trace Retrieval. Proceedings of the 2008 International Conference on Software Engineering Research and Practice (SERP’08), Las Vegas, USA, pp. 157–163. Zou X, Settimi R, Cleland-Huang J (2008) Evaluating the Use of Project Glossaries in Automated Trace Retrieval. Proceedings of the 2008 International Conference on Software Engineering Research and Practice (SERP’08), Las Vegas, USA, pp. 157–163.
Metadaten
Titel
Improving automated requirements trace retrieval: a study of term-based enhancement methods
verfasst von
Xuchang Zou
Raffaella Settimi
Jane Cleland-Huang
Publikationsdatum
01.04.2010
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 2/2010
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-009-9114-z

Premium Partner