Skip to main content
Erschienen in: Automated Software Engineering 3/2019

12.06.2019

An NLP approach for cross-domain ambiguity detection in requirements engineering

verfasst von: Alessio Ferrari, Andrea Esuli

Erschienen in: Automated Software Engineering | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

During requirements elicitation, different stakeholders with diverse backgrounds and skills need to effectively communicate to reach a shared understanding of the problem at hand. Linguistic ambiguity due to terminological discrepancies may occur between stakeholders that belong to different technical domains. If not properly addressed, ambiguity can create frustration and distrust during requirements elicitation meetings, and lead to problems at later stages of development. This paper presents a natural language processing approach to identify ambiguous terms between different domains, and rank them by ambiguity score. The approach is based on building domain-specific language models, one for each stakeholders’ domain. Word embeddings from each language model are compared in order to measure the differences of use of a term, thus estimating its potential ambiguity across the domains of interest. We evaluate the approach on seven potential elicitation scenarios involving five domains. In the evaluation, we compare the ambiguity rankings automatically produced with the ones manually obtained by the authors as well as by multiple annotators recruited through Amazon Mechanical Turk. The rankings produced by the approach lead to a maximum Kendall’s Tau of 88%. However, for several elicitation scenarios, the application of the approach was unsuccessful in terms of performance. Analysis of the agreement among annotators and of the observed inaccuracies offer hints for further research on the relationship between domain knowledge and natural language ambiguity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
3
In the pseudo-code, array indexes start from 1.
 
4
When n = 2, the variance is equivalent to the mean squared error.
 
8
spaCy is used also to identify nouns through POS tagging. The authors of spaCy (Honnibal and Montani 2017) report a 97% accuracy for the pretrained POS tagger for English, which is in line with the state of the art (Manning 2011). Given the high accuracy, we consider the influence of POS tagger errors as negligible.
 
10
In this preliminary task, the authors annotated a group of 60 sentences for the (I3) Medical Software (CS, MED) and 60 sentences for the (M2) Medical Robot scenario (CS, EEN, MEN, MED)—the sentence sets were not included in the evaluation presented in this work. The annotation was followed by a discussion on the annotated data and evident cases of disagreement, which allowed the authors to have an initial common ground to select the examples.
 
11
The original Kendall’s Tau measure does not provide a policy to handle ties.
 
16
Preliminary experiments were performed in this direction, to obtain the currently adopted parameters’ values. However, we reckon that a systematic evaluation campaign is required, as there may be an optimal parameters’ selection for each domain group.
 
Literatur
Zurück zum Zitat Berry, D.M., Kamsties, E.: Ambiguity in requirements specification. In: do Prado Leite, J.C.S., Doorn, J.H. (eds) Perspectives on Software Requirements. The Springer International Series in Engineering and Computer Science, vol. 753, pp. 7–44. Springer, Boston, MA (2004). https://doi.org/10.1007/978-1-4615-0465-8_2 Berry, D.M., Kamsties, E.: Ambiguity in requirements specification. In: do Prado Leite, J.C.S., Doorn, J.H. (eds) Perspectives on Software Requirements. The Springer International Series in Engineering and Computer Science, vol. 753, pp. 7–44. Springer, Boston, MA (2004). https://​doi.​org/​10.​1007/​978-1-4615-0465-8_​2
Zurück zum Zitat Camacho-Collados, J., Pilehvar, T.: From word to sense embeddings: a survey on vector representations of meaning. arXiv preprint arXiv:1805.04032 (2018) Camacho-Collados, J., Pilehvar, T.: From word to sense embeddings: a survey on vector representations of meaning. arXiv preprint arXiv:​1805.​04032 (2018)
Zurück zum Zitat Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1025–1035 (2014). https://doi.org/10.3115/v1/D14-1110 Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1025–1035 (2014). https://​doi.​org/​10.​3115/​v1/​D14-1110
Zurück zum Zitat Dalpiaz, F., van der Schalk, I., Lucassen, G.: Pinpointing ambiguity and incompleteness in requirements engineering via information visualization and NLP. In: International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer, pp. 119–135 (2018). https://doi.org/10.1007/978-3-319-77243-1_8 Dalpiaz, F., van der Schalk, I., Lucassen, G.: Pinpointing ambiguity and incompleteness in requirements engineering via information visualization and NLP. In: International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer, pp. 119–135 (2018). https://​doi.​org/​10.​1007/​978-3-319-77243-1_​8
Zurück zum Zitat Evans, M.C., Bhatia, J., Wadkar, S., Breaux, T.D.: An evaluation of constituency-based hyponymy extraction from privacy policies. In: Requirements Engineering Conference (RE), 2017 IEEE 25th International. IEEE, pp. 312–321 (2017). https://doi.org/10.1109/RE.2017.87 Evans, M.C., Bhatia, J., Wadkar, S., Breaux, T.D.: An evaluation of constituency-based hyponymy extraction from privacy policies. In: Requirements Engineering Conference (RE), 2017 IEEE 25th International. IEEE, pp. 312–321 (2017). https://​doi.​org/​10.​1109/​RE.​2017.​87
Zurück zum Zitat Fernández, D.M., Wagner, S., Kalinowski, M., Felderer, M., Mafra, P., Vetrò, A., Conte, T., Christiansson, M.T., Greer, D., Lassenius, C., Männistö, T., Nayabi, M., Oivo, M., Penzenstadler, B., Pfahl, D., Prikladnicki, R., Ruhe, G., Schekelmann, A., Sen, S., Spinola, R., Tuzcu, A., de la Vara, J.L., Wieringa, R.: Naming the pain in requirements engineering. Empir. Softw. Eng. 22(5), 2298–2338 (2017). https://doi.org/10.1007/s10664-016-9451-7 CrossRef Fernández, D.M., Wagner, S., Kalinowski, M., Felderer, M., Mafra, P., Vetrò, A., Conte, T., Christiansson, M.T., Greer, D., Lassenius, C., Männistö, T., Nayabi, M., Oivo, M., Penzenstadler, B., Pfahl, D., Prikladnicki, R., Ruhe, G., Schekelmann, A., Sen, S., Spinola, R., Tuzcu, A., de la Vara, J.L., Wieringa, R.: Naming the pain in requirements engineering. Empir. Softw. Eng. 22(5), 2298–2338 (2017). https://​doi.​org/​10.​1007/​s10664-016-9451-7 CrossRef
Zurück zum Zitat Ferrari, A., Donati, B., Gnesi, S.: Detecting domain-specific ambiguities: an NLP approach based on Wikipedia crawling and word embeddings. In: 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). IEEE, pp. 393–399 (2017b). https://doi.org/10.1109/REW.2017.20 Ferrari, A., Donati, B., Gnesi, S.: Detecting domain-specific ambiguities: an NLP approach based on Wikipedia crawling and word embeddings. In: 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). IEEE, pp. 393–399 (2017b). https://​doi.​org/​10.​1109/​REW.​2017.​20
Zurück zum Zitat Ferrari, A., Esuli, A., Gnesi, S.: Identification of cross-domain ambiguity with language models. In: Groen, E.C., Harrison, R., Murukannaiah, P.K., Vogelsang, A. (eds) 5th International Workshop on Artificial Intelligence for Requirements Engineering, AIRE@RE 2018, Banff, AB, Canada, 21 Aug 2018. IEEE, pp. 31–38 (2018a). https://doi.org/10.1109/AIRE.2018.00011 Ferrari, A., Esuli, A., Gnesi, S.: Identification of cross-domain ambiguity with language models. In: Groen, E.C., Harrison, R., Murukannaiah, P.K., Vogelsang, A. (eds) 5th International Workshop on Artificial Intelligence for Requirements Engineering, AIRE@RE 2018, Banff, AB, Canada, 21 Aug 2018. IEEE, pp. 31–38 (2018a). https://​doi.​org/​10.​1109/​AIRE.​2018.​00011
Zurück zum Zitat Firth, J.R.: Selected Papers of JR Firth, 1952–59. Indiana University Press, Bloomington (1968) Firth, J.R.: Selected Papers of JR Firth, 1952–59. Indiana University Press, Bloomington (1968)
Zurück zum Zitat Flekova, L., Gurevych, I.: Supersense embeddings: a unified model for supersense interpretation, prediction, and utilization. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2029–2041 (2016). https://doi.org/10.18653/v1/P16-1191 Flekova, L., Gurevych, I.: Supersense embeddings: a unified model for supersense interpretation, prediction, and utilization. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2029–2041 (2016). https://​doi.​org/​10.​18653/​v1/​P16-1191
Zurück zum Zitat Gacitua, R., Sawyer, P., Gervasi, V.: On the effectiveness of abstraction identification in requirements engineering. In: Proceedings of the 18th IEEE International Requirements Engineering Conference (RE’10). IEEE, pp. 5–14 (2010). https://doi.org/10.1109/RE.2010.12 Gacitua, R., Sawyer, P., Gervasi, V.: On the effectiveness of abstraction identification in requirements engineering. In: Proceedings of the 18th IEEE International Requirements Engineering Conference (RE’10). IEEE, pp. 5–14 (2010). https://​doi.​org/​10.​1109/​RE.​2010.​12
Zurück zum Zitat Guo, J., Cheng, J., Cleland-Huang, J.: Semantically enhanced software traceability using deep learning techniques. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, pp. 3–14 (2017). https://doi.org/10.1109/ICSE.2017.9 Guo, J., Cheng, J., Cleland-Huang, J.: Semantically enhanced software traceability using deep learning techniques. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, pp. 3–14 (2017). https://​doi.​org/​10.​1109/​ICSE.​2017.​9
Zurück zum Zitat Honnibal, M., Montani, I.: spaCy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (2017) (to appear) Honnibal, M., Montani, I.: spaCy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (2017) (to appear)
Zurück zum Zitat Lami, G., Gnesi, S., Fabbrini, F., Fusani, M.: The linguistic approach to the natural language requirements quality: Benefit of the use of an automatic tool. In: Proceedings 26th Annual NASA Goddard Software Engineering Workshop(SEW), vol. 00, p. 97 (2001). https://doi.org/10.1109/SEW.2001.992662 Lami, G., Gnesi, S., Fabbrini, F., Fusani, M.: The linguistic approach to the natural language requirements quality: Benefit of the use of an automatic tool. In: Proceedings 26th Annual NASA Goddard Software Engineering Workshop(SEW), vol. 00, p. 97 (2001). https://​doi.​org/​10.​1109/​SEW.​2001.​992662
Zurück zum Zitat Lee, Y.K., Ng, H.T.: An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, pp. 41–48 (2002). https://doi.org/10.3115/1118693.1118699 Lee, Y.K., Ng, H.T.: An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, pp. 41–48 (2002). https://​doi.​org/​10.​3115/​1118693.​1118699
Zurück zum Zitat Lian, X., Rahimi, M., Cleland-Huang, J., Zhang, L., Ferrai, R., Smith, M.: Mining requirements knowledge from collections of domain documents. In: 2016 IEEE 24th International Requirements Engineering Conference (RE). IEEE, pp. 156–165 (2016). https://doi.org/10.1109/RE.2016.50 Lian, X., Rahimi, M., Cleland-Huang, J., Zhang, L., Ferrai, R., Smith, M.: Mining requirements knowledge from collections of domain documents. In: 2016 IEEE 24th International Requirements Engineering Conference (RE). IEEE, pp. 156–165 (2016). https://​doi.​org/​10.​1109/​RE.​2016.​50
Zurück zum Zitat Maalej, W., Nabil, H.: Bug report, feature request, or simply praise? on automatically classifying app reviews. In: Proceedings of the 23rd IEEE International Requirements Engineering Conference, (RE’15). IEEE, pp. 116–125 (2015). https://doi.org/10.1109/RE.2015.7320414 Maalej, W., Nabil, H.: Bug report, feature request, or simply praise? on automatically classifying app reviews. In: Proceedings of the 23rd IEEE International Requirements Engineering Conference, (RE’15). IEEE, pp. 116–125 (2015). https://​doi.​org/​10.​1109/​RE.​2015.​7320414
Zurück zum Zitat Manning, C.D.: Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 171–189 (2011) Manning, C.D.: Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 171–189 (2011)
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013). arXiv:1310.4546 Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013). arXiv:​1310.​4546
Zurück zum Zitat Miller, G.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)MATH Miller, G.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)MATH
Zurück zum Zitat Pedersen, T.: A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, pp. 63–69. Association for Computational Linguistics. http://www.aclweb.org/anthology/A00-2009 (2000). Accessed 10 June 2019 Pedersen, T.: A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, pp. 63–69. Association for Computational Linguistics. http://​www.​aclweb.​org/​anthology/​A00-2009 (2000). Accessed 10 June 2019
Zurück zum Zitat Pohl, K., Rupp, C.: Requirements Engineering Fundamentals. Rocky Nook Inc, San Rafael (2011) Pohl, K., Rupp, C.: Requirements Engineering Fundamentals. Rocky Nook Inc, San Rafael (2011)
Zurück zum Zitat Quirchmayr, T., Paech, B., Kohl, R., Karey, H.: Semi-automatic software feature-relevant information extraction from natural language user manuals. In: Proceedings of the 23rd International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ’17), pp. 255–272, Springer (2017). https://doi.org/10.1007/978-3-319-54045-0_19 Quirchmayr, T., Paech, B., Kohl, R., Karey, H.: Semi-automatic software feature-relevant information extraction from natural language user manuals. In: Proceedings of the 23rd International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ’17), pp. 255–272, Springer (2017). https://​doi.​org/​10.​1007/​978-3-319-54045-0_​19
Zurück zum Zitat Raganato, A., Bovi, C.D., Navigli, R.: Neural sequence learning models for word sense disambiguation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1156–1167 (2017). https://doi.org/10.18653/v1/D17-1120 Raganato, A., Bovi, C.D., Navigli, R.: Neural sequence learning models for word sense disambiguation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1156–1167 (2017). https://​doi.​org/​10.​18653/​v1/​D17-1120
Zurück zum Zitat Robeer, M., Lucassen, G., van der Werf, J.M.E., Dalpiaz, F., Brinkkemper, S.: Automated extraction of conceptual models from user stories via nlp. In: Proceedings of the 24th IEEE International Requirements Engineering Conference (RE’16), pp. 196–205. IEEE (2016). https://doi.org/10.1109/RE.2016.40 Robeer, M., Lucassen, G., van der Werf, J.M.E., Dalpiaz, F., Brinkkemper, S.: Automated extraction of conceptual models from user stories via nlp. In: Proceedings of the 24th IEEE International Requirements Engineering Conference (RE’16), pp. 196–205. IEEE (2016). https://​doi.​org/​10.​1109/​RE.​2016.​40
Zurück zum Zitat Sleimi, A., Sannier, N., Sabetzadeh, M., Briand, L., Dann, J.: Automated extraction of semantic legal metadata using natural language processing. In: 2018 IEEE 26th International Requirements Engineering Conference (RE), pp. 124–135. IEEE (2018). https://doi.org/10.1109/RE.2018.00022 Sleimi, A., Sannier, N., Sabetzadeh, M., Briand, L., Dann, J.: Automated extraction of semantic legal metadata using natural language processing. In: 2018 IEEE 26th International Requirements Engineering Conference (RE), pp. 124–135. IEEE (2018). https://​doi.​org/​10.​1109/​RE.​2018.​00022
Zurück zum Zitat Sultanov, H., Hayes, J.H.: Application of reinforcement learning to requirements engineering: requirements tracing. In: Proceedings of the 21st IEEE International Requirements Engineering Conference (RE’13), pp. 52–61. IEEE (2013). https://doi.org/10.1109/RE.2013.6636705 Sultanov, H., Hayes, J.H.: Application of reinforcement learning to requirements engineering: requirements tracing. In: Proceedings of the 21st IEEE International Requirements Engineering Conference (RE’13), pp. 52–61. IEEE (2013). https://​doi.​org/​10.​1109/​RE.​2013.​6636705
Zurück zum Zitat Taghipour, K., Ng, H.T.: Semi-supervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 314–323 (2015) https://doi.org/10.3115/v1/N15-1035 Taghipour, K., Ng, H.T.: Semi-supervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 314–323 (2015) https://​doi.​org/​10.​3115/​v1/​N15-1035
Zurück zum Zitat Yuan, D., Richardson, J., Doherty, R., Evans, C., Altendorf, E.: Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:1603.07012 (2016) Yuan, D., Richardson, J., Doherty, R., Evans, C., Altendorf, E.: Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:​1603.​07012 (2016)
Metadaten
Titel
An NLP approach for cross-domain ambiguity detection in requirements engineering
verfasst von
Alessio Ferrari
Andrea Esuli
Publikationsdatum
12.06.2019
Verlag
Springer US
Erschienen in
Automated Software Engineering / Ausgabe 3/2019
Print ISSN: 0928-8910
Elektronische ISSN: 1573-7535
DOI
https://doi.org/10.1007/s10515-019-00261-7

Weitere Artikel der Ausgabe 3/2019

Automated Software Engineering 3/2019 Zur Ausgabe

Premium Partner