Skip to main content
Log in

Analysing anaphoric ambiguity in natural language requirements

  • Best Papers of RE'10: Requirements Engineering in a Multi-faceted World
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Many requirements documents are written in natural language (NL). However, with the flexibility of NL comes the risk of introducing unwanted ambiguities in the requirements and misunderstandings between stakeholders. In this paper, we describe an automated approach to identify potentially nocuous ambiguity, which occurs when text is interpreted differently by different readers. We concentrate on anaphoric ambiguity, which occurs when readers may disagree on how pronouns should be interpreted. We describe a number of heuristics, each of which captures information that may lead a reader to favor a particular interpretation of the text. We use these heuristics to build a classifier, which in turn predicts the degree to which particular interpretations are preferred. We collected multiple human judgements on the interpretation of requirements exhibiting anaphoric ambiguity and showed how the distribution of these judgements can be used to assess whether a particular instance of ambiguity is nocuous. Given a requirements document written in natural language, our approach can identify sentences that contain anaphoric ambiguity, and use the classifier to alert the requirements writer of text that runs the risk of misinterpretation. We report on a series of experiments that we conducted to evaluate the performance of the automated system we developed to support our approach. The results show that the system achieves high recall with a consistent improvement on baseline precision subject to some ambiguity tolerance levels, allowing us to explore and highlight realistic and potentially problematic ambiguities in actual requirements documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. As with programming, requirements are inherently an ill-conditioned problem; small changes in their interpretation can lead to hugely differently behaviour of the developed systems. We do not address here the issue of estimating the effect of different interpretations caused by ambiguity.

  2. Plural: anaphora.

  3. Our examples are adapted (in many cases abbreviated) from our collection of requirements documents. We render anaphora in bold and underline antecedent candidates.

  4. http://research.it.uts.edu.au/re/.

  5. In requirements documents, the sentences that describe requirements are not generally specified by the writer. So anaphora instances are collected based on the whole text of the document other than some specific requirements sentences.

  6. http://www.surveymonkey.com/s.aspx?sm=kbGtRdJJXqWabZFk28tJfw_3d_3d.

  7. Rogue judgments are errors made by judges through carelessness or by accident, rather than judgments that reflect a genuine difference of opinion.

  8. http://text0.mib.man.ac.uk:8080/scottpiao/sent_dectector.

  9. http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger/.

  10. A headword is the main word of a phrase, and the other words in that phrase modify it. For example, for the noun phrase ‘the CCSDS parameter’, ‘parameter’ is the headword, and ‘CCSDS’ is a noun modifier for the headword.

  11. http://biotext.berkeley.edu/software.html.

  12. Proper names are names of persons, places, or certain special things. They are typically capitalized nouns, such as ‘London’, ‘John Hunter’.

  13. http://nlp.stanford.edu/software/lex-parser.shtml.

  14. http://wordnet.princeton.edu/.

  15. http://www.cs.waikato.ac.nz/~ml/index.html.

  16. http://www.natcorp.ox.ac.uk/.

  17. http://verbs.colorado.edu/~mpalmer/projects/verbnet.html.

  18. http://sketchengine.co.uk/.

  19. In fact, as part of our future work we intend to integrate the technology in the popular DOORS requirements management tool.

References

  1. Achour CB, Rolland C, Souveyet C, Maiden NAM (1999) Guiding use case authoring: results of an empirical study. In: Proceedings of 7th IEEE international Requirements Engineering conference (RE’99), pp 36–43

  2. Aone C, Bennet SW (1996) Applying machine learning to anaphora resolution. In: Connectionist, statistical and symbolic approaches to learning for natural language processing, pp 302–314

  3. Berry DM, Kamsties E, Krieger MM (2003) From contract drafting to software specification: linguistic sources of ambiguity

  4. Berry D, Kamsties E (2005) The syntactically dangerous all and plural in specifications. IEEE Softw 22:55–57

    Article  Google Scholar 

  5. Berry D, Bucchiarone A, Gnesi S, Lami G, Trentanni G (2006) A new quality model for natural language requirements specifications. In: Proceedings of the international workshop on Requirements Engineering: Foundation of Software Quality (REFSQ)

  6. Boyd S, Zowghi D, Farroukh A (2005) Measuring the expressiveness of a constrained natural language: an empirical study. In: Proceedings of the 13th IEEE international conference on Requirements Engineering (RE’05), Washington, DC, pp 339–352

  7. Brennan SE, Friedman MW, Pollard C (1987) A centering approach to pronouns. In: Proceedings of the 25th annual meeting of the Association for Computational Linguistics (ACL), pp 155–162

  8. Castaño J, Zhang J, Pustejovsky H (2002) Anaphora resolution in biomedical literature. In: Proceedings of international symposium on reference resolution

  9. Chantree F, Nuseibeh B, de Roeck A, Willis A (2006) Identifying nocuous ambiguities in natural language requirements. In: Proceedings of 14th IEEE international Requirements Engineering conference (RE’06), Minneapolis, USA, pp 59–68

  10. Dagan I, Itai A (1990) Automatic processing of large corpora for the resolution of anaphora references. In: Proceedings of the 13th international conference on Computational Linguistics (COLING’90), pp 1–3

  11. Denber M (1998) Automatic resolution of anaphora in English. Technical report, Eastman Kodak Co

  12. Fabbrini F, Fusani M, Gnesi S, Lami G (2001) The linguistic approach to the natural language requirements, quality: benefits of the use of an automatic tool. In: Proceedings of the twenty sixth annual IEEE computer society—NASA GSFC software engineering workshop, pp 97–105

  13. Fantechi A, Gnesi S, Lami G, Maccari A (2003) Applications of linguistic techniques for use case analysis. Requir Eng J 8(9):161–170

    Article  Google Scholar 

  14. Fuchs NE, Schwitter R (1995) Specifying logic programs in controlled natural language. In: Proceedings of the workshop on computational logic for natural language processing, pp 3–5

  15. Futrelle RP (1999) Ambiguity in visual language theory and its role in diagram parsing. In: Proceedings of the IEEE symposium on visual languages (VL’99), IEEE Computer Society, p 172

  16. Gause DC, Weinberg GM (1989) Exploring requirements: quality before design. Dorset House, New York

    MATH  Google Scholar 

  17. Gervasi V, Zowghi D (2010) On the role of ambiguity in RE. In: Proceedings of the 16th international conference on Requirements Engineering: Foundation for Software Quality (REFSQ), pp 248–254

  18. Gnesi S, Lami G, Trentanni G, Fabbrini F, Fusani M (2005) An automatic tool for the analysis of natural language requirements. Int J Comput Syst Sci Eng (IJCSSE) 2(1):53–62

    Google Scholar 

  19. Goldin L, Berry DM (1997) AbstFinder, a prototype natural language text abstraction finder for use in requirements elicitation. Autom Softw Eng 4(4):375–412

    Article  Google Scholar 

  20. Grosz BJ, Joshi AK, Weinstein S (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21(2):203–226

    Google Scholar 

  21. Harter DE, Krishnan MS, Slaughter SA (1998) The life cycle effects of software process improvement: a longitudinal analysis. In: Proceedings of the international conference on information systems, association for information systems, pp 346–351

  22. Iida R, Inui K, Matsumoto Y (2005) Anaphora resolution by antecedent identification followed by anaphoricity determination. ACM Trans Asian Lang Inf Process (TALIP) 4(4):417–434

    Article  Google Scholar 

  23. Kamsties E, Berry D, Paech B (2001) Detecting ambiguities in requirements documents using inspections. In: Proceedings of the first Workshop on Inspection in Software Engineering (WISE’01), pp 68–80

  24. Kaiya H, Saeki M (2006) Using domain ontology as domain knowledge for requirements elicitation. In: Proceedings of 14th IEEE international Requirements Engineering conference (RE’06) pp 186–195

  25. Keren G (1992) Improving decisions and judgments: the desirable versus the feasible. In: Wright G, Bolger F (eds) Expertise and decision support. Plenum Press, Berlin, pp 25–46

    Chapter  Google Scholar 

  26. Kilgarriff A, Rychly P, Smrz P, Tugwell D (2004) The sketch engine. In: Proceedings of the eleventh European Association for Lexicography (EURALEX), pp 105–116

  27. Kim J, Jong CP (2004) BioAR: anaphora resolution for relating protein names to proteome database entries. In: Proceedings of ACL workshop on reference resolution and its applications, pp 79–86

  28. Kiyavitskaya N, Zeni N, Mich L, Berry DM (2008) Requirements for tools for ambiguity identification and measurement in natural language requirements specifications. Requir Eng J 13:207–240

    Article  Google Scholar 

  29. Klebanov B, Wiemer-Hastings PM (2002) Using LSA for pronominal anaphora resolution. In: Proceedings of the third international conference of Computational Linguistics and Intelligent Text Processing (CICLing 2002), Mexico City, Mexico, pp 197–199

  30. Kotonya G, Sommerville I (1998) Requirements engineering processes and techniques. Wiley, New York

    Google Scholar 

  31. Lappin S, Leass H (1994) An algorithm for pronominal anaphora resolution. Comput Linguist 20(4):535–561

    Google Scholar 

  32. Mich L, Garigliano R (2000) Ambiguity measures in requirement engineering. In: Proceedings of international conference on software—theory and practice (ICS2000), pp 39–48

  33. Mich L, Garigliano R (2002) NL-OOPS: a requirements analysis tool based on natural language processing. In: Proceedings of third international conference on data mining, pp 321–330

  34. Mich L, Franch M, Inverardi PN (2004) Market research for requirements analysis using linguistic tools. Requir Eng J 9:40–56

    Article  Google Scholar 

  35. Mitkov R (1998) Robust pronoun resolution with limited knowledge. In: Proceedings of the 18th international conference on Computational Linguistics (COLING’98)/ACL’98, Montreal, Canada, pp 869–875

  36. Mitkov R (2002) Anaphora resolution. Longman, London

    Google Scholar 

  37. Morgan R, Garigliano R, Callaghan P, Poria S, Smith M, Urbanowicz A, Collingham R, Costantino M, Cooper C (1995) Description of the LOLITA system as used in MUC-6. In: Proceedings of the sixth message understanding conference (MUC-6, 1995)

  38. Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 104–111

  39. Ng V (2010) Supervised noun phrase coreference research: the first fifteen years. In: Proceedings of the 48nd annual meeting of the Association for Computational Linguistics (ACL-2010), pp 1396–1411

  40. Oliver DE, Bhalotia G, Schwartz AS, Altman RB, Hearst MA (2004) Tools for loading Medline into a local relational database. BMC Bioinform 5:146

    Article  Google Scholar 

  41. Paul M, Yamamoto K, Sumita E (1999) Corpus-based anaphora resolution towards antecedent preference. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics, workshop “coreference and it’s applications”, pp 47–52

  42. Ponzetto SP, Poesio M (2009) State-of-the-art NLP approaches to coreference resolution: theory and practical recipes. In: tutorial abstracts of ACL-IJCNLP 2009, p 6

  43. Poesio M, Artstein R (2008) Introduction to the special issue on ambiguity and semantic judgements. Res Lang Comput 6:241–245

    Article  Google Scholar 

  44. Saggion H, Carvalho A (1994) Anaphora resolution in a machine translation system. In: Proceedings of the international conference on machine translation, pp 1–14

  45. Schneider GM, Martin J, Tsai WT (1992) An experimental study of fault detection in user requirements documents. ACM Trans Softw Eng Methodol 1(2):188–204

    Article  Google Scholar 

  46. Soon WM, Ng HT, Lim DCY (2001) A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27:521–544

    Article  Google Scholar 

  47. Strube M, Muller C (2003) A machine learning approach to pronoun resolution in spoken dialogue. In: Proceedings of the 41st annual meeting of the Association for Computational Linguistics (ACL), pp 168–175

  48. Sussman SW, Guinan PJ (1999) Antidotes for high complexity and ambiguity in software development. Inf Manage 36:23–35

    Article  Google Scholar 

  49. Tetreault JR (2001) A corpus-based evaluation of centering and pronoun resolution. Comput Linguist 27(4):507–520

    Article  Google Scholar 

  50. Tsuruoka Y, Tateishi Y, Kim J, Ohta T, McNaught J, Ananiadou S (2005) Developing a robust part-of-speech tagger for biomedical text. In: Advances in informatics, pp 382–392

  51. van Rossum W (1997) The implementation of technologies in intensive care units: ambiguity, uncertainty and organizational reactions. Technical report research report 97B51, Research Institute SOM (Systems, Organisations and Management), University of Groningen, Groningen, The Netherlands. http://irs.ub.rug.nl/ppn/165660821 or http://ideas.repec.org/p/dgr/rugsom/97b51.html#download

  52. Wasow T, Perfors A, Beaver D (2003) The puzzle of ambiguity. In: Orgun O, Sells P (eds) Morphology and the web of grammar: essays in memory of Steven G. Lapointe

  53. Walker M, Joshi A, Prince E (1998) Centering theory in discourse. Oxford University Press, Oxford

    MATH  Google Scholar 

  54. Willis A, Chantree F, de Roeck A (2008) Automatic identification of nocuous ambiguity. Res Lang Comput 6(3–4):1–23

    Google Scholar 

  55. Wilson WM, Rosenberg LH, Hyatt LE (1997) Automated analysis of requirement specifications. In: Proceedings of the nineteenth International Conference on Software Engineering (ICSE), pp 161–171

  56. Yang H, de Roeck A, Willis A, Nuseibeh B (2010) A methodology for automatic identification of nocuous ambiguity. In: The 23th international conference on Computational Linguistics (COLING’10), pp 1218–1226

  57. Yang H, Willis A, de Roeck A, Nuseibeh B (2010) Automatic detection of nocuous coordination ambiguities in natural language requirements. In: The 25th IEEE/ACM international conference on Automated Software Engineering (ASE’2010), pp 53–62

  58. Yang H, de Roeck A, Gervasi V., Willis A, Nuseibeh B (2010) Extending nocuous ambiguity analysis for anaphora in natural language requirements. In: Proceedings of 18th IEEE international Requirements Engineering conference (RE’10), pp 25–34

Download references

Acknowledgments

This work was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) as part of the MaTREx project (EP/F068859/1), and by the Science Foundation Ireland (SFI grant 03/CE2/I303_1). We are grateful to our research partners at Lancaster University for their input, and to Ian Alexander for his practical insights and guidance. Moreover, we also wish to acknowledge the anonymous reviewers’ insightful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, H., de Roeck, A., Gervasi, V. et al. Analysing anaphoric ambiguity in natural language requirements. Requirements Eng 16, 163–189 (2011). https://doi.org/10.1007/s00766-011-0119-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-011-0119-y

Keywords

Navigation