Skip to main content
Erschienen in: Soft Computing 8/2015

01.08.2015 | Methodologies and Application

Differential evolution-based feature selection technique for anaphora resolution

verfasst von: Utpal Kumar Sikdar, Asif Ekbal, Sriparna Saha, Olga Uryupina, Massimo Poesio

Erschienen in: Soft Computing | Ausgabe 8/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper a differential evolution (DE)-based feature selection technique is developed for anaphora resolution in a resource-poor language, namely Bengali. We discuss the issues of adapting a state-of-the-art English anaphora resolution system for a resource-poor language like Bengali. Performance of any anaphoric resolver greatly depends on the quality of a high accurate mention detector and the use of appropriate features for anaphora resolution. We develop a number of models for mention detection based on machine learning and heuristics. In anaphora resolution there is no globally accepted metric for measuring the performance, and each of them such as MUC, \(\hbox {B}^{3}\), CEAF, Blanc exhibit significantly different behaviors. Our proposed feature selection technique determines the near-optimal feature set by optimizing each of these evaluation metrics. Experiments show how a language-dependent system (designed primarily for English) can attain reasonably good performance level when re-trained and tested on a new language with a proper subset of features. Evaluation results yield the F-measure values of 66.70, 59.47, 51.56, 33.08 and 72.75 % for MUC, B 3, CEAFM, CEAFE and BLANC, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adapting a state-of-the-art Anaphora resolution system for resource-poor language. In: Proceedings of the sixth international joint conference on natural language processing, Asian Federation of natural language processing Adapting a state-of-the-art Anaphora resolution system for resource-poor language. In: Proceedings of the sixth international joint conference on natural language processing, Asian Federation of natural language processing
Zurück zum Zitat Anderson TW, Scolve S (1978) Introduction to the statistical analysis of data. Houghton Mifflin, BostonMATH Anderson TW, Scolve S (1978) Introduction to the statistical analysis of data. Houghton Mifflin, BostonMATH
Zurück zum Zitat Bagga A, Baldwin B (1998) Algorithms for scoring coreference chains. In: Proceedings of the LREC workshop on linguistic coreference, Granada, pp 563–566 Bagga A, Baldwin B (1998) Algorithms for scoring coreference chains. In: Proceedings of the LREC workshop on linguistic coreference, Granada, pp 563–566
Zurück zum Zitat Chatterji S, Dhar A, Barik B, Moumita PK, Sarkar S, Basu A (2011) Anaphora resolution for Bengali, Hindi, and Tamil using random tree algorithm in wek. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages Chatterji S, Dhar A, Barik B, Moumita PK, Sarkar S, Basu A (2011) Anaphora resolution for Bengali, Hindi, and Tamil using random tree algorithm in wek. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages
Zurück zum Zitat Dakwale P, Sharma H (2011) Anaphora resolution in Indian languages using hybrid approaches. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages Dakwale P, Sharma H (2011) Anaphora resolution in Indian languages using hybrid approaches. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages
Zurück zum Zitat Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassell S, Weischedel R (2004) The automatic content extraction (ACE) program-tasks, data, and evaluation. In: Proceedings of LREC Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassell S, Weischedel R (2004) The automatic content extraction (ACE) program-tasks, data, and evaluation. In: Proceedings of LREC
Zurück zum Zitat Ekbal A, Saha S, Uryupina O, Poesio M (2011) Multiobjective simulated annealing based approach for feature selection in anaphora resolution. In: Proceedings of the DAARC, pp 47–58 Ekbal A, Saha S, Uryupina O, Poesio M (2011) Multiobjective simulated annealing based approach for feature selection in anaphora resolution. In: Proceedings of the DAARC, pp 47–58
Zurück zum Zitat Ghosh A, Neogi S, Chakrabarty S, Bandyopadhyay S (2011) Anaphora resolution in Bengali. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages Ghosh A, Neogi S, Chakrabarty S, Bandyopadhyay S (2011) Anaphora resolution in Bengali. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages
Zurück zum Zitat Hoste V (2005) Optimization issues in machine learning of coreference resolution. PhD thesis, Antwerp University Hoste V (2005) Optimization issues in machine learning of coreference resolution. PhD thesis, Antwerp University
Zurück zum Zitat Iida R, Inui K, Takamura H, Matsumoto Y (2003) Incorporating contextual cues in trainable models for coreference resolution. In: Proceedings of the EACL workshop on the computational treatment of Anaphora Iida R, Inui K, Takamura H, Matsumoto Y (2003) Incorporating contextual cues in trainable models for coreference resolution. In: Proceedings of the EACL workshop on the computational treatment of Anaphora
Zurück zum Zitat Lafferty J (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Morgan Kaufmann, San Francisco, pp 282–289 Lafferty J (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Morgan Kaufmann, San Francisco, pp 282–289
Zurück zum Zitat Luo X (2005) On coreference resolution performance metrics. In: Proceedings of the NAACL/EMNLP, Vancouver Luo X (2005) On coreference resolution performance metrics. In: Proceedings of the NAACL/EMNLP, Vancouver
Zurück zum Zitat Luo X, Ittycheriah A, Jing H, Kambhatla A, Roukos S (2004) A mention-synchronous coreference resolution algorithm based on the bell tree. In. Proceedings of the ACL, pp 135–142 Luo X, Ittycheriah A, Jing H, Kambhatla A, Roukos S (2004) A mention-synchronous coreference resolution algorithm based on the bell tree. In. Proceedings of the ACL, pp 135–142
Zurück zum Zitat Luo X, Ittycheriah A, Jing H, Kambhatla N, Roukos S (2004) A mention-synchronous coreference resolution algorithm based on the Bell Tree. In: Proceedings of ACL, pp 136–143 Luo X, Ittycheriah A, Jing H, Kambhatla N, Roukos S (2004) A mention-synchronous coreference resolution algorithm based on the Bell Tree. In: Proceedings of ACL, pp 136–143
Zurück zum Zitat McCarthy JF, Lehnert WG (1995) Using decision trees for coreference resolution. In: Proceedings of the fourteenth international joint conference on atificial intelligence, pp 1050–1055 McCarthy JF, Lehnert WG (1995) Using decision trees for coreference resolution. In: Proceedings of the fourteenth international joint conference on atificial intelligence, pp 1050–1055
Zurück zum Zitat Mitkov R (1999) Introduction: special issue on anaphora resolution in machine translation and multilingual nlp. Mach Transl 14:159–161CrossRef Mitkov R (1999) Introduction: special issue on anaphora resolution in machine translation and multilingual nlp. Mach Transl 14:159–161CrossRef
Zurück zum Zitat Morton TS (1999) Using coreference in question answering. In: Proceedings of the 8th text REtrieval conference (TREC-8), pp 85–89 Morton TS (1999) Using coreference in question answering. In: Proceedings of the 8th text REtrieval conference (TREC-8), pp 85–89
Zurück zum Zitat Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 104–111 Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 104–111
Zurück zum Zitat Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 104–111 Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 104–111
Zurück zum Zitat Poesio M, Kabadjov MA (2004) A general-purpose, off-the-shelf anaphora resolution module: Implementation and preliminary evaluation. In: Proceeding of LREC, pp 663–666 Poesio M, Kabadjov MA (2004) A general-purpose, off-the-shelf anaphora resolution module: Implementation and preliminary evaluation. In: Proceeding of LREC, pp 663–666
Zurück zum Zitat Ponzetto SP, Strube M (2006) Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the human language technology conference of the NAACL, Main Conference, New York City, USA, Association for Computational Linguistics, pp 192–199 Ponzetto SP, Strube M (2006) Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the human language technology conference of the NAACL, Main Conference, New York City, USA, Association for Computational Linguistics, pp 192–199
Zurück zum Zitat Quinlan JR (1993) Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco Quinlan JR (1993) Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco
Zurück zum Zitat Recasens M, Hovy E (2011) Blanc: Implementing the rand index for coreference evaluation. Nat Lang Eng 17:485–510CrossRef Recasens M, Hovy E (2011) Blanc: Implementing the rand index for coreference evaluation. Nat Lang Eng 17:485–510CrossRef
Zurück zum Zitat Recasens M, Hovy E (2009) A deeper look into features for coreference resolution. In: Lalitha Devi S, Branco A, Mitkov R (eds.) Anaphora processing and applications (DAARC 2009. Number 5847 in LNAI). Springer, Berlin/Heidelberg, pp 29–42 Recasens M, Hovy E (2009) A deeper look into features for coreference resolution. In: Lalitha Devi S, Branco A, Mitkov R (eds.) Anaphora processing and applications (DAARC 2009. Number 5847 in LNAI). Springer, Berlin/Heidelberg, pp 29–42
Zurück zum Zitat Saha S, Ekbal A, Uryupina O, Poesio M (2011) Single and multi-objective optimization for feature selection in anaphora resolution. In: Proceedings of the fifth international joint conference in natural langauge processing (IJCNLP 2011), pp 93–101 Saha S, Ekbal A, Uryupina O, Poesio M (2011) Single and multi-objective optimization for feature selection in anaphora resolution. In: Proceedings of the fifth international joint conference in natural langauge processing (IJCNLP 2011), pp 93–101
Zurück zum Zitat Senapati A, Garain U (2011) Anaphora resolution system for Bengali by pronoun emitting approach. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages Senapati A, Garain U (2011) Anaphora resolution system for Bengali by pronoun emitting approach. In: Proceedings of NLP Tools Contest on Anaphora Resolution in Indian Languages
Zurück zum Zitat Sha F, Pereira F (2003) Shallow parsing with conditional random fields, pp 213–220 Sha F, Pereira F (2003) Shallow parsing with conditional random fields, pp 213–220
Zurück zum Zitat Sikdar U, Ekbal A, Saha S, Uryupina O, Poesio M (2013) Adapting a state-of-the-art anaphora resolution system for resource-poor language. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP), pp 815–821 Sikdar U, Ekbal A, Saha S, Uryupina O, Poesio M (2013) Adapting a state-of-the-art anaphora resolution system for resource-poor language. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP), pp 815–821
Zurück zum Zitat Soon WM, Ng HT, Lim DCY (2001) A machine learning approach to coreference resolution of noun phrases. Comput Linguist 27(4):521–544CrossRef Soon WM, Ng HT, Lim DCY (2001) A machine learning approach to coreference resolution of noun phrases. Comput Linguist 27(4):521–544CrossRef
Zurück zum Zitat Steinberger J, Poesio M, Kabadjov MA, Jeek K (2007) Two uses of anaphora resolution in summarization. In: Information processing and management: an international journal, pp 1663–1680 Steinberger J, Poesio M, Kabadjov MA, Jeek K (2007) Two uses of anaphora resolution in summarization. In: Information processing and management: an international journal, pp 1663–1680
Zurück zum Zitat Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359MathSciNetCrossRefMATH Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359MathSciNetCrossRefMATH
Zurück zum Zitat Uryupina O (2007) Knowledge acquisition for coreference resolution. PhD thesis, University of the Saarland Uryupina O (2007) Knowledge acquisition for coreference resolution. PhD thesis, University of the Saarland
Zurück zum Zitat Uryupina O (2010) Corry: a system for coreference resolution. In: Proceedings of the 5th international workshop on semantic evaluation (SemEval’10) Uryupina O (2010) Corry: a system for coreference resolution. In: Proceedings of the 5th international workshop on semantic evaluation (SemEval’10)
Zurück zum Zitat Versley Y (2006) A constraint-based approach to noun phrase coreference resolution in german newspaper text. In: Proceedings of Konferenz zur Verarbeitung Nat rlicher Sprache, pp 143–150 Versley Y (2006) A constraint-based approach to noun phrase coreference resolution in german newspaper text. In: Proceedings of Konferenz zur Verarbeitung Nat rlicher Sprache, pp 143–150
Zurück zum Zitat Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: a modular toolkit for coreference resolution. In: HLT-demonstrations ’08 proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies, pp 9–12 Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: a modular toolkit for coreference resolution. In: HLT-demonstrations ’08 proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies, pp 9–12
Zurück zum Zitat Vilain M, Burger J, Aberdeen J, Connolly D, Hirschman L (1995) A model-theoretic coreference scoring scheme. In: Proceedings of the sixth message understanding conference, pp 45–52 Vilain M, Burger J, Aberdeen J, Connolly D, Hirschman L (1995) A model-theoretic coreference scoring scheme. In: Proceedings of the sixth message understanding conference, pp 45–52
Zurück zum Zitat Walker C, Strassel S, Medero J, Maeda K (2006) Ace 2005 multilingual training corpus. Linguistic data consortium, Ldc2006t06 philadelphia penn Walker C, Strassel S, Medero J, Maeda K (2006) Ace 2005 multilingual training corpus. Linguistic data consortium, Ldc2006t06 philadelphia penn
Zurück zum Zitat Weischedel R, Pradhan S, Ramshaw L, Palmer M, Xue N, Marcus M, Taylor A, Greenberg C, Hovy E, Belvin R, Houston A (2008) Ontonotes release 2.0. Linguistic data consortium, ldc2008t04 philadelphia penn Weischedel R, Pradhan S, Ramshaw L, Palmer M, Xue N, Marcus M, Taylor A, Greenberg C, Hovy E, Belvin R, Houston A (2008) Ontonotes release 2.0. Linguistic data consortium, ldc2008t04 philadelphia penn
Zurück zum Zitat Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques (Morgan Kaufmann Series in Data Management Systems), 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques (Morgan Kaufmann Series in Data Management Systems), 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco
Zurück zum Zitat Yang X, Su J, Tan CL (2005) A twin-candidate model of coreference resolution with non-anaphor identification capability. In: Proceedings of IJCNLP, pp 719–730 Yang X, Su J, Tan CL (2005) A twin-candidate model of coreference resolution with non-anaphor identification capability. In: Proceedings of IJCNLP, pp 719–730
Zurück zum Zitat Yang X, Zhou G, Su J, Tan CL (2003) Coreference resolution using competition learning approach. In: Proceedings of the 41st annual meeting of the association for computational linguistics, pp 176–183 Yang X, Zhou G, Su J, Tan CL (2003) Coreference resolution using competition learning approach. In: Proceedings of the 41st annual meeting of the association for computational linguistics, pp 176–183
Metadaten
Titel
Differential evolution-based feature selection technique for anaphora resolution
verfasst von
Utpal Kumar Sikdar
Asif Ekbal
Sriparna Saha
Olga Uryupina
Massimo Poesio
Publikationsdatum
01.08.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 8/2015
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-014-1397-3

Weitere Artikel der Ausgabe 8/2015

Soft Computing 8/2015 Zur Ausgabe

Premium Partner