research-article

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India

Authors:
Arda Akdemir

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

,
Ali Hürriyetoğlu

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

,
Erdem Yörük

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

,
Burak Gürel

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

,
Çağri Yoltar

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

,
Deniz Yüret

Koç University, Istanbul, Sariyer

Koç University, Istanbul, Sariyer
View Profile

GIR'18: Proceedings of the 12th Workshop on Geographic Information RetrievalNovember 2018Article No.: 8Pages 1–10https://doi.org/10.1145/3281354.3281363

Published:06 November 2018Publication History

GIR'18: Proceedings of the 12th Workshop on Geographic Information Retrieval

Pages 1–10

ABSTRACT

Place name recognition is one of the key tasks in Information Extraction. In this paper, we tackle this task in English News from India. We first analyze the results obtained by using available tools and corpora and then train our own models to obtain better results. Most of the previous work done on entity recognition for English makes use of similar corpora for both training and testing. Yet we observe that the performance drops significantly when we test the models on different datasets. For this reason, we have trained various models using combinations of several corpora. Our results show that training models using combinations of several corpora improves the relative performance of these models but still more research on this area is necessary to obtain place name recognizers that generalize to any given dataset.

References

N Abinaya, Neethu John, Barathi HB Ganesh, Anand M Kumar, and KP Soman. 2014. AMRITA_CEN@ FIRE-2014: Named Entity Recognition for Indian Languages using Rich Features. In Proceedings of the Forum for Information Retrieval Evaluation. ACM, 103--111. Google ScholarDigital Library
Beatrice Alex, Kate Byrne, Claire Grover, and Richard Tobin. 2015. Adapting the Edinburgh geoparser for historical georeferencing. International Journal of Humanities and Arts Computing 9, 1 (2015), 15--35.Google ScholarCross Ref
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735. Google ScholarDigital Library
Isabelle Augenstein, Leon Derczynski, and Kalina Bontcheva. 2017. Generalisation in named entity recognition: A quantitative analysis. Computer Speech & Language 44 (2017), 61--83. Google ScholarDigital Library
Jason PC Chiu and Eric Nichols. 2015. Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308 (2015).Google Scholar
Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, Aug (2011), 2493--2537. Google ScholarDigital Library
Francisco Couto, Luis Campos, and Andre Lamurias. 2017. MER: a Minimal Named-Entity Recognition Tagger and Annotation Server. (04 2017).Google Scholar
Grant DeLozier, Jason Baldridge, and Loretta London. 2015. Gazetteer-Independent Toponym Resolution Using Geographic Word Profiles.. In AAAI. 2382--2388. Google ScholarDigital Library
Franck Dernoncourt, Ji Young Lee, and Peter Szolovits. 2017. NeuroNER: an easy-to-use program for named-entity recognition based on neural networks. Conference on Empirical Methods on Natural Language Processing (EMNLP) (2017).Google ScholarCross Ref
Franck Dernoncourt, Ji Young Lee, Ozlem Uzuner, and Peter Szolovits. 2016. De-identification of Patient Notes with Recurrent Neural Networks. Journal of the American Medical Informatics Association (JAMIA) (2016).Google Scholar
Allyson Ettinger, Sudha Rao, Hal Daumé III, and Emily M Bender. 2017. Towards linguistically generalizable nlp systems: A workshop and shared task. arXiv preprint arXiv:1711.01505 (2017).Google Scholar
Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 363--370. Google ScholarDigital Library
Christopher B. Jones and Ross S. Purves. 2008. Geographical information retrieval. International Journal of Geographical Information Science 22, 3 (2008), 219--228. Google ScholarDigital Library
Morteza Karimzadeh, Wenyi Huang, Siddhartha Banerjee, Jan Oliver Wallgrün, Frank Hardisty, Scott Pezanowski, Prasenjit Mitra, and Alan M MacEachren. 2013. GeoTxt: a web API to leverage place references in text. In Proceedings of the 7th workshop on geographic information retrieval. ACM, 72--73. Google ScholarDigital Library
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016).Google Scholar
Thomas Lavergne, Olivier Cappé, and François Yvon. 2010. Practical Very Large Scale CRFs. In Proceedings the 48th Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, 504--513. http://www.aclweb.org/anthology/P10-1052 Google ScholarDigital Library
David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Lingvisticae Investigationes 30, 1 (2007), 3--26.Google ScholarCross Ref
Sailaja Pingali. 2009. Indian English. Edinburgh University Press.Google Scholar
Dinesh Kumar Prabhakar, Shantanu Dubey, Bharti Goel, and Sukomal Pal. 2014. ISM@FIRE-2014: Named Entity Recognition for Indian Languages. In Proceedings of the Forum for Information Retrieval Evaluation. ACM, 98--102. Google ScholarDigital Library
Ross S Purves, Paul Clough, Christopher B Jones, Mark H Hall, Vanessa Murdock, et al. 2018. Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text. Foundations and Trends® in Information Retrieval 12, 2-3 (2018), 164--318.Google Scholar
Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer.Google Scholar
SP Sanjay, M Anand Kumar, and KP Soman. 2015. AMRITA_CEN-NLP@ FIRE 2015: CRF Based Named Entity Extractor For Twitter Microposts.. In FIRE Workshops. 96--99.Google Scholar
Erik F Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 142--147. Google ScholarDigital Library

Index Terms

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
2. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval tasks and goals
      1. Information extraction

Recommendations

Biomedical Named Entity Recognition with less Supervision
ICHI '15: Proceedings of the 2015 International Conference on Healthcare Informatics

Annotating clinical notes manually is very labor-intensive and needs expertise in the area of annotation. Thus annotation is a highly expensive task not only in human resource but also in financial aspects. Moreover mistakes, missed tags, and ...
Read More
Generalisation in named entity recognition

Quantitative study of NER performance in diverse corpora of different genres, including newswire and social media.Multiple state of the art NER approaches are tested.Possible reasons for NER failure are analysed and quantified: NE diversity, unseen NEs ...
Read More
NERA: Named Entity Recognition for Arabic

Name identification has been worked on quite intensively for the past few years, and has been incorporated into several products revolving around natural language processing tasks. Many researchers have attacked the name identification problem in a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

GIR'18: Proceedings of the 12th Workshop on Geographic Information Retrieval
November 2018
37 pages
ISBN:9781450360340
DOI:10.1145/3281354

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Entity Extraction
Machine Learning
Named Entity Recognition
Natural Language Processing
Place Name Recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
GIR'18 Paper Acceptance Rate8of12submissions,67%Overall Acceptance Rate46of61submissions,75%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India

GIR'18: Proceedings of the 12th Workshop on Geographic Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Biomedical Named Entity Recognition with less Supervision

Generalisation in named entity recognition

NERA: Named Entity Recognition for Arabic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India

GIR'18: Proceedings of the 12th Workshop on Geographic Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Biomedical Named Entity Recognition with less Supervision

Generalisation in named entity recognition

NERA: Named Entity Recognition for Arabic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media