Skip to main content
Top
Published in: GeoInformatica 4/2023

04-11-2022

A spatially-aware algorithm for location extraction from structured documents

Authors: Praval Sharma, Ashok Samal, Leen-Kiat Soh, Deepti Joshi

Published in: GeoInformatica | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Place names facilitate locating and distinguishing geographic space where human activities and natural phenomena occur. Extracting place names at multiple spatial resolutions from text is beneficial in several tasks such as identifying the location of events, enriching gazetteers, discovering connections between events and places, etc. Most modern place name extraction approaches generalize the linguistic rules and lexical features as a universal rule and ignore patterns inherent in place names in the geographic contexts. As a result, they lack spatial awareness to effectively identify place names from different geographic contexts, especially the lesser-known place names. In this research, we develop a novel Spatially-Aware Location Extraction (SALE) algorithm for place name extraction from structured documents that uses a hybrid approach comprising of knowledge-driven and data-driven methods. We build a custom named entity recognition (NER) system based on the conditional random field (CRF) and train/ fine-tune it using spatial features extracted from a dataset based on a given geographic region. SALE uses multiple pathways, including the use of the spatially tuned NER to enhance the efficacy in our place names extraction. The experimental results using a large geographic region show that our algorithm outperforms well-known state-of-the-art place name recognizers.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Perko D, Jordan P, Komac B (2017) Exonyms and other geographical names. Acta Geogr Slov 57(1):99–107CrossRef Perko D, Jordan P, Komac B (2017) Exonyms and other geographical names. Acta Geogr Slov 57(1):99–107CrossRef
2.
go back to reference Jones CB, Abdelmoty AI, Finch D, Fu G, Vaid S (2004) The SPIRIT spatial search engine: Architecture, ontologies and spatial indexing. In: International Conference on Geographic Information Science Jones CB, Abdelmoty AI, Finch D, Fu G, Vaid S (2004) The SPIRIT spatial search engine: Architecture, ontologies and spatial indexing. In: International Conference on Geographic Information Science
3.
go back to reference Murphy AB (1998) Rediscovering the importance of geography. Chronicle of Higher Education Murphy AB (1998) Rediscovering the importance of geography. Chronicle of Higher Education
4.
go back to reference Kapur A (2019) Mapping place names of India. Routledge and CRC Press, New YorkCrossRef Kapur A (2019) Mapping place names of India. Routledge and CRC Press, New YorkCrossRef
5.
go back to reference Gao S, Li L, Li W, Janowicz K, Zhang Y (2017) Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput Environ Urban Syst 61:172–186CrossRef Gao S, Li L, Li W, Janowicz K, Zhang Y (2017) Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput Environ Urban Syst 61:172–186CrossRef
6.
go back to reference Leetaru KH (2011) Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. First Monday 16(9):1–22 Leetaru KH (2011) Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. First Monday 16(9):1–22
7.
go back to reference Chen H, Vasardani M, Winter S (2019) Clustering-based disambiguation of fine-grained place names from descriptions. GeoInformatica 23:449–472CrossRef Chen H, Vasardani M, Winter S (2019) Clustering-based disambiguation of fine-grained place names from descriptions. GeoInformatica 23:449–472CrossRef
8.
go back to reference Shi L, Wu Y, Liu L, Sun X, Jiang L (2018) Event detection and identification of influential spreaders in social media data streams. Big Data Min Anal 1(1):34–46CrossRef Shi L, Wu Y, Liu L, Sun X, Jiang L (2018) Event detection and identification of influential spreaders in social media data streams. Big Data Min Anal 1(1):34–46CrossRef
9.
go back to reference Laere OV, Quinn J, Schockaert S, Dhoedt B (2014) Spatially aware term selection for geotagging. IEEE Trans Knowl Data Eng 26(1):221–234CrossRef Laere OV, Quinn J, Schockaert S, Dhoedt B (2014) Spatially aware term selection for geotagging. IEEE Trans Knowl Data Eng 26(1):221–234CrossRef
10.
go back to reference Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240CrossRef Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240CrossRef
11.
go back to reference Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning, ICML 2001 Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning, ICML 2001
12.
go back to reference Krippendorff K (1980) Content analysis: an introduction to its methodology. Sage Publication, LondonMATH Krippendorff K (1980) Content analysis: an introduction to its methodology. Sage Publication, LondonMATH
13.
go back to reference Weiss AS (2019) Journalists and their perceptions of location: making meaning in the community. Journal Stud 21(3):352–369MathSciNet Weiss AS (2019) Journalists and their perceptions of location: making meaning in the community. Journal Stud 21(3):352–369MathSciNet
14.
go back to reference Goggin G, Martin F, Dwyer T (2015) Locative news. Journal Stud 16(1):41–59 Goggin G, Martin F, Dwyer T (2015) Locative news. Journal Stud 16(1):41–59
15.
go back to reference Nyre L, Bjørnestad S, Tessem B, Øie KV (2012) Locative journalism: Designing a location-dependent news medium for smartphones. Convergence 18(3):297–314CrossRef Nyre L, Bjørnestad S, Tessem B, Øie KV (2012) Locative journalism: Designing a location-dependent news medium for smartphones. Convergence 18(3):297–314CrossRef
16.
go back to reference Jansson A, Lindell J (2015) News media consumption in the transmedia age. Journal Stud 16(1):79–96 Jansson A, Lindell J (2015) News media consumption in the transmedia age. Journal Stud 16(1):79–96
17.
go back to reference Kadmon N (2001) Toponymy: The lore, laws and language of geographical names. Vantage Press Inc, New York Kadmon N (2001) Toponymy: The lore, laws and language of geographical names. Vantage Press Inc, New York
18.
go back to reference Tuan Y-F (1991) Language and the making of place: A narrative-descriptive approach. Ann Assoc Am Geogr 81(4):684–696MathSciNetCrossRef Tuan Y-F (1991) Language and the making of place: A narrative-descriptive approach. Ann Assoc Am Geogr 81(4):684–696MathSciNetCrossRef
19.
go back to reference Tuan Y-F (1977) Space and place: The perspective of experience. University of Minnesota Press, Minneapolis Tuan Y-F (1977) Space and place: The perspective of experience. University of Minnesota Press, Minneapolis
20.
go back to reference Basso KH (1988) “Speaking with names”: Language and landscape among the Western Apache. Cult Anthropol 3(2):99–130CrossRef Basso KH (1988) “Speaking with names”: Language and landscape among the Western Apache. Cult Anthropol 3(2):99–130CrossRef
21.
go back to reference Rose-Redwood RS, Alderman DH, Azaryahu M (2010) Geographies of toponymic inscription: New directions in critical place name studies. Prog Hum Geogr 34(4):453–470 Rose-Redwood RS, Alderman DH, Azaryahu M (2010) Geographies of toponymic inscription: New directions in critical place name studies. Prog Hum Geogr 34(4):453–470
22.
23.
go back to reference Ozdikis O, Ramampiaro H, Nørvag K (2018) Spatial statistics of term co-occurrences for location prediction of tweets. In: European Conference on Information Retrieval Ozdikis O, Ramampiaro H, Nørvag K (2018) Spatial statistics of term co-occurrences for location prediction of tweets. In: European Conference on Information Retrieval
24.
go back to reference Pritt SW (2012) Geolocation of photographs by means of horizon matching with digital elevation models. In: IEEE International Geoscience and Remote Sensing Symposium. Munich, Germany Pritt SW (2012) Geolocation of photographs by means of horizon matching with digital elevation models. In: IEEE International Geoscience and Remote Sensing Symposium. Munich, Germany
25.
go back to reference Amitay E, Har’El N, Sivan R, Soffer A (2004) Web-a-Where: Geotagging web content. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval Amitay E, Har’El N, Sivan R, Soffer A (2004) Web-a-Where: Geotagging web content. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
26.
go back to reference Larson RR (1996) Geographic information retrieval and spatial browsing. In: Smith LC and Gluck M (eds) GIS and libraries: Patrons, maps and spatial information. University of Illinois at Urbana-Champaign, Urbana, pp 81–124 Larson RR (1996) Geographic information retrieval and spatial browsing. In: Smith LC and Gluck M (eds) GIS and libraries: Patrons, maps and spatial information. University of Illinois at Urbana-Champaign, Urbana, pp 81–124
27.
go back to reference Purves RS, Clough P, Jones CB, Arampatzis A, Bucher B, Finch D, Fu G, Joho H, Syed KA, Vaid S, Yang B (2007) The design and implementation of SPIRIT: A spatially aware search engine for information retrieval on the Internet. Int J Geogr Inf Sci 21(7):717–745CrossRef Purves RS, Clough P, Jones CB, Arampatzis A, Bucher B, Finch D, Fu G, Joho H, Syed KA, Vaid S, Yang B (2007) The design and implementation of SPIRIT: A spatially aware search engine for information retrieval on the Internet. Int J Geogr Inf Sci 21(7):717–745CrossRef
28.
go back to reference DeLozier G, Baldridge J, London L (2015) Gazetteer-independent toponym resolution using geographic word profiles. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, Texas DeLozier G, Baldridge J, London L (2015) Gazetteer-independent toponym resolution using geographic word profiles. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, Texas
29.
go back to reference Yu J, Rafiei D (2016) Geotagging named entities in news and online documents. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, Indiana Yu J, Rafiei D (2016) Geotagging named entities in news and online documents. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, Indiana
30.
go back to reference Karimzadeh M, Huang W, Banerjee S, Wallgrün JO, Hardisty F, Pezanowski S, Mitra P, MacEachren AM (2013) GeoTxt: A web API to leverage place references in text. In: Proceedings of the 7th Workshop on Geographic Information Retrieval, Orlando, Florida Karimzadeh M, Huang W, Banerjee S, Wallgrün JO, Hardisty F, Pezanowski S, Mitra P, MacEachren AM (2013) GeoTxt: A web API to leverage place references in text. In: Proceedings of the 7th Workshop on Geographic Information Retrieval, Orlando, Florida
31.
go back to reference Hu Y, Mao H, McKenzie G (2018) A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements. Int J Geogr Inf Sci 33(4):714–738CrossRef Hu Y, Mao H, McKenzie G (2018) A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements. Int J Geogr Inf Sci 33(4):714–738CrossRef
32.
go back to reference Teitlery BE, Lieberman MD, Panozzoy D, Sankaranarayanan J, Samety H, Sperling J (2008) NewsStand: A new view on news. In: Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Irvine, California Teitlery BE, Lieberman MD, Panozzoy D, Sankaranarayanan J, Samety H, Sperling J (2008) NewsStand: A new view on news. In: Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Irvine, California
33.
go back to reference Grover C, Tobin R, Byrne K, Woollard M, Reid J, Dunn S, Ball J (2010) Use of the Edinburgh Geoparser for georeferencing digitized historical collections. Philos Trans R Soc Lond A Math Phys Eng Sci 368(1925):3875–3889 Grover C, Tobin R, Byrne K, Woollard M, Reid J, Dunn S, Ball J (2010) Use of the Edinburgh Geoparser for georeferencing digitized historical collections. Philos Trans R Soc Lond A Math Phys Eng Sci 368(1925):3875–3889
34.
go back to reference Lieberman MD, Samet H, Sankaranarayanan J (2010) Geotagging with local lexicons to build indexes for textually-specified spatial data. In: IEEE 26th International Conference on Data Engineering, ICDE 2010, Long Beach, California Lieberman MD, Samet H, Sankaranarayanan J (2010) Geotagging with local lexicons to build indexes for textually-specified spatial data. In: IEEE 26th International Conference on Data Engineering, ICDE 2010, Long Beach, California
35.
go back to reference Gelernter J, Balaji S (2013) An algorithm for local geoparsing of microtext. GeoInformatica 17:635–667CrossRef Gelernter J, Balaji S (2013) An algorithm for local geoparsing of microtext. GeoInformatica 17:635–667CrossRef
36.
go back to reference Scalia G, Francalanci C, Pernici B (2022) CIME: Context-aware geolocation of emergency-related posts. GeoInformatica 26:125–157CrossRef Scalia G, Francalanci C, Pernici B (2022) CIME: Context-aware geolocation of emergency-related posts. GeoInformatica 26:125–157CrossRef
37.
go back to reference Stokes N, Li Y, Moffat A, Rong J (2008) An empirical study of the effects of NLP components on geographic IR performance. Int J Geogr Inf Sci 22(3):247–264CrossRef Stokes N, Li Y, Moffat A, Rong J (2008) An empirical study of the effects of NLP components on geographic IR performance. Int J Geogr Inf Sci 22(3):247–264CrossRef
38.
go back to reference Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investigationes 30(1):3–26 Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investigationes 30(1):3–26
39.
go back to reference Leidner JL, Lieberman MD (2011) Detecting geographical references in the form of place names and associated spatial natural language. SIGSPATIAL Special 3(2):5–11CrossRef Leidner JL, Lieberman MD (2011) Detecting geographical references in the form of place names and associated spatial natural language. SIGSPATIAL Special 3(2):5–11CrossRef
40.
go back to reference Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, ACL 2005 Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, ACL 2005
41.
go back to reference Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, Boulder, Colorado Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, Boulder, Colorado
42.
go back to reference Marrero M, Urbano J, Sanchez-Cuadrado S, Morato J, Gomez-Berbis JM (2013) Named entity recognition: Fallacies, challenges and opportunities. Comput Stand Interfaces 35(5):482–489CrossRef Marrero M, Urbano J, Sanchez-Cuadrado S, Morato J, Gomez-Berbis JM (2013) Named entity recognition: Fallacies, challenges and opportunities. Comput Stand Interfaces 35(5):482–489CrossRef
43.
go back to reference Vilain M, Su J, Lubar S (2007) Entity extraction is a boring solved problem-or is it? In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York Vilain M, Su J, Lubar S (2007) Entity extraction is a boring solved problem-or is it? In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York
44.
go back to reference Mota C, Grishman R (2008) Is this NE tagger getting old? In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco Mota C, Grishman R (2008) Is this NE tagger getting old? In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco
45.
go back to reference Lingad J, Karimi S, Yin J (2013) Location extraction from disaster-related microblogs. In: Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil Lingad J, Karimi S, Yin J (2013) Location extraction from disaster-related microblogs. In: Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil
46.
go back to reference Grishman R, Sundheim B (1996) Message understanding conference-6: A brief history. In: COLING 1996: Proceedings of the 16th conference on Computational linguistics, Copenhagen, Denmark Grishman R, Sundheim B (1996) Message understanding conference-6: A brief history. In: COLING 1996: Proceedings of the 16th conference on Computational linguistics, Copenhagen, Denmark
47.
go back to reference Rau LF (1991) Extracting company names from text. In: The Seventh IEEE Conference on Artificial Intelligence Application, Miami Beach, Florida Rau LF (1991) Extracting company names from text. In: The Seventh IEEE Conference on Artificial Intelligence Application, Miami Beach, Florida
48.
go back to reference Black W, Rinaldi WJ, Mowatt F (1998) D FACILE:escription of the NE system used for MUC-7. In: Seventh Message Understanding Conference, MUC-7, Fairfax, Virginia Black W, Rinaldi WJ, Mowatt F (1998) D FACILE:escription of the NE system used for MUC-7. In: Seventh Message Understanding Conference, MUC-7, Fairfax, Virginia
49.
go back to reference Krupka GR, Hausman K (2005) IsoQuest inc.: Description of the NetOwl extractor system as used for MUC-7. In: Seventh Message Understanding Conference, MUC-7. Fairfax, Virginia Krupka GR, Hausman K (2005) IsoQuest inc.: Description of the NetOwl extractor system as used for MUC-7. In: Seventh Message Understanding Conference, MUC-7. Fairfax, Virginia
50.
go back to reference Humphreys K, Gaizauskas R, Azzam S, Huyck C, Mitchel B, Cunningham H, Wilks Y (1998) University of Sheffield: Description of the LaSIE-II system as used for MUC-7. In: Seventh Message Understanding Conference, MUC-7, Fairfax, Virginia Humphreys K, Gaizauskas R, Azzam S, Huyck C, Mitchel B, Cunningham H, Wilks Y (1998) University of Sheffield: Description of the LaSIE-II system as used for MUC-7. In: Seventh Message Understanding Conference, MUC-7, Fairfax, Virginia
51.
go back to reference Li J, Sun A, Han J, Li C (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70 Li J, Sun A, Han J, Li C (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70
52.
go back to reference Bikel DM, Schwartz R, Weischedel RM (1999) An algorithm that learns what’s in a name. Mach Learn 34:211–231CrossRefMATH Bikel DM, Schwartz R, Weischedel RM (1999) An algorithm that learns what’s in a name. Mach Learn 34:211–231CrossRefMATH
53.
go back to reference Zhou G, Jian S (2002) Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA Zhou G, Jian S (2002) Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA
54.
go back to reference Curran J, Clark S (2003) Language independent NER using a maximum entropy tagger. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 Curran J, Clark S (2003) Language independent NER using a maximum entropy tagger. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003
55.
go back to reference Sang EFTK, Meulder FD (2003) Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 Sang EFTK, Meulder FD (2003) Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003
56.
go back to reference Bikel DM, Miller S, Schwartz R, Weischedel R (1997) Nymble: A high-performance learning name-finder. In: Fifth Conference on Applied Natural Language Processing, Washington, DC, USA Bikel DM, Miller S, Schwartz R, Weischedel R (1997) Nymble: A high-performance learning name-finder. In: Fifth Conference on Applied Natural Language Processing, Washington, DC, USA
57.
go back to reference Borthwick A, Sterling J, Agichtein E, Grishman R (1999) Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In: Sixth Workshop on Very Large Corpora Borthwick A, Sterling J, Agichtein E, Grishman R (1999) Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In: Sixth Workshop on Very Large Corpora
58.
go back to reference Chieu HL, Ng HT (2002) Named entity recognition: A maximum entropy approach using global information. In: CoLING 2002 Chieu HL, Ng HT (2002) Named entity recognition: A maximum entropy approach using global information. In: CoLING 2002
59.
go back to reference Bender O, Och FJ, Ney H (2003) Maximum entropy models for named entity recognition. In: Proceedings of the seventh conference on Natural Language Learning at HLT-NAACL Bender O, Och FJ, Ney H (2003) Maximum entropy models for named entity recognition. In: Proceedings of the seventh conference on Natural Language Learning at HLT-NAACL
60.
go back to reference Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Conference of the North American Chapter of the Association for Computational Linguistics & Human Language Technologies, NAACL-HLT 2003 Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Conference of the North American Chapter of the Association for Computational Linguistics & Human Language Technologies, NAACL-HLT 2003
61.
go back to reference McCallum A, Li W (2003) Early results for NER with CRF, feature induction and word embeddings. In: Proceedings of the seventh conference on Natural Language Learning at HLT-NAACL McCallum A, Li W (2003) Early results for NER with CRF, feature induction and word embeddings. In: Proceedings of the seventh conference on Natural Language Learning at HLT-NAACL
62.
go back to reference Krishnan V, Manning CD (2006) An effective two-stage model for exploiting non-local dependencies in named entity recognition. In: 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia Krishnan V, Manning CD (2006) An effective two-stage model for exploiting non-local dependencies in named entity recognition. In: 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia
63.
go back to reference Asahara M, Matsumoto Y (2003) Japanese named entity extraction with redundant morphological analysis. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics Asahara M, Matsumoto Y (2003) Japanese named entity extraction with redundant morphological analysis. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics
64.
go back to reference Li Y, Bontcheva K, Cunningham H (2005) SVM based learning system for information extraction. In: Deterministic and Statistical Methods in Machine Learning, pp 319–339 Li Y, Bontcheva K, Cunningham H (2005) SVM based learning system for information extraction. In: Deterministic and Statistical Methods in Machine Learning, pp 319–339
65.
go back to reference Szarvas G, Farkas R, Kocsor A (2006) A multilingual named entity recognition system using boosting and C4.5 decision tree learning algorithms. Discovery Science, pp 267–278 Szarvas G, Farkas R, Kocsor A (2006) A multilingual named entity recognition system using boosting and C4.5 decision tree learning algorithms. Discovery Science, pp 267–278
66.
go back to reference Chiu JP, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Association Comput Linguistics 4:357–370CrossRef Chiu JP, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Association Comput Linguistics 4:357–370CrossRef
67.
go back to reference Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. ArXiv, abs/1508.01991 Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. ArXiv, abs/1508.01991
68.
go back to reference Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural Architectures for Named Entity Recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural Architectures for Named Entity Recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California
69.
go back to reference Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany
70.
go back to reference Yang Z, Salakhutdinov R, Cohen WW (2017) Transfer learning for sequence tagging with hierarchical recurrent networks. In: ICLR 2017 Yang Z, Salakhutdinov R, Cohen WW (2017) Transfer learning for sequence tagging with hierarchical recurrent networks. In: ICLR 2017
71.
go back to reference Ji H, Grishman R (2006) Data selection in semi-supervised learning for name tagging. In: Proceedings of the Workshop on Information Extraction Beyond the Document, Sydney, Australia Ji H, Grishman R (2006) Data selection in semi-supervised learning for name tagging. In: Proceedings of the Workshop on Information Extraction Beyond the Document, Sydney, Australia
72.
go back to reference Turian J, Ratinov L-A, Bengio Y (2010) A simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics Turian J, Ratinov L-A, Bengio Y (2010) A simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
73.
go back to reference Alfonseca E, Manandhar S (2002) An unsupervised method for general named entity. In: Proceedings for International Conference on General WordNet Alfonseca E, Manandhar S (2002) An unsupervised method for general named entity. In: Proceedings for International Conference on General WordNet
74.
go back to reference Li D, Savova G, Kipper-Schuler K (2008) Conditional random fields and support vector machines for disorder named entity recognition in clinical texts. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, Columbus, Ohio Li D, Savova G, Kipper-Schuler K (2008) Conditional random fields and support vector machines for disorder named entity recognition in clinical texts. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, Columbus, Ohio
75.
go back to reference Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK
76.
go back to reference Han J, Sun A, Cong G, Zhao WX, Ji Z, Phan MC (2018) Linking fine-grained locations in user comments. IEEE Trans Knowl Data Eng 30(1):59–72CrossRef Han J, Sun A, Cong G, Zhao WX, Ji Z, Phan MC (2018) Linking fine-grained locations in user comments. IEEE Trans Knowl Data Eng 30(1):59–72CrossRef
77.
go back to reference Rocktäschel T, Weidlich M, Leser U (2012) ChemSpot: A hybrid system for chemical named entity recognition. Bioinformatics 28(12):1633–1640CrossRef Rocktäschel T, Weidlich M, Leser U (2012) ChemSpot: A hybrid system for chemical named entity recognition. Bioinformatics 28(12):1633–1640CrossRef
78.
go back to reference Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, NLPBA/BioNLP, Geneva, Switzerland Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, NLPBA/BioNLP, Geneva, Switzerland
80.
go back to reference Baldwin T, Kordoni V, Villavicencio A (2009) Prepositions in applications: A survey and introduction to the special issue. Comput Linguistics 35(2):119–150CrossRef Baldwin T, Kordoni V, Villavicencio A (2009) Prepositions in applications: A survey and introduction to the special issue. Comput Linguistics 35(2):119–150CrossRef
83.
go back to reference Chen SF, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 13(4):359–393CrossRef Chen SF, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 13(4):359–393CrossRef
84.
go back to reference Agrawal R, Imielinski T, Swami A(1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of data, Washington DC, USA Agrawal R, Imielinski T, Swami A(1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of data, Washington DC, USA
85.
go back to reference Sang EFTK (2002) Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition. In: Proceedings for Conference on Natural Language Learning Sang EFTK (2002) Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition. In: Proceedings for Conference on Natural Language Learning
86.
go back to reference Friburger N, Maurel D (2002) Textual similarity based on proper names. In: Mathematical Formal Information Retrieval, MFIR 2002 Friburger N, Maurel D (2002) Textual similarity based on proper names. In: Mathematical Formal Information Retrieval, MFIR 2002
87.
go back to reference Gaizauskas R, Wakao T, Humphreys K, Cunningham H, Wilks Y (1995) University of Sheffield: Description of the LaSIE system as used for MUC-6. In: Message Understanding Conference-6 pp6 Gaizauskas R, Wakao T, Humphreys K, Cunningham H, Wilks Y (1995) University of Sheffield: Description of the LaSIE system as used for MUC-6. In: Message Understanding Conference-6 pp6
88.
go back to reference Hammersley J, Clifford P (1971) Markov fields on finite graphs and lattices. Unpublished Paper Hammersley J, Clifford P (1971) Markov fields on finite graphs and lattices. Unpublished Paper
89.
go back to reference Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In: Proceedings of Human Language Technology, NAACL, Canada Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In: Proceedings of Human Language Technology, NAACL, Canada
90.
go back to reference Harrower T (2010) Inside reporting: A practical guide to the craft of journalism. McGraw-Hill Education, New York Harrower T (2010) Inside reporting: A practical guide to the craft of journalism. McGraw-Hill Education, New York
91.
go back to reference Krippendorff K (2004) Content Analysis: An Introduction to its methodology. SAGE Publications, Thousand Oaks Krippendorff K (2004) Content Analysis: An Introduction to its methodology. SAGE Publications, Thousand Oaks
92.
go back to reference Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguistics 34(4):555–596CrossRef Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguistics 34(4):555–596CrossRef
93.
go back to reference Scott WA (1955) Reliability of content analysis: The case of nominal scale coding. Pub Opin Q 19(3):321–325CrossRef Scott WA (1955) Reliability of content analysis: The case of nominal scale coding. Pub Opin Q 19(3):321–325CrossRef
94.
go back to reference Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRef Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46CrossRef
95.
go back to reference Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38(4):1047–1051CrossRefMATH Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38(4):1047–1051CrossRefMATH
96.
go back to reference Passonneau R (2006) Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy Passonneau R (2006) Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy
97.
go back to reference Krippendorff K, Craggs R (2016) The reliability of multi-valued coding of data. Commun Methods Meas 10(4):181–198CrossRef Krippendorff K, Craggs R (2016) The reliability of multi-valued coding of data. Commun Methods Meas 10(4):181–198CrossRef
98.
go back to reference Krippendorff K (2011) Agreement and information in the reliability of coding. Commun Methods Meas 5(2):93–112CrossRef Krippendorff K (2011) Agreement and information in the reliability of coding. Commun Methods Meas 5(2):93–112CrossRef
99.
go back to reference Burnard L (2000) Reference guide for the British national corpus. Oxford University, Oxford, UK Burnard L (2000) Reference guide for the British national corpus. Oxford University, Oxford, UK
101.
go back to reference Chinchor NA (1998) Overview of MUC-7. In: Seventh Message Understanding Conference, MUC-7. Fairfax, Virginia Chinchor NA (1998) Overview of MUC-7. In: Seventh Message Understanding Conference, MUC-7. Fairfax, Virginia
102.
go back to reference Gardner M, Grus J, Neumann M, Tafjord O, Dasigi P, Liu NF, Peters M, Schmitz M, Zettlemoyer LS (2018) AllenNLP: A deep semantic natural language processing platform. In: Proceedings of Workshop for NLP Open Source Software, NLP-OSS, Melbourne, Australia Gardner M, Grus J, Neumann M, Tafjord O, Dasigi P, Liu NF, Peters M, Schmitz M, Zettlemoyer LS (2018) AllenNLP: A deep semantic natural language processing platform. In: Proceedings of Workshop for NLP Open Source Software, NLP-OSS, Melbourne, Australia
103.
go back to reference Ripley BD. Modelling spatial patterns. J R Stat Soc Ser B 39:172–92 Ripley BD. Modelling spatial patterns. J R Stat Soc Ser B 39:172–92
Metadata
Title
A spatially-aware algorithm for location extraction from structured documents
Authors
Praval Sharma
Ashok Samal
Leen-Kiat Soh
Deepti Joshi
Publication date
04-11-2022
Publisher
Springer US
Published in
GeoInformatica / Issue 4/2023
Print ISSN: 1384-6175
Electronic ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-022-00482-1

Other articles of this Issue 4/2023

GeoInformatica 4/2023 Go to the issue