short-paper

Two-stage approach to named entity recognition using Wikipedia and DBpedia

Authors:
Seonghan Ryu

Pohang University of Science and Technology

Pohang University of Science and Technology
View Profile

,
Hwanjo Yu

Pohang University of Science and Technology

Pohang University of Science and Technology
View Profile

,
Gary Geunbae Lee

Pohang University of Science and Technology

Pohang University of Science and Technology
View Profile

IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and CommunicationJanuary 2017Article No.: 73Pages 1–4https://doi.org/10.1145/3022227.3022299

Published:05 January 2017Publication History

IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

Pages 1–4

ABSTRACT

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE dictionary, but collecting them manually is laborious and time-consuming. This paper proposes a two-stage approach based on nothing but Wikipedia and DBpedia to implement NER. This paper also addresses technical problems in developing Korean NER. In experiments, the proposed method can recognize NEs in short question sentences with 14.2% errors.

References

S. Bird, E. Klein, and E. Loper. Natural language processing with Python. O'Reilly Media, Inc., 2009. Google ScholarDigital Library
L. Dong, F. Wei, M. Zhou, and K. Xu. Question answering over Freebase with multi-column convolutional neural networks. In Proceedings of ACL, 2015.Google ScholarCross Ref
J. R. Finkel, T. Grenager, and C. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of ACL, 2005. Google ScholarDigital Library
A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proceedings of IEEE ICASSP, 2013.Google ScholarCross Ref
Y. Kim, Y. Hamn, J. Kim, D. Hwang, and K.-S. Choi. A non-morphological approach for dbpedia URI spotting within korean text. In Proceedings of HCLT, 2014.Google Scholar
D. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proceedings of ICLR, 2015.Google Scholar
J. Nothman, N. Ringland, W. Radford, T. Murphy, and J. R. Curran. Learning multilingual named entity recognition from Wikipedia. Artif. Intell., 194:151--175, 2012. Google ScholarDigital Library
M. Schuster and K. K. Paliwal. Bidirectional recurrent neural networks. IEEE Trans Sig. Process., 45:2673--2681, 1997. Google ScholarDigital Library
W. Shen, J. Wang, and J. Han. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng., 27:443--460, 2015.Google ScholarCross Ref
S. K. H. Song, Yeongkil; Jeong. A semi-automatic construction method of a named entity dictionary based on wikipedia. J. Korea Inst. Inf. Commun. Eng., 42, 2015.Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15:1929--1958, 2014. Google ScholarDigital Library
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of ACL, 2014.Google Scholar
W. Yin, M. Yu, B. Xiang, B. Zhou, and H. Schütze. Simple question answering by attentive convolutional neural network. ArXiv e-prints, 2016.Google Scholar

Index Terms

Two-stage approach to named entity recognition using Wikipedia and DBpedia
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Language models
  2. Information systems applications
    1. Collaborative and social computing systems and tools
      1. Wikis

Recommendations

Automatic gazette creation for named entity recognition and application to resume processing
COMPUTE '12: Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies

Named entities are important content-carrying units within documents. Consequently named entity recognition (NER) is an important part of information extraction. One fast and accurate approach to NER uses a list or gazette consisting of known instances. ...
Read More
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Read More
From DBpedia to Wikipedia: Filling the Gap by Discovering Wikipedia Conventions
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Many relations existing in DBpedia are missing in Wikipedia yielding up an information gap between the semantic web and the social web. Inserting these missing relations requires to automatically discover Wikipedia conventions. From pairs linked by a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication
January 2017
746 pages
ISBN:9781450348881
DOI:10.1145/3022227

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 January 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
DBpedia
Wikipedia
information extraction
named entity recognition
question answering
Qualifiers
- short-paper
Conference

Acceptance Rates
IMCOM '17 Paper Acceptance Rate113of366submissions,31%Overall Acceptance Rate213of621submissions,34%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 158
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Two-stage approach to named entity recognition using Wikipedia and DBpedia

IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic gazette creation for named entity recognition and application to resume processing

Learning multilingual named entity recognition from Wikipedia

From DBpedia to Wikipedia: Filling the Gap by Discovering Wikipedia Conventions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Two-stage approach to named entity recognition using Wikipedia and DBpedia

IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic gazette creation for named entity recognition and application to resume processing

Learning multilingual named entity recognition from Wikipedia

From DBpedia to Wikipedia: Filling the Gap by Discovering Wikipedia Conventions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media