Skip to main content
Erschienen in: Neural Computing and Applications 20/2020

04.04.2020 | S.I. : Applying Artificial Intelligence to the Internet of Things

A deep neural network-based model for named entity recognition for Hindi language

verfasst von: Richa Sharma, Sudha Morwal, Basant Agarwal, Ramesh Chandra, Mohammad S. Khan

Erschienen in: Neural Computing and Applications | Ausgabe 20/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The aim of this work is to develop efficient named entity recognition from the given text that in turn improves the performance of the systems that use natural language processing (NLP). The performance of IoT-based devices such as Alexa and Cortana significantly depends upon an efficient NLP model. To increase the capability of the smart IoT devices in comprehending the natural language, named entity recognition (NER) tools play an important role in these devices. In general, the NER is a two-step process that initially the proper nouns are identified from text and then classify them into predefined categories of entities such as person, location, measure, organization and time. NER is often performed as a subtask while processing natural languages which increases the accuracy level of a NLP task. In this paper, we propose deep neural network architecture for named entity recognition for the resource-scarce language Hindi, based on convolutional neural network (CNN), bidirectional long short-term memory (Bi-LSTM) neural network and conditional random field (CRF). In the proposed approach, initially, we use skip-gram word2vec model and GloVe model to represent words in semantic vectors which are further used in different deep neural network-based architectures. In the proposed approach, we use character- and word-level embedding to represent the text that includes information at fine-grained level. Due to the use of character-level embeddings, the proposed model is robust for the out-of-vocabulary words. Experimental results show that the combination of Bi-LSTM, CNN and CRF algorithms performs better as compared to the other baseline methods such as recurrent neural network, long short-term memory and Bi-LSTM individually.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Park G, Kim H. Low-cost implementation of a named entity recognition system for voice-activated human-appliance interfaces in a smart home, sustainability (Switzerland) 10. Park G, Kim H. Low-cost implementation of a named entity recognition system for voice-activated human-appliance interfaces in a smart home, sustainability (Switzerland) 10.
2.
Zurück zum Zitat Greenwood MA, Gaizauskas R (2003) Using a named entity tagger to generalise surface matching text patterns for question answering. In: Proceedings of the workshop on natural language processing for question answering (EACL03), Citeseer, pp. 29–34 Greenwood MA, Gaizauskas R (2003) Using a named entity tagger to generalise surface matching text patterns for question answering. In: Proceedings of the workshop on natural language processing for question answering (EACL03), Citeseer, pp. 29–34
3.
Zurück zum Zitat Babych B, Hartley A (2003) Improving machine translation quality with automatic named entity recognition. In: Proceedings of the 7th international EAMT workshop on MT and other language technology tools, improving MT through other language technology tools: resources and tools for building MT, Association for Computational Linguistics, pp 1–8 Babych B, Hartley A (2003) Improving machine translation quality with automatic named entity recognition. In: Proceedings of the 7th international EAMT workshop on MT and other language technology tools, improving MT through other language technology tools: resources and tools for building MT, Association for Computational Linguistics, pp 1–8
4.
Zurück zum Zitat Toda H, Kataoka R (2005) A search result clustering method using informatively named entities. In: Proceedings of the 7th annual ACM international workshop on web information and data management, ACM, pp 81–86 Toda H, Kataoka R (2005) A search result clustering method using informatively named entities. In: Proceedings of the 7th annual ACM international workshop on web information and data management, ACM, pp 81–86
5.
Zurück zum Zitat Chopra D, Joshi N, Mathur I (2016) Named entity recognition in Hindi using hidden Markov model. In: 2016 second international conference on computational intelligence & communication technology (CICT), IEEE, pp 581–586 Chopra D, Joshi N, Mathur I (2016) Named entity recognition in Hindi using hidden Markov model. In: 2016 second international conference on computational intelligence & communication technology (CICT), IEEE, pp 581–586
6.
Zurück zum Zitat Chopra D, Jahan N, Morwal S (2016) Hindi named entity recognition by aggregating rule based heuristics and hidden Markov model. Int J Inf 2(6):43–52. Chopra D, Jahan N, Morwal S (2016) Hindi named entity recognition by aggregating rule based heuristics and hidden Markov model. Int J Inf 2(6):43–52.
7.
Zurück zum Zitat Ekbal A, Bandyopadhyay S (2010) Named entity recognition using support vector machine: a language independent approach. Int J Electr Comput Syst Eng 4(2):155–170MATH Ekbal A, Bandyopadhyay S (2010) Named entity recognition using support vector machine: a language independent approach. Int J Electr Comput Syst Eng 4(2):155–170MATH
8.
Zurück zum Zitat Ekbal A, Bandyopadhyay S (2009) A conditional random field approach for named entity recognition in Bengali and Hindi. Linguist Issues Lang Technol 2(1):1–44 Ekbal A, Bandyopadhyay S (2009) A conditional random field approach for named entity recognition in Bengali and Hindi. Linguist Issues Lang Technol 2(1):1–44
9.
Zurück zum Zitat Goller C, Kuchler A (1996) Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of international conference on neural networks (ICNN’96), vol 1, IEEE, pp 347–352 Goller C, Kuchler A (1996) Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of international conference on neural networks (ICNN’96), vol 1, IEEE, pp 347–352
10.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
11.
Zurück zum Zitat Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681CrossRef Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681CrossRef
12.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, workshop track proceedings Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, workshop track proceedings
13.
Zurück zum Zitat dos Santos C, Guimaraes (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the fifth named entity workshop, association for computational linguistics, Beijing, China, pp 25–33 dos Santos C, Guimaraes (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the fifth named entity workshop, association for computational linguistics, Beijing, China, pp 25–33
14.
Zurück zum Zitat Lafferty J, McCallum A, Pereira FC (2006) Conditional random fields: probabilistic models for segmenting and labeling sequence data Lafferty J, McCallum A, Pereira FC (2006) Conditional random fields: probabilistic models for segmenting and labeling sequence data
15.
Zurück zum Zitat Basile P, Semeraro G, Cassotti P (2017) Bi-directional LSTM-CNNs-CRF for Italian sequence labeling, CLiC-it 2017 11-12 December 2017, Rome Basile P, Semeraro G, Cassotti P (2017) Bi-directional LSTM-CNNs-CRF for Italian sequence labeling, CLiC-it 2017 11-12 December 2017, Rome
16.
Zurück zum Zitat Gregoric AZ, Bachrach Y, Coope S (2018) Named entity recognition with parallel recurrent neural networks. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: short papers), pp 69–74 Gregoric AZ, Bachrach Y, Coope S (2018) Named entity recognition with parallel recurrent neural networks. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: short papers), pp 69–74
17.
Zurück zum Zitat Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, association for computational linguistics, San Diego, California, pp 260–270 Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, association for computational linguistics, San Diego, California, pp 260–270
18.
Zurück zum Zitat P. V. Q. de Castro, N. F. F. da Silva, A. da Silva Soares, Portuguese Named Entity Recognition Using LSTM-CRF, in: International Conference on Computational Processing of the Portuguese Language, Springer, 83–92, 2018. P. V. Q. de Castro, N. F. F. da Silva, A. da Silva Soares, Portuguese Named Entity Recognition Using LSTM-CRF, in: International Conference on Computational Processing of the Portuguese Language, Springer, 83–92, 2018.
19.
Zurück zum Zitat Misawa S, Taniguchi M, Miura Y, Ohkuma T (2017) Character-based Bidirectional LSTM-CRF with words and characters for Japanese Named Entity Recognition. In: Proceedings of the first workshop on subword and character level models in NLP, pp 97–102 Misawa S, Taniguchi M, Miura Y, Ohkuma T (2017) Character-based Bidirectional LSTM-CRF with words and characters for Japanese Named Entity Recognition. In: Proceedings of the first workshop on subword and character level models in NLP, pp 97–102
20.
Zurück zum Zitat Zhang Y, Yang J (2018) ChineseNER using lattice LSTM. In: ACL Zhang Y, Yang J (2018) ChineseNER using lattice LSTM. In: ACL
21.
Zurück zum Zitat Kaur Y, Kaur ER (2015) Named Entity Recognition (NER) system for Hindi language using combination of rule based approach and list look up approach. Int J Sci Res Manage 3(3):2300–2306 Kaur Y, Kaur ER (2015) Named Entity Recognition (NER) system for Hindi language using combination of rule based approach and list look up approach. Int J Sci Res Manage 3(3):2300–2306
22.
Zurück zum Zitat Morwal S, Jahan N, Chopra D (2012) Named entity recognition using hidden Markov model (HMM). Int J Nat Lang Comput 1(4):15–23CrossRef Morwal S, Jahan N, Chopra D (2012) Named entity recognition using hidden Markov model (HMM). Int J Nat Lang Comput 1(4):15–23CrossRef
23.
Zurück zum Zitat Gayen V, Sarkar K (2013) An HMM based named entity recognition system for indian languages: the JU System at ICON 2013, CoRR abs/1405.7397. Gayen V, Sarkar K (2013) An HMM based named entity recognition system for indian languages: the JU System at ICON 2013, CoRR abs/1405.7397.
24.
Zurück zum Zitat Saha SK, Sarkar S, Mitra P (2008) A hybrid feature set based maximum entropy Hindi named entity recognition. In: Proceedings of the third international joint conference on natural language processing, vol I, pp 343–349 Saha SK, Sarkar S, Mitra P (2008) A hybrid feature set based maximum entropy Hindi named entity recognition. In: Proceedings of the third international joint conference on natural language processing, vol I, pp 343–349
25.
Zurück zum Zitat Saha SK, Narayan S, Sarkar S, Mitra P (2010) A composite kernel for named entity recognition. Pattern Recogn Lett 31(12):1591–1597CrossRef Saha SK, Narayan S, Sarkar S, Mitra P (2010) A composite kernel for named entity recognition. Pattern Recogn Lett 31(12):1591–1597CrossRef
26.
Zurück zum Zitat Devi GR, Veena P, Kumar MA, Soman K (2016) Entity extraction of Hindi–English and Tamil–English code-mixed social media text. In: Forum for information retrieval evaluation, Springer, pp 206–218 Devi GR, Veena P, Kumar MA, Soman K (2016) Entity extraction of Hindi–English and Tamil–English code-mixed social media text. In: Forum for information retrieval evaluation, Springer, pp 206–218
27.
Zurück zum Zitat Singh V, Vijay D, Akhtar SS, Shrivastava M (2018) Named entity recognition for Hindi-English code-mixed social media text. In: Proceedings of the seventh named entities workshop, pp 27–35 Singh V, Vijay D, Akhtar SS, Shrivastava M (2018) Named entity recognition for Hindi-English code-mixed social media text. In: Proceedings of the seventh named entities workshop, pp 27–35
28.
Zurück zum Zitat Biswas S, Mishra M, Sitanathbiswas SA, Mohanty S (2010) A two stage language independent named entity recognition for Indian languages, IJCSIT). Int J Comput Sci Inf Technol 1(4):285–289. Biswas S, Mishra M, Sitanathbiswas SA, Mohanty S (2010) A two stage language independent named entity recognition for Indian languages, IJCSIT). Int J Comput Sci Inf Technol 1(4):285–289.
29.
Zurück zum Zitat Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12: 2493–2537 Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12: 2493–2537
30.
Zurück zum Zitat Chiu JP, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Assoc Comput Linguist 4:357–370CrossRef Chiu JP, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Assoc Comput Linguist 4:357–370CrossRef
31.
Zurück zum Zitat Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp 1064–1074 Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp 1064–1074
32.
Zurück zum Zitat Athavale V, Bharadwaj S, Pamecha M, Prabhu A, Shrivastava M (2016) Towards deep learning in Hindi NER: an approach to tackle the labelled data sparsity. In: Proceedings of the 13th international conference on natural language processing, ICON 2016, Varanasi, India, December 17–20, 2016, pp 154–160 Athavale V, Bharadwaj S, Pamecha M, Prabhu A, Shrivastava M (2016) Towards deep learning in Hindi NER: an approach to tackle the labelled data sparsity. In: Proceedings of the 13th international conference on natural language processing, ICON 2016, Varanasi, India, December 17–20, 2016, pp 154–160
33.
Zurück zum Zitat Gupta D, Ekbal A, Bhattacharyya P (2018) A deep neural network based approach for entity extraction in code-mixed indian social media text. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC-2018) Gupta D, Ekbal A, Bhattacharyya P (2018) A deep neural network based approach for entity extraction in code-mixed indian social media text. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC-2018)
34.
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, Doha, Qatar, pp 1532–1543 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, Doha, Qatar, pp 1532–1543
35.
Zurück zum Zitat Kanimozhi U, Ganapathy S, Manjula D, Kannan A (2019) An intelligent risk prediction system for breast cancer using fuzzy temporal rules. Natl Acad Sci Lett 42:227–232CrossRef Kanimozhi U, Ganapathy S, Manjula D, Kannan A (2019) An intelligent risk prediction system for breast cancer using fuzzy temporal rules. Natl Acad Sci Lett 42:227–232CrossRef
36.
Zurück zum Zitat Perumal S, Ganapathy S, Kannan A (2019) An intelligent fuzzy rule-based e-learning recommendation system for dynamic user interests. J Supercomput 75:5145–5160CrossRef Perumal S, Ganapathy S, Kannan A (2019) An intelligent fuzzy rule-based e-learning recommendation system for dynamic user interests. J Supercomput 75:5145–5160CrossRef
37.
Zurück zum Zitat Murthy R (2017) Named entity recognition using deep learning. In: 14th international conference on natural language processing. NLP Association of India (NLPAI) Murthy R (2017) Named entity recognition using deep learning. In: 14th international conference on natural language processing. NLP Association of India (NLPAI)
38.
Zurück zum Zitat Thangaramya K, Kulothugan K, Logambigai R, Selvi M, Ganapathy S, Kannan A (2019) Energy aware cluster and nero-fuzzy based routing algorithm for wireless sensor networks in IoT. Comput Netw 151:211–223CrossRef Thangaramya K, Kulothugan K, Logambigai R, Selvi M, Ganapathy S, Kannan A (2019) Energy aware cluster and nero-fuzzy based routing algorithm for wireless sensor networks in IoT. Comput Netw 151:211–223CrossRef
40.
Zurück zum Zitat Sethukkarasi R, et al (2014) An intelligent neuro fuzzy temporal knowledge representation model for mining temporal patterns, pp 1167–1178. Sethukkarasi R, et al (2014) An intelligent neuro fuzzy temporal knowledge representation model for mining temporal patterns, pp 1167–1178.
Metadaten
Titel
A deep neural network-based model for named entity recognition for Hindi language
verfasst von
Richa Sharma
Sudha Morwal
Basant Agarwal
Ramesh Chandra
Mohammad S. Khan
Publikationsdatum
04.04.2020
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 20/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-04881-z

Weitere Artikel der Ausgabe 20/2020

Neural Computing and Applications 20/2020 Zur Ausgabe

S.I. : Applying Artificial Intelligence to the Internet of Things

Deep neural network-based clustering technique for secure IIoT

Recent Advances in Deep Learning for Medical Image Processing

Brain tumor detection: a long short-term memory (LSTM)-based learning model

S.I. : Applying Artificial Intelligence to the Internet of Things

A hybrid classifier combination for home automation using EEG signals

Premium Partner