Skip to main content
Top

2018 | OriginalPaper | Chapter

Intelligent Text Mining Model for English Language Using Deep Neural Network

Authors : Shashi Pal Singh, Ajai Kumar, Hemant Darbari, Balvinder Kaur, Kanchan Tiwari, Nisheeth Joshi

Published in: Information and Communication Technology for Intelligent Systems (ICTIS 2017) - Volume 2

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Today there exist various sources that provide information in very massive amount to serve the demand over the internet, which creates huge collection of heterogeneous data. Thus existing data can be categorized as unstructured and structured data.
In this paper we propose an idea of a tool which intelligently preprocesses the unstructured data by segmenting the whole document into number of sentences, using deep learning concepts with word2vec [11] and a Recurrent Neural Network [13]. At the beginning step we use word2vec which was introduced by Tomas Mikolov with his team at Google, to generate vectors of the inputted text content which will be further forwarded to Recurrent Neural Network. RNN takes this series of vectors as input and trained Data Cleaning Recurrent Neural Network model will perform preprocessing task (including cleaning of missing, grammatically incorrect, misspelled data) to produce structured results, which then passed into automatic summarization module to generate desired summary.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)CrossRef Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)CrossRef
2.
go back to reference Kanimozhi, K.V., Venkatesan, M.: Unstructured data analysis-a survey. Int. J. Adv. Res. Comput. Commun. Eng. 4(3) (2015) Kanimozhi, K.V., Venkatesan, M.: Unstructured data analysis-a survey. Int. J. Adv. Res. Comput. Commun. Eng. 4(3) (2015)
3.
go back to reference Chakraborty, G., Pagolu, M.K.: Analysis of Unstructured Data: Applications of Text Analytics and Sentiment Mining. Paper 1288-2014 Chakraborty, G., Pagolu, M.K.: Analysis of Unstructured Data: Applications of Text Analytics and Sentiment Mining. Paper 1288-2014
4.
go back to reference Mihalcea, R., Tarau, P.: Text Rank: Bringing Order Into Texts Mihalcea, R., Tarau, P.: Text Rank: Bringing Order Into Texts
5.
go back to reference Brown, B., Chui, M., Manyika, J.: Are you ready for the era of Big Data? McKinsey Q. 4, 24–35 (2011) Brown, B., Chui, M., Manyika, J.: Are you ready for the era of Big Data? McKinsey Q. 4, 24–35 (2011)
8.
go back to reference Pawar, D.D., Bewoor, M.S., Patil, S.H.: Text rank: a novel concept for extraction based text summarization. Int. J. Comput. Sci. Inf. Technol. 5(3) (2014) Pawar, D.D., Bewoor, M.S., Patil, S.H.: Text rank: a novel concept for extraction based text summarization. Int. J. Comput. Sci. Inf. Technol. 5(3) (2014)
9.
go back to reference Mihalcea, R.: Graph-Based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization. Department of Computer Science University of North Tex. 1974–July 2004) Mihalcea, R.: Graph-Based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization. Department of Computer Science University of North Tex. 1974–July 2004)
10.
go back to reference Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010) Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)
12.
go back to reference Chanen, A.: Deep learning for extracting word-level meaning from safety report narratives. IEEE (2016). 978-1-5090-2149-9 Chanen, A.: Deep learning for extracting word-level meaning from safety report narratives. IEEE (2016). 978-1-5090-2149-9
14.
go back to reference Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP (2013) Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP (2013)
15.
go back to reference Hemalatha, I., Varma, G.P.S., Govardhan, A.: Preprocessing the informal text for efficient sentiment analysis. IJETTCS 1(2) (2012) Hemalatha, I., Varma, G.P.S., Govardhan, A.: Preprocessing the informal text for efficient sentiment analysis. IJETTCS 1(2) (2012)
16.
go back to reference Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: 25th International Conference on Machine Learning. Helsinki, Finland (2008) Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: 25th International Conference on Machine Learning. Helsinki, Finland (2008)
17.
go back to reference You, L., Li, Y., Wang, Y., Zhang, J., Yang, Y: A deep learning-based RNNs model for automatic security audit of short messages. In: IEEE 16th International Symposium on Communications and Information Technologies (ISCIT) (2016) You, L., Li, Y., Wang, Y., Zhang, J., Yang, Y: A deep learning-based RNNs model for automatic security audit of short messages. In: IEEE 16th International Symposium on Communications and Information Technologies (ISCIT) (2016)
18.
go back to reference Ouyang, X., Zhou, P., Li, C.H., Liu, L.: Sentiment analysis using convolutional neural network. In: IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing, Pervasive Intelligence and Computing (2015) Ouyang, X., Zhou, P., Li, C.H., Liu, L.: Sentiment analysis using convolutional neural network. In: IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing, Pervasive Intelligence and Computing (2015)
19.
go back to reference Chen, K.-Y., Liu, S.-H., Chen, B., Wang, H.-M., Jan, E.-E., Hsu, W.-L., Chen, H.-H.: Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques. IEEE Trans. Audio Speech Lang. Process. 23(8), 1322–1334 (2015)CrossRef Chen, K.-Y., Liu, S.-H., Chen, B., Wang, H.-M., Jan, E.-E., Hsu, W.-L., Chen, H.-H.: Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques. IEEE Trans. Audio Speech Lang. Process. 23(8), 1322–1334 (2015)CrossRef
Metadata
Title
Intelligent Text Mining Model for English Language Using Deep Neural Network
Authors
Shashi Pal Singh
Ajai Kumar
Hemant Darbari
Balvinder Kaur
Kanchan Tiwari
Nisheeth Joshi
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-63645-0_54

Premium Partner