Skip to main content
Top

2021 | OriginalPaper | Chapter

An Improved Dictionary Based Genre Classification Based on Title and Abstract of E-book Using Machine Learning Algorithms

Authors : Vrunda Thakur, Ankit C. Patel

Published in: Proceedings of Second International Conference on Computing, Communications, and Cyber-Security

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The amount of digital books or e-books is increasing day by day. Book Assortment is the job of assigning a category or set of appropriate genres to a book. The goal of this research paper is to classify books with related genres. Many existing approaches, like Support Vector Machine (SVM), Neural Text Categorizer (NTC), etc. are available for text mining. We applied existing machine learning algorithms with different datasets and implemented existing feature selection methods to select features. In our proposed dictionary-based approach, we classified books by its attributes like title, description, genre, and author using text mining. In the learning part, we created a dictionary of keywords from the book’s description and title and then assigned genres to the keywords. In the classification part, we attributed genres to a book. For classifying the books, we extracted a dataset from web pages using web scraping. Our proposed approach outperforms traditional approaches to reduce the time of training when massive data is considered.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mooney RJ, Roy L (2000) Content-based book recommending using learning for text categorization. In: Proceedings of the fifth ACM conference on digital libraries Mooney RJ, Roy L (2000) Content-based book recommending using learning for text categorization. In: Proceedings of the fifth ACM conference on digital libraries
3.
go back to reference Karimkhan M, Bhatia JB (2014) Sentiment analysis and big data processing. IJCSC 5(1):136–142 Karimkhan M, Bhatia JB (2014) Sentiment analysis and big data processing. IJCSC 5(1):136–142
4.
go back to reference Bhatia J, Kumhar M (2015) Perspective study on load balancing paradigms in cloud computing. IJCSC 6(1):112–120 Bhatia J, Kumhar M (2015) Perspective study on load balancing paradigms in cloud computing. IJCSC 6(1):112–120
5.
go back to reference Bhatia JB (2015) A dynamic model for load balancing in cloud infrastructure. Nirma Univ J Eng Technol (NUJET) 4(1):15 Bhatia JB (2015) A dynamic model for load balancing in cloud infrastructure. Nirma Univ J Eng Technol (NUJET) 4(1):15
7.
go back to reference Bieber A (2018) Voices from the interior: reimagining childhood under Janusz Korczak’s care. Lion Unicorn 42(3):321–337CrossRef Bieber A (2018) Voices from the interior: reimagining childhood under Janusz Korczak’s care. Lion Unicorn 42(3):321–337CrossRef
8.
go back to reference Swales JM (2019) The futures of EAP genre studies: a personal viewpoint. J English Acad Purposes 38:75–82 Swales JM (2019) The futures of EAP genre studies: a personal viewpoint. J English Acad Purposes 38:75–82
9.
go back to reference Kessler B, Nunberg G, Schütze H (1997) Automatic detection of text genre. arXiv preprint cmp-lg/9707002 Kessler B, Nunberg G, Schütze H (1997) Automatic detection of text genre. arXiv preprint cmp-lg/9707002
10.
go back to reference Liu Y et al (2020) A new feature selection method for text classification based on independent feature space search. Math Probl Eng Liu Y et al (2020) A new feature selection method for text classification based on independent feature space search. Math Probl Eng
11.
go back to reference Gupta A, Begum SA (2019) Efficient multi-cluster feature selection on text data. J Inf Optimiz Sci 40(8):1583–1598 Gupta A, Begum SA (2019) Efficient multi-cluster feature selection on text data. J Inf Optimiz Sci 40(8):1583–1598
12.
go back to reference Zheng W, Jin Z (2020) Comparing multiple categories of feature selection methods for text classification. Dig Scholarship Human 35(1):208–224 Zheng W, Jin Z (2020) Comparing multiple categories of feature selection methods for text classification. Dig Scholarship Human 35(1):208–224
13.
go back to reference Liu P et al. (2019) Sentiment analysis of chinese tourism review based on boosting and LSTM. In: 2019 international conference on communications, information system, and computer engineering (CISCE). IEEE Liu P et al. (2019) Sentiment analysis of chinese tourism review based on boosting and LSTM. In: 2019 international conference on communications, information system, and computer engineering (CISCE). IEEE
14.
go back to reference Yang Y, Pedersen JO (2017) A comparative study on feature selection in text categorization. ICML 97:412–420 Yang Y, Pedersen JO (2017) A comparative study on feature selection in text categorization. ICML 97:412–420
15.
go back to reference Zhao Y, Dong S, Li L (2014) Sentiment analysis on news comments based on a supervised learning method Zhao Y, Dong S, Li L (2014) Sentiment analysis on news comments based on a supervised learning method
16.
go back to reference Sarkar SD, Goswami S (2013) Empirical study on filter-based feature selection methods for text classification. Int J Comput Appl 81(6) Sarkar SD, Goswami S (2013) Empirical study on filter-based feature selection methods for text classification. Int J Comput Appl 81(6)
17.
go back to reference Sharma A, Dey S (2012) Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA Special Issue Adv Comput Commun Technol HPC Appl 3:15–20 Sharma A, Dey S (2012) Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA Special Issue Adv Comput Commun Technol HPC Appl 3:15–20
18.
go back to reference Ozsarfati E et al (2019) Book genre classification based on titles with comparative machine learning algorithms. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE Ozsarfati E et al (2019) Book genre classification based on titles with comparative machine learning algorithms. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE
19.
go back to reference Buczkowski P, Sobkowicz A, Kozlowski M (2018) Deep learning approaches towards book covers classification. ICPRAM:309–316 Buczkowski P, Sobkowicz A, Kozlowski M (2018) Deep learning approaches towards book covers classification. ICPRAM:309–316
20.
go back to reference Worsham J, Kalita J (2018) Genre identification and the compositional effect of the genre in literature. In: Proceedings of the 27th international conference on computational linguistics Worsham J, Kalita J (2018) Genre identification and the compositional effect of the genre in literature. In: Proceedings of the 27th international conference on computational linguistics
21.
go back to reference Álvarez-López T et al (2018) A proposal for book-oriented aspect-based sentiment analysis: comparison over domains. In: International conference on applications of natural language to information systems. Springer, Cham Álvarez-López T et al (2018) A proposal for book-oriented aspect-based sentiment analysis: comparison over domains. In: International conference on applications of natural language to information systems. Springer, Cham
22.
go back to reference Vachhani H et al (2019) Machine learning-based stock market analysis: a short survey. In: International conference on innovative data communication technologies and application. Springer, Cham Vachhani H et al (2019) Machine learning-based stock market analysis: a short survey. In: International conference on innovative data communication technologies and application. Springer, Cham
Metadata
Title
An Improved Dictionary Based Genre Classification Based on Title and Abstract of E-book Using Machine Learning Algorithms
Authors
Vrunda Thakur
Ankit C. Patel
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-16-0733-2_23