Skip to main content
Top

2018 | OriginalPaper | Chapter

A Method for Topic Classification of Web Pages Using LDA-SVM Model

Authors : Yuliang Wei, Wei Wang, Bailing Wang, Bo Yang, Yang Liu

Published in: Proceedings of 2017 Chinese Intelligent Automation Conference

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The fast developments on the computer and networking technologies have made the Internet become the largest medium of information in the word at present. Many companies hope to be able to timely and effective access to information from the Internet. Efficient webpages classification system is needed. According to the classification requirements, we use LDA-SVM model for elaborate web category classification. And we discuss the impact of topic number K in LDA to the classification. The experiments show our method is efficient.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cortes Corinna, Vapnik Vladimir (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes Corinna, Vapnik Vladimir (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
2.
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3 Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3
3.
go back to reference Ahmadi A, Fotouhi M, Khaleghi M (2011) Intelligent classification of web pages using contextual and visual features. Appl Soft Comput 11(2) Ahmadi A, Fotouhi M, Khaleghi M (2011) Intelligent classification of web pages using contextual and visual features. Appl Soft Comput 11(2)
4.
go back to reference Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959CrossRef Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959CrossRef
5.
go back to reference O ̈zel SA (2011) A web page classification system based on a genetic algorithm using tagged-terms as features. Expert Syst Appl: Int J 38(4) O ̈zel SA (2011) A web page classification system based on a genetic algorithm using tagged-terms as features. Expert Syst Appl: Int J 38(4)
6.
go back to reference Nguyen TTS, Lu HY, Lu J (2014) Web-page recommendation based on web usage and domain knowledge. IEEE Trans Knowl Data Eng 26(10):2574–2587CrossRef Nguyen TTS, Lu HY, Lu J (2014) Web-page recommendation based on web usage and domain knowledge. IEEE Trans Knowl Data Eng 26(10):2574–2587CrossRef
7.
go back to reference Hern ́andez I, Rivero CR, Ruiz D, Corchuelo R (2014) CALA: an unsupervised URL-based web page classification system. Knowl-Based Syst 57 Hern ́andez I, Rivero CR, Ruiz D, Corchuelo R (2014) CALA: an unsupervised URL-based web page classification system. Knowl-Based Syst 57
8.
go back to reference Belmouhcine A, Benkhalifa M (2016) Implicit links-based techniques to enrich k-nearest neighbors and naive bayes algorithms for web page classification. In Proceedings of the 9th international conference on computer recognition systems CORES 2015. Springer International Publishing Belmouhcine A, Benkhalifa M (2016) Implicit links-based techniques to enrich k-nearest neighbors and naive bayes algorithms for web page classification. In Proceedings of the 9th international conference on computer recognition systems CORES 2015. Springer International Publishing
9.
go back to reference Cui L, Meng F, Shi Y, Li M, Liu A (2014) A hierarchy method based on LDA and SVM for news classification. In 2014 IEEE international conference on data mining workshop, pp 60–64 Cui L, Meng F, Shi Y, Li M, Liu A (2014) A hierarchy method based on LDA and SVM for news classification. In 2014 IEEE international conference on data mining workshop, pp 60–64
10.
go back to reference Chen X, Xia Y, Jin P, Carroll J (2015) Dataless text classification with descriptive LDA. In AAAI’15: Proceedings of the twenty-ninth AAAI conference on artificial intelligence. Leshan Teachers College, AAAI Press, New York Chen X, Xia Y, Jin P, Carroll J (2015) Dataless text classification with descriptive LDA. In AAAI’15: Proceedings of the twenty-ninth AAAI conference on artificial intelligence. Leshan Teachers College, AAAI Press, New York
Metadata
Title
A Method for Topic Classification of Web Pages Using LDA-SVM Model
Authors
Yuliang Wei
Wei Wang
Bailing Wang
Bo Yang
Yang Liu
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-6445-6_64