Skip to main content

2015 | OriginalPaper | Buchkapitel

A Novel Feature Selection Method Based on Category Distribution and Phrase Attributes

verfasst von : Yi Zheng, Weihong Han, Chengzhang Zhu

Erschienen in: Trustworthy Computing and Services

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A novel and effective feature selection method called CDPAB-FSM(category distribution and phrase attribute based feature selection method) for automatical Chinese web page classification was proposed. The method combined the distribution among categories with that within the category. As well, the length and position of the phrases to select were taken into account in order to distinguish the feature phrases from other unimportant ones. Experiments showed that CDPAB-FSM was suited for feature selection of Chinese web page classification and it achieved better classification results than TF-IDF did.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Baboo, S.S., Sasikala, S.: Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. In: 2010 IEEE International Conference on Communication Control and Computing Technologies, pp. 748–757. IEEE (2010) Baboo, S.S., Sasikala, S.: Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. In: 2010 IEEE International Conference on Communication Control and Computing Technologies, pp. 748–757. IEEE (2010)
2.
Zurück zum Zitat Cambria, E., Huang, G.B., Kasun, L.L.C., Zhou, H., Vong, C.M., Lin, J., Yin, J., Cai, Z., Liu, Q., Li, K., et al.: Extreme learning machines. IEEE Intell. Syst. 28(6), 30–59 (2013)CrossRef Cambria, E., Huang, G.B., Kasun, L.L.C., Zhou, H., Vong, C.M., Lin, J., Yin, J., Cai, Z., Liu, Q., Li, K., et al.: Extreme learning machines. IEEE Intell. Syst. 28(6), 30–59 (2013)CrossRef
3.
Zurück zum Zitat Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012)CrossRef Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012)CrossRef
4.
Zurück zum Zitat Lan, Y., Hu, Z., Soh, Y.C., Huang, G.B.: An extreme learning machine approach for speaker recognition. Neural Comput. Appl. 22(3–4), 417–425 (2013)CrossRef Lan, Y., Hu, Z., Soh, Y.C., Huang, G.B.: An extreme learning machine approach for speaker recognition. Neural Comput. Appl. 22(3–4), 417–425 (2013)CrossRef
5.
Zurück zum Zitat Mesleh, A.M.: Chi square feature extraction based svms arabic language text categorization system. J. Comput. Sci. 3(6), 430 (2007)CrossRef Mesleh, A.M.: Chi square feature extraction based svms arabic language text categorization system. J. Comput. Sci. 3(6), 430 (2007)CrossRef
6.
Zurück zum Zitat Oveisi, F., Oveisi, S., Erfanian, A., Patras, I.: Tree-structured feature extraction using mutual information. IEEE Trans. Neural Netw. Learn. Syst. 23(1), 127–137 (2012)CrossRef Oveisi, F., Oveisi, S., Erfanian, A., Patras, I.: Tree-structured feature extraction using mutual information. IEEE Trans. Neural Netw. Learn. Syst. 23(1), 127–137 (2012)CrossRef
7.
Zurück zum Zitat Patil, L.H., Atique, M.: A novel approach for feature selection method tf-idf in document clustering. In: 2013 IEEE 3rd International Advance Computing Conference, pp. 858–862. IEEE (2013) Patil, L.H., Atique, M.: A novel approach for feature selection method tf-idf in document clustering. In: 2013 IEEE 3rd International Advance Computing Conference, pp. 858–862. IEEE (2013)
8.
Zurück zum Zitat Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef
9.
Zurück zum Zitat Xu, J., Zhou, H., Huang, G.B.: Extreme learning machine based fast object recognition. In: 15th International Conference on Information Fusion, pp. 1490–1496. IEEE (2012) Xu, J., Zhou, H., Huang, G.B.: Extreme learning machine based fast object recognition. In: 15th International Conference on Information Fusion, pp. 1490–1496. IEEE (2012)
10.
Zurück zum Zitat Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: 1997 International Conference on Machine Learning, vol. 97, pp. 412–420 (1997) Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: 1997 International Conference on Machine Learning, vol. 97, pp. 412–420 (1997)
11.
Zurück zum Zitat Zhang, H., Ren, Y.g., Yang, X.: Research on text feature selection algorithm based on information gain and feature relation tree. In: 10th Web Information System and Application Conference. pp. 446–449. IEEE (2013) Zhang, H., Ren, Y.g., Yang, X.: Research on text feature selection algorithm based on information gain and feature relation tree. In: 10th Web Information System and Application Conference. pp. 446–449. IEEE (2013)
12.
Zurück zum Zitat Zhu, D., Xiao, J.: R-tfidf, a variety of tf-idf term weighting strategy in document categorization. In: Seventh International Conference on Semantics Knowledge and Grid, pp. 83–90. IEEE (2011) Zhu, D., Xiao, J.: R-tfidf, a variety of tf-idf term weighting strategy in document categorization. In: Seventh International Conference on Semantics Knowledge and Grid, pp. 83–90. IEEE (2011)
13.
Zurück zum Zitat Zong, W., Huang, G.B.: Face recognition based on extreme learning machine. Neurocomputing 74(16), 2541–2551 (2011)CrossRef Zong, W., Huang, G.B.: Face recognition based on extreme learning machine. Neurocomputing 74(16), 2541–2551 (2011)CrossRef
Metadaten
Titel
A Novel Feature Selection Method Based on Category Distribution and Phrase Attributes
verfasst von
Yi Zheng
Weihong Han
Chengzhang Zhu
Copyright-Jahr
2015
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-47401-3_4