Skip to main content
Top

2014 | OriginalPaper | Chapter

Reducing Effects of Class Imbalance Distribution in Multi-class Text Categorization

Authors : Part Pramokchon, Punpiti Piamsa-nga

Published in: Recent Advances in Information and Communication Technology

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

In multi-class text classification, when number of entities in each class is highly imbalanced, performance of feature ranking methods is usually low because the larger class has much dominant influence to the classifier and the smaller one seems to be ignored. This research attempts to solve this problem by separating the larger classes into several smaller subclasses according to their proximities, by k-mean clustering then all subclasses are considered for feature scoring measure instead of the main classes. This cluster-based feature scoring method is proposed to reduce the influence of skewed class distributions. Compared to performance of feature sets selected from main classes and ground-truth subclasses, the experimental results show that performance of a feature set selected by the proposed method achieves significant improvement on classifying imbalanced corpora, the RCV1v2 dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Metadata
Title
Reducing Effects of Class Imbalance Distribution in Multi-class Text Categorization
Authors
Part Pramokchon
Punpiti Piamsa-nga
Copyright Year
2014
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-06538-0_26

Premium Partner