Skip to main content

2019 | OriginalPaper | Buchkapitel

BSIL: A Brain Storm-Based Framework for Imbalanced Text Classification

verfasst von : Jiachen Tian, Shizhan Chen, Xiaowang Zhang, Zhiyong Feng

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

All neural networks are not always effective in processing imbalanced datasets when dealing with text classification due to most of them designed under a balanced assumption. In this paper, we present a novel framework named BSIL to improve the capability of neural networks in imbalanced text classification built on brain storm optimization (BSO). With our framework BSIL, the simulation of human brainstorming process of BSO can sample imbalanced datasets in a reasonable way. Firstly, we present an approach to generate multiple relatively balanced subsets of an imbalanced dataset by applying scrambling segmentation and global random sampling in BSIL. Secondly, we introduce a parallel method to train a classifier for a subset efficiently. Finally, we propose a decision-making layer to accept “suggestions” of all classifiers in order to achieve the most reliable prediction result. The experimental results show that BSIL associated with CNN, RNN and Self-attention model can performs better than those models in imbalanced text classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Al-Stouhi, S., Reddy, K.: Transfer learning for class imbalance problems with inadequate data. Knowl. Inf. Syst. 48(1), 201–228 (2016)CrossRef Al-Stouhi, S., Reddy, K.: Transfer learning for class imbalance problems with inadequate data. Knowl. Inf. Syst. 48(1), 201–228 (2016)CrossRef
2.
Zurück zum Zitat Charte, F., Rivera, J., del Jesus, J., Herrera, F.: REMEDIAL-HwR: tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing 326, 110–122 (2019)CrossRef Charte, F., Rivera, J., del Jesus, J., Herrera, F.: REMEDIAL-HwR: tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing 326, 110–122 (2019)CrossRef
3.
Zurück zum Zitat Charte, F., Rivera, J., del Jesus, J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015)CrossRef Charte, F., Rivera, J., del Jesus, J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015)CrossRef
4.
Zurück zum Zitat Chen, W., Cao, Y., Sun, Y., Liu, Q., Li, Y.: Improving brain storm optimization algorithm via simplex search. arXiv, CoRR abs/1712.03166 (2017) Chen, W., Cao, Y., Sun, Y., Liu, Q., Li, Y.: Improving brain storm optimization algorithm via simplex search. arXiv, CoRR abs/1712.03166 (2017)
5.
Zurück zum Zitat Cheng, S., Qin, Q., Chen, J., Shi, Y.: Brain storm optimization algorithm: a review. Artif. Intell. Rev. 46(4), 445–458 (2016)CrossRef Cheng, S., Qin, Q., Chen, J., Shi, Y.: Brain storm optimization algorithm: a review. Artif. Intell. Rev. 46(4), 445–458 (2016)CrossRef
6.
Zurück zum Zitat Datta, S., Nag, S., Mullick, S., Das, S.: Diversifying support vector machines for boosting using kernel perturbation: Applications to class imbalance and small disjuncts. arXiv, CoRR abs/1712.08493 (2017) Datta, S., Nag, S., Mullick, S., Das, S.: Diversifying support vector machines for boosting using kernel perturbation: Applications to class imbalance and small disjuncts. arXiv, CoRR abs/1712.08493 (2017)
7.
Zurück zum Zitat He, H., Garcia, A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008) He, H., Garcia, A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008)
8.
Zurück zum Zitat Khan, H., Hayat, M., Bennamoun, M., Sohel, A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3573–3587 (2018)CrossRef Khan, H., Hayat, M., Bennamoun, M., Sohel, A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3573–3587 (2018)CrossRef
9.
Zurück zum Zitat Kubat, M., Holte, C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2–3), 195–215 (1998)CrossRef Kubat, M., Holte, C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2–3), 195–215 (1998)CrossRef
10.
Zurück zum Zitat Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of AAAI 2015, pp. 2267–2273 (2015) Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of AAAI 2015, pp. 2267–2273 (2015)
11.
Zurück zum Zitat Lin, C., Tsai, F., Hu, H., Jhang, S.: Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409, 17–26 (2017)CrossRef Lin, C., Tsai, F., Hu, H., Jhang, S.: Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409, 17–26 (2017)CrossRef
12.
Zurück zum Zitat Moreo A., Esuli A., Sebastiani F.: Distributional random oversampling for imbalanced text classification. In: Proceedings of SIGIR 2016, pp. 805–808 (2016) Moreo A., Esuli A., Sebastiani F.: Distributional random oversampling for imbalanced text classification. In: Proceedings of SIGIR 2016, pp. 805–808 (2016)
13.
Zurück zum Zitat Sun Y., Kamel M., Wang Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of ICDM 2017, pp. 592–602 (2006) Sun Y., Kamel M., Wang Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of ICDM 2017, pp. 592–602 (2006)
14.
Zurück zum Zitat Wang, J., Chen, Y., Hao, S., Feng, W., Shen, Z.: Balanced distribution adaptation for transfer learning. In: Proceedings of ICDM 2017, pp. 1129–1134 (2017) Wang, J., Chen, Y., Hao, S., Feng, W., Shen, Z.: Balanced distribution adaptation for transfer learning. In: Proceedings of ICDM 2017, pp. 1129–1134 (2017)
15.
Zurück zum Zitat Wang, S., Minku, L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)CrossRef Wang, S., Minku, L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)CrossRef
Metadaten
Titel
BSIL: A Brain Storm-Based Framework for Imbalanced Text Classification
verfasst von
Jiachen Tian
Shizhan Chen
Xiaowang Zhang
Zhiyong Feng
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-32236-6_5

Premium Partner