skip to main content
10.1145/3357384.3357891acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Semantics Aware Random Forest for Text Classification

Authors Info & Claims
Published:03 November 2019Publication History

ABSTRACT

The Random Forest (RF) classifiers are suitable for dealing with the high dimensional noisy data in text classification. An RF model comprises a set of decision trees each of which is trained using random subsets of features. Given an instance, the prediction by the RF is obtained via majority voting of the predictions of all the trees in the forest. However, different test instances would have different values for the features used in the trees and the trees should contribute differently to the predictions. This diverse contribution of the trees is not considered in traditional RFs. Many approaches have been proposed to model the diverse contributions by selecting a subset of trees for each instance. This paper is among these approaches. It proposes a Semantics Aware Random Forest (SARF) classifier. SARF extracts the features used by trees to generate the predictions and selects a subset of the predictions for which the features are relevant to the predicted classes. We evaluated SARF's classification performance on $30$ real-world text datasets and assessed its competitiveness with state-of-the-art ensemble selection methods. The results demonstrate the superior performance of the proposed approach in textual information retrieval and initiate a new direction of research to utilise interpretability of classifiers.

References

  1. Dhammika Amaratunga, Javier Cabrera, and Yung-Seop Lee. 2008. Enriched random forests. Bioinformatics , Vol. 24, 18 (2008), 2010--2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert Mü ller. 2010. How to Explain Individual Classification Decisions. JMLR , Vol. 11 (2010).Google ScholarGoogle Scholar
  3. Simon Bernard, Laurent Heutte, and Sébastien Adam. 2008. On the selection of decision trees in random forests. International Joint Conference on Neural Networks (2008), 302--307.Google ScholarGoogle Scholar
  4. Leo Breiman. 2001. Random forests. Machine learning , Vol. 45, 1 (2001), 5--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alceu S Britto Jr, Robert Sabourin, and Luiz ES Oliveira. 2014. Dynamic selection of classifiers a comprehensive review. Pattern Recognition , Vol. 47, 11 (2014).Google ScholarGoogle Scholar
  6. Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis. In Proceedings of the 2018 EMNLP Workshop. 40--46.Google ScholarGoogle ScholarCross RefCross Ref
  7. Raphael Campos, Sérgio Canuto, Thiago Salles, Clebson CA de Sá, and Marcos André Goncc alves. 2017. Stacking bagged and boosted forests for effective automated classification. In Proceedings of the 40th ACM SIGIR. 105--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Rafael MO Cruz, Robert Sabourin, and George DC Cavalcanti. 2017. META-DES. Oracle: Meta-learning and feature selection for dynamic ensemble selection. Information fusion , Vol. 38 (2017), 84--103.Google ScholarGoogle Scholar
  9. Rafael MO Cruz, Robert Sabourin, and George DC Cavalcanti. 2018b. Dynamic classifier selection: Recent advances and perspectives. Information Fusion , Vol. 41 (2018), 195--216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rafael MO Cruz, Robert Sabourin, George DC Cavalcanti, and Tsang Ing Ren. 2015. META-DES: a dynamic ensemble selection framework using meta-learning. Pattern recognition , Vol. 48, 5 (2015), 1925--1935.Google ScholarGoogle Scholar
  11. Rafael M. O. Cruz, Luiz G. Hafemann, Robert Sabourin, and George D. C. Cavalcanti. 2018a. DESlib: A Dynamic ensemble selection library in Python . arXiv preprint arXiv:1802.04967 (2018).Google ScholarGoogle Scholar
  12. Janez Demvs ar. 2006. Statistical Comparisons of Classifiers over Multiple Data Sets . Journal of Machine Learning Research , Vol. 7 (2006), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Haytham Elghazel, Alex Aussem, and Florence Perraud. 2011. Trading-off diversity and accuracy for optimal ensemble tree selection in random forests. In Ensembles in Machine Learning Applications. Springer, 169--179.Google ScholarGoogle Scholar
  15. Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research , Vol. 15, 1 (2014), 3133--3181.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Salvador Garc'ia, Zhong-Liang Zhang, Abdulrahman Altalhi, Saleh Alshomrani, and Francisco Herrera. 2018. Dynamic ensemble selection for multi-class imbalanced datasets. Information Sciences , Vol. 445 (2018), 22--37.Google ScholarGoogle ScholarCross RefCross Ref
  17. Md Zahidul Islam, Jixue Liu, Lin Liu, Jiuyong Li, and Wei Kang. 2019. Semantic Explanations in Ensemble Learning. In Proceedings of the PAKDD 2019 . 29--41.Google ScholarGoogle ScholarCross RefCross Ref
  18. Albert HR Ko, Robert Sabourin, and Alceu Souza Britto Jr. 2008. From dynamic classifier selection to dynamic ensemble selection. Pattern Recognition , Vol. 41 (2008).Google ScholarGoogle Scholar
  19. L. I Kuncheva. 2014. Combining Pattern Classifiers: Methods and Algorithms second edition ed.). John Wiley & Sons, Inc.Google ScholarGoogle Scholar
  20. Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent Convolutional Neural Networks for Text Classification.. In AAAI , Vol. 333. 2267--2273.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Christopher D Manning and Hinrich Schütze. 1999. Foundations of statistical natural language processing .MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Anil Narassiguin, Haytham Elghazel, and Alex Aussem. 2017. Dynamic Ensemble Selection with Probabilistic Classifier Chains. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 169--186.Google ScholarGoogle Scholar
  23. Aytuug Onan, Serdar Korukouglu, and Hasan Bulut. 2016. Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications , Vol. 57 (2016), 232--247.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. , Vol. 2, 1--2 (2008), 1--135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fábio Pinto, Carlos Soares, and Jo ao Mendes-Moreira. 2016. CHADE: Metalearning with Classifier Chains for Dynamic Combination of Classifiers. In Proceedings of the ECML PKDD. 410--425.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Gregory Plumb, Denali Molitor, and Ameet S Talwalkar. 2018. Model Agnostic Supervised Local Explanations. In Advances in Neural Information Processing Systems. 2520--2529.Google ScholarGoogle Scholar
  27. Robi Polikar. 2006. Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine , Vol. 6, 3 (2006), 21--45.Google ScholarGoogle ScholarCross RefCross Ref
  28. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?": Explaining the Predictions of Any Classifier. In the 22nd ACM SIGKDD. 1135--1144.Google ScholarGoogle Scholar
  29. Marko Robnik-vS ikonja. 2004. Improving random forests. In European conference on machine learning . Springer, 359--370.Google ScholarGoogle Scholar
  30. Lior Rokach. 2009. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics & Data Analysis , Vol. 53, 12 (2009), 4046--4072.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Thiago Salles, Marcos Gonçalves, Victor Rodrigues, and Leonardo Rocha. 2018. Improving random forests by neighborhood projection for effective text classification. Information Systems , Vol. 77 (2018), 1--21.Google ScholarGoogle ScholarCross RefCross Ref
  32. Thiago Salles, Marcos Gonccalves, Victor Rodrigues, and Leonardo Rocha. 2015. BROOF: Exploiting Out-of-Bag Errors, Boosting and Random Forests for Effective Automated Classification. In Proceedings of the 38th ACM SIGIR . 353--362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management , Vol. 24, 5 (1988).Google ScholarGoogle Scholar
  34. Robert E Schapire and Yoram Singer. 2000. BoosTexter: A boosting-based system for text categorization. Machine learning , Vol. 39, 2--3 (2000), 135--168.Google ScholarGoogle Scholar
  35. Grigorios Tsoumakas, Ioannis Partalas, and Ioannis Vlahavas. 2009. An ensemble pruning primer. In Applications of supervised and unsupervised ensemble methods .Google ScholarGoogle Scholar
  36. Alexey Tsymbal, Mykola Pechenizkiy, and Pádraig Cunningham. 2006. Dynamic integration with random forests. In Proceedings of the ECML. Springer, 801--808.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gang Wang, Jianshan Sun, Jian Ma, Kaiquan Xu, and Jibao Gu. 2014. Sentiment classification: The contribution of ensemble learning. Decision support systems , Vol. 57 (2014), 77--93.Google ScholarGoogle Scholar
  38. Tomasz Woloszynski, Marek Kurzynski, Pawel Podsiadlo, and Gwidon W Stachowiak. 2012. A measure of competence based on random classification for dynamic ensemble selection. Information Fusion , Vol. 13, 3 (2012), 207--213.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kevin Woods, W. Philip Kegelmeyer, and Kevin Bowyer. 1997. Combination of multiple classifiers using local accuracy estimates. IEEE transactions on pattern analysis and machine intelligence , Vol. 19, 4 (1997), 405--410.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Baoxun Xu, Xiufeng Guo, Yunming Ye, and Jiefeng Cheng. 2012. An Improved Random Forest Classifier for Text Categorization. JCP , Vol. 7, 12 (2012), 2913--2920.Google ScholarGoogle Scholar
  41. Fan Yang, Wei-hang Lu, Lin-kai Luo, and Tao Li. 2012. Margin optimization based pruning for random forest. Neurocomputing , Vol. 94 (2012), 54--63.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Heping Zhang and Minghui Wang. 2009. Search for the smallest random forest. Statistics and its Interface , Vol. 2, 3 (2009), 381.Google ScholarGoogle Scholar
  43. Zhong-Liang Zhang, Yu-Yu Chen, Jing Li, and Xing-Gang Luo. 2019. A distance-based weighting framework for boosting the performance of dynamic ensemble selection. Information Processing & Management , Vol. 56, 4 (2019), 1300--1316.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Semantics Aware Random Forest for Text Classification

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
          November 2019
          3373 pages
          ISBN:9781450369763
          DOI:10.1145/3357384

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 3 November 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader