Skip to main content
Log in

An information-theoretic graph-based approach for feature selection

  • Published:
Sādhanā Aims and scope Submit manuscript

Abstract

Feature selection is a critical research problem in data science. The need for feature selection has become more critical with the advent of high-dimensional data sets especially related to text, image and micro-array data. In this paper, a graph-theoretic approach with step-by-step visualization is proposed in the context of supervised feature selection. Mutual information criterion is used to evaluate the relevance of the features with respect to the class. A graph-based representation of the input data set, named as feature information map (FIM) is created, highlighting the vertices representing the less informative features. Amongst the more informative features, the inter-feature similarity is measured to draw edges between features having high similarity. At the end, minimal vertex cover is applied on the connected vertices to identify a subset of features potentially having less similarity among each other. Results of the experiments conducted with standard data sets show that the proposed method gives better results than the competing algorithms for most of the data sets. The proposed algorithm also has a novel contribution of rendering a visualization of features in terms of relevance and redundancy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2

Similar content being viewed by others

References

  1. Cao L 2016 Data science and analytics: a new era. Int. J. Data Sci. Anal. 1: 1–2

    Article  Google Scholar 

  2. Morgulev E, Azar O H and Lidor R 2017 Sports analytics and the big-data era. Int. J. Data Sci. Anal. 5(4): 213-222

    Article  Google Scholar 

  3. Moujahid A and Dornaika F 2017 Feature selection for spatially enhanced LBP: application to face recognition. Int. J. Data Sci. Anal. 5: 11–18

    Article  Google Scholar 

  4. Bandyopadhyay S, Bhadra T, Mitra P and Maulik U 2014 Integration of dense subgraph finding with feature clustering for unsupervised feature selection. Pattern Recognit. Lett. 40: 104–112

    Article  Google Scholar 

  5. Liu H and Yu L 2005 Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17: 491–502

    Article  Google Scholar 

  6. Dash M and Liu H 1997 Feature selection for classification. Intell. Data Anal. 1: 131–156

    Article  Google Scholar 

  7. John G H, Kohavi R and Pfleger K 1994 Irrelevant features and the subset selection problem. In: ICML Proceedings, pp. 121–129

  8. Das A K, Goswami S, Chakraborty B and Chakrabarti A 2016 A graph-theoretic approach for visualization of data set feature association. In: Advanced Computing and Systems for Security, vol. 4, pp. 109–124

  9. Goswami S, Das A K, Chakrabarti A and Chakraborty B 2017 A feature cluster taxonomy based feature selection technique. Expert Syst. Appl. 79: 76–89

    Article  Google Scholar 

  10. Goswami S, Guha P, Tarafdar A, Das A K, Chakraborty S, Chakrabarti A and Chakraborty B 2017 An approach of feature selection using graph-theoretic heuristic and hill climbing. Pattern Anal. Appl. 22(2): 615–631

    Article  MathSciNet  Google Scholar 

  11. Liu H and Motoda H 2009 Computational methods of feature selection. Inf. Process. Manag. 45: 490–493

    Article  Google Scholar 

  12. Tang J, Alelyani S and Liu H 2014 Feature selection for classification: a review. In: Data Classification: Algorithms and Applications, pp. 37–64

  13. Das A K, Goswami S, Chakrabarti A and Chakraborty B 2017 A new hybrid feature selection approach using Feature Association Map for supervised and unsupervised classiffication. Expert Syst. Appl. 88: 81–94

    Article  Google Scholar 

  14. Peng H, Long F and Ding C H 2005 Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27: 1226–1238

    Article  Google Scholar 

  15. Estvez P A, Tesmer M, Perez C A and Zurada J M 2009 Normalized mutual information feature selection. IEEE Trans. Neural Netw. 20: 189–201

    Article  Google Scholar 

  16. Battiti R 1994 Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4): 537–550

    Article  Google Scholar 

  17. Hoque N, Bhattacharyya D K and Kalita J K 2014 MIFS-ND: a mutual information-based feature selection method. Expert Syst. Appl. 41: 6371–6385

    Article  Google Scholar 

  18. Zhang Z and Hancock E R 2011 A graph-based approach to feature selection. In: GBRPR Proceedings, pp. 205–214

  19. Zhang Z and Hancock E R 2012 Hypergraph based information-theoretic feature selection. Pattern Recognit. Lett. 33: 1991–1999

    Article  Google Scholar 

  20. Tsamardinos I, Brown L E and Aliferis C F 2006 The max–min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1): 31–78

    Article  Google Scholar 

  21. Gasse M, Aussem A and Elghazel H 2014 A hybrid algorithm for Bayesian network structure learning with application to multi-label learning. Expert Syst. Appl. 41(15): 6755–6772

    Article  Google Scholar 

  22. Chow C K and Liu C N 1968 Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory IT 14(3): 462–467

    Article  Google Scholar 

  23. Zare H and Niazi M 2016 Relevant based structure learning for feature selection. Eng. Appl. Artif. Intell. 55: 93–102

    Article  Google Scholar 

  24. Huang S et al 2013 Alzheimer’s disease neuroimaging initiative—a sparse structure learning algorithm for Gaussian Bayesian network identification from high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 35(6): 1328–1342

    Article  Google Scholar 

  25. Hall M A 2000 Correlation-based feature selection for discrete and numeric class machine learning. In: ICML Proceedings, pp. 359–366

  26. Khan I and Khan S 2014 Experimental comparison of five approximation algorithms for minimum vertex cover. Int. J. Sci. Technol. 7: 69–84

    Google Scholar 

  27. Li S et al 2011 An algorithm for minimum vertex cover based on Max-I share degree. J. Comput. 6: 1781–1788

    Google Scholar 

  28. Lichman M and Bache K 2013 UCI machine learning repository [online]. Available: http://archive.ics.uci.edu/ml. Accessed 10 Oct 2018

  29. Taylor R 1990 Interpretation of the correlation coefficient: a basic review. J. Diagn. Med. Sonogr. 6: 35–39

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Basabi Chakraborty.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, A.K., Kumar, S., Jain, S. et al. An information-theoretic graph-based approach for feature selection. Sādhanā 45, 11 (2020). https://doi.org/10.1007/s12046-019-1238-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12046-019-1238-2

Keywords

Navigation