Skip to main content

2014 | OriginalPaper | Buchkapitel

RF-SEA-Based Feature Selection for Data Classification in Medical Domain

verfasst von : S. Sasikala, S. Appavu alias Balamurugan, S. Geetha

Erschienen in: Intelligent Computing, Networking, and Informatics

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dimensionality reduction is an essential problem in data analysis that has received a significant amount of attention from several disciplines. It includes two types of methods, i.e., feature extraction and feature selection. In this paper, we introduce a simple method for supervised feature selection for data classification tasks. The proposed hybrid feature selection mechanism (HFS), i.e., RF-SEA (ReliefF-Shapley ensemble analysis) which combines both filter and wrapper models for dimension reduction. In the first stage, we use the filter model to rank the features by the ReliefF(RF) between classes and then choose the highest relevant features to the classes with the help of the threshold. In the second stage, we use Shapley ensemble algorithm to evaluate the contribution of features to the classification task in the ranked feature subset and principal component analysis (PCA) is carried out as preprocessing step before both the steps. Experiments with several medical datasets proves that our proposed approach is capable of detecting completely irrelevant features and remove redundant features without significantly hurting the performance of the classification algorithm and also experimental results show obviously that the RF-SEA method can obtain better classification performance than singly Shapley-value-based or ReliefF (RF)-algorithm based method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Liu, H, Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. (2005) Liu, H, Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. (2005)
2.
Zurück zum Zitat Lemke, F., Mueller, J.-A.: Medical data analysis using self-organizing data mining technologies. Syst. Anal. Model. Simul. 43(10), 1399–1408 (2003)CrossRef Lemke, F., Mueller, J.-A.: Medical data analysis using self-organizing data mining technologies. Syst. Anal. Model. Simul. 43(10), 1399–1408 (2003)CrossRef
3.
Zurück zum Zitat Li, W., Han, J., Pei, J.: CMAR accurate and efficient classification based on multiple association rules. In: Proceedings of 2001 International Conference on Data Mining (2001) Li, W., Han, J., Pei, J.: CMAR accurate and efficient classification based on multiple association rules. In: Proceedings of 2001 International Conference on Data Mining (2001)
4.
Zurück zum Zitat Importance of feature selection in decision-tree and artificial-neural-network ecological applications Alburnus alburnus alborella: A practical example : Tina Tirelli, Daniela Pessani. Ecol. Inf. 6, 309–315 (2011) Importance of feature selection in decision-tree and artificial-neural-network ecological applications Alburnus alburnus alborella: A practical example : Tina Tirelli, Daniela Pessani. Ecol. Inf. 6, 309–315 (2011)
5.
Zurück zum Zitat Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the AAAI-92, AAAI Press, pp. 129–134 (1992) Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the AAAI-92, AAAI Press, pp. 129–134 (1992)
6.
Zurück zum Zitat Robnic-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn. 53(1–2), 23–69 (2003)CrossRef Robnic-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn. 53(1–2), 23–69 (2003)CrossRef
7.
Zurück zum Zitat Sun, Y., Wu, D.: A Relief based feature extraction algorithm. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 188–195 (2008) Sun, Y., Wu, D.: A Relief based feature extraction algorithm. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 188–195 (2008)
8.
Zurück zum Zitat Ghiselli, E.E.: Theory of Psychological Measurement. McGraw_Hill Ghiselli, E.E.: Theory of Psychological Measurement. McGraw_Hill
9.
Zurück zum Zitat Quinlan, J.R.: Induction of decision trees. Machine Learn. 1, 81–106 (1986) Quinlan, J.R.: Induction of decision trees. Machine Learn. 1, 81–106 (1986)
10.
Zurück zum Zitat Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4), 537–550 (1994)CrossRef Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4), 537–550 (1994)CrossRef
11.
Zurück zum Zitat Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Annals of Mathematics Studies II (28), pp. 307–317. Princeton University Press, Princeton (1953) Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Annals of Mathematics Studies II (28), pp. 307–317. Princeton University Press, Princeton (1953)
14.
Zurück zum Zitat Jolliffe, I.T.: Principal Component Analysis. Springer (2002) Jolliffe, I.T.: Principal Component Analysis. Springer (2002)
Metadaten
Titel
RF-SEA-Based Feature Selection for Data Classification in Medical Domain
verfasst von
S. Sasikala
S. Appavu alias Balamurugan
S. Geetha
Copyright-Jahr
2014
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-1665-0_59