Skip to main content
Erschienen in: Knowledge and Information Systems 2/2019

05.09.2018 | Regular Paper

Graph clustering-based discretization approach to microarray data

verfasst von: Kittakorn Sriwanna, Tossapon Boongoen, Natthakan Iam-On

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Several techniques in data mining require discrete data. In fact, learning with discrete domains often performs better than the case of continuous data. Multivariate discretization is the algorithm that transforms continuous data to discrete one by considering correlations among attributes. Given the benefit of this idea, many multivariate discretization algorithms have been proposed. However, there are a few discretization algorithms that directly apply to microarray or gene expression data, which is high-dimensional and unbalance data. Even so interesting, no multivariate method has been put forward for microarray data analysis. According to the recent published research, graph clustering-based discretization of splitting and merging methods (GraphS and GraphM) usually achieves superior results compared to many well-known discretization algorithms. In this paper, GraphS and GraphM are extended by adding the alpha parameter that is the ratio between the similarity of gene expressions (distance) and the similarity of the class label. Moreover, the extensions consider 3 similarity measures of cosine similarity, Euclidean distance, and Pearson correlation in order to determine the proper pairwise similarity measure. The evaluation against 20 real microarray datasets and 4 classifiers suggests that the results of three classification performances (ACC, AUC, Kappa) and running time of two proposed methods based on cosine similarity, GraphM(C) and GraphS(C) are better than 9 state-of-the-art discretization algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66 Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
2.
Zurück zum Zitat Alcalá-Fdez J, Sánchez L, García S, del Jesus M, Ventura S, Garrell J, Otero J, Romero C, Bacardit J, Rivas V, Fernández J, Herrera F (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318CrossRef Alcalá-Fdez J, Sánchez L, García S, del Jesus M, Ventura S, Garrell J, Otero J, Romero C, Bacardit J, Rivas V, Fernández J, Herrera F (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318CrossRef
3.
Zurück zum Zitat Alcalá J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2010) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17(255–287):11 Alcalá J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2010) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17(255–287):11
4.
Zurück zum Zitat Baralis E, Bruno G, Fiori A (2011) Measuring gene similarity by means of the classification distance. Knowl Inf Syst 29(1):81–101CrossRef Baralis E, Bruno G, Fiori A (2011) Measuring gene similarity by means of the classification distance. Knowl Inf Syst 29(1):81–101CrossRef
5.
Zurück zum Zitat Bay SD (2001) Multivariate discretization for set mining. Knowl Inf Syst 3(4):491–512MATHCrossRef Bay SD (2001) Multivariate discretization for set mining. Knowl Inf Syst 3(4):491–512MATHCrossRef
6.
Zurück zum Zitat Ben-David A (2008a) About the relationship between roc curves and cohen’s kappa. Eng Appl Artif Intell 21(6):874–882CrossRef Ben-David A (2008a) About the relationship between roc curves and cohen’s kappa. Eng Appl Artif Intell 21(6):874–882CrossRef
7.
Zurück zum Zitat Ben-David A (2008b) Comparison of classification accuracy using cohens weighted kappa. Expert Syst Appl 34(2):825–832CrossRef Ben-David A (2008b) Comparison of classification accuracy using cohens weighted kappa. Expert Syst Appl 34(2):825–832CrossRef
8.
Zurück zum Zitat Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2010) On the effectiveness of discretization on gene selection of microarray data. In: The 2010 international joint conference on Neural networks (IJCNN). IEEE, pp 1–8 Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2010) On the effectiveness of discretization on gene selection of microarray data. In: The 2010 international joint conference on Neural networks (IJCNN). IEEE, pp 1–8
9.
Zurück zum Zitat Boullé M (2006) Modl: A bayes optimal discretization method for continuous attributes. Machine learning 65(1):131–165CrossRef Boullé M (2006) Modl: A bayes optimal discretization method for continuous attributes. Machine learning 65(1):131–165CrossRef
10.
Zurück zum Zitat Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef
11.
Zurück zum Zitat Brandes U, Gaertler M, Wagner D (2003) Experiments on graph clustering algorithms. Springer, BerlinMATHCrossRef Brandes U, Gaertler M, Wagner D (2003) Experiments on graph clustering algorithms. Springer, BerlinMATHCrossRef
12.
Zurück zum Zitat Cai R, Hao Z, Wen W, Wang L (2013) Regularized gaussian mixture model based discretization for gene expression data association mining. Appl Intell 39(3):607–613CrossRef Cai R, Hao Z, Wen W, Wang L (2013) Regularized gaussian mixture model based discretization for gene expression data association mining. Appl Intell 39(3):607–613CrossRef
13.
Zurück zum Zitat Cai R, Tung AK, Zhang Z, Hao Z (2011) What is unequal among the equals? ranking equivalent rules from gene expression data. IEEE Trans Knowl Data Eng 23(11):1735–1747CrossRef Cai R, Tung AK, Zhang Z, Hao Z (2011) What is unequal among the equals? ranking equivalent rules from gene expression data. IEEE Trans Knowl Data Eng 23(11):1735–1747CrossRef
14.
Zurück zum Zitat Cano A, Nguyen DT, Ventura S, Cios KJ (2016) ur-caim: improved caim discretization for unbalanced and balanced data. Soft Comput 20(1):173–188CrossRef Cano A, Nguyen DT, Ventura S, Cios KJ (2016) ur-caim: improved caim discretization for unbalanced and balanced data. Soft Comput 20(1):173–188CrossRef
15.
Zurück zum Zitat Cano A, Nguyen D, Ventura S, Cios K (2014) ur-caim: improved caim discretization for unbalanced and balanced data. Soft Comput 20:1–16CrossRef Cano A, Nguyen D, Ventura S, Cios K (2014) ur-caim: improved caim discretization for unbalanced and balanced data. Soft Comput 20:1–16CrossRef
16.
Zurück zum Zitat Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Machine learningEWSL-91. Springer, pp 164–178 Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Machine learningEWSL-91. Springer, pp 164–178
17.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
18.
Zurück zum Zitat de Sá CR, Soares C, Knobbe A (2015) Entropy-based discretization methods for ranking data. Inf Sci de Sá CR, Soares C, Knobbe A (2015) Entropy-based discretization methods for ranking data. Inf Sci
19.
Zurück zum Zitat Deegalla S, Boström H (2007) Classification of microarrays with knn: comparison of dimensionality reduction methods. In: Intelligent data engineering and automated learning-IDEAL 2007. Springer, pp 800–809 Deegalla S, Boström H (2007) Classification of microarrays with knn: comparison of dimensionality reduction methods. In: Intelligent data engineering and automated learning-IDEAL 2007. Springer, pp 800–809
20.
Zurück zum Zitat Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Prieditis A, Russell S (eds) Machine learning proceedings 1995. Morgan Kaufmann, San Francisco, pp 194–202CrossRef Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Prieditis A, Russell S (eds) Machine learning proceedings 1995. Morgan Kaufmann, San Francisco, pp 194–202CrossRef
21.
Zurück zum Zitat Durrant B, Frank E, Hunt L, Holmes G, Mayo M, Pfahringer B, Smith T, Witten I (2014) Weka 3: Data mining software in java. Machine Learning Group at the University of Waikato Durrant B, Frank E, Hunt L, Holmes G, Mayo M, Pfahringer B, Smith T, Witten I (2014) Weka 3: Data mining software in java. Machine Learning Group at the University of Waikato
22.
Zurück zum Zitat Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp 1022–1029 Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp 1022–1029
23.
Zurück zum Zitat Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701MATHCrossRef Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701MATHCrossRef
24.
Zurück zum Zitat Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92MathSciNetMATHCrossRef Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92MathSciNetMATHCrossRef
25.
Zurück zum Zitat García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064CrossRef García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064CrossRef
26.
Zurück zum Zitat Garcia S, Luengo J, Sáez JA, López V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734–750CrossRef Garcia S, Luengo J, Sáez JA, López V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734–750CrossRef
27.
Zurück zum Zitat Giancarlo R, Bosco GL, Pinello L (2010) Distance functions, clustering algorithms and microarray data analysis. In: Learning and intelligent optimization. Springer, pp 125–138 Giancarlo R, Bosco GL, Pinello L (2010) Distance functions, clustering algorithms and microarray data analysis. In: Learning and intelligent optimization. Springer, pp 125–138
28.
Zurück zum Zitat Gonzalez-Abril L, Cuberos FJ, Velasco F, Ortega JA (2009) Ameva: an autonomous discretization algorithm. Expert Syst Appl 36(3):5327–5332CrossRef Gonzalez-Abril L, Cuberos FJ, Velasco F, Ortega JA (2009) Ameva: an autonomous discretization algorithm. Expert Syst Appl 36(3):5327–5332CrossRef
29.
Zurück zum Zitat Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San FranciscoMATH Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San FranciscoMATH
30.
Zurück zum Zitat Hayashi Y, Setiono R, Azcarraga A (2016) Neural network training and rule extraction with augmented discretized input. Neurocomputing 207:610–622CrossRef Hayashi Y, Setiono R, Azcarraga A (2016) Neural network training and rule extraction with augmented discretized input. Neurocomputing 207:610–622CrossRef
31.
Zurück zum Zitat Ho KM, Scott PD (1997) Zeta: a global method for discretization of continuous variables. In: Proc. Third intl conf. knowledge discovery and data mining (KDD97), pp 191–194 Ho KM, Scott PD (1997) Zeta: a global method for discretization of continuous variables. In: Proc. Third intl conf. knowledge discovery and data mining (KDD97), pp 191–194
32.
Zurück zum Zitat Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 65–70 Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 65–70
33.
Zurück zum Zitat Huang J, Ling CX (2005) Using auc and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef Huang J, Ling CX (2005) Using auc and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef
34.
Zurück zum Zitat John GH, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, UAI’95, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 338–345 John GH, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, UAI’95, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 338–345
35.
Zurück zum Zitat Kautz T, Eskofier BM, Pasluosta CF (2017) Generic performance measure for multiclass-classifiers. Pattern Recognit 68:111–125CrossRef Kautz T, Eskofier BM, Pasluosta CF (2017) Generic performance measure for multiclass-classifiers. Pattern Recognit 68:111–125CrossRef
36.
Zurück zum Zitat Kerber R (1992) Chimerge: discretization of numeric attributes. In: Proceedings of the tenth national conference on artificial intelligence, Aaai Press, pp 123–128 Kerber R (1992) Chimerge: discretization of numeric attributes. In: Proceedings of the tenth national conference on artificial intelligence, Aaai Press, pp 123–128
37.
Zurück zum Zitat Kurgan LA, Cios KJ (2004) Caim discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153CrossRef Kurgan LA, Cios KJ (2004) Caim discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153CrossRef
38.
Zurück zum Zitat Li J, Fong S, Mohammed S, Fiaidhi J (2016) Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms. J Supercomput 72(10):3708–3728CrossRef Li J, Fong S, Mohammed S, Fiaidhi J (2016) Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms. J Supercomput 72(10):3708–3728CrossRef
39.
Zurück zum Zitat Lustgarten JL, Gopalakrishnan V, Grover H, Visweswaran S (2008) Improving classification performance with discretization on biomedical datasets. In: AMIA annual symposium proceedings, Vol. 2008, American Medical Informatics Association, p 445 Lustgarten JL, Gopalakrishnan V, Grover H, Visweswaran S (2008) Improving classification performance with discretization on biomedical datasets. In: AMIA annual symposium proceedings, Vol. 2008, American Medical Informatics Association, p 445
40.
Zurück zum Zitat Lustgarten JL, Visweswaran S, Gopalakrishnan V, Cooper GF (2011) Application of an efficient bayesian discretization method to biomedical data. BMC Bioinform 12(1):309CrossRef Lustgarten JL, Visweswaran S, Gopalakrishnan V, Cooper GF (2011) Application of an efficient bayesian discretization method to biomedical data. BMC Bioinform 12(1):309CrossRef
41.
Zurück zum Zitat Lv J, Peng Q, Chen X, Sun Z (2016) A multi-objective heuristic algorithm for gene expression microarray data classification. Expert Syst Appl 59:13–19CrossRef Lv J, Peng Q, Chen X, Sun Z (2016) A multi-objective heuristic algorithm for gene expression microarray data classification. Expert Syst Appl 59:13–19CrossRef
42.
Zurück zum Zitat Madhu G, Rajinikanth T, Govardhan A (2014) Improve the classifier accuracy for continuous attributes in biomedical datasets using a new discretization method. Procedia Comput Sci 31:671–679CrossRef Madhu G, Rajinikanth T, Govardhan A (2014) Improve the classifier accuracy for continuous attributes in biomedical datasets using a new discretization method. Procedia Comput Sci 31:671–679CrossRef
43.
Zurück zum Zitat Nguyen V-A, Lió P (2009) Measuring similarity between gene expression profiles: a bayesian approach. BMC Genom 10(Suppl 3):S14CrossRef Nguyen V-A, Lió P (2009) Measuring similarity between gene expression profiles: a bayesian approach. BMC Genom 10(Suppl 3):S14CrossRef
44.
Zurück zum Zitat Ong HF, Mustapha N, Sulaiman MN (2014) An integrative gene selection with association analysis for microarray data classification. Intell. Data Anal. 18(4):739–758CrossRef Ong HF, Mustapha N, Sulaiman MN (2014) An integrative gene selection with association analysis for microarray data classification. Intell. Data Anal. 18(4):739–758CrossRef
45.
Zurück zum Zitat Piatetsky-Shapiro G, Tamayo P (2003) Microarray data mining: facing the challenges. ACM SIGKDD Explor Newsl 5(2):1–5CrossRef Piatetsky-Shapiro G, Tamayo P (2003) Microarray data mining: facing the challenges. ACM SIGKDD Explor Newsl 5(2):1–5CrossRef
46.
Zurück zum Zitat Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA
47.
Zurück zum Zitat Rahman MG, Islam MZ (2016) Discretization of continuous attributes through low frequency numerical values and attribute interdependency. Expert Syst Appl 45:410–423CrossRef Rahman MG, Islam MZ (2016) Discretization of continuous attributes through low frequency numerical values and attribute interdependency. Expert Syst Appl 45:410–423CrossRef
48.
Zurück zum Zitat Ramirez-Gallego S, Garcia S, Benitez J, Herrera F (2015a) Multivariate discretization based on evolutionary cut points selection for classification. IEEE Trans Cybern PP(99):1–1 Ramirez-Gallego S, Garcia S, Benitez J, Herrera F (2015a) Multivariate discretization based on evolutionary cut points selection for classification. IEEE Trans Cybern PP(99):1–1
49.
Zurück zum Zitat Ramirez-Gallego S, Garcia S, Benitez JM, Herrera F, (2015b) Multivariate discretization based on evolutionary cut points selection for classification Ramirez-Gallego S, Garcia S, Benitez JM, Herrera F, (2015b) Multivariate discretization based on evolutionary cut points selection for classification
50.
Zurück zum Zitat Ruan J, Jahid MJ, Gu F, Lei C, Huang Y-W, Hsu Y-T, Mutch DG, Chen C-L, Kirma NB, Huang TH-M (2016) A novel algorithm for network-based prediction of cancer recurrence. Genomics Ruan J, Jahid MJ, Gu F, Lei C, Huang Y-W, Hsu Y-T, Mutch DG, Chen C-L, Kirma NB, Huang TH-M (2016) A novel algorithm for network-based prediction of cancer recurrence. Genomics
51.
Zurück zum Zitat Sang Y, Qi H, Li K, Jin Y, Yan D, Gao S (2014) An effective discretization method for disposing high-dimensional data. Inf Sci 270:73–91MathSciNetMATHCrossRef Sang Y, Qi H, Li K, Jin Y, Yan D, Gao S (2014) An effective discretization method for disposing high-dimensional data. Inf Sci 270:73–91MathSciNetMATHCrossRef
52.
Zurück zum Zitat Shang C, Shen Q (2005) Aiding classification of gene expression data with feature selection: a comparative study. Int J Comput Intell Res 1(1):68–76CrossRef Shang C, Shen Q (2005) Aiding classification of gene expression data with feature selection: a comparative study. Int J Comput Intell Res 1(1):68–76CrossRef
53.
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
54.
Zurück zum Zitat Sriwanna K, Boongoen T, Iam-On N (2017) Graph clustering-based discretization of splitting and merging methods (graphs and graphm). Human-Centric Comput Inf Sci 7(1):21CrossRef Sriwanna K, Boongoen T, Iam-On N (2017) Graph clustering-based discretization of splitting and merging methods (graphs and graphm). Human-Centric Comput Inf Sci 7(1):21CrossRef
55.
Zurück zum Zitat Sriwanna K, Puntumapon K, Waiyamai K (2012) An enhanced class-attribute interdependence maximization discretization algorithm. In: Advanced data mining and applications. Springer, pp 465–476 Sriwanna K, Puntumapon K, Waiyamai K (2012) An enhanced class-attribute interdependence maximization discretization algorithm. In: Advanced data mining and applications. Springer, pp 465–476
56.
Zurück zum Zitat Wang H-Q, Jing G-J, Zheng C (2014) Biology-constrained gene expression discretization for cancer classification. Neurocomputing 145:30–36CrossRef Wang H-Q, Jing G-J, Zheng C (2014) Biology-constrained gene expression discretization for cancer classification. Neurocomputing 145:30–36CrossRef
57.
Zurück zum Zitat Wei D, Jiang Q, Wei Y, Wang S (2012) A novel hierarchical clustering algorithm for gene sequences. BMC Bioinform. 13(1):174CrossRef Wei D, Jiang Q, Wei Y, Wang S (2012) A novel hierarchical clustering algorithm for gene sequences. BMC Bioinform. 13(1):174CrossRef
58.
Zurück zum Zitat Wu X, Kumar V (2009) The top ten algorithms in data mining, 1st edn. Chapman & Hall, Boca RatonCrossRef Wu X, Kumar V (2009) The top ten algorithms in data mining, 1st edn. Chapman & Hall, Boca RatonCrossRef
59.
Zurück zum Zitat Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRef Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRef
60.
Zurück zum Zitat Yang P, Li J-S, Huang Y-X (2011) Hdd: a hypercube division-based algorithm for discretisation. Int J Syst Sci 42(4):557–566MathSciNetMATHCrossRef Yang P, Li J-S, Huang Y-X (2011) Hdd: a hypercube division-based algorithm for discretisation. Int J Syst Sci 42(4):557–566MathSciNetMATHCrossRef
61.
Zurück zum Zitat Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74CrossRef Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74CrossRef
62.
Zurück zum Zitat Yu Z, You J, Li L, Wong H-S, Han G (2012) Representative distance: a new similarity measure for class discovery from gene expression data. IEEE Trans NanoBiosci 11(4):341–351CrossRef Yu Z, You J, Li L, Wong H-S, Han G (2012) Representative distance: a new similarity measure for class discovery from gene expression data. IEEE Trans NanoBiosci 11(4):341–351CrossRef
Metadaten
Titel
Graph clustering-based discretization approach to microarray data
verfasst von
Kittakorn Sriwanna
Tossapon Boongoen
Natthakan Iam-On
Publikationsdatum
05.09.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1249-z

Weitere Artikel der Ausgabe 2/2019

Knowledge and Information Systems 2/2019 Zur Ausgabe