Skip to main content
Top
Published in: Knowledge and Information Systems 2/2018

27-10-2017 | Regular Paper

Estimation of incomplete values in heterogeneous attribute large datasets using discretized Bayesian max–min ant colony optimization

Authors: Sivaraj Rajappan, DeviPriya Rangasamy

Published in: Knowledge and Information Systems | Issue 2/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The size of datasets is becoming larger nowadays and missing values in such datasets pose serious threat to data analysts. Although various techniques have been developed by researchers to handle missing values in different kinds of datasets, there is not much effort to deal with the missing values in mixed attributes in large datasets. This paper has proposed novel strategies for dealing with this issue. The significant attributes (covariates) required for imputation are first selected using gain ratio measure to decrease the computational complexity. Since analysis of continuous attributes in imputation process is complex, they are first discretized using a novel methodology called Bayesian classifier-based discretization. Then, missing values in them are imputed using Bayesian max–min ant colony optimization algorithm which hybridizes ACO with Bayesian principles. The local search technique is also introduced in ACO implementation to improve its exploitative capability. The proposed methodology is implemented in real datasets with different missing rates ranging from 5 to 50% and from the experimental results, it is observed that the proposed discretization and imputation algorithms produce better results than the existing methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abdulkader MMS, Gajpal Y, ElMekkawy TY (2015) Hybridized ant colony algorithm for the multi compartment vehicle routing problem. Appl Soft Comput 37:196–203CrossRef Abdulkader MMS, Gajpal Y, ElMekkawy TY (2015) Hybridized ant colony algorithm for the multi compartment vehicle routing problem. Appl Soft Comput 37:196–203CrossRef
2.
go back to reference Ali R, Siddiqi MH, Lee S (2015) Rough set-based approaches for discretization: a compact Review. Artif Intell Rev 44(2):235–263CrossRef Ali R, Siddiqi MH, Lee S (2015) Rough set-based approaches for discretization: a compact Review. Artif Intell Rev 44(2):235–263CrossRef
3.
go back to reference Aydilek IB, Arslan A (2013) A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf Sci Int J 233:25–35 Aydilek IB, Arslan A (2013) A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf Sci Int J 233:25–35
4.
go back to reference Bai J, Yang G-K, Chen Y-W, Hu L-H, Pan C-C (2013) A model induced max–min ant colony optimization for asymmetric travelling salesman problem. Appl Soft Comput 13:1365–1375CrossRef Bai J, Yang G-K, Chen Y-W, Hu L-H, Pan C-C (2013) A model induced max–min ant colony optimization for asymmetric travelling salesman problem. Appl Soft Comput 13:1365–1375CrossRef
5.
6.
go back to reference Berrichi A, Yalaoui F, Amodeo L, Mezghiche M (2010) Computers Bi-objective ant colony optimization approach to optimize production and maintenance scheduling. Oper Res 37:1584–1596MathSciNetMATH Berrichi A, Yalaoui F, Amodeo L, Mezghiche M (2010) Computers Bi-objective ant colony optimization approach to optimize production and maintenance scheduling. Oper Res 37:1584–1596MathSciNetMATH
7.
go back to reference Boyles S (2011) A comparison of interpolation methods for missing traffic volume data. In: Proceedings of the 90th annual meeting of the transportation research board, pp 23–27 Boyles S (2011) A comparison of interpolation methods for missing traffic volume data. In: Proceedings of the 90th annual meeting of the transportation research board, pp 23–27
8.
go back to reference Blum C (2005) Ant colony optimization: introduction and recent trends. Phys Life Rev 2:353–373CrossRef Blum C (2005) Ant colony optimization: introduction and recent trends. Phys Life Rev 2:353–373CrossRef
9.
go back to reference Bobbie-Jo M, Webb-Robertson Wiberg HK, Matzke MM et al (2015) Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J Proteome Res 14(5):1993–2001CrossRef Bobbie-Jo M, Webb-Robertson Wiberg HK, Matzke MM et al (2015) Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J Proteome Res 14(5):1993–2001CrossRef
10.
go back to reference Borrotti G, Minervini D, Lucrezia D, Poli I (2016) Naïve Bayes ant colony optimization for designing high dimensional experiments. Appl Soft Comput 49:259–268CrossRef Borrotti G, Minervini D, Lucrezia D, Poli I (2016) Naïve Bayes ant colony optimization for designing high dimensional experiments. Appl Soft Comput 49:259–268CrossRef
11.
go back to reference Boulle M (2006) MODL: a Bayes optimal discretization method for continuous attributes. Mach Learn 65:131–165CrossRef Boulle M (2006) MODL: a Bayes optimal discretization method for continuous attributes. Mach Learn 65:131–165CrossRef
12.
go back to reference Chen J, Huang H, Tian F, Tian S (2008) A selective Bayes classifier for classifying incomplete data based on gain ratio. Knowl Based Syst 21(7):530–534CrossRef Chen J, Huang H, Tian F, Tian S (2008) A selective Bayes classifier for classifying incomplete data based on gain ratio. Knowl Based Syst 21(7):530–534CrossRef
13.
go back to reference Cheng X, Cook D, Hofmann H (2015) Visually exploring missing values in multivariable data using a graphical user interface. J Stat Soft 68(6):1–23CrossRef Cheng X, Cook D, Hofmann H (2015) Visually exploring missing values in multivariable data using a graphical user interface. J Stat Soft 68(6):1–23CrossRef
14.
go back to reference D’Andreagiovanni F, Krolikowski J, Pulaj J (2015) A fast hybrid primal heuristic for multiband robust capacitated network design with multiple time periods. Appl Soft Comput 26:497–507CrossRef D’Andreagiovanni F, Krolikowski J, Pulaj J (2015) A fast hybrid primal heuristic for multiband robust capacitated network design with multiple time periods. Appl Soft Comput 26:497–507CrossRef
15.
go back to reference D’Andreagiovanni F, Nardin A (2015) Towards the fast and robust optimal design of wireless body area networks. Appl Soft Comput 37:971–982CrossRef D’Andreagiovanni F, Nardin A (2015) Towards the fast and robust optimal design of wireless body area networks. Appl Soft Comput 37:971–982CrossRef
16.
go back to reference Deng Y, Chang C, Ido MS, Long Q (2016) Multiple imputation for general missing data patterns in the presence of high-dimensional data. Sci Rep 6(21689):1–10 Deng Y, Chang C, Ido MS, Long Q (2016) Multiple imputation for general missing data patterns in the presence of high-dimensional data. Sci Rep 6(21689):1–10
17.
go back to reference DeviPriya R, Kuppuswami S (2014) Drawing inferences from clinical studies with missing values using genetic algorithm. Int J Bioinform Res Appl 10(6):613–627CrossRef DeviPriya R, Kuppuswami S (2014) Drawing inferences from clinical studies with missing values using genetic algorithm. Int J Bioinform Res Appl 10(6):613–627CrossRef
18.
go back to reference Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B 26(1):1–13CrossRef Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B 26(1):1–13CrossRef
19.
go back to reference Dorigo M, Stützle T (2010) Ant colony optimization: overview and recent advances. In: Gendreau M, Potvin JY (eds) Handbook of metaheuristics. International series in operations research & management science, vol 146. Springer, Boston Dorigo M, Stützle T (2010) Ant colony optimization: overview and recent advances. In: Gendreau M, Potvin JY (eds) Handbook of metaheuristics. International series in operations research & management science, vol 146. Springer, Boston
20.
go back to reference Duan P, Yong AI (2016) Research on an improved ant colony optimization algorithm and its application. Int J Hybrid Inf Technol 9(4):223–234CrossRef Duan P, Yong AI (2016) Research on an improved ant colony optimization algorithm and its application. Int J Hybrid Inf Technol 9(4):223–234CrossRef
21.
go back to reference Euchi J, Mraihi R (2012) The urban bus routing problem in the Tunisian case by the hybrid artificial ant colony algorithm. Swarm Evol Comput 2:15–24CrossRef Euchi J, Mraihi R (2012) The urban bus routing problem in the Tunisian case by the hybrid artificial ant colony algorithm. Swarm Evol Comput 2:15–24CrossRef
22.
go back to reference Friedman N, Goldszmidt M (1996) Discretizing continuous attributes while learning Bayesian networks. In: Proceedings of 13th international conference on machine learning 1996 Friedman N, Goldszmidt M (1996) Discretizing continuous attributes while learning Bayesian networks. In: Proceedings of 13th international conference on machine learning 1996
23.
go back to reference Gambardella L, Montemanni R, Weyland D (2012) Coupling ant colony systems with strong local searches. Eur J Oper Res 220(3):831–843MathSciNetCrossRefMATH Gambardella L, Montemanni R, Weyland D (2012) Coupling ant colony systems with strong local searches. Eur J Oper Res 220(3):831–843MathSciNetCrossRefMATH
24.
go back to reference Garcia J, Lopez-Bueno I, Fernandez F, Borrajo D (2010) A comparative study of discretization approaches for state space generalization in the keep away soccer task. Reinforcement learning: algorithms, implementations and applications. Nova Science Publishers, Hauppauge Garcia J, Lopez-Bueno I, Fernandez F, Borrajo D (2010) A comparative study of discretization approaches for state space generalization in the keep away soccer task. Reinforcement learning: algorithms, implementations and applications. Nova Science Publishers, Hauppauge
25.
go back to reference Garcia-Laencina P-J, Abreu PH, Abreu MH, Afonoso N (2015) Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values. Comput Biol Med 59:125–133CrossRef Garcia-Laencina P-J, Abreu PH, Abreu MH, Afonoso N (2015) Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values. Comput Biol Med 59:125–133CrossRef
26.
go back to reference Grzymala-Busse JW, Mroczek T (2016) A comparison of four approaches to discretization based on entropy. Entropy 18(69):1–11 Grzymala-Busse JW, Mroczek T (2016) A comparison of four approaches to discretization based on entropy. Entropy 18(69):1–11
27.
go back to reference Han T, Lee S, Oh S (2015) Improving discretization by post- processing procedure. Int J Eng Technol 7(2):414–421 Han T, Lee S, Oh S (2015) Improving discretization by post- processing procedure. Int J Eng Technol 7(2):414–421
28.
go back to reference Herrera F, Luengo J, Saez JA, Lopez V, Garcia S (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. Proc IEEE Trans Knowl Data Eng 25:734–750CrossRef Herrera F, Luengo J, Saez JA, Lopez V, Garcia S (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. Proc IEEE Trans Knowl Data Eng 25:734–750CrossRef
29.
go back to reference Huang C-L, Huang W-C, Chang H-Y, Yeh Y-C, Tsai C-Y (2013) Hybridization strategies for continuous ant colony optimization and particle swarm optimization applied to data clustering. Appl Soft Comput 13:3864–3872CrossRef Huang C-L, Huang W-C, Chang H-Y, Yeh Y-C, Tsai C-Y (2013) Hybridization strategies for continuous ant colony optimization and particle swarm optimization applied to data clustering. Appl Soft Comput 13:3864–3872CrossRef
30.
go back to reference Huang W, Pan Y, Wu J (2013) Supervised discretization with GK - \(\tau \). Proc Int Confer Inf Technol Quant Manag Proc Comput Sci 17:114–120 Huang W, Pan Y, Wu J (2013) Supervised discretization with GK - \(\tau \). Proc Int Confer Inf Technol Quant Manag Proc Comput Sci 17:114–120
31.
go back to reference Huang W, Pan Y, Wu J (2014) Supervised discretization for optimal prediction. Supervised Discretization for optimal prediction. In: Proceedings of 1st international conference on data science, vol 30, pp 75 – 80 Huang W, Pan Y, Wu J (2014) Supervised discretization for optimal prediction. Supervised Discretization for optimal prediction. In: Proceedings of 1st international conference on data science, vol 30, pp 75 – 80
32.
go back to reference Ismkhan H (2017) Effective heuristics for ant colony optimization to handle large-scale problems. Swarm Evol Comput 32:140–149 Ismkhan H (2017) Effective heuristics for ant colony optimization to handle large-scale problems. Swarm Evol Comput 32:140–149
33.
go back to reference Janicki R, Malec D (2013) A Bayesian model averaging approach to analyzing categorical data with nonignorable nonresponse. Comput Stat Data Anal 57(1):600–614MathSciNetCrossRefMATH Janicki R, Malec D (2013) A Bayesian model averaging approach to analyzing categorical data with nonignorable nonresponse. Comput Stat Data Anal 57(1):600–614MathSciNetCrossRefMATH
34.
go back to reference Josse J, Husson F (2016) missMDA: a package for handling missing values in multivariate data analysis. J Stat Soft 70(1):1–23CrossRef Josse J, Husson F (2016) missMDA: a package for handling missing values in multivariate data analysis. J Stat Soft 70(1):1–23CrossRef
35.
go back to reference Kabir MM, Shahjahan Md, Murase K (2012) A new hybrid ant colony optimization algorithm for feature selection. Exp Syst Appl 39:3747–3763CrossRef Kabir MM, Shahjahan Md, Murase K (2012) A new hybrid ant colony optimization algorithm for feature selection. Exp Syst Appl 39:3747–3763CrossRef
36.
37.
go back to reference Komarudin K, Wong Y (2010) Applying ant system for solving unequal area facility layout problems. Eur J Oper Res 202:730–746CrossRefMATH Komarudin K, Wong Y (2010) Applying ant system for solving unequal area facility layout problems. Eur J Oper Res 202:730–746CrossRefMATH
38.
go back to reference Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15:1116–1125CrossRef Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15:1116–1125CrossRef
39.
go back to reference Lee MC, Mitra R (2016) Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalized linear models. Comput Stat Data Anal 95:24–38CrossRef Lee MC, Mitra R (2016) Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalized linear models. Comput Stat Data Anal 95:24–38CrossRef
40.
go back to reference Lorenzo-Seva U, Joost R, Ginkel V (2016) Multiple imputation of missing values in exploratory factor analysis of multidimensional scales: estimating latent trait scores. Anal Psicol 32(2):596–608CrossRef Lorenzo-Seva U, Joost R, Ginkel V (2016) Multiple imputation of missing values in exploratory factor analysis of multidimensional scales: estimating latent trait scores. Anal Psicol 32(2):596–608CrossRef
41.
go back to reference Liu Z, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recognit 52:85–95CrossRef Liu Z, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recognit 52:85–95CrossRef
42.
go back to reference Lu J, Yang Y, Webb GI (2006) Incremental discretization for naïve-bayes classifier. In: Li X, Zaïane OR, Li Z-H (eds) ADMA 2006. LNCS, vol 4093. Springer, Heidelberg, pp 223–238 Lu J, Yang Y, Webb GI (2006) Incremental discretization for naïve-bayes classifier. In: Li X, Zaïane OR, Li Z-H (eds) ADMA 2006. LNCS, vol 4093. Springer, Heidelberg, pp 223–238
43.
go back to reference Lustgarten JL, Visweswaran S, Gopalakrishnan V et al (2011) Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinform 12:309CrossRef Lustgarten JL, Visweswaran S, Gopalakrishnan V et al (2011) Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinform 12:309CrossRef
44.
go back to reference Maslove DM, Podchiyska T, Lowe HJ (2013) Discretization of continuous features in clinical datasets. J Am Med Inform Assoc 20:544–553CrossRef Maslove DM, Podchiyska T, Lowe HJ (2013) Discretization of continuous features in clinical datasets. J Am Med Inform Assoc 20:544–553CrossRef
45.
go back to reference Mousa AA (2014) Hybrid ant optimization system for multiobjective economic emission load dispatch problem under fuzziness. Swarm Evol Comput 18:11–21CrossRef Mousa AA (2014) Hybrid ant optimization system for multiobjective economic emission load dispatch problem under fuzziness. Swarm Evol Comput 18:11–21CrossRef
46.
go back to reference Mirkes EM, Coats TJ, Levesley J, Gorban AN (2016) Handling missing data in large healthcare dataset: a case study of unknown trauma outcomes. Comput Biol Med 75:203–216CrossRef Mirkes EM, Coats TJ, Levesley J, Gorban AN (2016) Handling missing data in large healthcare dataset: a case study of unknown trauma outcomes. Comput Biol Med 75:203–216CrossRef
47.
go back to reference Murray JS, Reiter JP (2014) multiple imputation of missing categorical and continuous values via Bayesian mixture models with local dependence. Technical report. arXiv:1410.0438 Murray JS, Reiter JP (2014) multiple imputation of missing categorical and continuous values via Bayesian mixture models with local dependence. Technical report. arXiv:​1410.​0438
48.
go back to reference Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl Soft Comput 10:183–197CrossRef Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl Soft Comput 10:183–197CrossRef
49.
go back to reference Otero FEB, Freitas AA, Johnson CG (2012) Inducing decision trees with an ant colony optimization algorithm. Appl Soft Comput 12:3615–3626CrossRef Otero FEB, Freitas AA, Johnson CG (2012) Inducing decision trees with an ant colony optimization algorithm. Appl Soft Comput 12:3615–3626CrossRef
50.
go back to reference Peng L, Ting-ting Z, Tian-ge L, Kai-hui Z (2015) Missing value imputation method based on density clustering and grey relational analysis. Int J Multimed Ubiq Engg 10(11):133–142CrossRef Peng L, Ting-ting Z, Tian-ge L, Kai-hui Z (2015) Missing value imputation method based on density clustering and grey relational analysis. Int J Multimed Ubiq Engg 10(11):133–142CrossRef
51.
go back to reference Qu L, Li L, Zhang Y, Hu J (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512–522CrossRef Qu L, Li L, Zhang Y, Hu J (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512–522CrossRef
52.
go back to reference Ramirez-Gallego S, Garcia S, Mourino-Talin H, Martinez-Rego D, Bolon-Canedo V, Alonso-Betanzos A, Benitez JM, Herrer F (2016) Data discretization: taxonomy and big data challenge. WIREs Data Min Knowl Disc 6:5–21CrossRef Ramirez-Gallego S, Garcia S, Mourino-Talin H, Martinez-Rego D, Bolon-Canedo V, Alonso-Betanzos A, Benitez JM, Herrer F (2016) Data discretization: taxonomy and big data challenge. WIREs Data Min Knowl Disc 6:5–21CrossRef
53.
go back to reference Razzaghi T, Roderick O, Safro I, Marko N (2015) fast imbalanced classification of healthcare data with missing values. arXiv:1503.06250v1 [stat.ML] Razzaghi T, Roderick O, Safro I, Marko N (2015) fast imbalanced classification of healthcare data with missing values. arXiv:​1503.​06250v1 [stat.ML]
54.
go back to reference Rosen GL, Reichenberger ER, Rosenfeld AM (2011) NBC: the Naïve Bayes classification tool web server for taxonomic classification of meta genomic reads. Bioinformatics 27(1):127–129CrossRef Rosen GL, Reichenberger ER, Rosenfeld AM (2011) NBC: the Naïve Bayes classification tool web server for taxonomic classification of meta genomic reads. Bioinformatics 27(1):127–129CrossRef
55.
go back to reference Saha S, Ghosh A, Seal DB, Dey KN (2016) An improved fuzzy based missing value estimation in DNA microarray validated by gene ranking. Adv Fuzzy Syst. Article ID 6134736 Saha S, Ghosh A, Seal DB, Dey KN (2016) An improved fuzzy based missing value estimation in DNA microarray validated by gene ranking. Adv Fuzzy Syst. Article ID 6134736
56.
go back to reference Salama KM, Freitas AA (2014) Classification with cluster-based Bayesian multi-nets using ant colony optimisation. Swarm Evol Comput 18:54–70CrossRef Salama KM, Freitas AA (2014) Classification with cluster-based Bayesian multi-nets using ant colony optimisation. Swarm Evol Comput 18:54–70CrossRef
57.
go back to reference Shah JS, Brock GN, Rai SN (2015) Metabolomics data analysis and missing value issues with application to infarcted mouse hearts. BMC Bioinform 16(Suppl 15):P16CrossRef Shah JS, Brock GN, Rai SN (2015) Metabolomics data analysis and missing value issues with application to infarcted mouse hearts. BMC Bioinform 16(Suppl 15):P16CrossRef
58.
go back to reference Singh N, Javeed A, Chhabra S, Kumar P (2015) Missing value imputation with unsupervised kohonen self organizing map. In: Shetty NR et al (eds) in emerging research in computing, information, communication and applications, pp 61–76 Singh N, Javeed A, Chhabra S, Kumar P (2015) Missing value imputation with unsupervised kohonen self organizing map. In: Shetty NR et al (eds) in emerging research in computing, information, communication and applications, pp 61–76
59.
go back to reference Tang J, Zhang G, Wang Y, Wang H, Liu F (2015) A hybrid approach to integrate fuzzy C-means based imputation method with genetic algorithm for missing traffic volume data estimation. Transp Res Part C 51:29–40CrossRef Tang J, Zhang G, Wang Y, Wang H, Liu F (2015) A hybrid approach to integrate fuzzy C-means based imputation method with genetic algorithm for missing traffic volume data estimation. Transp Res Part C 51:29–40CrossRef
60.
go back to reference Tsutsui S, Fujimoto N (2011) Fast QAP solving by ACO with 2-opt local search on a GPU. In: 2011 IEEE congress on evolutionary computation Tsutsui S, Fujimoto N (2011) Fast QAP solving by ACO with 2-opt local search on a GPU. In: 2011 IEEE congress on evolutionary computation
61.
go back to reference Voillet V, Besse P, Liaubet L, Cristobal MS, Gonzalez I (2016) Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. BMC Bioinform 17(1):402CrossRef Voillet V, Besse P, Liaubet L, Cristobal MS, Gonzalez I (2016) Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. BMC Bioinform 17(1):402CrossRef
62.
go back to reference Wan Y, Wang M, Yeb Z, Laia X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258CrossRef Wan Y, Wang M, Yeb Z, Laia X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258CrossRef
63.
go back to reference Wang S, Min F, Wang Z, Cao T (2009) OFFD: Optimal flexible frequency discretization for Naïve Bayes classification. In: ADMA 2009. LNAI, vol 5678, pp 704–712 Wang S, Min F, Wang Z, Cao T (2009) OFFD: Optimal flexible frequency discretization for Naïve Bayes classification. In: ADMA 2009. LNAI, vol 5678, pp 704–712
64.
go back to reference Xiao J, Xu Q, Wu C, Gao Y, Hua T, Xu C (2016) Performance evaluation of missing-value imputation clustering based on a multivariate Gaussian mixture model. PLoS ONE 11(8):e0161112CrossRef Xiao J, Xu Q, Wu C, Gao Y, Hua T, Xu C (2016) Performance evaluation of missing-value imputation clustering based on a multivariate Gaussian mixture model. PLoS ONE 11(8):e0161112CrossRef
65.
go back to reference Xu E, Liangshan S, Yongchang R, Hao W, Feng Q (2010) A new discretization approach of continuous attributes. In: Proceedings of Asia-Pacific conference on wearable computing systems Xu E, Liangshan S, Yongchang R, Hao W, Feng Q (2010) A new discretization approach of continuous attributes. In: Proceedings of Asia-Pacific conference on wearable computing systems
66.
go back to reference Yang J, Shi X, Marchese M, Liang Y (2008) An ant colony optimization method for generalized TSP problem. Prog Nat Sci 18:1417–1422MathSciNetCrossRef Yang J, Shi X, Marchese M, Liang Y (2008) An ant colony optimization method for generalized TSP problem. Prog Nat Sci 18:1417–1422MathSciNetCrossRef
67.
go back to reference Yang Y, Webb GI (2001) Proportional k-interval discretization for naive-Bayes Classifiers. In: Proceedings of the 12th European conference on machine learning, pp 564–575 Yang Y, Webb GI (2001) Proportional k-interval discretization for naive-Bayes Classifiers. In: Proceedings of the 12th European conference on machine learning, pp 564–575
68.
go back to reference Yang Y, Xu Z, Song D (2016) Missing value imputation for microRNA expression data by using a GO-based similarity measure. BMC Bioinform 17(suppl 1):10CrossRef Yang Y, Xu Z, Song D (2016) Missing value imputation for microRNA expression data by using a GO-based similarity measure. BMC Bioinform 17(suppl 1):10CrossRef
69.
go back to reference Zhang Z (2015) Missing values in big data research: some basic skills. Ann Transl Med 3(21):323 Zhang Z (2015) Missing values in big data research: some basic skills. Ann Transl Med 3(21):323
70.
go back to reference Zhang Z, Gao C, Lu Y, Liu Y, Liang M (2016) Multi-Objective ant colony optimization based on the physarum-inspired mathematical model for Bi-objective traveling salesman problems. PLoS ONE 11(1):e0146709CrossRef Zhang Z, Gao C, Lu Y, Liu Y, Liang M (2016) Multi-Objective ant colony optimization based on the physarum-inspired mathematical model for Bi-objective traveling salesman problems. PLoS ONE 11(1):e0146709CrossRef
71.
go back to reference Zhu W, Wang J, Zhang Y, Jia L (2010) A discretization algorithm based on information distance criterion and ant colony optimization algorithm for knowledge extracting on industrial database. In: Proceedings of international conference on mechatronics and automation Zhu W, Wang J, Zhang Y, Jia L (2010) A discretization algorithm based on information distance criterion and ant colony optimization algorithm for knowledge extracting on industrial database. In: Proceedings of international conference on mechatronics and automation
Metadata
Title
Estimation of incomplete values in heterogeneous attribute large datasets using discretized Bayesian max–min ant colony optimization
Authors
Sivaraj Rajappan
DeviPriya Rangasamy
Publication date
27-10-2017
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2018
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-017-1123-4

Other articles of this Issue 2/2018

Knowledge and Information Systems 2/2018 Go to the issue

Premium Partner