Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 1/2023

27-06-2022 | Original Article

Merits of Bayesian networks in overcoming small data challenges: a meta-model for handling missing data

Authors: Hanen Ameur, Hasna Njah, Salma Jamoussi

Published in: International Journal of Machine Learning and Cybernetics | Issue 1/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The abundant availability of data in Big Data era has helped achieving significant advances in the machine learning field. However, many datasets appear with incompleteness from different perspectives such as values, labels, annotations and records. By discarding the records yielding ambiguousness, the exploitable data settles down to a small, sometimes ineffective, portion. Making the most of this small portion is burdensome because it usually yields overfitted models. In this paper we propose a new taxonomy for data missingness, in the machine learning context, along with a new metamodel to address the missing data problem within real and open data. Our proposed methodology relies on a H2S Kernel whose ultimate goal is the effective learning of a generalized Bayesian network from small input datasets. Our contributions are motivated by the strong probabilistic foundation of the Bayesian network, on the one hand, and on the ensemble learning effectiveness, on the other hand. The highlights of our kernel are the new strategy for multiple Bayesian network structure learning and the novel technique for the weighted fusion of Bayesian network structures. To harness on the richness of the merged network in terms of knowledge, we propose four H2S-derived systems to address the missing values/records impacts involving the annotation, the balancing, missing values imputation and data over-sampling. We combine these systems into a meta-model, and we perform a step-by-step experimental study. The obtained results showcase the efficiency of our contributions to deal with multi-class problems and with extremely small datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Footnotes
2
Survey network is downloadable, from BNLEARN website, in different formats at: https://​www.​bnlearn.​com/​bnrepository/​discrete-small.​html#survey.
 
3
these datasets are available online via UCI Machine Learning repository https://​archive.​ics.​uci.​edu/​ml/​datasets.​php.
 
Literature
1.
go back to reference Akeret J, Refregier A, Amara A, Seehars S, Hasner C (2015) Approximate Bayesian computation for forward modeling in cosmology. J Cosmol Astropart Phys 2015(08):043CrossRef Akeret J, Refregier A, Amara A, Seehars S, Hasner C (2015) Approximate Bayesian computation for forward modeling in cosmology. J Cosmol Astropart Phys 2015(08):043CrossRef
2.
go back to reference Ben-David S, Lu T, Pál D, Sotáková M (2009) Learning low density separators. In: van Dyk D, Welling M (eds) Proceedings of the twelth international conference on artificial intelligence and statistics. Proceedings of Machine Learning Research, PMLR, Florida, USA, pp 25–32 Ben-David S, Lu T, Pál D, Sotáková M (2009) Learning low density separators. In: van Dyk D, Welling M (eds) Proceedings of the twelth international conference on artificial intelligence and statistics. Proceedings of Machine Learning Research, PMLR, Florida, USA, pp 25–32
3.
go back to reference Boonchuay K, Sinapiromsaran K, Lursinsap C (2017) Decision tree induction based on minority entropy for the class imbalance problem. Pattern Anal Appl 20(3):769–782CrossRef Boonchuay K, Sinapiromsaran K, Lursinsap C (2017) Decision tree induction based on minority entropy for the class imbalance problem. Pattern Anal Appl 20(3):769–782CrossRef
4.
go back to reference 2 Carvalho AM (2009) Scoring functions for learning Bayesian networks. Inesc-id Tec. Rep 1 2 Carvalho AM (2009) Scoring functions for learning Bayesian networks. Inesc-id Tec. Rep 1
5.
go back to reference Castro CL, Braga AP (2013) Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans Neural Netw Learn Syst 24(6):888–899CrossRef Castro CL, Braga AP (2013) Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans Neural Netw Learn Syst 24(6):888–899CrossRef
6.
7.
go back to reference Chen Z, Lin T, Xia X, Xu H, Ding S (2018) A synthetic neighborhood generation based ensemble learning for the imbalanced data classification. Appl Intell 48(8):2441–2457CrossRef Chen Z, Lin T, Xia X, Xu H, Ding S (2018) A synthetic neighborhood generation based ensemble learning for the imbalanced data classification. Appl Intell 48(8):2441–2457CrossRef
8.
go back to reference Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mech Learn Res 12(ARTICLE):2493–2537MATH Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mech Learn Res 12(ARTICLE):2493–2537MATH
9.
go back to reference Cooper GF (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artif Intell 42(2–3):393–405MATHCrossRef Cooper GF (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artif Intell 42(2–3):393–405MATHCrossRef
10.
go back to reference Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347MATHCrossRef Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347MATHCrossRef
11.
go back to reference Correia AHC, Cussens J, de Campos C (2020) On pruning for score-based Bayesian network structure learning. In: The 23rd international conference on artificial intelligence and statistics {AISTATS}, Proceedings of Machine Learning Research, vol 108. PMLR, pp 2709–2718 Correia AHC, Cussens J, de Campos C (2020) On pruning for score-based Bayesian network structure learning. In: The 23rd international conference on artificial intelligence and statistics {AISTATS}, Proceedings of Machine Learning Research, vol 108. PMLR, pp 2709–2718
12.
13.
go back to reference Dópido I, Li J, Marpu PR, Plaza A, Dias JMB, Benediktsson JA (2013) Semisupervised self-learning for hyperspectral image classification. IEEE Trans Geosci Remote Sens 51(7):4032–4044CrossRef Dópido I, Li J, Marpu PR, Plaza A, Dias JMB, Benediktsson JA (2013) Semisupervised self-learning for hyperspectral image classification. IEEE Trans Geosci Remote Sens 51(7):4032–4044CrossRef
15.
go back to reference Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. Mach Learn 31(1):1–38 Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. Mach Learn 31(1):1–38
16.
go back to reference Feng W, Huang W, Ren J (2018) Class imbalance ensemble learning based on the margin theory. Appl Sci 8(5):815CrossRef Feng W, Huang W, Ren J (2018) Class imbalance ensemble learning based on the margin theory. Appl Sci 8(5):815CrossRef
17.
go back to reference Fernández A, Garcia S, Herrera F, Chawla NV (2018) Smote for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905MATHCrossRef Fernández A, Garcia S, Herrera F, Chawla NV (2018) Smote for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905MATHCrossRef
18.
go back to reference François O, Leray P (2006) Learning the tree augmented Naive Bayes classifier from incomplete datasets. In: Third European workshop on probabilistic graphical models, 12–15 September, Prague, Czech Republic. Electronic Proceedings, pp 91–98 François O, Leray P (2006) Learning the tree augmented Naive Bayes classifier from incomplete datasets. In: Third European workshop on probabilistic graphical models, 12–15 September, Prague, Czech Republic. Electronic Proceedings, pp 91–98
19.
go back to reference Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2):131–163MATHCrossRef Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2):131–163MATHCrossRef
20.
go back to reference Gámez JA, Mateo JL, Puerta JM (2011) Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min Knowl Disc 22(1):106–148MATHCrossRef Gámez JA, Mateo JL, Puerta JM (2011) Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min Knowl Disc 22(1):106–148MATHCrossRef
21.
go back to reference Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in Intelligent Computing. International Conference on intelligent computing (ICIC). Springer, Berlin, Heidelberg, pp 878–887. https://doi.org/10.1007/11538059_91 Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in Intelligent Computing. International Conference on intelligent computing (ICIC). Springer, Berlin, Heidelberg, pp 878–887. https://​doi.​org/​10.​1007/​11538059_​91
22.
go back to reference Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243MATHCrossRef Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243MATHCrossRef
23.
go back to reference Huang Y, Gao Y, Gan Y, Ye M (2021) A new financial data forecasting model using genetic algorithm and long short-term memory network. Neurocomputing 425:207–218CrossRef Huang Y, Gao Y, Gan Y, Ye M (2021) A new financial data forecasting model using genetic algorithm and long short-term memory network. Neurocomputing 425:207–218CrossRef
24.
go back to reference Imam N, Issac B, Jacob SM (2019) A semi-supervised learning approach for tackling twitter spam drift. Int J Comput Intell Appl 18(02):1950010CrossRef Imam N, Issac B, Jacob SM (2019) A semi-supervised learning approach for tackling twitter spam drift. Int J Comput Intell Appl 18(02):1950010CrossRef
25.
go back to reference Imam T, Ting KM, Kamruzzaman J (2006) z-SVM: An SVM for improved classification of imbalanced data. In: Sattar A, Kang, Bh (eds) Advances in artificial intelligence, 19th Australian joint conference on artificial intelligence, Hobart, Australia. Springer, Berlin, Heidelberg, pp 264–273. https://doi.org/10.1007/11941439_30 Imam T, Ting KM, Kamruzzaman J (2006) z-SVM: An SVM for improved classification of imbalanced data. In: Sattar A, Kang, Bh (eds) Advances in artificial intelligence, 19th Australian joint conference on artificial intelligence, Hobart, Australia. Springer, Berlin, Heidelberg, pp 264–273. https://​doi.​org/​10.​1007/​11941439_​30
26.
go back to reference Janžura M, Nielsen J (2006) A simulated annealing-based method for learning Bayesian networks from statistical data. Int J Intell Syst 21(3):335–348MATHCrossRef Janžura M, Nielsen J (2006) A simulated annealing-based method for learning Bayesian networks from statistical data. Int J Intell Syst 21(3):335–348MATHCrossRef
27.
go back to reference Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64(5):402CrossRef Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64(5):402CrossRef
29.
go back to reference Kraaijeveld P, Druzdzel MJ, Onisko A, Wasyluk H (2005) Genierate: an interactive generator of diagnostic Bayesian network models. In: Proc. 16th Int. Workshop Principles Diagnosis. Citeseer, pp 175–180 Kraaijeveld P, Druzdzel MJ, Onisko A, Wasyluk H (2005) Genierate: an interactive generator of diagnostic Bayesian network models. In: Proc. 16th Int. Workshop Principles Diagnosis. Citeseer, pp 175–180
30.
go back to reference Kramer SC, Sorenson HW (1988) Bayesian parameter estimation. IEEE Trans Autom Control 33(2):217–222MATHCrossRef Kramer SC, Sorenson HW (1988) Bayesian parameter estimation. IEEE Trans Autom Control 33(2):217–222MATHCrossRef
31.
go back to reference Lateh MA, Muda AK, Yusof ZIM, Muda NA, Azmi MS (2017) Handling a small dataset problem in prediction model by employ artificial data generation approach: a review. J Phys Conf Ser 892:012016CrossRef Lateh MA, Muda AK, Yusof ZIM, Muda NA, Azmi MS (2017) Handling a small dataset problem in prediction model by employ artificial data generation approach: a review. J Phys Conf Ser 892:012016CrossRef
32.
go back to reference Li H, Jin G, Zhou J, Zb ZHOU, Dq LI (2008) Survey of Bayesian network inference algorithms. Syst Eng Eclectron 30(5):935–939 Li H, Jin G, Zhou J, Zb ZHOU, Dq LI (2008) Survey of Bayesian network inference algorithms. Syst Eng Eclectron 30(5):935–939
33.
go back to reference Little RJ, Rubin DB (2019) Statistical analysis with missing data, vol 793. Wiley, HobokenMATH Little RJ, Rubin DB (2019) Statistical analysis with missing data, vol 793. Wiley, HobokenMATH
35.
go back to reference Liu H, Wang J (2006) A new way to enumerate cycles in graph. In: Advanced International Conference on Telecommunications and International Conference on Internet and Web Applications and Services (AICT/ICIW). IEEE, Guadeloupe, French Caribbean, p 57. https://doi.org/10.1109/AICT-ICIW.2006.22 Liu H, Wang J (2006) A new way to enumerate cycles in graph. In: Advanced International Conference on Telecommunications and International Conference on Internet and Web Applications and Services (AICT/ICIW). IEEE, Guadeloupe, French Caribbean, p 57. https://​doi.​org/​10.​1109/​AICT-ICIW.​2006.​22
37.
go back to reference Mack C, Su Z, Westreich D (2018) Managing missing data in patient registries: addendum to registries for evaluating patient outcomes: A user’s Gguide, Third Edition [Internet]. Agency for healthcare research and quality (US), Rockville (MD), Report No.: 17(18)-EHC015-EF Mack C, Su Z, Westreich D (2018) Managing missing data in patient registries: addendum to registries for evaluating patient outcomes: A user’s Gguide, Third Edition [Internet]. Agency for healthcare research and quality (US), Rockville (MD), Report No.: 17(18)-EHC015-EF
38.
go back to reference Mallapragada PK, Jin R, Jain AK, Liu Y (2008) Semiboost: boosting for semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 31(11):2000–2014CrossRef Mallapragada PK, Jin R, Jain AK, Liu Y (2008) Semiboost: boosting for semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 31(11):2000–2014CrossRef
39.
go back to reference Marlin B (2008) Missing data problems in machine learning. Ph.D. thesis Marlin B (2008) Missing data problems in machine learning. Ph.D. thesis
40.
go back to reference Marqués AI, García V, Sánchez JS (2012) Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Syst Appl 39(11):10244–10250CrossRef Marqués AI, García V, Sánchez JS (2012) Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Syst Appl 39(11):10244–10250CrossRef
41.
go back to reference Martins MS, El Yafrani M, Delgado M, Lüders R, Santana R, Siqueira HV, Akcay HG, Ahiod B (2021) Analysis of Bayesian network learning techniques for a hybrid multi-objective Bayesian estimation of distribution algorithm: a case study on mnk landscape. J Heuristics 27(4):549–573CrossRef Martins MS, El Yafrani M, Delgado M, Lüders R, Santana R, Siqueira HV, Akcay HG, Ahiod B (2021) Analysis of Bayesian network learning techniques for a hybrid multi-objective Bayesian estimation of distribution algorithm: a case study on mnk landscape. J Heuristics 27(4):549–573CrossRef
43.
go back to reference Neapolitan RE, Jiang X (2010) Probabilistic methods for financial and marketing informatics. Elsevier, AmsterdamMATH Neapolitan RE, Jiang X (2010) Probabilistic methods for financial and marketing informatics. Elsevier, AmsterdamMATH
44.
go back to reference Njah H, Jamoussi S (2015) Weighted ensemble learning of Bayesian network for gene regulatory networks. Neurocomputing 150:404–416CrossRef Njah H, Jamoussi S (2015) Weighted ensemble learning of Bayesian network for gene regulatory networks. Neurocomputing 150:404–416CrossRef
45.
go back to reference Paton K (1969) An algorithm for finding a fundamental set of cycles of a graph. Commun ACM 12(9):514–518MATHCrossRef Paton K (1969) An algorithm for finding a fundamental set of cycles of a graph. Commun ACM 12(9):514–518MATHCrossRef
46.
go back to reference Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San FranciscoMATH Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San FranciscoMATH
47.
go back to reference Pearl J (2014) Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier, AmsterdamMATH Pearl J (2014) Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier, AmsterdamMATH
49.
go back to reference Pérez-Miñana E (2016) Improving ecosystem services modelling: Insights from a Bayesian network tools review. Environ Model Softw 85:184–201CrossRef Pérez-Miñana E (2016) Improving ecosystem services modelling: Insights from a Bayesian network tools review. Environ Model Softw 85:184–201CrossRef
51.
52.
go back to reference Ramanan N, Natarajan S (2020) Causal learning from predictive modeling for observational data. Front Big Data 3:34CrossRef Ramanan N, Natarajan S (2020) Causal learning from predictive modeling for observational data. Front Big Data 3:34CrossRef
53.
go back to reference Rancoita PM, Zaffalon M, Zucca E, Bertoni F, De Campos CP (2016) Bayesian network data imputation with application to survival tree analysis. Comput Stat Data Anal 93:373–387MATHCrossRef Rancoita PM, Zaffalon M, Zucca E, Bertoni F, De Campos CP (2016) Bayesian network data imputation with application to survival tree analysis. Comput Stat Data Anal 93:373–387MATHCrossRef
54.
go back to reference Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239MATHCrossRef Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239MATHCrossRef
56.
go back to reference Rissanen J (1999) Hypothesis selection and testing by the mdl principle. Comput J 42(4):260–269MATHCrossRef Rissanen J (1999) Hypothesis selection and testing by the mdl principle. Comput J 42(4):260–269MATHCrossRef
58.
go back to reference Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscipl Rev Data Min Knowl Discov 8(4):e1249 Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscipl Rev Data Min Knowl Discov 8(4):e1249
59.
go back to reference Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10(3):e0118432CrossRef Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10(3):e0118432CrossRef
60.
go back to reference Sakamoto Y, Ishiguro M (1986) Akaike information criterion statistics, vol 81. D. Reidel, Dordrecht, p 26853 (10.5555)MATH Sakamoto Y, Ishiguro M (1986) Akaike information criterion statistics, vol 81. D. Reidel, Dordrecht, p 26853 (10.5555)MATH
61.
64.
go back to reference Spirtes P, Glymour CN, Scheines R, Heckerman D (2000) Causation, prediction, and search. MIT Press, CambridgeMATH Spirtes P, Glymour CN, Scheines R, Heckerman D (2000) Causation, prediction, and search. MIT Press, CambridgeMATH
66.
go back to reference Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370CrossRef Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370CrossRef
69.
go back to reference Tong Y, Tien I (2017) Algorithms for Bayesian network modeling, inference, and reliability assessment for multistate flow networks. J Comput Civ Eng 31(5):04017051CrossRef Tong Y, Tien I (2017) Algorithms for Bayesian network modeling, inference, and reliability assessment for multistate flow networks. J Comput Civ Eng 31(5):04017051CrossRef
70.
go back to reference Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65(1):31–78MATHCrossRef Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65(1):31–78MATHCrossRef
71.
go back to reference Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440MATHCrossRef Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440MATHCrossRef
72.
go back to reference Vapnik V, Guyon I, Hastie T (1995) Support vector machines. Mach Learn 20(3):273–297CrossRef Vapnik V, Guyon I, Hastie T (1995) Support vector machines. Mach Learn 20(3):273–297CrossRef
73.
go back to reference Vilardell M, Buxó M, Clèries R, Martínez JM, Garcia G, Ameijide A, Font R, Civit S, Marcos-Gragera R, Vilardell ML et al (2020) Missing data imputation and synthetic data simulation through modeling graphical probabilistic dependencies between variables (modgraprodep): an application to breast cancer survival. Artif Intell Med 107:101875CrossRef Vilardell M, Buxó M, Clèries R, Martínez JM, Garcia G, Ameijide A, Font R, Civit S, Marcos-Gragera R, Vilardell ML et al (2020) Missing data imputation and synthetic data simulation through modeling graphical probabilistic dependencies between variables (modgraprodep): an application to breast cancer survival. Artif Intell Med 107:101875CrossRef
74.
go back to reference Xu L, Schuurmans D (2005) Unsupervised and semi-supervised multi-class support vector machines. In: AAAI, vol. 40, p. 50 Xu L, Schuurmans D (2005) Unsupervised and semi-supervised multi-class support vector machines. In: AAAI, vol. 40, p. 50
75.
go back to reference Yap BW, Abd Rani K, Abd Rahman HA, Fong S, Khairudin Z, Abdullah NN (2014) An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Proceedings of the first international conference on advanced data and information engineering (DaEng-2013). Springer, pp 13–22 Yap BW, Abd Rani K, Abd Rahman HA, Fong S, Khairudin Z, Abdullah NN (2014) An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Proceedings of the first international conference on advanced data and information engineering (DaEng-2013). Springer, pp 13–22
77.
go back to reference Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661CrossRef Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661CrossRef
78.
go back to reference Yoon J, Jordon J, Schaar M (2018) Gain: Missing data imputation using generative adversarial nets. In: Proceedings of the 35th International Conference on Machine Learning (ICML). PMLR, Stockholm, Sweden, pp 5675–5684 Yoon J, Jordon J, Schaar M (2018) Gain: Missing data imputation using generative adversarial nets. In: Proceedings of the 35th International Conference on Machine Learning (ICML). PMLR, Stockholm, Sweden, pp 5675–5684
79.
go back to reference Yu J, Smith VA, Wang PP, Hartemink AJ, Jarvis ED (2002) Using Bayesian network inference algorithms to recover molecular genetic regulatory networks. In: International Conference on systems biology, vol 2002 Yu J, Smith VA, Wang PP, Hartemink AJ, Jarvis ED (2002) Using Bayesian network inference algorithms to recover molecular genetic regulatory networks. In: International Conference on systems biology, vol 2002
80.
go back to reference Yu S, Krishnapuram B, Rosales R, Rao RB (2011) Bayesian co-training. J Mach Learn Res 12:2649–2680MATH Yu S, Krishnapuram B, Rosales R, Rao RB (2011) Bayesian co-training. J Mach Learn Res 12:2649–2680MATH
81.
go back to reference Zheng W, Jin M (2020) The effects of class imbalance and training data size on classifier learning: an empirical study. SN Comput Sci 1(2):1–13CrossRef Zheng W, Jin M (2020) The effects of class imbalance and training data size on classifier learning: an empirical study. SN Comput Sci 1(2):1–13CrossRef
82.
go back to reference Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems, vol 16. MIT Press Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems, vol 16. MIT Press
83.
go back to reference Zhu X, Lafferty J (2005) Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In: Proceedings of the 22nd International Conference on machine learning, pp 1052–1059. https://doi.org/10.1145/1102351.1102484 Zhu X, Lafferty J (2005) Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In: Proceedings of the 22nd International Conference on machine learning, pp 1052–1059. https://​doi.​org/​10.​1145/​1102351.​1102484
Metadata
Title
Merits of Bayesian networks in overcoming small data challenges: a meta-model for handling missing data
Authors
Hanen Ameur
Hasna Njah
Salma Jamoussi
Publication date
27-06-2022
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 1/2023
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-022-01577-9

Other articles of this Issue 1/2023

International Journal of Machine Learning and Cybernetics 1/2023 Go to the issue