Skip to main content

2018 | OriginalPaper | Buchkapitel

Missing Data Imputation by LOLIMOT and FSVM/FSVR Algorithms with a Novel Approach: A Comparative Study

verfasst von : Fatemeh Fazlikhani, Pegah Motakefi, Mir Mohsen Pedram

Erschienen in: Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Missing values occurrence is an inherent part of collecting data sets in real world’s problems. This issue, causes lots of ambiguities in data analysis while processing data sets. Therefore, implementing methods which can handle missing data issues are critical in many fields, in order to providing accurate, efficient and valid analysis.
In this paper, we proposed a novel preprocessing approach that estimates and imputes missing values in datasets by using LOLIMOT and FSVM/FSVR algorithms, which are state-of-the-art algorithms. Classification accuracy, is a scale for comparing precision and efficiency of presented approach with some other well-known methods. Obtained results, show that proposed approach is the most accurate one.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Meesad, P., Yen, G.G.: Combined numerical and linguistic knowledge representation and its application to medical diagnosis. In: IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, pp. 206–222, August 2003CrossRef Meesad, P., Yen, G.G.: Combined numerical and linguistic knowledge representation and its application to medical diagnosis. In: IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, pp. 206–222, August 2003CrossRef
2.
Zurück zum Zitat Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation: a study of fuzzy K-means clustering method. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, Jerzy W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 573–579. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25929-9_70CrossRef Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation: a study of fuzzy K-means clustering method. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, Jerzy W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 573–579. Springer, Heidelberg (2004). https://​doi.​org/​10.​1007/​978-3-540-25929-9_​70CrossRef
3.
Zurück zum Zitat Schafer, J.L.: Analysis of Incomplete Data, pp. 10–13. Chapman & Hall, London (1997)MATH Schafer, J.L.: Analysis of Incomplete Data, pp. 10–13. Chapman & Hall, London (1997)MATH
4.
Zurück zum Zitat Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn, pp. 3–19. Wiley, New York (2002)CrossRef Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn, pp. 3–19. Wiley, New York (2002)CrossRef
5.
Zurück zum Zitat Wayman, C.: Multiple imputation for missing data: what is it and how can I use it. In: Annual Meeting of the American Educational Research Association, Chicago, IL, pp. 2–16 (2003) Wayman, C.: Multiple imputation for missing data: what is it and how can I use it. In: Annual Meeting of the American Educational Research Association, Chicago, IL, pp. 2–16 (2003)
6.
Zurück zum Zitat Jiri, K.: Dealing with missing values in data. Faculty of Civil Engineering, Czech Technical University, pp. 1–10 (2013) Jiri, K.: Dealing with missing values in data. Faculty of Civil Engineering, Czech Technical University, pp. 1–10 (2013)
7.
Zurück zum Zitat Schafer, L.J., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7(2), 147–177 (2002)CrossRef Schafer, L.J., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7(2), 147–177 (2002)CrossRef
8.
Zurück zum Zitat Luengo, J., Garcia, S., Herrera, F.: A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling attribute values: the good synergy between RBFNs and Event Covering method. CITIC-University of Granada, pp. 406–418 (2010) Luengo, J., Garcia, S., Herrera, F.: A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling attribute values: the good synergy between RBFNs and Event Covering method. CITIC-University of Granada, pp. 406–418 (2010)
11.
Zurück zum Zitat Kantardzic, M.: Data Mining-Concepts, Models, Methods, and Algorithms. IEEE, pp. 165–176 (2003) Kantardzic, M.: Data Mining-Concepts, Models, Methods, and Algorithms. IEEE, pp. 165–176 (2003)
13.
Zurück zum Zitat Little, R.J., Rubin, D.B.: Statistical analysis with missing data, 2nd edn, pp. 1–409. Wiley, Hoboken (2002)CrossRef Little, R.J., Rubin, D.B.: Statistical analysis with missing data, 2nd edn, pp. 1–409. Wiley, Hoboken (2002)CrossRef
14.
Zurück zum Zitat Hand, D.J., Manilla, H., Smyth, P.: Principles of Data Mining, A Bradford Book, pp. 157–160. MIT Press, Cambridge (2001) Hand, D.J., Manilla, H., Smyth, P.: Principles of Data Mining, A Bradford Book, pp. 157–160. MIT Press, Cambridge (2001)
15.
Zurück zum Zitat Gustavo, E., Monard, B., Monard, M.C.: A Study of K-Nearest Neighbour as an Imputation Method. Institute of Mathematics and Computer Science– ICMC, pp. 1–10 (2002) Gustavo, E., Monard, B., Monard, M.C.: A Study of K-Nearest Neighbour as an Imputation Method. Institute of Mathematics and Computer Science– ICMC, pp. 1–10 (2002)
16.
Zurück zum Zitat Smola, A.J., Scholkophf, B.: A tutorial on Support Vector Regression. NeuroCOLT2 Technical report Series, NC2-TR-1998-03, pp. 1–73, October 1998 Smola, A.J., Scholkophf, B.: A tutorial on Support Vector Regression. NeuroCOLT2 Technical report Series, NC2-TR-1998-03, pp. 1–73, October 1998
17.
Zurück zum Zitat Scholkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: Fayyad, U.M., Uthurusamy, R. (eds.) Proceedings, First International Conference on Knowledge Discovery and Data Mining, pp. 252–257. AAAI Press, Menlo Park (1995) Scholkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: Fayyad, U.M., Uthurusamy, R. (eds.) Proceedings, First International Conference on Knowledge Discovery and Data Mining, pp. 252–257. AAAI Press, Menlo Park (1995)
18.
Zurück zum Zitat Chen, Z.: Data Mining and Uncertain Reasoning: An Integrated Approach, pp. 1–392. Wiley, Hoboken (2001) Chen, Z.: Data Mining and Uncertain Reasoning: An Integrated Approach, pp. 1–392. Wiley, Hoboken (2001)
19.
Zurück zum Zitat Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 2, pp. 1–17. Morgan-Kaufmann, Los Altos (1990) Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 2, pp. 1–17. Morgan-Kaufmann, Los Altos (1990)
21.
Zurück zum Zitat Sohani, M., Kermani, K.K.: A neuro-fuzzy approach to diagnosis of neonatal jaundice. In: Proceedings of the 1st International Conference on Bio Inspired Models of Network, Information and Computing Systems Cavalese, Italy, pp. 2–6 (2006) Sohani, M., Kermani, K.K.: A neuro-fuzzy approach to diagnosis of neonatal jaundice. In: Proceedings of the 1st International Conference on Bio Inspired Models of Network, Information and Computing Systems Cavalese, Italy, pp. 2–6 (2006)
22.
Zurück zum Zitat Janghorbani, A., Arasteh, A.: Application of local linear neuro-fuzzy model in prediction of mean arterial blood pressure time series. In: Proceedings of the 17th Iranian Conference of Biomedical Engineering (ICBME 2010), pp. 1–4 (2010) Janghorbani, A., Arasteh, A.: Application of local linear neuro-fuzzy model in prediction of mean arterial blood pressure time series. In: Proceedings of the 17th Iranian Conference of Biomedical Engineering (ICBME 2010), pp. 1–4 (2010)
23.
Zurück zum Zitat Nikookar, A., Lucas, C.: Artificial bee colony based learning of local linear neuro-fuzzy models. In: IEEE Fuzzy Systems (IFSC), pp. 1–4 (2013) Nikookar, A., Lucas, C.: Artificial bee colony based learning of local linear neuro-fuzzy models. In: IEEE Fuzzy Systems (IFSC), pp. 1–4 (2013)
24.
Zurück zum Zitat Sen, W., Hong, C., Xiaodong, F.: Clustering algorithm for incomplete data sets with mixed numerical and categorical attributes. Int. J. Database Theory Appl. 6(5), 95–104 (2013)CrossRef Sen, W., Hong, C., Xiaodong, F.: Clustering algorithm for incomplete data sets with mixed numerical and categorical attributes. Int. J. Database Theory Appl. 6(5), 95–104 (2013)CrossRef
26.
Zurück zum Zitat Nobel, W.S.: What is a support vector machine? Comput. Biol. 1–3 (2006) Nobel, W.S.: What is a support vector machine? Comput. Biol. 1–3 (2006)
27.
Zurück zum Zitat Pigott, T.D.: A review of methods for missing data. Educ. Res. Eval. 7(4), 353–383 (2001)CrossRef Pigott, T.D.: A review of methods for missing data. Educ. Res. Eval. 7(4), 353–383 (2001)CrossRef
28.
Zurück zum Zitat Vapnik, V., Golowich, S., Smola, A.: Support vector machine for function approximation regression estimation and signal processing. In: Advances in Neural Information Processing Systems, vol. 9, pp. 281–287 (1996) Vapnik, V., Golowich, S., Smola, A.: Support vector machine for function approximation regression estimation and signal processing. In: Advances in Neural Information Processing Systems, vol. 9, pp. 281–287 (1996)
31.
Zurück zum Zitat Enders, C.: Applied Missing Data Analysis, pp. 3–55. Guilford Press, New York (2010) Enders, C.: Applied Missing Data Analysis, pp. 3–55. Guilford Press, New York (2010)
32.
Zurück zum Zitat Luengo, J., Garcia, S., Herrera, F.: On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl. Inf. Syst. 32, 77–108 (2011)CrossRef Luengo, J., Garcia, S., Herrera, F.: On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl. Inf. Syst. 32, 77–108 (2011)CrossRef
37.
Zurück zum Zitat Bottou, l., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77–82, 9–13 October 1994 Bottou, l., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77–82, 9–13 October 1994
40.
Zurück zum Zitat Liao, R.: Support Vector Machines, pp. 1–33, 10 November 2015 Liao, R.: Support Vector Machines, pp. 1–33, 10 November 2015
41.
Zurück zum Zitat Lin, C., Wang, S.: Fuzzy support vector machines. IEEE Trans. Neural Netw. 13(2), 464–471 (2002)CrossRef Lin, C., Wang, S.: Fuzzy support vector machines. IEEE Trans. Neural Netw. 13(2), 464–471 (2002)CrossRef
42.
Zurück zum Zitat Lin, K., Pai, P.: A fuzzy support vector regression model for business cycle predictions. Expert Syst. Appl. 37, 5430–5435 (2010)CrossRef Lin, K., Pai, P.: A fuzzy support vector regression model for business cycle predictions. Expert Syst. Appl. 37, 5430–5435 (2010)CrossRef
43.
Zurück zum Zitat Huang, H., Liu, Y.: Fuzzy support vector machines for pattern recognition and data mining. Int. J. Fuzzy Syst. 4(3), 826–835 (2002)MathSciNet Huang, H., Liu, Y.: Fuzzy support vector machines for pattern recognition and data mining. Int. J. Fuzzy Syst. 4(3), 826–835 (2002)MathSciNet
Metadaten
Titel
Missing Data Imputation by LOLIMOT and FSVM/FSVR Algorithms with a Novel Approach: A Comparative Study
verfasst von
Fatemeh Fazlikhani
Pegah Motakefi
Mir Mohsen Pedram
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91476-3_46