Skip to main content
Erschienen in: Cluster Computing 3/2016

01.09.2016

An examination of on-line machine learning approaches for pseudo-random generated data

verfasst von: Jia Zhu, Chuanhua Xu, Zhixu Li, Gabriel Fung, Xueqin Lin, Jin Huang, Changqin Huang

Erschienen in: Cluster Computing | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A pseudo-random generator is an algorithm to generate a sequence of objects determined by a truly random seed which is not truly random. It has been widely used in many applications, such as cryptography and simulations. In this article, we examine current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on-line algorithms. To further improve the prediction performance, we propose a novel sample weighted algorithm that takes generalization errors in each iteration into account. We perform intensive evaluation on real Baccarat data generated by Casino machines and random number generated by a popular Java program, which are two typical examples of pseudo-random generated data. The experimental results show that support vector machine and k-nearest neighbors have better performance than others with and without sample weighted algorithm in the evaluation data set.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)MathSciNet Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)MathSciNet
2.
Zurück zum Zitat Barker, E., Barker, W., Burr, W., Polk, W., Smid, M.: Recommendation for key management. NIST Special Publication (2013) Barker, E., Barker, W., Burr, W., Polk, W., Smid, M.: Recommendation for key management. NIST Special Publication (2013)
3.
Zurück zum Zitat Belmouhcine. A., Benkhalifa, M.: Implicit links-based techniques to enrich k-nearest neighbors and naive Bayes algorithms for web page classification. In: Proceedings of the 9th International Conference on Computer Recognition Systems, pp. 755–766 (2016) Belmouhcine. A., Benkhalifa, M.: Implicit links-based techniques to enrich k-nearest neighbors and naive Bayes algorithms for web page classification. In: Proceedings of the 9th International Conference on Computer Recognition Systems, pp. 755–766 (2016)
4.
Zurück zum Zitat Bhalke, D.G., Rama Rao, C.B., Bormane, D.S.: Automatic musical instrument classification using fractional Fourier transform based-MFCC features and counter propagation neural network. J. Intell. Inf. Syst. 20(5), 425–426 (2015) Bhalke, D.G., Rama Rao, C.B., Bormane, D.S.: Automatic musical instrument classification using fractional Fourier transform based-MFCC features and counter propagation neural network. J. Intell. Inf. Syst. 20(5), 425–426 (2015)
5.
6.
Zurück zum Zitat Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998) Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)
7.
Zurück zum Zitat Breiman, L., Friedman, J.H., Olshen, A.R., Stone, C.J.: Support-Vector Networks. Wadsworth and Brooks Cole Advanced Books and Software, Monterey (1984) Breiman, L., Friedman, J.H., Olshen, A.R., Stone, C.J.: Support-Vector Networks. Wadsworth and Brooks Cole Advanced Books and Software, Monterey (1984)
8.
Zurück zum Zitat Caruana, R., Caruana, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning (2006) Caruana, R., Caruana, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning (2006)
9.
Zurück zum Zitat Chen, F.H., Howard, H.: An alternative model for the analysis of detecting electronic industries earnings management using stepwise regression, random forest, and decision tree. Soft Comput. 20(5), 1945–1960 (2015)CrossRef Chen, F.H., Howard, H.: An alternative model for the analysis of detecting electronic industries earnings management using stepwise regression, random forest, and decision tree. Soft Comput. 20(5), 1945–1960 (2015)CrossRef
10.
Zurück zum Zitat Connor, J.J., Robertson, E.F.: Student’s t-test. MacTutor History of Mathematics Archive, University of St Andrews (1908) Connor, J.J., Robertson, E.F.: Student’s t-test. MacTutor History of Mathematics Archive, University of St Andrews (1908)
11.
Zurück zum Zitat Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH
12.
Zurück zum Zitat Dasarathy, B., Los Alamitos: Nearest Neighbor (NN) Norms: Nn Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991) Dasarathy, B., Los Alamitos: Nearest Neighbor (NN) Norms: Nn Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)
13.
Zurück zum Zitat Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–137 (1997)CrossRefMATH Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–137 (1997)CrossRefMATH
14.
Zurück zum Zitat Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall, Englewood Cliffs (1999)MATH Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall, Englewood Cliffs (1999)MATH
15.
Zurück zum Zitat Jiang, M.W., Li, H.L.: Vehicle classification based on hierarchical support vector machine. In: Proceedings of the International Conference on Computer Engineering and Network, pp. 593–600 (2014) Jiang, M.W., Li, H.L.: Vehicle classification based on hierarchical support vector machine. In: Proceedings of the International Conference on Computer Engineering and Network, pp. 593–600 (2014)
16.
Zurück zum Zitat Kennedy, A., Shepherd, M.: Automatic identification of home pages on the web. In: Proceedings of the 38th Annual Hawaii International Conference on System Sciences, pp. 99–108 (2005) Kennedy, A., Shepherd, M.: Automatic identification of home pages on the web. In: Proceedings of the 38th Annual Hawaii International Conference on System Sciences, pp. 99–108 (2005)
17.
Zurück zum Zitat Kumar, S., Sahoo, G.: Classification of heart disease using Naive Bayes and genetic algorithm. In: Proceedings of the International Conference on CIDM, pp. 269–282 (2014) Kumar, S., Sahoo, G.: Classification of heart disease using Naive Bayes and genetic algorithm. In: Proceedings of the International Conference on CIDM, pp. 269–282 (2014)
18.
Zurück zum Zitat Li, D.G., Liu, X.B., Zhao, J.M.: An approach for J wave auto-detection based on support vector machine. In: Big Data Computing and Communications, pp. 435–461 (2015) Li, D.G., Liu, X.B., Zhao, J.M.: An approach for J wave auto-detection based on support vector machine. In: Big Data Computing and Communications, pp. 435–461 (2015)
19.
Zurück zum Zitat Liao, S.H., Chu, P.H., Hsiao, P.Y.: Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst. Appl. 39, 11303–11311 (2012)CrossRef Liao, S.H., Chu, P.H., Hsiao, P.Y.: Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst. Appl. 39, 11303–11311 (2012)CrossRef
20.
Zurück zum Zitat Littlestone, N.: Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach. Learn. 2(4), 285–318 (1988) Littlestone, N.: Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach. Learn. 2(4), 285–318 (1988)
21.
Zurück zum Zitat Littlestone, N.: Mistake bounds and logarithmic linear-threshold learning algorithms. Technical report UCSC-CRL-89-11 (1989) Littlestone, N.: Mistake bounds and logarithmic linear-threshold learning algorithms. Technical report UCSC-CRL-89-11 (1989)
22.
Zurück zum Zitat Mohri, M., Rostamizadeh, A., Talwalker, A.: Foundations of Machine Learning. MIT, Cambridge (2012)MATH Mohri, M., Rostamizadeh, A., Talwalker, A.: Foundations of Machine Learning. MIT, Cambridge (2012)MATH
23.
Zurück zum Zitat Nello, C., John, S.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)MATH Nello, C., John, S.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)MATH
24.
Zurück zum Zitat Prakash, V.J., Nithya, L.M.: A survey on semi-supervised learning techniques. Int. J. Comput. Trends Technol. 8(1), 25–29 (2014)CrossRef Prakash, V.J., Nithya, L.M.: A survey on semi-supervised learning techniques. Int. J. Comput. Trends Technol. 8(1), 25–29 (2014)CrossRef
25.
Zurück zum Zitat Provost, F.J., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. Proc. ICML 98, 445–453 (1998) Provost, F.J., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. Proc. ICML 98, 445–453 (1998)
26.
Zurück zum Zitat Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993) Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
27.
Zurück zum Zitat Rosenblatt, F.: The perceptron—a perceiving and recognizing automaton. Report 85-460-1 (1957) Rosenblatt, F.: The perceptron—a perceiving and recognizing automaton. Report 85-460-1 (1957)
28.
Zurück zum Zitat von Neumann, J.: Various Techniques Used in Connection with Random Digits. Applied Mathematics Series, pp. 36–38. U.S. Government Printing Office, Washington (1951) von Neumann, J.: Various Techniques Used in Connection with Random Digits. Applied Mathematics Series, pp. 36–38. U.S. Government Printing Office, Washington (1951)
29.
Zurück zum Zitat Wang, S.S., Jiang, L.X., Li, C.Q.: Adapting Naive Bayes tree for text classification. Knowl. Inf. Syst. 44(1), 77–89 (2015)CrossRef Wang, S.S., Jiang, L.X., Li, C.Q.: Adapting Naive Bayes tree for text classification. Knowl. Inf. Syst. 44(1), 77–89 (2015)CrossRef
30.
Zurück zum Zitat Widrow, B., Hoff, M.E.: Adaptive switching circuits. In: Proceedings of WESCON Convention, pp. 96–140 (1960) Widrow, B., Hoff, M.E.: Adaptive switching circuits. In: Proceedings of WESCON Convention, pp. 96–140 (1960)
31.
Zurück zum Zitat Yeung, D.S., Chan, P.P.K.: A novel dynamic fusion method using localized generalization error model. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 623–628 (2009) Yeung, D.S., Chan, P.P.K.: A novel dynamic fusion method using localized generalization error model. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 623–628 (2009)
32.
Zurück zum Zitat Zhu, J., Yang, Y., Xie, Q., Wang, L., Hassan, S.: Robust hybrid name disambiguation framework for large databases. Scientometrics 98, 2255–2274 (2014)CrossRef Zhu, J., Yang, Y., Xie, Q., Wang, L., Hassan, S.: Robust hybrid name disambiguation framework for large databases. Scientometrics 98, 2255–2274 (2014)CrossRef
Metadaten
Titel
An examination of on-line machine learning approaches for pseudo-random generated data
verfasst von
Jia Zhu
Chuanhua Xu
Zhixu Li
Gabriel Fung
Xueqin Lin
Jin Huang
Changqin Huang
Publikationsdatum
01.09.2016
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 3/2016
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0586-5

Weitere Artikel der Ausgabe 3/2016

Cluster Computing 3/2016 Zur Ausgabe