nach oben

Progress in Artificial Intelligence

Erschienen in:

16.02.2017 | Regular Paper

The effect of human thought on data: an analysis of self-reported data in supervised learning and neural networks

verfasst von: Justin Lovinger, Iren Valova

Erschienen in: Progress in Artificial Intelligence | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

While scientific applications can gather consistent data from the natural world, psychological, sociological, and even economic applications rely on data provided by people. Since the majority of machine learning is aimed at improving the lives of people, human input is essential for useful results. In this paper, we explore datasets where input and target attributes are provided by people taking surveys. Every survey dataset, generated from human input, is reliable and self-consistent according to Cronbach’s alpha. One expects a reliable questionnaire to provide effective data for learning. It is this expectation that our analysis finds false, when applied to supervised learning. Both statistical analysis and application of several supervised learning architectures, with a focus on neural networks, are utilized to provide insight into data gathered through human input.

Vorheriger Artikel MR-DIS: democratic instance selection for big data by MapReduce

Nächster Artikel Preference stability along time: the time cohesiveness measure

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Lovinger, J.: Clever surveys. https://www.cleversurveys.com/. Accessed 30 Dec 2016

Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)CrossRef

Basu, M., Ho, T.K.: Data Complexity in Pattern Recognition. Springer Science & Business Media, Berlin (2006)CrossRefMATH

Sáez, J.A., Krawczyk, B., Woźniak, M.: Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recogn. 57, 164–178 (2016)CrossRef

Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)MATH

Smith, M.R., Martinez, T.: Improving classification accuracy by identifying and removing instances that should be misclassified. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 2690–2697. IEEE (2011)

Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recogn. Lett. 24(7), 1015–1022 (2003)CrossRef

Barandela, R., Gasca, E.: Decontamination of training samples for supervised pattern recognition methods. In: Ferri, F.J., Inesta, J.M., Amin, A., Pudil, P. (eds.) Advances in Pattern Recognition, pp. 621–630. Springer, Berlin (2000)

Jiang, Y., Zhou, Z.-H.: Editing training data for knn classifiers with neural network ensemble. In: Advances in Neural Networks–ISNN 2004, pp. 356–361. Springer, Berlin (2004)

10.

Bootkrajang, J., Kabán, A.: Multi-class classification in the presence of labelling errors. In: ESANN, Citeseer (2011)

11.

Harhoff, D., Körting, T.: Lending relationships in Germany–empirical evidence from survey data. J. Bank. Finance 22(10), 1317–1353 (1998)CrossRef

12.

De Vaus, D.: Surveys in Social Research. Routledge, London (2013)

13.

Thompson, D.F.: Deliberative democratic theory and empirical political science. Annu. Rev. Polit. Sci. 11, 497–520 (2008)CrossRef

14.

van Kampen, D.: The 5-dimensional personality test (5dpt): relationships with two lexically based instruments and the validation of the absorption scale. J. Personal. Assess. 94(1), 92–101 (2012)CrossRef

15.

Burisch, M.: Approaches to personality inventory construction: a comparison of merits. Am. Psychol. 39(3), 214 (1984)CrossRef

16.

Reyes-Ortiz, J.-L., Anguita, D., Ghio, A., Parra, X.: Human activity recognition using smartphones data set. UCI Machine Learning Repository (2013)

17.

Aha, D.W.: Heart disease data set. UCI Machine Learning Repository (1988)

18.

Gonyea, R.M.: Self-reported data in institutional research: review and recommendations. New Dir. Inst. Res. 127, 73 (2005)

19.

Harrison, L.D.: The validity of self-reported data on drug use. J. Drug Issues 25(1), 91–111 (1995)CrossRef

20.

van Poppel, M.N.M., de Vet, H.C.W., Koes, B.W., Smid, T., Bouter, L.M.: Measuring sick leave: a comparison of self-reported data on sick leave and data from company records. Occup. Med. 52(8), 485–490 (2002)CrossRef

21.

Wang, S.: Classification with incomplete survey data: a hopfield neural network approach. Comput. Oper. Res. 32(10), 2583–2594 (2005)CrossRefMATH

22.

Lu, C., Li, X.-W., Pan, H.-B.: Application of extension neural network for classification with incomplete survey data. In: First International Conference on Innovative Computing, Information and Control, 2006. ICICIC’06, vol. 3, pp. 190–193. IEEE (2006)

23.

Tagliaferri, R., Longo, G., Milano, L., Acernese, F., Barone, F., Ciaramella, A., De Rosa, R., Donalek, C., Eleuteri, A., Raiconi, G., et al.: Neural networks in astronomy. Neural Netw. 16(3), 297–319 (2003)CrossRef

24.

Hagan, M.T., Demuth, H.B., Beale, M.H., et al.: Neural Network Design. Pws Pub, Boston (1996)

25.

Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)

26.

Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

27.

Tóth, L.: Phone recognition with deep sparse rectifier neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6985–6989. IEEE (2013)

28.

Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30, p. 1 (2013)

29.

Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186. Springer, Berlin (2010)

30.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH

31.

Lowe, D., Broomhead, D.: Multivariable functional interpolation and adaptive networks. Complex Syst. 2, 321–355 (1988)MathSciNetMATH

32.

Broomhead, D.S., Lowe, D.: Radial basis functions, multi-variable functional interpolation and adaptive networks. Technical report, DTIC Document (1988)

33.

Tan, Y., Wang, J., Zurada, J.M.: Nonlinear blind source separation using a radial basis function network. IEEE Trans. Neural Netw. 12(1), 124–134 (2001)CrossRef

34.

Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982)MathSciNetCrossRefMATH

35.

Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)CrossRef

36.

Kalteh, A.M., Hjorth, P., Berndtsson, R.: Review of the self-organizing map (SOM) approach in water resources: analysis, modelling and application. Environ. Model. Softw. 23(7), 835–845 (2008)CrossRef

37.

Mao, K.Z., Tan, K.-C.: Probabilistic neural-network structure determination for pattern classification. IEEE Trans. Neural Netw. 11(4), 1009–1016 (2000)CrossRef

38.

Gao, M., Tian, J.: Web classification mining based on radial basic probabilistic neural network. In: 2009 First International Workshop on Database Technology and Applications, pp. 586–589. IEEE (2009)

39.

Specht, D.F.: Probabilistic neural networks. Neural Netw. 3(1), 109–118 (1990)CrossRef

40.

Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, 1995, vol. 1, pp. 278–282. IEEE (1995)

41.

Díaz-Uriarte, R., De Andres, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinf. 7(1), 1 (2006)CrossRef

42.

Rodriguez-Galiano, V.F., Ghimire, B., Rogan, J., Chica-Olmo, M., Rigol-Sanchez, J.P.: An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 67, 93–104 (2012)CrossRef

43.

Utgoff, P.E.: Incremental induction of decision trees. Mach. Learn. 4(2), 161–186 (1989)

44.

Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994)

45.

Pradhan, B.: A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using gis. Comput. Geosci. 51, 350–365 (2013)CrossRef

46.

Milborrow, S.: Titanic decision tree. https://en.wikipedia.org/wiki/Decision_tree_learning#/media/File:CART_tree_titanic_survivors.png

47.

Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)MathSciNetCrossRef

48.

Quinlan, R.J.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

49.

scikit-learn: http://scikit-learn.org

50.

TheStoat: Linux distributions. https://www.cleversurveys.com/surveys/survey-5763859801440256_5629499534213120/responses/-linux-distributions. Accessed 30 Dec 2016

51.

Lewis, R., Goldberg, L.R.: The structure of phenotypic personality traits. Am. Psychol. 48(1), 26 (1993)CrossRef

52.

Costa, P.T., McCrae, R.R.: The revised neo personality inventory (neo-pi-r). SAGE Handb. Personal. Theory Assess. 2, 179–198 (2008)

53.

Turiano, N.A., Mroczek, D.K., Moynihan, J., Chapman, B.P.: Big 5 personality traits and interleukin-6: evidence for healthy neuroticism in a us population sample. Brain Behav. Immun. 28, 83–89 (2013)CrossRef

54.

PaintingInAir: What pet should i get? https://www.cleversurveys.com/surveys/survey-5709198289534976_5629499534213120/responses/-what-pet-should-i-get. Accessed 30 Dec 2016

55.

AvinaDiviri: Alcoholic drink predictor. https://www.cleversurveys.com/surveys/survey-6024271184789504_5668600916475904/responses/-alcoholic-drink-predictor. Accessed 30 Dec 2016

56.

Marshall, M.: UCI machine learning repository (1988)

57.

Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: theory and application to medical diagnosis (1990)

58.

Tavakol, M., Dennick, R.: Making sense of Cronbach’s alpha. Int. J. Med. Educ. 2, 53 (2011)CrossRef

59.

Bland, J.M., Altman, D.G.: Statistics notes: Cronbach’s alpha. Bmj 314(7080), 572 (1997)CrossRef

Titel: The effect of human thought on data: an analysis of self-reported data in supervised learning and neural networks
verfasst von: Justin Lovinger
Iren Valova
Publikationsdatum: 16.02.2017
Verlag: Springer Berlin Heidelberg
Erschienen in: Progress in Artificial Intelligence / Ausgabe 3/2017
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI: https://doi.org/10.1007/s13748-017-0118-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 3/2017

A genetic algorithm approach to customizing a glucose model based on usual therapeutic parameters

Preference stability along time: the time cohesiveness measure

Application of ensemble classifiers in accurate diagnosis of myocardial ischemia conditions

MR-DIS: democratic instance selection for big data by MapReduce

Handling swarm of UAVs based on evolutionary multi-objective optimization

Comparing multi-objective metaheuristics for solving a three-objective formulation of multiple sequence alignment

Premium Partner