Skip to main content
Top

2019 | OriginalPaper | Chapter

Crowd-Powered Systems to Diminish the Effects of Semantic Drift

Authors : Saulo D. S. Pedro, Estevam R. Hruschka Jr.

Published in: Hybrid Artificial Intelligent Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Internet and social Web made possible the acquisition of information to feed a growing number of Machine Learning (ML) applications and, in addition, brought light to the use of crowdsourcing approaches, commonly applied to problems that are easy for humans but difficult for computers to solve, building the crowd-powered systems. In this work, we consider the issue of semantic drift in a bootstrap learning algorithm and propose the novel idea of a crowd-powered approach to diminish the effects of such issue. To put this idea to test we built a hybrid version of the Coupled Pattern Learner (CPL), a bootstrap learning algorithm that extract contextual patterns from an unstructured text, and SSCrowd, a component that allows conversation between learning systems and Web users, in an attempt to actively and autonomously look for human supervision by asking people to take part into the knowledge acquisition process, thus using the intelligence of the crowd to improve the learning capabilities of CPL. We take advantage of the ease that humans have to understand language in unstructured text, and we show the results of using a hybrid crowd-powered approach to diminish the effects of semantic drift.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Amershi, S., Cakmak, M., Knox, W.B., Kulesza, T.: Power to the people: the role of humans in interactive machine learning. AI Magazine 35(4), 105–120 (2014)CrossRef Amershi, S., Cakmak, M., Knox, W.B., Kulesza, T.: Power to the people: the role of humans in interactive machine learning. AI Magazine 35(4), 105–120 (2014)CrossRef
2.
go back to reference Balcan, M.-F., Urner, R.: Active learning-modern learning theory. In: Kao, M.-Y. (ed.) Encyclopedia of Algorithms, pp. 8–13. Springer, New York (2016)CrossRef Balcan, M.-F., Urner, R.: Active learning-modern learning theory. In: Kao, M.-Y. (ed.) Encyclopedia of Algorithms, pp. 8–13. Springer, New York (2016)CrossRef
4.
go back to reference Bernstein, M.S., et al.: Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, pp. 313–322. ACM (2010) Bernstein, M.S., et al.: Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, pp. 313–322. ACM (2010)
5.
go back to reference Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998) Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
6.
go back to reference Bradeško, L., Starc, J., Mladenic, D., Grobelnik, M., Witbrock, M.: Curious cat conversational crowd based and context aware knowledge acquisition chat bot. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 239–252. IEEE (2016) Bradeško, L., Starc, J., Mladenic, D., Grobelnik, M., Witbrock, M.: Curious cat conversational crowd based and context aware knowledge acquisition chat bot. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 239–252. IEEE (2016)
7.
go back to reference Brew, A., Greene, D., Cunningham, P.: Using crowdsourcing and active learning to track sentiment in online media. In: ECAI, pp. 145–150 (2010) Brew, A., Greene, D., Cunningham, P.: Using crowdsourcing and active learning to track sentiment in online media. In: ECAI, pp. 145–150 (2010)
8.
go back to reference Callan, J., Hoy, M., Yoo, C., Zhao, L.: Clueweb09 data set (2009) Callan, J., Hoy, M., Yoo, C., Zhao, L.: Clueweb09 data set (2009)
9.
go back to reference Carlson, A.: Coupled semi-supervised learning. Tech. rep., Machine Learning Department, Carnegie Mellon University (2010) Carlson, A.: Coupled semi-supervised learning. Tech. rep., Machine Learning Department, Carnegie Mellon University (2010)
10.
go back to reference Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr, E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI, vol. 5, p. 3 (2010) Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr, E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI, vol. 5, p. 3 (2010)
11.
go back to reference Curran, J.R., Murphy, T., Scholz, B.: Minimising semantic drift with mutual exclusion bootstrapping. In: Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, vol. 6, pp. 172–180. Citeseer (2007) Curran, J.R., Murphy, T., Scholz, B.: Minimising semantic drift with mutual exclusion bootstrapping. In: Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, vol. 6, pp. 172–180. Citeseer (2007)
12.
go back to reference Kamar, E., Hacker, S., Horvitz, E.: Combining human and machine intelligence in large-scale crowdsourcing. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 467–474. International Foundation for Autonomous Agents and Multiagent Systems (2012) Kamar, E., Hacker, S., Horvitz, E.: Combining human and machine intelligence in large-scale crowdsourcing. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 467–474. International Foundation for Autonomous Agents and Multiagent Systems (2012)
13.
go back to reference Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Advances in neural information processing systems, pp. 1953–1961 (2011) Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Advances in neural information processing systems, pp. 1953–1961 (2011)
14.
go back to reference Lasecki, W.S., Wesley, R., Nichols, J., Kulkarni, A., Allen, J.F., Bigham, J.P.: Chorus: a crowd-powered conversational assistant. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, pp. 151–162. ACM (2013) Lasecki, W.S., Wesley, R., Nichols, J., Kulkarni, A., Allen, J.F., Bigham, J.P.: Chorus: a crowd-powered conversational assistant. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, pp. 151–162. ACM (2013)
15.
go back to reference Lenat, D.B.: CYC: a large-scale investment in knowledge infrastructure. Commun. ACM 38(11), 33–38 (1995)CrossRef Lenat, D.B.: CYC: a large-scale investment in knowledge infrastructure. Commun. ACM 38(11), 33–38 (1995)CrossRef
16.
go back to reference McIntosh , T., Curran, J.R.: Reducing semantic drift with bagging and distributional similarity. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 396–404 (2009) McIntosh , T., Curran, J.R.: Reducing semantic drift with bagging and distributional similarity. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 396–404 (2009)
17.
go back to reference Pedro, S.D.S., Appel, A.P., Hruschka Jr, E.R.: Autonomously reviewing and validating the knowledge base of a never-ending learning system. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1195–1204. ACM (2013) Pedro, S.D.S., Appel, A.P., Hruschka Jr, E.R.: Autonomously reviewing and validating the knowledge base of a never-ending learning system. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1195–1204. ACM (2013)
18.
go back to reference Pedro, S.D.S., Hruschka, E.R.: Conversing learning: active learning and active social interaction for human supervision in never-ending learning systems. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds.) IBERAMIA 2012. LNCS (LNAI), vol. 7637, pp. 231–240. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34654-5_24CrossRef Pedro, S.D.S., Hruschka, E.R.: Conversing learning: active learning and active social interaction for human supervision in never-ending learning systems. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds.) IBERAMIA 2012. LNCS (LNAI), vol. 7637, pp. 231–240. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-34654-5_​24CrossRef
19.
go back to reference Pedro, S.D.S., Hruschka Jr, E.R.: Collective intelligence as a source for machine learning self-supervision. In: Proceedings of the 4th International Workshop on Web Intelligence & Communities in conjunction with WWW 2012, p. 5. ACM (2012) Pedro, S.D.S., Hruschka Jr, E.R.: Collective intelligence as a source for machine learning self-supervision. In: Proceedings of the 4th International Workshop on Web Intelligence & Communities in conjunction with WWW 2012, p. 5. ACM (2012)
20.
go back to reference Riloff, E., Jones, R., et al.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI/IAAI, pp. 474–479 (1999) Riloff, E., Jones, R., et al.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI/IAAI, pp. 474–479 (1999)
21.
go back to reference Settles, B.: Active learning literature survey. University of Wisconsin, Madison 52(55–66), 11 (2010) Settles, B.: Active learning literature survey. University of Wisconsin, Madison 52(55–66), 11 (2010)
22.
go back to reference Sun, C., Rampalli, N., Yang, F., Doan, A.H.: Chimera: Large-scale classification using machine learning, rules, and crowdsourcing. Proc. VLDB Endowment 7(13), 1529–1540 (2014)CrossRef Sun, C., Rampalli, N., Yang, F., Doan, A.H.: Chimera: Large-scale classification using machine learning, rules, and crowdsourcing. Proc. VLDB Endowment 7(13), 1529–1540 (2014)CrossRef
23.
go back to reference Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: human-based character recognition via web security measures. Science 321(5895), 1465–1468 (2008)MathSciNetCrossRef Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: human-based character recognition via web security measures. Science 321(5895), 1465–1468 (2008)MathSciNetCrossRef
24.
go back to reference Yangarber, R.: Counter-training in discovery of semantic patterns. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 343–350. Association for Computational Linguistics (2003) Yangarber, R.: Counter-training in discovery of semantic patterns. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 343–350. Association for Computational Linguistics (2003)
25.
go back to reference Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd Annual Meeting of the Association for Computational Linguistics (1995) Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd Annual Meeting of the Association for Computational Linguistics (1995)
26.
go back to reference Zaidan, O.F., Burch, C.C.: Crowdsourcing translation: professional quality from non-professionals. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1220–1229. Association for Computational Linguistics (2011) Zaidan, O.F., Burch, C.C.: Crowdsourcing translation: professional quality from non-professionals. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1220–1229. Association for Computational Linguistics (2011)
Metadata
Title
Crowd-Powered Systems to Diminish the Effects of Semantic Drift
Authors
Saulo D. S. Pedro
Estevam R. Hruschka Jr.
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-29859-3_59

Premium Partner