ABSTRACT
One problem facing players of competitive games is negative, or toxic, behavior. League of Legends, the largest eSport game, uses a crowdsourcing platform called the Tribunal to judge whether a reported toxic player should be punished or not. The Tribunal is a two stage system requiring reports from those players that directly observe toxic behavior, and human experts that review aggregated reports. While this system has successfully dealt with the vague nature of toxic behavior by majority rules based on many votes, it naturally requires tremendous cost, time, and human efforts. In this paper, we propose a supervised learning approach for predicting crowdsourced decisions on toxic behavior with large-scale labeled data collections; over 10 million user reports involved in 1.46 million toxic players and corresponding crowdsourced decisions. Our result shows good performance in detecting overwhelmingly majority cases and predicting crowdsourced decisions on them. We demonstrate good portability of our classifier across regions. Finally, we estimate the practical implications of our approach, potential cost savings and victim protection.
- J. Barnett, M. Coulson, and N. Foreman. Examining player anger in World of Warcraft. In Online worlds: Convergence of the real and the virtual, pages 147--160. 2010.Google Scholar
- F. S. Bellezza, A. G. Greenwald, and M. R. Banaji. Words high and low in pleasantness as rated by male and female college students. Behavior Research Methods, Instruments, & Computers, 18(3):299--303, 1986.Google Scholar
- J. Blackburn, R. Simha, N. Kourtellis, X. Zuo, M. Ripeanu, J. Skvoretz, and A. Iamnitchi. Branded with a scarlet "C": Cheaters in a gaming social network. In WWW '12, pages 81--90, 2012. Google ScholarDigital Library
- M. M. Bradley and P. J. Lang. Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical report, Technical Report C-1, The Center for Research in Psychophysiology, University of Florida, 1999.Google Scholar
- A. Brew, D. Greene, and P. Cunningham. Using crowdsourcing and active learning to track sentiment in online media. In ECAI, pages 145--150, 2010. Google ScholarDigital Library
- V. H.-H. Chen, H. B.-L. Duh, and C. W. Ng. Players who play to make others cry: The influence of anonymity and immersion. In ACE '09, 2009. Google ScholarDigital Library
- T. Chesney, I. Coyne, B. Logan, and N. Madden. Griefing in virtual worlds: Causes, casualties and coping strategies. Information Systems Journal, 19(6):525--548, 2009.Google ScholarCross Ref
- M. Davies. Gamers don't want any more grief, 2011. http://tinyurl.com/stfunub11.Google Scholar
- P. S. Dodds and C. M. Danforth. Measuring the happiness of large-scale written expression: Songs, blogs, and presidents. Journal of Happiness Studies, 11(4):441--456, 2010.Google ScholarCross Ref
- C. Y. Foo and E. M. I. Koivisto. Defining grief play in MMORPGs: Player and developer perceptions. In ACE '04, 2004. Google ScholarDigital Library
- M. R. Frank, L. Mitchell, P. S. Dodds, and C. M. Danforth. Happiness and the patterns of life: A study of geolocated tweets. Scientific Reports, 3(2625), 2013.Google Scholar
- M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. CrowdDB: Answering queries with crowdsourcing. In SIGMOD '11, pages 61--72, 2011. Google ScholarDigital Library
- S. A. Golder and M. W. Macy. Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science, 333(6051):1878--1881, 2011.Google ScholarCross Ref
- M. F. Goodchild and J. A. Glennon. Crowdsourcing geographic information for disaster response: a research frontier. International Journal of Digital Earth, 3(3):231--241, 2010.Google ScholarCross Ref
- J. Heer and M. Bostock. Crowdsourcing graphical perception: Using Mechanical Turk to assess visualization design. In CHI '10, pages 203--212, 2010. Google ScholarDigital Library
- S. S. Ho and D. M. McLeod. Social-psychological influences on opinion expression in face-to-face and computer-mediated communication. Communication Research, 35(2):190--207, 2008.Google ScholarCross Ref
- J. Howe. The rise of crowdsourcing. Wired magazine, 14(6):1--4, 2006.Google Scholar
- P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on Amazon Mechanical Turk. In SIGKDD '10 Workshop on Human Computation, pages 64--67, 2010. Google ScholarDigital Library
- E. Kamar, S. Hacker, and E. Horvitz. Combining human and machine intelligence in large-scale crowdsourcing. In AAMAS '12, pages 467--474, 2012. Google ScholarDigital Library
- A. Kittur, E. H. Chi, and B. Suh. Crowdsourcing user studies with Mechanical Turk. In CHI '08, pages 453--456, 2008. Google ScholarDigital Library
- A. Kumar and M. Lease. Modeling annotator accuracies for supervised learning. In WSDM '11 Workshop on Crowdsourcing for Search and Data Mining, pages 19--22, 2011.Google Scholar
- H. Kwak and S. Han. "So many bad guys, so little time": Understanding toxic behavior and reaction in team competition games. Submitted.Google Scholar
- H. Lin and C.-T. Sun. The "white-eyed" player culture: Grief play and construction of deviance in MMORPGs. In DiGRA '05, 2005.Google Scholar
- J.-K. Lou, K. Park, M. Cha, J. Park, C.-L. Lei, and K.-T. Chen. Gender swapping and user behaviors in online social games. In WWW '13, pages 827--836, 2013. Google ScholarDigital Library
- K. Y. A. McKenna and J. A. Bargh. Plan 9 from cyberspace: The implications of the Internet for personality and social psychology. Personality and Social Psychology Review, 4(1):57--75, 2000.Google ScholarCross Ref
- L. Mitchell, M. R. Frank, K. D. Harris, P. S. Dodds, and C. M. Danforth. The geography of happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place. PloS one, 8(5):e64417, 2013.Google ScholarCross Ref
- J. Mulligan, B. Patrovsky, and R. Koster. Developing online games: An insider's guide. Pearson Education, 2003. Google ScholarDigital Library
- S. Nowak and S. Rüger. How reliable are annotations via crowdsourcing: A study about inter-annotator agreement for multi-label image annotation. In Proceedings of the international conference on Multimedia information retrieval, MIR '10, pages 557--566, 2010. Google ScholarDigital Library
- D. Olweus. Bullying at school: Long term outcomes for the victims and an effective school-based intervention program. Aggressive Behavior: Current Perspectives, pages 97--130, 1996.Google Scholar
- G. Paolacci, J. Chandler, and P. Ipeirotis. Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5(5):411--419, 2010.Google ScholarCross Ref
- A. J. Quinn and B. B. Bederson. Human-machine hybrid computation. In CHI '11 Workshop On Crowdsourcing And Human Computation, 2011.Google Scholar
- A. J. Quinn, B. B. Bederson, T. Yeh, and J. Lin. CrowdFlow: Integrating machine learning with Mechanical Turk for speed-cost-quality flexibility. Better Performance Over Iterations, 2010.Google Scholar
- V. S. Sheng, F. Provost, and P. G. Ipeirotis. Get another label? Improving data quality and data mining using multiple, noisy labelers. In KDD '08, pages 614--622, 2008. Google ScholarDigital Library
- P. K. Smith, J. Mahdavi, M. Carvalho, S. Fisher, S. Russell, and N. Tippett. Cyberbullying: Its nature and impact in secondary school pupils. Journal of Child Psychology and Psychiatry, 49(4):376--385, 2008.Google ScholarCross Ref
- R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast--but is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing, pages 254--263. Association for Computational Linguistics, 2008. Google ScholarDigital Library
- J. Suler. The online disinhibition effect. Cyberpsychology & behavior, 7(3):321--326, 2004.Google ScholarCross Ref
- W. Tang and M. Lease. Semi-supervised consensus labeling for crowdsourcing. In SIGIR '11 Workshop on Crowdsourcing for Information Retrieval, 2011.Google Scholar
- A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe. Predicting elections with Twitter: What 140 characters reveal about political sentiment. In ICWSM '10, pages 178--185, 2010.Google Scholar
- R. Van Houten. Punishment: From the animal laboratory to the applied setting. The Effects of Punishment on Human Behavior, pages 13--44, 1983.Google ScholarCross Ref
- E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. Information processing & management, 36(5):697--716, 2000. Google ScholarDigital Library
- D. E. Warner and M. Ratier. Social context in massively-multiplayer online games (MMOGs): Ethical questions in shared space. International Review of Information Ethics, 4(7), 2005.Google Scholar
- B. Weiner. A cognitive (attribution)-emotion-action model of motivated behavior: An analysis of judgments of help-giving. Journal of Personality and Social Psychology, 39(2):186, 1980.Google ScholarCross Ref
- O. Zaidan and C. Callison-Burch. Crowdsourcing translation: Professional quality from non-professionals. In ACL '11, pages 1220--1229, 2011. Google ScholarDigital Library
Index Terms
- STFU NOOB!: predicting crowdsourced decisions on toxic behavior in online games
Recommendations
Understanding toxic behavior in online games
WWW '14 Companion: Proceedings of the 23rd International Conference on World Wide WebWith the remarkable advances from isolated console games to massively multi-player online role-playing games, the online gaming world provides yet another place where people interact with each other. Online games have attracted attention from ...
Toxic Behaviors in Team-Based Competitive Gaming: The Case of League of Legends
CHI PLAY '20: Proceedings of the Annual Symposium on Computer-Human Interaction in PlayToxic behaviors in online gaming such as flaming and harassment have been gaining attention from the research community, yet little consensus has formed about what constitutes toxic behavior. Game developers usually maintain a classification system of ...
Studying toxic behavior influence and player chat in an online video game
WI '17: Proceedings of the International Conference on Web IntelligenceMany online collaborative games, e-sports in particular, heavily rely on teamwork. However, players can act in an antisocial way during the match, creating dissent into the match. This kind of behavior is referred to as toxic. We aim to discover the ...
Comments