ABSTRACT
Search systems in online social media sites are frequently used to find information about ongoing events and people. For topics with multiple competing perspectives, such as political events or political candidates, bias in the top ranked results significantly shapes public opinion. However, bias does not emerge from an algorithm alone. It is important to distinguish between the bias that arises from the data that serves as the input to the ranking system and the bias that arises from the ranking system itself. In this paper, we propose a framework to quantify these distinct biases and apply this framework to politics-related queries on Twitter. We found that both the input data and the ranking system contribute significantly to produce varying amounts of bias in the search results and in different ways. We discuss the consequences of these biases and possible mechanisms to signal this bias in social media search systems' interfaces.
- Lada A. Adamic and Natalie Glance. 2005. The Political Blogosphere and the 2004 U.S. Election: Divided They Blog. In Proc. LinkKDD. Google ScholarDigital Library
- Solon Barocas and Andrew D Selbst. 2014. Big data's disparate impact. Available at SSRN 2477899 (2014).Google Scholar
- Parantapa Bhattacharya, Muhammad Bilal Zafar, Niloy Ganguly, Saptarshi Ghosh, and Krishna P. Gummadi. 2014. Inferring User Interests in the Twitter Social Network. In Proc. ACM RecSys. Google ScholarDigital Library
- Le Chen, Alan Mislove, and Christo Wilson. 2015. Peeking beneath the hood of uber. In In Proc. of the 2015 ACM Conference on Internet Measurement Conference. ACM, 495--508. Google ScholarDigital Library
- M. Conover, B. Gonçalves, J. Ratkiewicz, A. Flammini, and F. Menczer. 2011a. Predicting the Political Alignment of Twitter Users. In Proc. IEEE SocialCom.Google Scholar
- M. Conover, J. Ratkiewicz, Matthew Francisco, B Gonçalves, F. Menczer, and A. Flammini. 2011b. Political Polarization on Twitter. In Proc. AAAI ICWSM.Google Scholar
- Amit Datta, Michael Carl Tschantz, and Anupam Datta. 2014. Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination. Choice, and Discrimination. arXiv. org (2014).Google Scholar
- Robert Epstein and Ronald E. Robertson. 2015. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections. Proc. of the National Academy of Sciences (PNAS) 112, 33 (2015), E4512--E4521.Google ScholarCross Ref
- Motahhare Eslami, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. I always assumed that I wasn't really that close to {her}: Reasoning about Invisible Algorithms in News Feeds. In In Proc. of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 153--162. Google ScholarDigital Library
- USA Executive Office of the President. 2016. Big Data: A Report on Algorithmic Systems, Opportunity,and Civil Rights. http://tinyurl.com/Big-Data-White-House. (2016).Google Scholar
- S. Fortunato, A. Flammini, F. Menczer, and A. Vespignani. 2006. Topical interests and the mitigation of search engine bias. Proc. of the National Academy of Sciences (PNAS) 103, 34 (2006), 12684--12689.Google ScholarCross Ref
- Saptarshi Ghosh, Naveen Sharma, Fabricio Benevenuto, Niloy Ganguly, and Krishna Gummadi. 2012. Cognos: Crowdsourcing Search for Topic Experts in Microblogs. In Proc. ACM SIGIR. 978-1-4503-1472-5 Google ScholarDigital Library
- Jennifer Golbeck and Derek Hansen. 2011. Computing Political Preference Among Twitter Followers. In ACM SIGCHI. Google ScholarDigital Library
- Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring Personalization of Web Search. In Proc. WWW. Google ScholarDigital Library
- Aniko Hannak, Gary Soeller, David Lazer, Alan Mislove, and Christo Wilson. 2014. Measuring price discrimination and steering on e-commerce web sites. In In Proc. of the 2014 conference on internet measurement conference. ACM, 305--318. Google ScholarDigital Library
- Alex Hern. 2015. Flickr faces complaints over 'offensive' auto-tagging for photos. http://tinyurl.com/Flickr-AutoTagging. (2015).Google Scholar
- Itai Himelboim, Stephen McCreery, and Marc Smith. 2013. Birds of a feather tweet together: Integrating network and content analyses to examine cross-ideology exposure on Twitter. Journal of Computer-Mediated Communication 18, 2 (2013), 40--60. Google ScholarDigital Library
- Chloe Kliman-Silver, Aniko Hannak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, Location, Location: The Impact of Geolocation on Web Search Personalization. In Proc. ACM IMC. Google ScholarDigital Library
- Danai Koutra, Paul N. Bennett, and Eric Horvitz. 2015. Events and Controversies: Influences of a Shocking News Event on Information Seeking. In Proc. WWW. Google ScholarDigital Library
- Q Vera Liao, Wai-Tat Fu, and Markus Strohmaier. 2016. Snowden: Understanding Biases Introduced by Behavioral Differences of Opinion Groups on Social Media. In In Proc. of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3352--3363. Google ScholarDigital Library
- Joseph Lichterman. 2010. New Pew data: More Americans are getting news on Facebook and Twitter. (2010). http://tinyurl.com/News-on-Social-Media.Google Scholar
- Zhe Liu and Ingmar Weber. 2014. Is Twitter a public sphere for online conflicts? A cross-ideological and cross-hierarchical look. In International Conference on Social Informatics. Springer, 336--347.Google ScholarCross Ref
- Aibek Makazhanov and Davood Rafiei. 2013. Predicting Political Preference of Twitter Users. In Proc. ASONAM. Google ScholarDigital Library
- Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press. Google Scholar
- Amy Mitchell and Dana Page. 2015. The Evolving Role of News on Twitter and Facebook. Pew Research Center (2015).Google Scholar
- Abbe Mowshowitz and Akira Kawaguchi. 2005. Measuring search engine bias. Information Processing and Management 41, 5 (2005), 1193--1205. Google ScholarDigital Library
- Sean A Munson, Stephanie Y Lee, and Paul Resnick. 2013. Encouraging Reading of Diverse Political Viewpoints with a Browser Widget. In ICWSM.Google Scholar
- Sean A Munson and Paul Resnick. 2010. Presenting diverse political opinions: how and how much. In In Proc. of the SIGCHI conference on human factors in computing systems. ACM, 1457--1466. Google ScholarDigital Library
- Bing Pan, Helene Hembrooke, Thorsten Joachims, Lori Lorigo, Geri Gay, and Laura Granka. 2007. In Google We Trust: Users' Decisions on Rank, Position, and Relevance. Journal of Computer-Mediated Communication 12 (2007), 801--823. Issue 3.Google ScholarCross Ref
- Souneil Park, Seungwoo Kang, Sangyoung Chung, and Junehwa Song. 2009. NewsCube: delivering multiple aspects of news to mitigate media bias. In Proc. ACM CHI. Google ScholarDigital Library
- Pew Research 2013. Twitter Reaction to Events Often at Odds with Overall Public Opinion. http://www.pewresearch.org/2013/03/04/twitter-reaction-to-events-often-at-odds-with-overall-public-opinion/. (2013).Google Scholar
- Matthew Purver and Sylwester Karolina. 2015. Twitter Language Use Reflects Psychological Differences between Democrats and Republicans. PLoS ONE 10, 9 (2015), e0137422.Google Scholar
- Real Clear Politics 2015. RealClearPolitics -- Election 2016 -- 2016 Republican Presidential Nomination. http://tinyurl.com/us-republican-polling-data. (2015).Google Scholar
- Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data and Discrimination: Converting Critical Concerns into Productive Inquiry (2014).Google Scholar
- Bryan C Semaan, Scott P Robertson, Sara Douglas, and Misa Maruyama. 2014. Social media supporting political deliberation across multiple public spheres: towards depolarization. In In Proc. of the 17th ACM conference on Computer supported cooperative work & social computing. ACM, 1409--1421. Google ScholarDigital Library
- Naveen Sharma, Saptarshi Ghosh, Fabricio Benevenuto, Niloy Ganguly, and Krishna Gummadi. 2012. Inferring Who-is-Who in the Twitter Social Network. In Proc. ACM WOSN. Google ScholarDigital Library
- Laura M Smith, Linhong Zhu, Kristina Lerman, and Zornitsa Kozareva. 2013. The role of social media in the discussion of controversial topics. In Social Computing (SocialCom), 2013 International Conference on. IEEE, 236--243. Google ScholarDigital Library
- Latanya Sweeney. 2013. Discrimination in online ad delivery. Queue 11, 3 (2013), 10. Google ScholarDigital Library
- Herman Tavani. 2014. Search Engines and Ethics. In The Stanford Encyclopedia of Philosophy (spring 2014 ed.), Edward N. Zalta (Ed.).Google Scholar
- Jaime Teevan, Daniel Ramage, and Merredith Ringel Morris. 2011. #TwitterSearch: a comparison of microblog search and web search. In Proc. ACM WSDM. Google ScholarDigital Library
- Elizabeth Van Couvering. 2010. Search engine bias: the structuration of traffic on the World-Wide Web. Ph.D. Dissertation. The London School of Economics and Political Science.Google Scholar
- Liwen Vaughan and Mike Thelwall. 2004. Search Engine Coverage Bias: Evidence and Possible Causes. Information Processing and Management 40, 4 (May 2004), 693--707. Google ScholarDigital Library
- Ingmar Weber, Venkata Rama Kiran Garimella, and Erik Borra. 2012. Mining Web Query Logs to Analyze Political Issues. In Proc. ACM WebSci. Google ScholarDigital Library
- Ingmar Weber, Venkata Rama Kiran Garimella, and Asmelash Teka. 2013. Political hashtag trends. In European Conference on Information Retrieval. Springer, 857--860. Google ScholarDigital Library
- Michael J. Welch, Junghoo Cho, and Christopher Olston. 2011. Search Result Diversity for Informational Queries. In Proc. WWW. Google ScholarDigital Library
- Tae Yano, Philip Resnik, and Noah A. Smith. 2010. Shedding (a Thousand Points of) Light on Biased Language. In Proc NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (CSLDAMT). Google ScholarDigital Library
- Sarita Yardi and Danah Boyd. 2010. Dynamic debates: An analysis of group polarization over time on twitter. Bulletin of Science, Technology & Society 30, 5 (2010), 316--327.Google ScholarCross Ref
- Emine Yilmaz and Javed A. Aslam. 2006. Estimating Average Precision with Incomplete and Imperfect Judgments. In Proc. ACM CIKM. Google ScholarDigital Library
- Elad Yom-Tov, Susan Dumais, and Qi Guo. 2013. Promoting Civil Discourse Through Search Engine Diversity. Social Science Computer Review 32 (2013), 145--154. Issue 2. Google ScholarDigital Library
- Muhammad Bilal Zafar, Krishna P. Gummadi, and Cristian Danescu-Niculescu-Mizil. 2016. Message Impartiality in Social Media Discussions. In Proc. AAAI ICWSM.Google Scholar
- Daniel Xiaodan Zhou, Paul Resnick, and Qiaozhu Mei. 2011. Classifying the Political Leaning of News Articles and Users from User Votes. In Proc. AAAI ICWSM.Google Scholar
Index Terms
- Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media
Recommendations
Search bias quantification: investigating political bias in social media and web search
Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results returned by these search engines can shape user opinion ...
Evaluation metrics for measuring bias in search engine results
AbstractSearch engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the ...
Evaluating the performance and neutrality/bias of search engines
VALUETOOLS 2019: Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and ToolsDifferent search engines provide different outputs for the same keyword. This may be due to different definitions of relevance, to different ranking aggregation methods, and/or to different knowledge/anticipation of users' preferences, but rankings are ...
Comments