Skip to main content

2016 | Buch

Social Informatics

8th International Conference, SocInfo 2016, Bellevue, WA, USA, November 11-14, 2016, Proceedings, Part I

insite
SUCHEN

Über dieses Buch

The two-volume set LNCS 10046 and 10047 constitutes the proceedings of the 8th International Conference on Social Informatics, SocInfo 2016, held in Bellevue, WA, USA, in November 2016.
The 36 full papers and 39 poster papers presented in this volume were carefully reviewed and selected from 120 submissions. They are organized in topical sections named: networks, communities, and groups; politics, news, and events; markets, crowds, and consumers; and privacy, health, and well-being.

Inhaltsverzeichnis

Frontmatter

Networks, Communities, and Groups

Frontmatter
How Well Do Doodle Polls Do?

Web-based Doodle polls, where respondents indicate their availability for a collection of times provided by the poll initiator, are an increasingly common way of selecting a time for an event or meeting. Yet group dynamics can markedly influence an individual’s response, and thus the overall solution quality. Via theoretical worst-case analysis, we analyze certain common behaviors of Doodle poll respondents, including when participants are either more generous with or more protective of their time, showing that deviating from one’s “true availability” can have a substantial impact on the overall quality of the selected time. We show perhaps counter-intuitively that being more generous with your time can lead to inferior time slots being selected, and being more protective of your time can lead to superior time slots being selected. We also bound the improvement and degradation of outcome quality under both types of behaviors.

Danya Alrawi, Barbara M. Anthony, Christine Chung
Bring on Board New Enthusiasts! A Case Study of Impact of Wikipedia Art + Feminism Edit-A-Thon Events on Newcomers

Success of online production communities such as Wikipedia highly relies on a continuous stream of newcomers to replace the inevitable high turnover and to bring on board new sources of ideas and labor. However, these communities have been struggling with attracting newcomers, especially from a diverse population of users. In this work, we conducted a case study on how organizing an offline co-located event over a short period of time contributes to involving newcomers in the online community. We present results of our multiple-source quantitative analysis of Wikipedia Art+Feminism edit-a-thon as a case of such events. The results of our analysis shows that such offline events are successful in attracting a large number of newcomers; however, retention of the newcomers stays as a challenge.

Rosta Farzan, Saiph Savage, Claudia Flores Saviaga
The Social Dynamics of Language Change in Online Networks

Language change is a complex social phenomenon, revealing pathways of communication and sociocultural influence. But, while language change has long been a topic of study in sociolinguistics, traditional linguistic research methods rely on circumstantial evidence, estimating the direction of change from differences between older and younger speakers. In this paper, we use a data set of several million Twitter users to track language changes in progress. First, we show that language change can be viewed as a form of social influence: we observe complex contagion for phonetic spellings and “netspeak” abbreviations (e.g., lol), but not for older dialect markers from spoken language. Next, we test whether specific types of social network connections are more influential than others, using a parametric Hawkes process model. We find that tie strength plays an important role: densely embedded social ties are significantly better conduits of linguistic influence. Geographic locality appears to play a more limited role: we find relatively little evidence to support the hypothesis that individuals are more influenced by geographically local social ties, even in their usage of geographical dialect markers.

Rahul Goel, Sandeep Soni, Naman Goyal, John Paparrizos, Hanna Wallach, Fernando Diaz, Jacob Eisenstein
On URL Changes and Handovers in Social Media

Social media sites (e.g. Twitter and Pinterest) allow users to change the name of their accounts. A change in the account name results in a change in the URL of the user’s homepage. We develop an algorithm that extracts such changes from streaming data and discover that a large number of social media accounts are performing synchronous and collaborative URL changes. We identify various types of URL changes such as handover, exchange, serial handover and loop exchange. All such behaviors are likely to be automated behavior and, thus, indicate accounts that are either already involved in malicious activities or being prepared to do so.In this paper, we focus on URL handovers where a URL is released by a user and claimed by another user. We find interesting association between handovers and temporal, textual and network behaviors of users. We show several anomalous behaviors from suspicious users for each of these associations. We identify that URL handovers are instantaneous automated operations. We further investigate to understand the benefits of URL handovers, and identify that handovers are strongly associated with reusable internal links and successful avoidance of suspension by the host site. Our handover detection algorithm, which makes such analysis possible, is scalable to process millions of posts (e.g. tweets, pins) and shared publicly online.

Hossein Hamooni, Nikan Chavoshi, Abdullah Mueen
Comment-Profiler: Detecting Trends and Parasitic Behaviors in Online Comments

Can we detect anomalies and abuse among users of commenting platforms? Commenting has become a significant activity and specialized platforms provide commenting capability to many popular websites, such as Huffington Post. These platforms have become a new type of online social interaction, but have received very little attention. We conduct an extensive study on 19M comments from Disqus, one of the largest commenting platforms. Our work consists of two thrusts: (a) we identify features and patterns of commenting behavior, and (b) we detect peculiar and parasitic users. First, we study and evaluate features of user behavior that capture different aspects: user-user interaction (“social”), user-article interaction (“engagement”), and temporal properties. We also develop a method which we call, DownTimeFinder, to determine users’ downtime (think night-time) in their daily behavior, which helps identify three major groups of users based on their utilization (3, 9, 15 h of up-time). Second, we identify surprising and abnormal behaviors using our features. Interestingly, we find: (a) two tightly collaborative groups of size at least 29 users that seem to be promoting the same ideas, (b) 38 users with behavior that points to spamming and trolling activities, and (c) 19 different instances where Disqus is used as a chat room. The goal of our work is to highlight commenting platforms as an ignored, but information-rich, online activity.

Tai-Ching Li, Abdullah Mueen, Michalis Faloutsos, Huy Hang
On Profiling Bots in Social Media

The popularity of social media platforms such as Twitter has led to the proliferation of automated bots, creating both opportunities and challenges in information dissemination, user engagements, and quality of services. Past works on profiling bots had been focused largely on malicious bots, with the assumption that these bots should be removed. In this work, however, we find many bots that are benign, and propose a new, broader categorization of bots based on their behaviors. This includes broadcast, consumption, and spam bots. To facilitate comprehensive analyses of bots and how they compare to human accounts, we develop a systematic profiling framework that includes a rich set of features and classifier bank. We conduct extensive experiments to evaluate the performances of different classifiers under varying time windows, identify the key features of bots, and infer about bots in a larger Twitter population. Our analysis encompasses more than 159K bot and human (non-bot) accounts in Twitter. The results provide interesting insights on the behavioral traits of both benign and malicious bots.

Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee-Peng Lim
A Diffusion Model for Maximizing Influence Spread in Large Networks

Influence spread is an important phenomenon that occurs in many social networks. Influence maximization is the corresponding problem of finding the most influential nodes in these networks. In this paper, we present a new influence diffusion model, based on pairwise factor graphs, that captures dependencies and directions of influence among neighboring nodes. We use an augmented belief propagation algorithm to efficiently compute influence spread on this model so that the direction of influence is preserved. Due to its simplicity, the model can be used on large graphs with high-degree nodes, making the influence maximization problem practical on large, real-world graphs. Using large Flixster and Epinions datasets, we provide experimental results showing that our model predictions match well with ground-truth influence spreads, far better than other techniques. Furthermore, we show that the influential nodes identified by our model achieve significantly higher influence spread compared to other popular models. The model parameters can easily be learned from basic, readily available training data. In the absence of training, our approach can still be used to identify influential seed nodes.

Tu-Thach Quach, Jeremy D. Wendt
Lightweight Interactions for Reciprocal Cooperation in a Social Network Game

The construction of reciprocal relationships requires cooperative interactions during the initial meetings. However, cooperative behavior with strangers is risky because the strangers may be exploiters. In this study, we show that people increase the likelihood of cooperativeness of strangers by using lightweight non-risky interactions in risky situations based on the analysis of a social network game (SNG). They can construct reciprocal relationships in this manner. The interactions involve low-cost signaling because they are not generated at any cost to the senders and recipients. Theoretical studies show that low-cost signals are not guaranteed to be reliable because the low-cost signals from senders can lie at any time. However, people used low-cost signals to construct reciprocal relationships in an SNG, which suggests the existence of mechanisms for generating reliable, low-cost signals in human evolution.

Masanori Takano, Kazuya Wada, Ichiro Fukuda
Continuous Recipe Selection Model Based on Cooking History

Thousands of different recipes are posted on recipe sites by consumers who often refer to them when they cook. Such users occasionally select new recipes. In this paper, we propose for users a recipe selection model composed of both preference and challenging viewpoints to appropriately predict recipes that users are more likely to cook next in continuous cooking behaviors. The occurrence probability of the challenging behaviors of each user is estimated from past cooking sequences, and recipe scores are calculated by incorporating preference and challenging viewpoints. Our experimental evaluations using actual cooking histories demonstrate the high prediction performance of our method. We clarified the estimation efficiency of users who tackle challenging recipes.

Shuhei Yamamoto, Noriko Kando, Tetsuji Satoh

Politics, News, and Events

Frontmatter
Examining Community Policing on Twitter: Precinct Use and Community Response

A number of high-profile incidents have highlighted tensions between citizens and police, bringing issues of police-citizen trust and community policing to the forefront of the public’s attention. Efforts to mediate this tension emphasize the importance of promoting interaction and developing social relationships between citizens and police. This strategy – a critical component of community policing – may be employed in a variety of settings, including social media. While the use of social media as a community policing tool has gained attention from precincts and law enforcement oversight bodies, the ways in which police are expected to use social media to meet these goals remains an open question. This study seeks to explore how police are currently using social media as a community policing tool. It focuses on Twitter – a functionally flexible social media space – and considers whether and how law enforcement agencies are co-negotiating norms of engagement within this space, as well as how the public responds to the behavior of police accounts.

Nina Cesare, Emma S. Spiro, Hedwig Lee, Tyler McCormick
The Dynamics of Group Risk Perception in the US After Paris Attacks

This paper examines how the public perceived immigrant groups as potential risk, and how such risk perception changed after the attacks that took place in Paris on November 13, 2015. The study utilizes the Twitter conversations associated with different political leanings in the U.S., and mixed methods approach that integrated both quantitative and qualitative analyses. Risk perception profiles of Muslim, Islam, Latino, and immigrant were quantitatively constructed, based on how these groups/issues were morally judged as risk. Discourse analysis on how risk narratives constructed before and after the event was conducted. The study reveals that the groups/issues differed by how they were perceived as a risk or at risk across political leanings, and how the risk perception was related to in- and out-group biases. The study has important implication on how different communities conceptualize, perceive, and respond to danger, especially in the context of terrorism.

Wen-Ting Chung, Kai Wei, Yu-Ru Lin, Xidao Wen
Determining the Veracity of Rumours on Twitter

While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users’ past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors’ profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis.

Georgios Giasemidis, Colin Singleton, Ioannis Agrafiotis, Jason R. C. Nurse, Alan Pilgrim, Chris Willis, D. V. Greetham
PicHunt: Social Media Image Retrieval for Improved Law Enforcement

First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigate violence and create law & order situations in society. This results in the need for first responders to inspect the spread of such images and users propagating them on social media. In this paper, we present a comparison between different hand-crafted features and a Convolutional Neural Network (CNN) model to retrieve similar images, which outperforms state-of-art hand-crafted features. We propose an Open-Source-Intelligent (OSINT) real-time image search system, robust to retrieve modified images that allows first responders to analyze the current spread of images, sentiments floating and details of users propagating such content. The system also aids officials to save time of manually analyzing the content by reducing the search space on an average by 67 %.

Sonal Goel, Niharika Sachdeva, Ponnurangam Kumaraguru, A. V. Subramanyam, Divam Gupta
TwitterNews+: A Framework for Real Time Event Detection from the Twitter Data Stream

In recent years, substantial research efforts have gone into investigating different approaches to the detection of events in real time from the Twitter data stream. Most of these approaches, however, suffer from a high computational cost and are not evaluated using a publicly available corpus, thus making it difficult to properly compare them. In this paper, we propose a scalable event detection system, TwitterNews+, to detect and track newsworthy events in real time. TwitterNews+ uses a novel approach to cluster event related tweets from Twitter with a significantly lower computational cost compared to the existing state-of-the-art approaches. Finally, we evaluate the effectiveness of TwitterNews+ using a publicly available corpus and its associated ground truth data set of newsworthy events. The result of the evaluation shows a significant improvement, in terms of recall and precision, over the baselines we have used.

Mahmud Hasan, Mehmet A. Orgun, Rolf Schwitter
Uncovering Topic Dynamics of Social Media and News: The Case of Ferguson

Looking at the dynamics of news content and social media content can help us understand the increasingly complex dynamics of the relationship between the media and the public surrounding noteworthy news events. Although topic models such as latent Dirichlet allocation (lda) are valuable tools, they are a poor fit for analyses in which some documents, like news articles, tend to incorporate multiple topics, while others, like tweets, tend to be focused on just one. In this paper, we propose Single Topic lda (st-lda) which jointly models news-type documents as distributions of topics and tweets as having a single topic; the model improves topic discovery in news and tweets within a unified topic space by removing noisy topics that conventional lda tends to assign to tweets. Using st-lda, we focus on the unrest in Ferguson, Missouri after the fatal shooting of Michael Brown on August 9, 2014, looking in particular at the topic dynamics of tweets in and out of St. Louis area, and at differences and relationships between topic coverage in news and tweets.

Lingzi Hong, Weiwei Yang, Philip Resnik, Vanessa Frias-Martinez
Identifying Partisan Slant in News Articles and Twitter During Political Crises

In this paper, we are interested in understanding the interrelationships between mainstream and social media in forming public opinion during mass crises, specifically in regards to how events are framed in the mainstream news and on social networks and to how the language used in those frames may allow to infer political slant and partisanship. We study the lingual choices for political agenda setting in mainstream and social media by analyzing a dataset of more than 40M tweets and more than 4M news articles from the mass protests in Ukraine during 2013–2014—known as “Euromaidan”—and the post-Euromaidan conflict between Russian, pro-Russian and Ukrainian forces in eastern Ukraine and Crimea. We design a natural language processing algorithm to analyze at scale the linguistic markers which point to a particular political leaning in online media and show that political slant in news articles and Twitter posts can be inferred with a high level of accuracy. These findings allow us to better understand the dynamics of partisan opinion formation during mass crises and the interplay between mainstream and social media in such circumstances.

Dmytro Karamshuk, Tetyana Lokot, Oleksandr Pryymak, Nishanth Sastry
Predicting Poll Trends Using Twitter and Multivariate Time-Series Classification

Social media outlets, such as Twitter, provide invaluable information for understanding the social and political climate surrounding particular issues. Millions of people who vary in age, social class, and political beliefs come together in conversation. However, this information poses challenges to making inferences from these tweets. Using the tweets from the 2016 U.S. Presidential campaign, one main research question is addressed in this work. That is, can accurate predictions be made detecting changes in a political candidate’s poll score trends utilizing tweets created during their campaign? The novelty of this work is that we formulate the problem as a multivariate time-series classification problem, which fits the temporal nature of tweets, rather than as a traditional attribute-based classification. Features that represent various aspects of support for (or against) a candidate are tracked on an hour-by-hour basis. Together these form multivariate time-series. One commonly used approach to this problem is based on the majority voting scheme. This method assumes the univariate time-series from different features have equal importance. To alleviate this issue a weighted shapelet transformation model is proposed. Extensive experiments on over 12 million tweets between November 2015 and January 2016 related to the four primary candidates (Bernie Sanders, Hillary Clinton, Donald Trump and Ted Cruz) indicate that the multivariate time-series approach outperforms traditional attribute-based approaches.

Tom Mirowski, Shoumik Roychoudhury, Fang Zhou, Zoran Obradovic
Inferring Population Preferences via Mixtures of Spatial Voting Models

Understanding political phenomena requires measuring the political preferences of society. We introduce a model based on mixtures of spatial voting models that infers the underlying distribution of political preferences of voters with only voting records of the population and political positions of candidates in an election. Beyond offering a cost-effective alternative to surveys, this method projects the political preferences of voters and candidates into a shared latent preference space. This projection allows us to directly compare the preferences of the two groups, which is desirable for political science but difficult with traditional survey methods. After validating the aggregated-level inferences of this model against results of related work and on simple prediction tasks, we apply the model to better understand the phenomenon of political polarization in the Texas, New York, and Ohio electorates. Taken at face value, inferences drawn from our model indicate that the electorates in these states may be less bimodal than the distribution of candidates, but that the electorates are comparatively more extreme in their variance. We conclude with a discussion of limitations of our method and potential future directions for research.

Alison Nahm, Alex Pentland, Peter Krafft
Contrasting Public Opinion Dynamics and Emotional Response During Crisis

We propose an approach for contrasting spatiotemporal dynamics of public opinions expressed toward targeted entities, also known as stance detection task, in Russia and Ukraine during crisis. Our analysis relies on a novel corpus constructed from posts on the VKontakte social network, centered on local public opinion of the ongoing Russian-Ukrainian crisis, along with newly annotated resources for predicting expressions of fine-grained emotions including joy, sadness, disgust, anger, surprise and fear. Akin to prior work on sentiment analysis we align traditional public opinion polls with aggregated automatic predictions of sentiments for contrastive geo-locations. We report interesting observations on emotional response and stance variations across geo-locations. Some of our findings contradict stereotypical misconceptions imposed by media, for example, we found posts from Ukraine that do not support Euromaidan but support Putin, and posts from Russia that are against Putin but in favor USA. Furthermore, we are the first to demonstrate contrastive stance variations over time across geo-locations using storyline visualization (Storyline visualization is available at http://www.cs.jhu.edu/~svitlana/) technique.

Svitlana Volkova, Ilia Chetviorkin, Dustin Arendt, Benjamin Van Durme
Social Politics: Agenda Setting and Political Communication on Social Media

Social media play an increasingly important role in political communication. Various studies investigated how individuals adopt social media for political discussion, to share their views about politics and policy, or to mobilize and protest against social issues. Yet, little attention has been devoted to the main actors of political discussions: the politicians. In this paper, we explore the topics of discussion of U.S. President Obama and the 50 U.S. State Governors using Twitter data and agenda-setting theory as a tool to describe the patterns of daily political discussion, uncovering the main topics of attention and interest of these actors. We examine over one hundred thousand tweets produced by these politicians and identify seven macro-topics of conversation, finding that Twitter represents a particularly appealing vehicle of conversation for American opposition politicians. We highlight the main motifs of political conversation of the two parties, discovering that Republican and Democrat Governors are more or less similarly active on Twitter but exhibit different styles of communication. Finally, by reconstructing the networks of occurrences of Governors’ hashtags and keywords related to political issues, we observe that Republicans and Democrats form two tight yet polarized cores, with a strongly different shared agenda on many issues of discussion.

Xinxin Yang, Bo-Chiuan Chen, Mrinmoy Maity, Emilio Ferrara

Markets, Crowds, and Consumers

Frontmatter
Preference-Aware Successive POI Recommendation with Spatial and Temporal Influence

There have been vast advances and rapid growth in Location based social networking (LBSN) services in recent years. Point of Interest (POI) recommendation is one of the most important applications in LBSN services. POI recommendation provides users personalized location recommendation. It helps users to explore new locations and filter uninteresting places that do not match with their interests. But traditional POI recommendation cannot suggest where a user may go the next day or next hour based on their current location or status. In this paper, we consider the task of personalized successive POI recommendation, recommending to a user the very next location where he might be interested to go next based on his current location. Multiple factors influence users to choose a POI, such as user’s categorical preferences, temporal activities and location preferences, popularity of a POI as well as sequential patterns of a user. In this work, we define a unified framework that takes all these factors into consideration to build a better successive POI recommendation model. We evaluate our system with a real-world dataset collected from Foursquare. Experimental results show that our proposed framework works better than other baseline approaches.

Madhuri Debnath, Praveen Kumar Tripathi, Ramez Elmasri
Event Participation Recommendation in Event-Based Social Networks

Event-based Social Networks (EBSN) have experienced rapid growth in recent years. Event participation recommendation is to recommend a list of users who are most likely to participate in a new event. Due to the nature of new event and severe data sparsity in EBSN, the traditional recommender systems do not work well for event participation recommendation. In this paper, we first conduct a study of Meetup users to understand the major factors impacting their event participation decisions. We then develop a sliding-window based machine-learning model that effectively combines user features from multiple channels to recommend users to new events. Through evaluation using the Meetup dataset, we demonstrate that our model can capture the short-term consistency of user preferences and outperforms the traditional popularity-based and nearest-neighbor based recommendation models. Our model is suitable for real-time recommendation on practical EBSN platforms.

Hao Ding, Chenguang Yu, Guangyu Li, Yong Liu
An Effective Approach to Finding a Context Path in Review Texts Using Pathfinder Scaling

Customer reviews feature opinions or sentiments that a review writer has given, and these opinions or sentiments have an impact on the reader. Identifying and presenting word associations that indicate a sentiment orientation and semantics can aid in selecting the best review for providing the information customers are seeking. In this paper, we attempted to discover the context structure and the context path presenting explicit semantics in review texts. To this end, we extracted word co-occurrences and converted them to a cosine adjacency matrix. Then a co-word network applied by Pathfinder scaling was constructed. Finally, we measured the context score and presented context paths from the context structure in the review texts. In results, our approach found that a compound noun is easy to detect by network analysis. The extracted context paths remain intact, a sentiment polarity derived from review texts. The evaluative expression for a certain aspect of a product or service is clearer and more specified within the context path. Furthermore, it is not necessary to train reference words to detect the sentiment orientations.

Erin Hea-Jin Kim, SuYeon Kim
How to Find Accessible Free Wi-Fi at Tourist Spots in Japan

We propose a method of finding spots at tourist attractions that do not have accessible Free Wi-Fi by using social media data. Although it is an important issue for the government to determine where they should install Free Wi-Fi equipment, it involves a high human cost. We focused on the difference in usage of social network services (SNSs) to find where there was a lack of Free Wi-Fi. We posed two simple hypotheses: (1) uploaded photos on Flickr, where batch-time SNS reflects the popularity of attractions from the travelers’ perspective, and (2) posts on Twitter, where real-time SNS reflects the communications environment. Differences in the distributions of posts in these SNSs indicate the gap in needs and the current status of communications infrastructures. Experimental results obtained from fieldwork in the Yokohama area clarified that although our method could locate places that were popular with tourists, some of these locations did not have Free Wi-Fi equipment installed there.

Keisuke Mitomi, Masaki Endo, Masaharu Hirota, Shohei Yokoyama, Yoshiyuki Shoji, Hiroshi Ishikawa

Privacy, Health and Wellbeing

Frontmatter
Mobile Communication Signatures of Unemployment

The mapping of populations socio-economic well-being is highly constrained by the logistics of censuses and surveys. Consequently, spatially detailed changes across scales of days, weeks, or months, or even year to year, are difficult to assess; thus the speed of which policies can be designed and evaluated is limited. However, recent studies have shown the value of mobile phone data as an enabling methodology for demographic modeling and measurement. In this work, we investigate whether indicators extracted from mobile phone usage can reveal information about the socio-economical status of microregions such as districts (i.e., average spatial resolution $${<}2.7$$<2.7 km). For this we examine anonymized mobile phone metadata combined with beneficiaries records from unemployment benefit program. We find that aggregated activity, social, and mobility patterns strongly correlate with unemployment. Furthermore, we construct a simple model to produce accurate reconstruction of district level unemployment from their mobile communication patterns alone. Our results suggest that reliable and cost-effective economical indicators could be built based on passively collected and anonymized mobile phone data. With similar data being collected every day by telecommunication services across the world, survey-based methods of measuring community socioeconomic status could potentially be augmented or replaced by such passive sensing methods in the future.

Abdullah Almaatouq, Francisco Prieto-Castrillo, Alex Pentland
Identifying Stereotypes in the Online Perception of Physical Attractiveness

Stereotyping can be viewed as oversimplified ideas about social groups. They can be positive, neutral or negative. The main goal of this paper is to identify stereotypes for female physical attractiveness in images available in the Web. We look at the search engines as possible sources of stereotypes. We conducted experiments on Google and Bing by querying the search engines for beautiful and ugly women. We then collect images and extract information of faces. We propose a methodology and apply it to analyze photos gathered from search engines to understand how race and age manifest in the observed stereotypes and how they vary according to countries and regions. Our findings demonstrate the existence of stereotypes for female physical attractiveness, in particular negative stereotypes about black women and positive stereotypes about white women in terms of beauty. We also found negative stereotypes associated with older women in terms of physical attractiveness. Finally, we have identified patterns of stereotypes that are common to groups of countries.

Camila Souza Araújo, Wagner Meira Jr., Virgilio Almeida
Analysing RateMyProfessors Evaluations Across Institutions, Disciplines, and Cultures: The Tell-Tale Signs of a Good Professor

Can we tell a good professor from their students’ comments? And are there differences between what is considered to be a good professor by different student groups? We use a large corpus of student evaluations collected from the RateMyProfessors website, covering different institutions, disciplines, and cultures, and perform several comparative experiments and analyses aimed to answer these two questions. Our results indicate that (1) we can reliably classify good professors from poor professors with an accuracy of over 90 %, and (2) we can separate the evaluations made for good professors by different groups with accuracies in the range of 71–89 %. Furthermore, a qualitative analysis performed using topic modeling highlights the aspects of interest for different student groups.

Mahmoud Azab, Rada Mihalcea, Jacob Abernethy
Detecting Coping Style from Twitter

Coping styles are psychological and behavioral strategies people use to deal with stressful situations. They may be adaptive (helping to reduce stressors), or maladaptive (which tend to reduce symptoms without addressing the underlying problem). Some coping styles—particularly maladaptive ones—are tied to specific conditions.This study explores whether coping style can be predicted by analyzing user behavior on Twitter. Our results show that a combination of text analysis and behavioral information can be used to build a classifier that can accurately determine whether individuals use primarily adaptive or maladaptive coping styles. Furthermore, we show this can be predicted using a small feature set of psycholinguistic measures, which directly map to core elements of coping as identified in the psychological literature.In addition to the results contributing to the literature on individual attribute prediction, information about coping strategies is useful for understanding more complex psychological phenomena (like addiction and PTSD). Understanding such attributes is of growing interest to the research community, and our results add a tool to support further work in that area. Our results may also be useful in contributing to personalization, especially in health-related topics, and to a personal analysis tool to guide people toward building healthier coping styles if their current actions are maladaptive.

Jennifer Golbeck
User Privacy Concerns with Common Data Used in Recommender Systems

Recommender systems, and personalization algorithms more broadly, have become an integral part of modern e-commerce, streaming, and social media services. Collaborative filtering in particular leverages users’ ratings to compute new items of interest. The algorithms that drive them use a variety of data, from user ratings to measures of social relationships. As a field, we have built more effective, accurate algorithms with the available data. However, recommender systems are often opaque to users, and users’ privacy concerns about the data these algorithms use is unknown.In this project, we administered a survey to nearly 1,000 subjects to gauge their opinions about privacy issues tied to a variety of common personal data points used in making recommendations and the ways that data is used. We found that data collected within in an application is generally of low concern, while the use of social data and data obtained from third parties is often considered a privacy violation. Furthermore, users expressed discomfort with their data being used anonymously to help personalize content for others - a common practice in collaborative filtering. We discuss the survey results and implications for creating privacy-respecting recommender systems.

Jennifer Golbeck
How a User’s Personality Influences Content Engagement in Social Media

Social media presents an opportunity for people to share content that they find to be significant, funny, or notable. No single piece of content will appeal to all users, but are there systematic variations between users that can help us better understand information propagation? We conducted an experiment exploring social media usage during disaster scenarios, combining electroencephalogram (EEG), personality surveys, and prompts to share social media, we show how personality not only drives willingness to engage with social media, but also helps to determine what type of content users find compelling. As expected, extroverts are more likely to share content. In contrast, one of our central results is that individuals with depressive personalities are the most likely cohort to share informative content, like news or alerts. Because personality and mood will generally be highly correlated between friends via homophily, our results may be an import factor in understanding social contagion.

Nathan O. Hodas, Ryan Butner, Court Corley
Semi-supervised Knowledge Extraction for Detection of Drugs and Their Effects

New Psychoactive Substances (NPS) are drugs that lay in a grey area of legislation, since they are not internationally and officially banned, possibly leading to their not prosecutable trade. The exacerbation of the phenomenon is that NPS can be easily sold and bought online. Here, we consider large corpora of textual posts, published on online forums specialized on drug discussions, plus a small set of known substances and associated effects, which we call seeds. We propose a semi-supervised approach to knowledge extraction, applied to the detection of drugs (comprising NPS) and effects from the corpora under investigation. Based on the very small set of initial seeds, the work highlights how a contrastive approach and context deduction are effective in detecting substances and effects from the corpora. Our promising results, which feature a F1 score close to 0.9, pave the way for shortening the detection time of new psychoactive substances, once these are discussed and advertised on the Internet.

Fabio Del Vigna, Marinella Petrocchi, Alessandro Tommasi, Cesare Zavattari, Maurizio Tesconi
Using Social Media to Measure Student Wellbeing: A Large-Scale Study of Emotional Response in Academic Discourse

Student resilience and emotional wellbeing are essential for both academic and social development. Earlier studies on tracking students’ happiness in academia showed that many of them struggle with mental health issues. For example, a 2015 study at the University of California Berkeley found that 47 % of graduate students suffer from depression, following a 2005 study that showed 10 % had considered suicide. This is the first large-scale study that uses signals from social media to evaluate students’ emotional wellbeing in academia. This work presents fine-grained emotion and opinion analysis of 79,329 tweets produced by students from 44 universities. The goal of this study is to qualitatively evaluate and compare emotions and sentiments emanating from students’ communications across different academic discourse types and across universities in the U.S. We first build novel predictive models to categorize academic discourse types generated by students into personal, social, and general categories. We then apply emotion and sentiment classification models to annotate each tweet with six Ekman’s emotions – joy, fear, sadness, disgust, anger, and surprise and three opinion types – positive, negative, and neutral. We found that emotions and opinions expressed by students vary across discourse types and universities, and correlate with survey-based data on student satisfaction, happiness and stress. Moreover, our results provide novel insights on how students use social media to share academic information, emotions, and opinions that would pertain to students academic performance and emotional well-being.

Svitlana Volkova, Kyungsik Han, Courtney Corley
EmojiNet: Building a Machine Readable Sense Inventory for Emoji

Emoji are a contemporary and extremely popular way to enhance electronic communication. Without rigid semantics attached to them, emoji symbols take on different meanings based on the context of a message. Thus, like the word sense disambiguation task in natural language processing, machines also need to disambiguate the meaning or ‘sense’ of an emoji. In a first step toward achieving this goal, this paper presents EmojiNet, the first machine readable sense inventory for emoji. EmojiNet is a resource enabling systems to link emoji with their context-specific meaning. It is automatically constructed by integrating multiple emoji resources with BabelNet, which is the most comprehensive multilingual sense inventory available to date. The paper discusses its construction, evaluates the automatic resource creation process, and presents a use case where EmojiNet disambiguates emoji usage in tweets. EmojiNet is available online for use at http://emojinet.knoesis.org.

Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran
Backmatter
Metadaten
Titel
Social Informatics
herausgegeben von
Emma Spiro
Yong-Yeol Ahn
Copyright-Jahr
2016
Electronic ISBN
978-3-319-47880-7
Print ISBN
978-3-319-47879-1
DOI
https://doi.org/10.1007/978-3-319-47880-7

Neuer Inhalt