Skip to main content
main-content
Top

About this book

The volume presents, in a synergistic manner, significant theoretical and practical contributions in the area of social media reputation and authorship measurement, visualization, and modeling. The book justifies and proposes contributions to a future agenda for understanding the requirements for making social media authorship more transparent. Building on work presented in a previous volume of this series, Roles, Trust, and Reputation in Social Media Knowledge Markets, this book discusses new tools, applications, services, and algorithms that are needed for authoring content in a real-time publishing world. These insights may help people who interact and create content through social media better assess their potential for knowledge creation. They may also assist in analyzing audience attitudes, perceptions, and behavior in informal social media or in formal organizational structures. In addition, the volume includes several chapters that analyze the higher order ethical, critical thinking, and philosophical principles that may be used to ground social media authorship. Together, the perspectives presented in this volume help us understand how social media content is created and how its impact can be evaluated.

The chapters demonstrate thought leadership through new ways of constructing social media experiences and making traces of social interaction visible. Transparency in Social Media aims to help researchers and practitioners design services, tools, or methods of analysis that encourage a more transparent process of interaction and communication on social media. Knowing who has added what content and with what authority to a specific online social media project can help the user community better understand, evaluate and make decisions and, ultimately, act on the basis of such information.

Table of Contents

Frontmatter

Overtures to Transparency in Social Media

Frontmatter

Introduction

Abstract
As engagement with social media has become a dominant information acquisition and dissemination experience, the nature of the collection, production, and consumption of information has also changed. One of the most significant changes is the lowering of cost and technological barriers for sharing knowledge or opinions. User generated content dominates social media. This challenges traditional methods of collecting, disseminating and evaluating information. As much of the information exchanged on social media is often created or vetted by individuals or corporations whose identities, motives, or abilities are poorly or often simply unknown, we need new tools, theories, and practical strategies for evaluating the quality of the content and the credibility of its authors. Modelling the provenance and impact of authorship on social media is of crucial importance for explaining the emergence and impact of human motivations on social media content generation. Research on presenting, visualizing and explaining the social context of any given user in a social medium information exchange is equally important. In brief, researchers and practitioners need to create theories, methods and tools that make the authorship and dissemination process more transparent. We need new ways to understand at a glance, who, in what context, and if possible why creates or disseminates specific units of content.
Sorin Adam Matei, Elisa Bertino, Martha Russell

Socio-Computational Frameworks, Tools and Algorithms for Supporting Transparent Authorship in Social Media Knowledge Markets

Abstract
This chapter presents a summary of the themes, topics, methods and case studies presented at the Kredible.net workshop on Reputation, Trust and Authority, held at Stanford University in 2013. The workshop brought together scholars from a variety of disciplines to explore how—amidst the Internet’s enormous volume of content and relationships—certain topics, concepts and individuals rise to prominence, develop strong reputations, gain followers and establish credibility and trust. The projects presented in this paper explore the emergence of social roles, the creation of value, and the perception of credibility and trustworthiness in online information. They combine social science insights into the structure and nature of online interaction with advances in computational science, data visualization, graph analysis and natural language processing. The methods and results presented in this paper offer innovative statistical strategies, models, and methodologies for navigating the large and complex data sets produced by online content.
Karina Alexanyan, Sorin Adam Matei, Martha Russell

Assessing Provenance and Pathways in Social Media: Case Studies, Methods, and Tools

Frontmatter

Robust Aggregation of Inconsistent Information: Concepts and Research Directions

Abstract
Improving the reliability of information obtained from multiple sources in an automated way is today a critical need as decision-making processes and other applications heavily depend on such information. The sources can be sensors used in a wireless sensor network, multiple resources on the Internet or a large number of users of a web service or a social network. Examples of such a problem include finding a reliable rating of a product or a movie, the accurate current price of a stock or a reliable assessment of the prevailing market analysts’ sentiment about a stock. The problem arises from the fact that multiple sources often provide inconsistent information, due to differences of opinion, human or hardware errors, being out of date and, most importantly, the sources might even maliciously supply false information with an express intent to deceive. While it is clear that no aggregation procedure can strictly guarantee the accuracy of the output, in practice we must seek “the best” answer possible—the one that in the given circumstances minimizes the error or the likelihood of an error of the aggregate value. Such procedures are needed for both numerical and non-numerical data and, given the vast amount of data becoming available, should operate without a need for human judgment or an external “gold standard”. In this paper we provide a survey of solutions to this problem based on iterative filtering approaches that take into account not only the information but also the information sources and are able to assess the credibility of the information and the trustworthiness of the information sources. We also discuss future work and open research directions.
Aleksandar Ignjatovic, Mohsen Rezvani, Mohammad Allahbakhsh, Elisa Bertino

Weaponized Crowdsourcing: An Emerging Threat and Potential Countermeasures

Abstract
The crowdsourcing movement has spawned a host of successful efforts that organize large numbers of globally-distributed participants to tackle a range of tasks, including crisis mapping (e.g., Ushahidi), translation (e.g., Duolingo), and protein folding (e.g., Foldit). Alongside these specialized systems, we have seen the rise of general-purpose crowdsourcing marketplaces like Amazon Mechanical Turk and Crowdflower that aim to connect task requesters with task workers, toward creating new crowdsourcing systems that can intelligently organize large numbers of people. However, these positive opportunities have a sinister counterpart: what we dub “Weaponized Crowdsourcing”. Already we have seen the first glimmers of this ominous new trend—including large-scale “crowdturfing”, wherein masses of cheaply paid shills can be organized to spread malicious URLs in social media (Grier, Thomas, Paxson, & Zhang, 2010; Lee & Kim, 2012), form artificial grassroots campaigns (“astroturf”) (Gao et al., 2010; Lee, Caverlee, Cheng, & Sui, 2013), spread rumor and misinformation (Castillo, Mendoza, & Poblete, 2011; Gupta, Lamba, Kumaraguru, & Joshi, 2013), and manipulate search engines. A recent study finds that 90 % of tasks on many crowdsourcing platforms are for crowdturfing (Wang et al., 2012), and our initial research (Lee, Tamilarasan, & Caverlee, 2013) shows that most malicious tasks in crowdsourcing systems target either online communities (56 %) or search engines (33 %). Unfortunately, little is known about Weaponized Crowdsourcing as it manifests in existing systems, nor what are the ramifications on the design and operation of emerging socio-technical systems. Hence, this chapter shall focus on key research questions related to Weaponized Crowdsourcing as well as outline the potential of building new preventative frameworks for maintaining the information quality and integrity of online communities in the face of this rising challenge.
James Caverlee, Kyumin Lee

The Structures of Twitter Crowds and Conversations

Abstract
Social media promises to provide access to a vast variety of human interactions, important and trivial. More than traditional electronic media or interpersonal contact, social media allows people to find and interact based on common interests rather than physical proximity. Billions of people have embraced these tools, entering social media spaces to exchange trillions of messages. Social media interactions may not be as rich as face-to-face interactions, but they offer access to a wide range of people and topics. Success has led to new problems, as social media offers too many contacts, too many interactions, and poor tools for filtering and gaining an overview of the larger landscape of communication. Social media is created and consumed through tools that limit the observer’s view to individual messages or short chains of messages and replies. The leaf and the branch of social media is visible, but not the tree or the forest. The result is an information and interaction deluge. The overwhelming amount of data and the limited ways to understand it can be seen as a negative consequence of social media. For many ordinary users social media is an incomprehensible torrent. Proposed solutions, such as automatic filters that select relevant information for us, are often seen as worse than the problem it is meant to solve. “Filter bubbles” can trap users in homogeneous collections of information, losing sight of the larger range of discussions and content. Social media is inherently a social network, meaning that people use it to create collections of connections that have an emergent form, structure and shape. Interfaces to social media, however, lack insights into the nature, topology, and size of the networks they present. Access to social media network information is of academic and practical interest. Social Network Analysis (SNA) offers a powerful method to conceptualize, analyze and visualize social media—leading to new applications and user interfaces that help users make their own decisions about content relevance and the credibility of other users. Social media can be much more useful for users, and the information in it can be more easily evaluated, if its underlying network structure is made more visible and comprehensible.
Marc A. Smith, Itai Himelboim, Lee Rainie, Ben Shneiderman

Visible Effort: Visualizing and Measuring Group Structuration Through Social Entropy

Abstract
A theoretically-grounded learning feedback tool suite, the Visible Effort Mediawiki extension is proposed for optimizing online group learning activities by measuring the amount of equality and the emergence of social structure in groups that participate in Computer-Mediated Collaboration (CMC). Building on social entropy theory, which is drawn from Shannon’s Mathematical Theory of Communication, Visible Effort captures levels of CMC unevenness and group structure and visualizes them on wiki Web pages through background colors, charts, and tabular data. Visual information provides users feedback on how balanced and equitable collaboration is within their online group, helping them to maintain it within optimal levels. We present the theoretical and practical implications of Visible Effort and the measures behind it, as well as illustrate its capabilities by describing a quasi-experimental teaching activity (use scenario) in tandem with a detailed discussion of theoretical justification, methodological underpinning, and technological capabilities of the approach.
Sorin Adam Matei, Robert Bruno, Pamela L. Morris

Stepwise Segmented Regression Analysis: An Iterative Statistical Algorithm to Detect and Quantify Evolutionary and Revolutionary Transformations in Longitudinal Data

Abstract
This chapter proposes an iterative statistical approach, based on the principles of stepwise and segmented regression, to detect and quantify evolutionary trends and revolutionary changes (breakpoints) in long-term processes. The resulting stepwise segmented regression analysis was initially developed to assess especially complex social systems such as behavioral changes across the Wikipedia editorial community.
Unlike most existing breakpoint detection tools, stepwise segmented regression can detect multiple revolutionary moments occurring in sequence, including those that result in continuous and discontinuous line segments. It is also less sensitive to random noise and heteroscedasticity than tools based upon model selection criteria like BIC, and its model may be expanded to include exponential terms, although such exponential terms should be used with caution. Finally, its use of stepwise-based iteration limits its computational complexity, making it a reasonable choice to examine longer processes with many data points. In sum, this flexible and robust regression-based approach may be used in a far wider range of contexts than any existing breakpoint detection tool, making it ideal for evaluating unknown or complicated social and natural scientific processes.
Brian C. Britt

Towards Bottom-Up Decision Making and Collaborative Knowledge Generation in Urban Infrastructure Projects Through Online Social Media

Abstract
Evolution of civil infrastructure from a technical artifact into an engineering system and a national asset over the past century has created a new discourse for development, construction, and management of infrastructure, which more and more emphasizes soft and subjective aspects of the system. Modern civil infrastructure is a complex system composed of the physical network of assets together with the social network of actors/users, and their interactions through the operational processes of the system (Lukszo & Bouwmas, 2005). This defines a sociotechnical system whose behavior cannot be studied without respect to the associated agents and the related social/institutional infrastructure. This system will be governed by organizational policies as well as social norms and standards. Such a definition for civil infrastructure has improved the role of the society from customers and end users of a service into stakeholders who may influence specifications of the system. This new role introduces new opportunities and challenges to domain decision makers. On one hand, it creates great opportunities for social engagement. Technical and professional decision makers can distill the distributed knowledge of public communities (referred to as non-expert or non-mainstream knowledge by Brabham & Sanchez, 2010) to reinforce the decision making procedure. On the other hand, given the diversity of interests and technical sophistications involved, an active participation of the public may result in a chaotic nature for the decision process.
Mazdak Nik-Bakht, Tamer E. El-Diraby

Biometric-Based User Authentication and Activity Level Detection in a Collaborative Environment

Abstract
In recent years, research on behavioral biometrics has led to a set of reliable, efficient, and versatile tools of enabling user authentication and discretionary access control to a secure resource or system. This chapter presents a new direction of biometrics research focused not only on behavior-mediated user authentication, but also on individual activity detection and collaborative behavior analysis during meeting room activities. The discussion is grounded on utilizing different physiological and behavioral biometrics, such as gait, face, and voice in order to construct a multi-modal and unobtrusive intelligent meeting room system. The designed system should have the ability to automatically recognize participants and analyze collaborative behavioral patterns by tracking user location, detecting certain activities, tracking individual mood and contribution level, and analyzing group activities.
Faisal Ahmed, Marina Gavrilova

Improving Transparency Through Documentation and Curation

Frontmatter

In the Flow: Evolving from Utility Based Social Medium to Community Peer

Abstract
In a broad sense, a social medium is an online interaction space. Most commonly known online interaction spaces are infrastructures that allow members to interact around one or more nexuses of interaction using one or more modes of interaction. Success or failure of an online interaction space depends on how effectively the nexuses and modes meet the needs of the intended community of users. nanoHUB.org is described as an online interaction space that was designed largely by considering how members of the intended community could satisfy the “needs of the one” through chosen nexuses and modes. Based on satisfying several acute needs of individuals, the nanoHUB online interaction space grew into a large community that is beginning to behave more as a social unit than as a group of individuals. The primary nexus of interaction, a simulation tool, was chosen as an active rather than passive nexus (i.e. consuming from the nexus creates new information in the process). The active nexus more easily facilitates the design of features where the social medium itself can consume from the nexus and produce novel information useful to its community of users. In effect, it can become more than an infrastructural platform. It can become a member of its own community.
Michael G. Zentner, Lynn K. Zentner, Dwight McKay, Swaroop Samek, Nathan Denny, Sabine Brunswicker, Gerhard Klimeck

Ostinato: The Exploration-Automation Cycle of User-Centric, Process-Automated Data-Driven Visual Network Analytics

Abstract
Network analysis is a valuable method for investigating and mapping the social structure driving phenomena and sharing the findings with others. The interactive visual analytics approach transforms data into views that allow the visual exploration of the structures and processes of networks represented by data, therefore increasing the transparency of editorial processes on social media as well as networked structures in innovation ecosystems and other phenomena. Although existing tools have opened many new exploratory opportunities, new tools in development promise investigators even greater freedom to interact with the data, refine and analyze the data, and explore alternative explanations for networked processes. This chapter presents the Ostinato Model—an iterative, user-centric, process-automated model for data-driven visual network analytics. The Ostinato Model simultaneously supports the automation of the process and enables interactive and transparent exploration. The model has two phases, Data Collection and Refinement and Network Creation and Analysis. The Data Collection and Refinement phase is further divided into Entity Index Creation, Web/API Crawling, Scraping, and Data Aggregation. The Network Construction and Analysis phase is composed of Filtering in Entities, Node and Edge Creation, Metrics Calculation, Node and Edge Filtering, Entity Index Refinement, Layout Processing and Visual Properties Configuration. A cycle of exploration and automation characterizes the model and is embedded in each phase.
Jukka Huhtamäki, Martha G. Russell, Neil Rubens, Kaisa Still

Visual Analytics of User Influence and Location-Based Social Networks

Abstract
Social media have evolved as an important source of information and situational awareness in crisis and emergency management. As the number of messages generated and diffused through social networks in time of crisis increase exponentially, locating reliable and critical information in a timely manner is crucial, especially for decision makers. In such scenarios, identifying influential users in social networks, detecting anomalous information diffusion patterns, and locating corresponding geographical coordinates are often instrumental in providing important information and helping analysts make decisions in a timely manner. We describe a visual analytics framework focusing on identifying influential users and anomalous information diffusion based on dynamic social networks using Twitter data. We also demonstrate a visual analytics approach that allows users to analyze a large volume of social media data to detect and examine abnormal events within Location-Based Social Network (LBSN). Our statistical models to extract user topics and evaluate their anomaly scores are applied to facilitate exploration and perception of Twitter semantics. The framework provides highly interactive filtering and geo-location mapping to help categorize different topics, detect influential users and anomalous information in specific events, and investigate the underlying spatiotemporal patterns.
Jiawei Zhang, Junghoon Chae, Shehzad Afzal, Abish Malik, Dennis Thom, Yun Jang, Thomas Ertl, Sorin Adam Matei, David S. Ebert

Transparency, Control, and Content Generation on Wikipedia: Editorial Strategies and Technical Affordances

Abstract
The sparse nature of Wikipedia’s main content interface, dominated by clearly laid out content, neatly organized into information boxes, and structured into headings and subheadings projects an image of a simple and flexible content management system. Even though the process of social production that undergirds Wikipedia is rife with conflict, power struggles, revert wars, content transactions, and coordination efforts, not to mention vandalism, the article pages on Wikipedia shun information gauges that highlight the social nature of the contributions. Rather, they are characterized by a “less is more” ideology of design, which aims to maximize readability and to encourage future contributions. The tools for discerning the social dynamics that lead to the creation of any given page are buried deep into the structure of the interface. Often they are created and maintained by voluntary contributors, who host the information on their own servers. The reason for which the design choices made for the Wikipedia interface hide rather than highlight the true nature of these social dynamics remains a continuous motive for puzzlement.
Closer investigation reveals that the deceivingly simple nature of the interface is in fact a method to attract new collaborators and to establish content credibility. As Wikipedia has matured, its public notoriety demands a new approach to the manner in which Wikipedia reflects the rather complex process of authorship on its content pages. This chapter discusses a number of visualizations designed to support this goal, and discusses why they have not as yet been adopted into the Wikipedia interface. The ultimate aim of the chapter is to highlight that in an era of socially constructed knowledge the debate about the desirability of visualizing the process by which knowledge is produced on social media should be about more than “responsive interfaces” and maximizing contributions. The ethical implications of knowing who is responsible for producing the content are important and should be made visible in collaborative knowledge production projects.
Sorin Adam Matei, Jeremy Foote

Transparency in Social Media: Ethical and Critical Dimensions

Frontmatter

Truth Telling and Deception in the Internet Society

Abstract
It is argued that mass deployment and use of high-capacity electronic communication has caused the breakdown of a human institution so elementary that nobody in the past bothered to make laws protecting it: the everyday give-and-take of speech. It has done this by vastly increasing people’s power to deceive without, at the same time, increasing their power to detect deception. The reason the imbalance is so problematic is that speech is an economic act, something done by the speaker to achieve some gain or personal benefit and involving exchange of value with the listener. Speech is therefore not inherently truthful. It becomes truthful, if at all, only when the speaker is afraid of being branded as a liar or a fool if found out, but, at the same time, not too afraid to speak out in the first place. Modern electronic means have provided speakers with a huge arsenal of tools to evade personal responsibility for what they say, thus badly unbalancing the economics of speech. Fortunately, the cure is simple: one need only create institutions that circumscribe this power in just the right amount so as to restore balance. In this paper I describe experiments I have conducted in creating such institutions using cohorts of students in coursework. I have found that a carefully crafted electronic interface that enforces the right amount of transparency, together with a small amount of human intervention, can have a startlingly positive effect on behavior. No policing is required. Creativity and intellectual cooperation bloom automatically and massively when healthy economic interactions among the people in the cohort are restored.
Robert B. Laughlin

Embedding Privacy and Ethical Values in Big Data Technology

Abstract
The phenomenon now commonly referred to as “Big Data” holds great promise and opportunity as a potential source of solutions to many societal ills ranging from cancer to terrorism; but it might also end up as “…a troubling manifestation of Big Brother, enabling invasions of privacy, decreased civil freedoms (and) increased state and corporate control” (Boyd & Crawford, 2012, p. 664). Discussions about the use of Big Data are widespread as “(d)iverse groups argue about the potential benefits and costs of analyzing genetic sequences, social media interactions, health records, phone logs, government records, and other digital traces left by people” (Boyd & Crawford, 2012, p. 662). This chapter attempts to establish guidelines for the discussion and analysis of ethical issues related to Big Data in research, particularly with respect to privacy. In doing so, it adds new dimensions to the agenda setting goal of this volume. It is intended to help researchers in all fields, as well as policy-makers, to articulate their concerns in an organized way, and to specify relevant issues for discussion, policy-making and action with respect to the ethics of Big Data. On the basis of our review of scholarly literature and our own investigations with big and small data, we have come to recognize that privacy and the great potential for privacy violations constitute major concerns in the debate about Big Data. Furthermore, our approach and our recommendations are generalizable to other ethical considerations inherent in Big Data as we illustrate in the final section of the chapter.
Michael Steinmann, Julia Shuster, Jeff Collmann, Sorin Adam Matei, Rochelle E. Tractenberg, Kevin FitzGerald, Gregory J. Morgan, Douglas Richardson

Critical Thinking and Socio-Technical Methods for Ascertaining Credibility Online

Abstract
SM: Howard, I would like to start with the deep past of online sociability, online social interaction of the 1980s—when you were a member of the Well, the famous bulletin board that pioneered the idea of online community. We all read your book Virtual Community, Homesteading on the Electronic Frontier, and we remember vividly the justification you offered for writing the book and for theorizing about virtual communities—which was that, at the time, people thought of the idea of getting together with other people online as weird, as nerdy, as disturbing, even as a form of social deviance; that nothing good can come out of it, that it is just an adulteration of social life, a weakening of social ties. Was the fear of online interaction that intense in the 1980s?
Howard Rheingold, Sorin Adam Matei
Additional information

Premium Partner

    Image Credits