Social structure of Facebook networks

https://doi.org/10.1016/j.physa.2011.12.021Get rights and content

Abstract

We study the social structure of Facebook “friendship” networks at one hundred American colleges and universities at a single point in time, and we examine the roles of user attributes–gender, class year, major, high school, and residence–at these institutions. We investigate the influence of common attributes at the dyad level in terms of assortativity coefficients and regression models. We then examine larger-scale groupings by detecting communities algorithmically and comparing them to network partitions based on user characteristics. We thereby examine the relative importance of different characteristics at different institutions, finding for example that common high school is more important to the social organization of large institutions and that the importance of common major varies significantly between institutions. Our calculations illustrate how microscopic and macroscopic perspectives give complementary insights on the social organization at universities and suggest future studies to investigate such phenomena further.

Highlights

► We study Facebook networks from one hundred American colleges and universities. ► We examine the roles of user attributes and algorithmically detect communities. ► We examine the relative importance of different characteristics at different institutions. ► Microscopic and macroscopic perspectives give complementary insights on social organization.

Introduction

Since their introduction, social networking sites (SNSs) such as Friendster, MySpace, Facebook, Orkut, LinkedIn, and myriad others have attracted hundreds of millions of users, many of whom have integrated SNSs into their daily lives to communicate with friends, send e-mails, solicit opinions or votes, organize events, spread ideas, find jobs, and more [1]. Facebook, an SNS launched in February 2004, now overwhelms numerous aspects of everyday life, and it has become an immensely popular societal obsession [1], [2], [3], [4]. Facebook members can create self-descriptive profiles that include links to the profiles of their “friends”, who may or may not be offline friends. Facebook requires that anybody who wants to be added as a friend have the relationship confirmed, so Facebook friendships define a network (graph) of reciprocated ties (undirected edges) that connect individual users. (In this article, we use the words “edge” and “link” interchangeably.)

The emergence of SNSs such as Facebook and MySpace has revolutionized the availability of social and demographic data, which has in turn had a significant impact on the study of social networks [1], [5], [6]. It is possible to acquire very large data sets from SNSs, though of course the population online and actively using SNSs is a biased sample of the broader population. Services like Facebook also contain large quantities of demographic data, as many users now voluntarily reveal voluminous amounts of detailed personal information. An especially exciting aspect of studying SNSs is that they provide an opportunity to examine social organization at unprecedented levels of size and detail, and they also provide new venues to test sampling effects [7]. One can investigate the structure of an SNS like Facebook to examine it as a network in its own right, and ideally one can also try to take one step further and infer interesting insights regarding the offline social networks that an SNS imperfectly parallels. Most people tend to draw their Facebook friends from their real-life social networks [1], so it is not entirely unreasonable to use a Facebook network as a proxy for an offline social network. (Of course, as noted by Hogan [8], one does need to be aware of significant limitations when taking such a leap of faith.)

Social scientists, information scientists, and physical scientists have all jumped on the SNS data bandwagon [9]. It would be impossible to exhaustively cite all of the research in this area, so we only highlight a few results; additional references can be found in the review by Boyd and Ellison [1]. Boyd [10], [11] also conducted an empirical study of Facebook and MySpace, concluding that Facebook tends to appeal to a more elite and educated cross section than MySpace. The company RapLeaf [12] has compiled global demographics on the age and gender usage of numerous SNSs. Other recent studies have investigated the manifestation on SNSs of race and ethnicity [13], religion [14], gender [15], [16], and national identity [17]. Other research has illustrated that online friendship networks can be exploited to improve shopper recommendation systems on websites such as Amazon [18]. (Presumably, this is becoming increasingly prominent in practice.)

Several papers have attempted to increase understanding of how SNS friendships form. For example, Kumar et al. [19] examined preferential attachment models of SNS growth, concluding that it is important to consider different classes of users. Lampe et al. [20] explored the relationship between profile elements and number of Facebook friends, and other scholars have examined the importance of geography [21] and online message activity [22] to online friendship formation. Other papers have established the existence of strong correlations between network participation and website activity, including the motivation of people to join particular groups [23], the recommendations of online groups [24], online messages and friendship formation [22], interaction activity versus sense of belonging [25], and the role of explicit ideological relationship designations in affecting voting behavior [26], [27]. Lewis et al. [3] used Facebook data for an entire class of freshmen at an unnamed, private American university to conduct a quantitative study of social networks and cultural preferences. The same data set was also used to examine user privacy settings on Facebook [28].

In the present paper, we study the complete Facebook networks of 100 American colleges and universities from a single-day snapshot in September 2005. This paper is a sequel to our previous research on 5 of these institutions [29], in which we developed some of the methodology that we employ here. In September 2005, one needed a .edu e-mail address to become a member of Facebook. We thus ignore links between nodes at different institutions and study the Facebook networks of the 100 institutions as 100 separate networks. For each network, we have categorical data encompassing the gender, major, class year, high school, and residence (e.g., dormitory, House, fraternity, etc.) of the users. We examine homophily and community structure (network partitions that are obtained algorithmically) for each of the networks and compare the community structure to partitions based on the given categorical data. We thereby compare and contrast the organizations of the 100 different Facebook networks, which arguably allows us to compare and contrast the organizations of the underlying university social networks to which they provide an imperfect counterpart. In addition to the inherent interest of these Facebook networks, our investigation is important for subsequent use of these networks–which were formed via ostensibly the same generative mechanism–as benchmark examples for numerous types of computations, such as new community detection methods.

The remainder of this paper is organized as follows. We first discuss the Facebook data and present the methods that we used for testing homophily at the dyad level and demographic organization at the community level. We then present and discuss results on the largest connected components of the networks, student-only subnetworks, and single-gender subnetworks. Finally, we summarize and discuss our findings.

Section snippets

Data

The data, sent directly to us by Adam D’Angelo of Facebook, consists of the complete set of users (nodes) from the Facebook networks at each of 100 American institutions (which we enumerate in Table A.1) and all of the “friendship” links between those users’ pages as they existed on one particular day in September 2005. Each institution in the data is additionally identified by a number appearing as part of its name that appears to correspond to the order in which each institution “joined”

Methods

We study each network at both the dyad level and the community level. We first consider homophily [32], [33], [34], which we quantify by assortativity coefficients using the available categorical data. For some of the smaller networks, we additionally perform independent logistic regression on node pairs to obtain the log odds contributions to edge presence between two nodes that have the same categorical-data value. We similarly fit exponential random graph models (ERGMs) [35], [36], [37], [38]

Results

We now use the methods outlined in the previous section to study the Facebook networks. We first follow the order of presentation above and then make some observations in combinations. Complete results are available in the tables in the Supplementary Data.

Conclusions

We have studied the social structure of Facebook “friendship” networks at one hundred American institutions at a single point in time (using data from September 2005). To compare the organizations of the 100 institutions using categorical data, we considered both microscopic and macroscopic perspectives. In particular, calculating assortativity coefficients and regression-model coefficients based on observed ties allows one to examine homophily at the local level, and algorithmic community

Acknowledgements

We thank Adam D’Angelo and Facebook for providing the data used in this study. We also acknowledge Sandra González-Bailón, Eric Kelsic, and Fred Stutzman for useful discussions. We thank Christina Frost for her help with developing some of the employed graph visualization code (available at http://netwiki.amath.unc.edu/VisComms) [54]. We would also like to thank James Fowler, Jim Moody, and Tom Snijders for their helpful insight. ALT was funded by the NSF through the UNC AGEP (NSF HRD-0450099)

References (70)

  • V. Krebs, Social network analysis software & services for organizations, communities, and their consultants, 2008.,...
  • M. Kurant, M. Gjoka, C.T. Butts, A. Markopoulou, Walking on a graph with a magnifying glass: stratified sampling via...
  • B. Hogan, A comparison of on and offline networks through the Facebook API, Working Paper, 2009. Available at:...
  • S. Rosenbloom, On Facebook, scholars link up with data, New York Times, 17 December...
  • D.M. Boyd, Viewing American class divisions through Facebook and Myspace, Apophenia Blog Essay, June 24, 2007....
  • D.M. Boyd

    White flight in networked publics? How race and class shaped american teen engagement with Myspace and Facebook

  • V. Sodera, Rapleaf study reveals gender and age data of social network users (press release). Available at:...
  • R. Gajjala

    Shifting frames: race, ethnicity, and intercultural communication in online social networking and virtual work

  • R. Nyland, C. Near, Jesus is my friend: religiosity as a mediating factor in internet social networking use, Paper...
  • N.W. Geidner, C.A. Fook, M.W. Bell, Masculinity and online social networks: male self-identification on Facebook.com,...
  • L. Hjorth et al.

    Being there and being here: gendered customising of mobile 3G practices through a case study in Seoul

    Convergence

    (2005)
  • S. Fragoso, WTF a crazy Brazilian invasion, in: F. Sudweeks, H. Hrachovec (Eds.), Proceedings of CATaC 2006, Murdoch...
  • R. Zheng, F. Provost, A. Ghose, Social network collaborative filtering, Working paper CeDER-8-08. Center for Digital...
  • R. Kumar et al.

    Structure and evolution of online social networks

  • C. Lampe et al.

    A familiar Face(book): profile elements as signals in an online social network

  • D. Liben-Nowell et al.

    Geographic routing in social networks

    Proceedings of the National Academy of Sciences

    (2005)
  • S.A. Golder et al.

    Rhythms of social interaction: messaging within a massive online network

  • L. Backstrom et al.

    Group formation in large social networks: membership, growth, and evolution

  • E. Spertus et al.

    Evaluating similarity measures: a large-scale study in the Orkut social network

  • A. Chin, M. Chignell, Identifying active subgroups within online communities, in: Proceedings of the Centre for...
  • M. Brzozowski et al.

    Friends and foes: ideological social networking

  • T. Hogg et al.

    Multiple relationship types in online communities and social networks

  • K. Lewis et al.

    The taste for privacy: an analysis of college student privacy settings in an online social network

    Journal of Computer-Mediated Communication

    (2008)
  • A.L. Traud et al.

    Comparing community structure to characteristics in online collegiate social networks

    SIAM Review

    (2011)
  • Cited by (639)

    View all citing articles on Scopus
    View full text