nach oben

2011 | Buch

Social Network Data Analytics

herausgegeben von: Charu C. Aggarwal

Verlag: Springer US

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Social network analysis applications have experienced tremendous advances within the last few years due in part to increasing trends towards users interacting with each other on the internet. Social networks are organized as graphs, and the data on social networks takes on the form of massive streams, which are mined for a variety of purposes.

Social Network Data Analytics covers an important niche in the social network analytics field. This edited volume, contributed by prominent researchers in this field, presents a wide selection of topics on social network data mining such as Structural Properties of Social Networks, Algorithms for Structural Discovery of Social Networks and Content Analysis in Social Networks. This book is also unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks. Emphasis is placed on simplifying the content so that students and practitioners benefit from this book.

This book targets advanced level students and researchers concentrating on computer science as a secondary text or reference book. Data mining, database, information security, electronic commerce and machine learning professionals will find this book a valuable asset, as well as primary associations such as ACM, IEEE and Management Science.

Inhaltsverzeichnis

Frontmatter

Chapter 1. An Introduction to Social Network Data Analytics

Abstract

The advent of online social networks has been one of the most exciting events in this decade. Many popular online social networks such as Twitter, LinkedIn, and Facebook have become increasingly popular. In addition, a number of multimedia networks such as Flickr have also seen an increasing level of popularity in recent years. Many such social networks are extremely rich in content, and they typically contain a tremendous amount of content and linkage data which can be leveraged for analysis. The linkage data is essentially the graph structure of the social network and the communications between entities; whereas the content data contains the text, images and other multimedia data in the network. The richness of this network provides unprecedented opportunities for data analytics in the context of social networks. This book provides a data-centric view of online social networks; a topic which has been missing from much of the literature. This chapter provides an overview of the key topics in this field, and their coverage in this book.

Charu C. Aggarwal

Chapter 2. Statistical Properties of Social Networks

Abstract

In this chapter we describe patterns that occur in the structure of social networks, represented as graphs. We describe two main classes of properties, static properties, or properties describing the structure of snapshots of graphs; and dynamic properties, properties describing how the structure evolves over time. These properties may be for unweighted or weighted graphs, where weights may represent multi-edges (e.g. multiple phone calls from one person to another), or edge weights (e.g. monetary amounts between a donor and a recipient in a political donation network).

Mary McGlohon, Leman Akoglu, Christos Faloutsos

Chapter 3. Random Walks in Social Networks and their Applications: A Survey

Abstract

A wide variety of interesting real world applications, e.g. friend suggestion in social networks, keyword search in databases, web-spam detection etc. can be framed as ranking entities in a graph. In order to obtain ranking we need a graph-theoretic measure of similarity. Ideally this should capture the information hidden in the graph structure. For example, two entities are similar, if there are lots of short paths between them. Random walks have proven to be a simple, yet powerful mathematical tool for extracting information from the ensemble of paths between entities in a graph. Since real world graphs are enormous and complex, ranking using random walks is still an active area of research. The research in this area spans from new applications to novel algorithms and mathematical analysis, bringing together ideas from different branches of statistics, mathematics and computer science. In this book chapter, we describe different random walk based proximity measures, their applications, and existing algorithms for computing them.

Purnamrita Sarkar, Andrew W. Moore

Chapter 4. Community Discovery in Social Networks: Applications, Methods and Emerging Trends

Abstract

Data sets originating from many different real world domains can be represented in the form of interaction networks in a very natural, concise and meaningful fashion. This is particularly true in the social context, especially given recent advances in Internet technologies and Web 2.0 applications leading to a diverse range of evolving social networks. Analysis of such networks can result in the discovery of important patterns and potentially shed light on important properties governing the growth of such networks.

It has been shown that most of these networks exhibit strong modular nature or community structure. An important research agenda thus is to identify communities of interest and study their behavior over time. Given the importance of this problem there has been significant activity within this field particularly over the last few years. In this article we survey the landscape and attempt to characterize the principle methods for community discovery (and related variants) and identify current and emerging trends as well as crosscutting research issues within this dynamic field.

S. Parthasarathy, Y. Ruan, V. Satuluri

Chapter 5. Node Classification in Social Networks

Abstract

When dealing with large graphs, such as those that arise in the context of online social networks, a subset of nodes may be labeled. These labels can indicate demographic values, interest, beliefs or other characteristics of the nodes (users). A core problem is to use this information to extend the labeling so that all nodes are assigned a label (or labels).

In this chapter, we survey classification techniques that have been proposed for this problem. We consider two broad categories: methods based on iterative application of traditional classifiers using graph information as features, and methods which propagate the existing labels via random walks. We adopt a common perspective on these methods to highlight the similarities between different approaches within and across the two categories. We also describe some extensions and related directions to the central problem of node classification.

Smriti Bhagat, Graham Cormode, S. Muthukrishnan

Chapter 6. Evolution in Social Networks: A Survey

Abstract

There is much research on social network analysis but only recently did scholars turn their attention to the volatility of social networks. An abundance of questions emerged. How does a social network evolve – can we find laws and derive models that explain its evolution? How do communities emerge in a social network and how do they expand or shrink? What is a community in an evolving network – can we claim that two communities seen at two distinct timepoints are the same one, even if they have next to no members in common? Research advances have different perspectives: some scholars focus on how evolution manifests itself in a social network, while others investigate how individual communities evolve as new members join and old ones become inactive. There are methods for discovering communities and capturing their changes in time, and methods that consider a community as a smoothly evolving constellation and thus build and adapt models upon that premise. This survey organizes advances on evolution in social networks into a common framework and gives an overview of these different perspectives.

Myra Spiliopoulou

Chapter 7. A Survey of Models and Algorithms for Social Influence Analysis

Abstract

Social influence is the behavioral change of a person because of the perceived relationship with other people, organizations and society in general. Social influence has been a widely accepted phenomenon in social networks for decades. Many applications have been built based around the implicit notation of social influence between people, such as marketing, advertisement and recommendations. With the exponential growth of online social network services such as Facebook and Twitter, social influence can for the first time be measured over a large population. In this chapter, we survey the research on social influence analysis with a focus on the computational aspects. First, we present statistical measurements related to social influence. Second, we describe the literature on social similarity and influences. Third, we present the research on social influence maximization which has many practical applications including marketing and advertisement.

Jimeng Sun, Jie Tang

Chapter 8. A Survey of Algorithms and Systems for Expert Location in Social Networks

Abstract

Given a particular task and a set of candidates, one often wants to identify the right expert (or set of experts) that can perform the given task. We call this problem the expert-location problem and we survey its different aspects as they arise in practice. For example, given the activities of candidates within a context (e.g., authoring a document, answering a question), we first describe methods for evaluating the level of expertise for each of them. Often, experts are organized in networks that correspond to social networks or organizational structures of companies. We next devote part of the chapter for describing algorithms that compute the expertise level of individuals by taking into account their position in such a network. Finally, complex tasks often require the collective expertise of more than one experts. In such cases, it is more realistic to require a team of experts that can collaborate towards a common goal. We describe algorithms that identify effective expert teams within a network of experts. The chapter is a survey of different algorithms for expertise evaluation and team identification. We highlight the basic algorithmic problems and give some indicative algorithms that have been developed in the literature. We conclude the chapter by providing a comprehensive overview of real-life systems for expert location.

Theodoros Lappas, Kun Liu, Evimaria Terzi

Chapter 9. A Survey of Link Prediction in Social Networks

Abstract

Link prediction is an important task for analying social networks which also has applications in other domains like, information retrieval, bioinformatics and e-commerce. There exist a variety of techniques for link prediction, ranging from feature-based classification and kernel-based method to matrix factorization and probabilistic graphical models. These methods differ from each other with respect to model complexity, prediction performance, scalability, and generalization ability. In this article, we survey some representative link prediction methods by categorizing them by the type of the models. We largely consider three types of models: first, the traditional (non-Bayesian) models which extract a set of features to train a binary classification model. Second, the probabilistic approaches which model the joint-probability among the entities in a network by Bayesian graphical models. And, finally the linear algebraic approach which computes the similarity between the nodes in a network by rank-reduced similarity matrices. We discuss various existing link prediction models that fall in these broad categories and analyze their strength and weakness. We conclude the survey with a discussion on recent developments and future research direction.

Mohammad Al Hasan, Mohammed J. Zaki

Chapter 10. Privacy in Social Networks: A Survey

Abstract

In this chapter, we survey the literature on privacy in social networks. We focus both on online social networks and online affiliation networks. We formally define the possible privacy breaches and describe the privacy attacks that have been studied. We present definitions of privacy in the context of anonymization together with existing anonymization techniques.

Elena Zheleva, Lise Getoor

Chapter 11. Visualizing Social Networks

Abstract

With today‘s ubiquity and popularity of social network applications, the ability to analyze and understand large networks in an efficient manner becomes critically important. However, as networks become larger and more complex, reasoning about social dynamics via simple statistics is not a feasible option. To overcome these limitations, we can rely on visual metaphors. Visualization nowadays is no longer a passive process that produces images from a set of numbers. Recent years have witnessed a convergence of social network analytics and visualization, coupled with interaction, that is changing the way analysts understand and characterize social networks. In this chapter, we discuss the main goal of visualization and how different metaphors are aimed towards elucidating different aspects of social networks, such as structure and semantics. We also describe a number of methods where analytics and visualization are interwoven towards providing a better comprehension of social structure and dynamics.

Carlos D. Correa, Kwan-Liu Ma

Chapter 12. Data Mining in Social Media

Abstract

The rise of online social media is providing a wealth of social network data. Data mining techniques provide researchers and practitioners the tools needed to analyze large, complex, and frequently changing social media data. This chapter introduces the basics of data mining, reviews social media, discusses how to mine social media data, and highlights some illustrative examples with an emphasis on social networking sites and blogs.

Geoffrey Barbier, Huan Liu

Chapter 13. Text Mining in Social Networks

Abstract

Social networks are rich in various kinds of contents such as text and multimedia. The ability to apply text mining algorithms effectively in the context of text data is critical for a wide variety of applications. Social networks require text mining algorithms for a wide variety of applications such as keyword search, classification, and clustering. While search and classification are well known applications for a wide variety of scenarios, social networks have a much richer structure both in terms of text and links. Much of the work in the area uses either purely the text content or purely the linkage structure. However, many recent algorithms use a combination of linkage and content information for mining purposes. In many cases, it turns out that the use of a combination of linkage and content information provides much more effective results than a system which is based purely on either of the two. This paper provides a survey of such algorithms, and the advantages observed by using such algorithms in different scenarios. We also present avenues for future research in this area.

Charu C. Aggarwal, Haixun Wang

Chapter 14. Integrating Sensors and Social Networks

Abstract

A number of sensor applications in recent years collect data which can be directly associated with human interactions. Some examples of such applications include GPS applications on mobile devices, accelerometers, or location sensors designed to track human and vehicular traffic. Such data lends itself to a variety of rich applications in which one can use the sensor data in order to model the underlying relationships and interactions. It also leads to a number of challenges, since such data may often be private, and it is important to be able to perform the mining process without violating the privacy of the users. In this chapter, we provide a broad survey of the work in this important and rapidly emerging field. We also discuss the key problems which arise in the context of this important field and the corresponding solutions.

Charu C. Aggarwal, Tarek Abdelzaher

Chapter 15. Multimedia Information Networks in Social Media

Abstract

The popularity of personal digital cameras and online photo/video sharing community has lead to an explosion of multimedia information. Unlike traditional multimedia data, many new multimedia datasets are organized in a structural way, incorporating rich information such as semantic ontology, social interaction, community media, geographical maps, in addition to the multimedia contents by themselves. Studies of such structured multimedia data have resulted in a new research area, which is referred to as Multimedia Information Networks. Multimedia information networks are closely related to social networks, but especially focus on understanding the topics and semantics of the multimedia files in the context of network structure. This chapter reviews different categories of recent systems related to multimedia information networks, summarizes the popular inference methods used in recent works, and discusses the applications related to multimedia information networks. We also discuss a wide range of topics including public datasets, related industrial systems, and potential future research directions in this field.

Liangliang Cao, GuoJun Qi, Shen-Fu Tsai, Min-Hsuan Tsai, Andrey Del Pozo, Thomas S. Huang, Xuemei Zhang, Suk Hwan Lim

Chapter 16. An Overview of Social Tagging and Applications

Abstract

Social tagging on online portals has become a trend now. It has emerged as one of the best ways of associating metadata with web objects. With the increase in the kinds of web objects becoming available, collaborative tagging of such objects is also developing along new dimensions. This popularity has led to a vast literature on social tagging. In this survey paper, we would like to summarize different techniques employed to study various aspects of tagging. Broadly, we would discuss about properties of tag streams, tagging models, tag semantics, generating recommendations using tags, visualizations of tags, applications of tags, integration of different tagging systems and problems associated with tagging usage. We would discuss topics like why people tag, what influences the choice of tags, how to model the tagging process, kinds of tags, different power laws observed in tagging domain, how tags are created and how to choose the right tags for recommendation. Metadata generated in the form of tags can be efficiently used to improve web search, for web object classification, for generating ontologies, for enhanced browsing etc. We would discuss these applications and conclude with thoughts on future work in the area.

Manish Gupta, Rui Li, Zhijun Yin, Jiawei Han

Backmatter

Titel: Social Network Data Analytics
herausgegeben von: Charu C. Aggarwal
Verlag: Springer US
Electronic ISBN: 978-1-4419-8462-3
Print ISBN: 978-1-4419-8461-6
DOI: https://doi.org/10.1007/978-1-4419-8462-3