Skip to main content

2014 | Buch

State of the Art Applications of Social Network Analysis

insite
SUCHEN

Über dieses Buch

Social network analysis increasingly bridges the discovery of patterns in diverse areas of study as more data becomes available and complex. Yet the construction of huge networks from large data often requires entirely different approaches for analysis including; graph theory, statistics, machine learning and data mining. This work covers frontier studies on social network analysis and mining from different perspectives such as social network sites, financial data, e-mails, forums, academic research funds, XML technology, blog content, community detection and clique finding, prediction of user’s- behavior, privacy in social network analysis, mobility from spatio-temporal point of view, agent technology and political parties in parliament. These topics will be of interest to researchers and practitioners from different disciplines including, but not limited to, social sciences and engineering.

Inhaltsverzeichnis

Frontmatter
A Randomized Approach for Structural and Message Based Private Friend Recommendation in Online Social Networks
Abstract
The emerging growth of online social networks have opened new doors for various business applications such as promoting a new product across its customers. Besides this, friend recommendation is an important tool for recommending potential candidates as friends to users in order to enhance the development of the entire network structure. Existing friend recommendation methods utilize social network structure and/or user profile information. However, these techniques can no longer be applicable if the privacy of users is taken into consideration. In this chapter, we first propose a two-phase private friend recommendation protocol for recommending friends to a given target user based on the network structure as well as utilizing the real message interaction between users. Our protocol computes the recommendation scores of all users who are within a radius of h from the target user in a privacy-preserving manner. We then address some implementation details and point out an inherent security issue in the current online social networks due to the message flow information. To mitigate this issue or to provide better security, we propose an extended version of the proposed protocol using randomization technique. In addition, we show the practical applicability of our approach through empirical analysis based on different parameters.
Bharath K. Samanthula, Wei Jiang
Context Based Semantic Relations in Tweets
Abstract
Twitter, a popular social networking platform, provides a medium for people to share information and opinions with their followers. In such a medium, a flash event finds an immediate response. However, one concept may be expressed in many different ways. Because of users’ different writing conventions, acronym usages, language differences, and spelling mistakes, there may be variations in the content of postings even if they are about the same event. Analyzing semantic relationships and detecting these variations have several use cases, such as event detection, and making recommendations to users while they are posting tweets. In this work, we apply semantic relationship analysis methods based on term co-occurrences in tweets, and evaluate their effect on detection of daily events from Twitter. The results indicate higher accuracy in clustering, earlier event detection and more refined event clusters.
Ozer Ozdikis, Pinar Senkul, Halit Oguztuzun
Fast Exact and Approximate Computation of Betweenness Centrality in Social Networks
Abstract
Social networks have demonstrated in the last few years to be a powerful and flexible concept useful to represent and analyze data emerging from social interactions and social activities. The study of these networks can thus provide a deeper understanding of many emergent global phenomena. The amount of data available in the form of social networks is growing by the day. This poses many computational challenging problems for their analysis. In fact many analysis tools suitable to analyze small to medium sized networks are inefficient for large social networks. The computation of the betweenness centrality index (BC) is a well established method for network data analysis and it is also important as subroutine in more advanced algorithms, such as the Girvan-Newman method for graph partitioning. In this chapter we present a novel approach for the computation of the betweenness centrality, which speeds up considerably Brandes’ algorithm (the current state of the art) in the context of social networks. Our approach exploits the natural sparsity of the data to algebraically (and efficiently) determine the betweenness of those nodes forming trees (tree-nodes) in the social network. Moreover, for the residual network, which is often of much smaller size, we modify directly the Brandes’ algorithm so that we can remove the nodes already processed and perform the computation of the shortest paths only for the residual nodes. We also give a fast sampling-based algorithm that computes an approximation of the betweenness centrality values of the residual network while returns the exact value for the tree-nodes. This algorithm improves in speed and precision over current state of the art approximation methods. Tests conducted on a sample of publicly available large networks from the Stanford repository show that, for the exact algorithm, speed improvements of a factor ranging between 2 and 5 are possible on several such graphs, when the sparsity, measured by the ratio of tree-nodes to the total number of nodes, is in a medium range (30–50 %). For some large networks from the Stanford repository and for a sample of social networks provided by Sistemi Territoriali with high sparsity (80 % and above) tests show that our algorithm, named SPVB (for Shortest Path Vertex Betweenness), consistently runs between one and two orders of magnitude faster than the current state of the art exact algorithm.
Miriam Baglioni, Filippo Geraci, Marco Pellegrini, Ernesto Lastres
An Agent-Based Modeling Framework for Social Network Simulation
Abstract
Agent-based modeling has been frequently adopted as a research tool in the fields of social and political sciences. Although recently social network analysis has generated a new wave of interest in many different research fields, nonetheless software instruments specifically created for agent-based social network simulation are still missing. However, restricting the field of interest specifically to social network models and simulations instead of supporting general agent-based ones, allows for the creation of easier to use, more focused tools. In this work, we propose an agent-based modeling framework for simulations over social networks. The models are written in a purposely developed, domain-specific language that helps in mapping social-network concepts to agent-based ones. Our framework is created to deal with large simulations and to work effortlessly with other social network analysis toolkits.
Enrico Franchi
Early Stage Conversation Catalysts on Entertainment-Based Web Forums
Abstract
In this chapter we examine the interest around a number of television series broadcast on a weekly basis. We show that through analysis of initial conversation between fans or users of dedicated web forums we can provide a description of the greatest period of interest (or peak). We then focus our attention on this peak with an ultimate goal of characterising episodes as a function of the actions and qualities of the people that take part in early conversation about this episode. We find that early interaction trends have strong similarities with the overall conversation patterns, and contain the majority of information provided by influential members of the community. This observation has important implications for the rapid generation of meta-data which may be used during later broadcast and re-runs for description and valuation of episodes.
James Lanagan, Nikolai Anokhin, Julien Velcin
Predicting Users Behaviours in Distributed Social Networks Using Community Analysis
Abstract
Prediction of user behaviour in Social Networks is important for a lot of applications, ranging from marketing to social community management. This chapter is devoted to the analysis of the propensity of a user to stop using a social platform in a near future. This problem is called churn prediction and has been extensively studied in telecommunication networks. We first present a novel algorithm to accurately detect overlapping local communities in social graphs. This algorithm outperforms the state of the art methods and is able to deal with pathological cases which can occur in real networks. It is then shown how, using graph attributes extracted from the user’s local community, it is possible to design efficient methods to predict churn. Because the data of real large social networks is generally distributed across many servers, we show how to compute the different local social circles, using distributed data and in parallel on Hadoop HBase. Experimentations are presented on one of the largest French social blog platforms, Skyrock, where millions of teenagers interact daily.
Blaise Ngonmang, Emmanuel Viennet, Maurice Tchuente
What Should We Protect? Defining Differential Privacy for Social Network Analysis
Abstract
Privacy of social network data is a growing concern that threatens to limit access to this valuable data source. Analysis of the graph structure of social networks can provide valuable information for revenue generation and social science research, but unfortunately, ensuring this analysis does not violate individual privacy is difficult. Simply anonymizing graphs or even releasing only aggregate results of analysis may not provide sufficient protection. Differential privacy is an alternative privacy model, popular in data-mining over tabular data, that uses noise to obscure individuals’ contributions to aggregate results and offers a very strong mathematical guarantee that individuals’ presence in the data-set is hidden. Analyses that were previously vulnerable to identification of individuals and extraction of private data may be safely released under differential-privacy guarantees. We review two existing standards for adapting differential privacy to network data and analyze the feasibility of several common social-network analysis techniques under these standards. Additionally, we propose out-link privacy and partition privacy, novel standards for differential privacy over network data, and introduce powerful private algorithms for common network analysis techniques that were infeasible to privatize under previous differential privacy standards.
Christine Task, Chris Clifton
Complex Network Analysis of Research Funding: A Case Study of NSF Grants
Abstract
Funding from the government agencies has been the driving force for the research and educational institutions particularly in the United States. The government funds billions of dollars every year to lead research initiatives that will shape the future. In this chapter, we analyze the funds distributed by the United States National Science Foundation (NSF), a major source of academic research funding, to understand the collaboration patterns among researchers and institutions. Using complex network analysis, we interpret the collaboration patterns at researcher, institution, and state levels by constructing the corresponding networks based on the number of grants collaborated at different time frames. Additionally, we analyze these networks for small, medium, and large projects in order to observe collaboration at different funding levels. We further analyze the directorates to identify the differences in collaboration trends between disciplines. Sample networks can be found at http://​www.​cse.​unr.​edu/​~mgunes/​ NSFCollaboration​Networks/​.
Hakan Kardes, Abdullah Sevincer, Mehmet Hadi Gunes, Murat Yuksel
A Density-Based Approach to Detect Community Evolutionary Events in Online Social Networks
Abstract
With the advent of Web 2.0/3.0 supported social media, Online Social Networks (OSNs) have emerged as one of the popular communication tools to interact with similar interest groups around the globe. Due to increasing popularity of OSNs and exponential growth in the number of their users, a significant amount of research efforts has been diverted towards analyzing user-generated data available on these networks, and as a result various community mining techniques have been proposed by different research groups. But, most of the existing techniques consider the number of OSN users as a fixed set, which is not always true in a real scenario, rather the OSNs are dynamic in the sense that many users join/leave the network on a regular basis. Considering such dynamism, this chapter presents a density-based community mining method, OCTracker, for tracking overlapping community evolution in online social networks. The proposed approach adapts a preliminary community structure towards dynamic changes in social networks using a novel density-based approach for detecting overlapping community structures and automatically detects evolutionary events including birth, growth, contraction, merge, split, and death of communities with time. Unlike other density-based community detection methods, the proposed method does not require the neighborhood threshold parameter to be set by the users, rather it automatically determines the same for each node locally. Evaluation results on various datasets reveal that the proposed method is computationally efficient and naturally scales to large social networks.
Muhammad Abulaish, Sajid Yousuf Bhat
@Rank: Personalized Centrality Measure for Email Communication Networks
Abstract
Email communication patterns have been long used to derive the underlying social network structure. By looking at who is talking to who and how often, the researchers have disclosed interesting patterns, hinting on social roles and importance of actors in such networks. Email communication analysis has been previously applied to discovering cliques and fraudulent activities (e.g. the Enron email network), to observe information dissemination patterns, and to identify key players in communication networks. In this chapter we are using a large dataset of email communication within a constrained community to discover the importance of actors in the underlying network as perceived independently by each actor. We base our method on a simple notion of implicit importance: people are more likely to quickly respond to emails sent by people whom they perceive as important. We propose several methods for building the social network from the email communication data and we introduce various weighting schemes which correspond to different perceptions of importance. We compare the rankings to observe the stability of our method. We also compare the results with an a priori assessment of actors’ importance to verify our method. The resulting ranking can be used both in the aggregated form as a global centrality measure, as well as personalized ranking that reflects individual perception of other actors’ importance.
Paweł Lubarski, Mikołaj Morzy
Twitter Sentiment Analysis: How to Hedge Your Bets in the Stock Markets
Abstract
Emerging interest of trading companies and hedge funds in mining social web has created new avenues for intelligent systems that make use of public opinion in driving investment decisions. It is well accepted that at high frequency trading, investors are tracking memes rising up in microblogging forums to count for the public behavior as an important feature while making short term investment decisions. We investigate the complex relationship between tweet board literature (like bullishness, volume, agreement etc) with the financial market instruments (like volatility, trading volume and stock prices). We have analyzed Twitter sentiments for more than 4 million tweets between June 2010 and July 2011 for DJIA, NASDAQ-100 and 11 other big cap technological stocks. Our results show high correlation (upto 0.88 for returns) between stock prices and twitter sentiments. Further, using Granger’s Causality Analysis, we have validated that the movement of stock prices and indices are greatly affected in the short term by Twitter discussions. Finally, we have implemented Expert Model Mining System (EMMS) to demonstrate that our forecasted returns give a high value of R-square (0.952) with low Maximum Absolute Percentage Error (MaxAPE) of 1.76 % for Dow Jones Industrial Average (DJIA). We introduce and validate performance of market monitoring elements derived from public mood that can be exploited to retain a portfolio within limited risk state during typical market conditions.
Tushar Rao, Saket Srivastava
The Impact of Measurement Time on Subgroup Detection in Online Communities
Abstract
More and more communities use internet based services and infrastructure for communication and collaboration. All these activities leave digital traces that are of interest for research as real world data sources that can be processed automatically or semi-automatically. Since productive online communities (such as open source developer teams) tend to support the establishment of ties between actors who work on or communicate about the same or similar objects, social network analysis is a frequently used research methodology in this field. A typical application of Social Network Analysis (SNA) techniques is the detection of cohesive subgroups of actors (also called “community detection”. We were particularly interested in such methods that allow for the detection of overlapping clusters, which is the case with the Clique Percolation Method (CPM) and Link Community detection (LC). We have used these two methods to analyze data from some open source developer communities (mailing lists and log files) and have compared the results for varied time windows of measurement. The influence of the time span of data capturing/aggregation can be compared to photography: A certain minimal window size is needed to get a clear image with enough “light” (i.e. dense enough interaction data), whereas for very long time spans the image will be blurred because subgroup membership will indeed change during the time span (corresponding to a moving target). In this sense, our target parameter is “resolution” of subgroup structures. We have identified several indicators for good resolution. In general, this value will vary for different types of communities with different communication frequency and behavior. Following our findings, an explicit analysis and comparison of the influence of time window for different communities may be used to better adjust analysis techniques for the communities at hand.
Sam Zeini, Tilman Göhnert, Tobias Hecking, Lothar Krempel, H. Ulrich Hoppe
Spatial and Temporal Evaluation of Network-Based Analysis of Human Mobility
Abstract
The availability of massive network and mobility data from diverse domains has fostered the analysis of human behavior and interactions. This data availability leads to challenges in the knowledge discovery community. Several different analyses have been performed on the traces of human trajectories, such as understanding the real borders of human mobility or mining social interactions derived from mobility and viceversa. However, the data quality of the digital traces of human mobility has a dramatic impact over the knowledge that it is possible to mine, and this issue has not been thoroughly tackled in literature so far. In this chapter, we mine and analyze with complex network techniques a large dataset of human trajectories, a GPS dataset from more than 150 k vehicles in Italy. We build a multiresolution spatial grid and we map the trajectories to several complex networks, by connecting the different areas of our region of interest. We also analyze different temporal slices of the network, obtaining a dynamic perspective over its evolution. We analyze the structural properties of the temporal and geographical slices and their human mobility predictive power. The result is a significant advancement in our understanding of the data transformation process that is needed to connect mobility with social network analysis and mining.
Michele Coscia, Salvatore Rinzivillo, Fosca Giannotti, Dino Pedreschi
An Ant Based Particle Swarm Optimization Algorithm for Maximum Clique Problem in Social Networks
Abstract
In recent years, social network services provide a suitable platform for analyzing the activity of users in social networks. In online social networks, interaction between users plays a key role in social network analysis. One of the important types of social structure is a full connected relation between some users, which known as clique structure. Therefore finding a maximum clique is essential for analysis of certain groups and communities in social networks. This paper proposed a new hybrid method using ant colony optimization algorithm and particle swarm optimization algorithm for finding a maximum clique in social networks. In the proposed method, it is improved process of pheromone update by particle swarm optimization in order to attain better results. Simulation results on popular standard social network benchmarks in comparison standard ant colony optimization algorithm are shown a relative enhancement of proposed algorithm.
Mohammad Soleimani-pouri, Alireza Rezvanian, Mohammad Reza Meybodi
XEngine: An XML Search Engine for Social Groups
Abstract
We introduce in this chapter an XML user-based Collaborative Filtering system called XEngine. The framework of XEngine categorizes social groups based on ethnic, cultural, religious, demographic, or other characteristics. XEngine outputs ranked lists of content items, taking into account not only the initial preferences of the user, but also the preferences of the various social groups, to which the user belongs. The user’s social groups are inferred implicitly by the system without involving the user. XEngine constructs the social groups and identifies their preferences dynamically on the fly. These preferences are determined from the preferences of the social groups’ member users using a group modeling strategy. XEngine can be used for various practical applications, such as Internet or other businesses that market preference-driven products. We experimentally compared XEngine with three existing systems. Results showed marked improvement.
Kamal Taha
Size, Diversity and Components in the Network Around an Entrepreneur: Shaped by Culture and Shaping Embeddedness of Firm Relations
Abstract
The network around an entrepreneur is conceptualized as having structural properties of size, diversity and a configuration of components. The Global Entrepreneurship Monitor has surveyed 61 countries with 88,562 entrepreneurs who reported networking with advisors. Cluster analysis of their relations revealed five components: a private network of advice relations with spouse, parents, other family and friends; a work-place network of boss, coworkers, starters and mentors; a professional network of accountants, lawyers, banks, investors, counselors and researchers; a market network of competitors, collaborators, suppliers and customers; and an international network of advice relations with persons abroad and persons who have come from abroad. Entrepreneurs’ networking is unfolding in a culture of traditionalism versus secular-rationalism. Traditionalism is hypothesized to reduce diversity and size of networks and specifically reduce networking in the public sphere, but to enhance networking in the private sphere. Cultural effects on networking are tested as macro-to-micro effects on networking in two-level mixed linear models with fixed effects of traditionalism and individual-level variables and random effects of country. We find that traditionalism reduces diversity and overall networking and specifically networking in the work-place, professions, market and internationally, but enhances private networking. These cultural effects are larger than effects of attributes of the entrepreneur. The personal network around the entrepreneur provides an embedding of the business relations around the entrepreneurs’ firm which are especially facilitated by the entrepreneur’s networks in the public sphere.
Maryam Cheraghi, Thomas Schott
Content Mining of Microblogs
Abstract
Emergence of Web 2.0, internet users can share their contents with other users using social networks. In this chapter microbloggers’ contents are evaluated with respect to how they reflect their categories. Migrobloggers’ category information, which is one of the four categories that are economy sport, entertainment or technology, is taken from wefollow.com application. 3337 RSS news feeds, whose category labels are same with microbloggers’ contributions, are used as training data for classification. Unlike the similar studies if a feature of microblog doesn’t appear in RSS news feeds as a feature, this feature is omitted so abbreviations and nonsense words in microblogs can be eliminated. In this study two types of users’ contributions are taken as test data. These users are normal microbloggers and bots. Classification results show that bots provide more categorical content than normal users.
M. Özgür Cingiz, Banu Diri
Backmatter
Metadaten
Titel
State of the Art Applications of Social Network Analysis
herausgegeben von
Fazli Can
Tansel Özyer
Faruk Polat
Copyright-Jahr
2014
Electronic ISBN
978-3-319-05912-9
Print ISBN
978-3-319-05911-2
DOI
https://doi.org/10.1007/978-3-319-05912-9

Premium Partner