Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.12.2016 | Research | Ausgabe 1/2016 Open Access

Applied Network Science 1/2016

Towards a standard sampling methodology on online social networks: collecting global trends on Twitter

Zeitschrift:
Applied Network Science > Ausgabe 1/2016
Autoren:
C. A. Piña-García, Carlos Gershenson, J. Mario Siqueiros-García
Wichtige Hinweise

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The content extraction tool was programmed by C. A. Piña-García. All authors helped to write the literature review and to collect data. C. A. Piña-García wrote the majority of the paper with assistance from Carlos Gershenson and Siqueiros-García. All authors read and approved the final manuscript.

Abstract

One of the most significant current challenges in large-scale online social networks, is to establish a concise and coherent method aimed to collect and summarize data. Sampling the content of an Online Social Network (OSN) plays an important role as a knowledge discovery tool.
It is becoming increasingly difficult to ignore the fact that current sampling methods must cope with a lack of a full sampling frame i.e., there is an imposed condition determined by a limited data access. In addition, another key aspect to take into account is the huge amount of data generated by users of social networking services such as Twitter, which is perhaps the most influential microblogging service producing approximately 500 million tweets per day. In this context, due to the size of Twitter, which is problematic to be measured, the analysis of the entire network is infeasible and sampling is unavoidable.
In addition, we strongly believe that there is a clear need to develop a new methodology to collect information on social networks (social mining). In this regard, we think that this paper introduces a set of random strategies that could be considered as a reliable alternative to gather global trends on Twitter. It is important to note that this research pretends to show some initial ideas in how convenient are random walks to extract information or global trends.
The main purpose of this study, is to propose a suitable methodology to carry out an efficient collecting process via three random strategies: Brownian, Illusion and Reservoir. These random strategies will be applied through a Metropolis-Hastings Random Walk (MHRW). We show that interesting insights can be obtained by sampling emerging global trends on Twitter. The study also offers some important insights providing descriptive statistics and graphical description from the preliminary experiments.
Literatur
Über diesen Artikel

Weitere Artikel der Ausgabe 1/2016

Applied Network Science 1/2016 Zur Ausgabe

Premium Partner

    Bildnachweise