Abstract
Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.
- D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 2003. Google ScholarDigital Library
- A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. Technical Report 2003-67, Stanford InfoLab, 2003.Google Scholar
- R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In In SIGMOD 2000. Google ScholarDigital Library
- N. Diakopoulos, M. Naaman, and F. Kivran-Swaine. Diamonds in the rough: Social media visual analytics for journalistic inquiry. In VAST, 2010.Google ScholarCross Ref
- R. Goldman and J. Widom. WSQ/DSQ: a practical approach for combined querying of databases and the web. SIGMOD Record, 2000. Google ScholarDigital Library
- B. Hecht, L. Hong, B. Suh, and E. H. Chi. Tweets from justin bieber's heart: the dynamics of the location field in user profiles. CHI, 2011. Google ScholarDigital Library
- A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller. Twitinfo: aggregating and visualizing microblogs for event exploration. CHI, 2011. Google ScholarDigital Library
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. SIGMOD, 2008. Google ScholarDigital Library
- V. Raman and J. M. Hellerstein. Partial results for online query processing. In SIGMOD, 2002. Google ScholarDigital Library
Index Terms
- Processing and visualizing the data in tweets
Recommendations
Credibility investigation of newsworthy tweets using a visualising Petri net model
2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)Investigating information credibility is an important problem in online social networks such as Twitter. Since misleading information can get easily propagated in Twitter, ranking tweets according to their credibility can help to detect rumors and ...
Analyzing and predicting viral tweets
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebTwitter and other microblogging services have become indispensable sources of information in today's web. Understanding the main factors that make certain pieces of information spread quickly in these platforms can be decisive for the analysis of ...
Visualizing teacher tweets: finding professional learning networks in topical networks
ASIST '15: Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the CommunityThe data for this study is part of #teachertweets, an interdisciplinary quantitative and qualitative study that examines the networks that US-based teachers form on Twitter, the conversations they are having there, and the content of individual tweets. ...
Comments