skip to main content
research-article

Processing and visualizing the data in tweets

Published:11 January 2012Publication History
Skip Abstract Section

Abstract

Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.

References

  1. D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. Technical Report 2003-67, Stanford InfoLab, 2003.Google ScholarGoogle Scholar
  3. R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In In SIGMOD 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Diakopoulos, M. Naaman, and F. Kivran-Swaine. Diamonds in the rough: Social media visual analytics for journalistic inquiry. In VAST, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  5. R. Goldman and J. Widom. WSQ/DSQ: a practical approach for combined querying of databases and the web. SIGMOD Record, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Hecht, L. Hong, B. Suh, and E. H. Chi. Tweets from justin bieber's heart: the dynamics of the location field in user profiles. CHI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller. Twitinfo: aggregating and visualizing microblogs for event exploration. CHI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. Raman and J. M. Hellerstein. Partial results for online query processing. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Processing and visualizing the data in tweets

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGMOD Record
            ACM SIGMOD Record  Volume 40, Issue 4
            December 2011
            60 pages
            ISSN:0163-5808
            DOI:10.1145/2094114
            Issue’s Table of Contents

            Copyright © 2012 Authors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 11 January 2012

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader