skip to main content
10.1145/2619112.2619115acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

On the Correlation Between Textual Content and Geospatial Locations in Microblogs

Authors Info & Claims
Published:22 June 2014Publication History

ABSTRACT

Microblogs allow users to publish geo-tagged posts---short textual messages assigned to a geographic location. Users send posts from places they visit and discuss an idiosyncratic mixture of personal and general topics. Thus, it is reasonable to assume that the locations and the textual content of posts will be unique and will identify the posting user, to some extent. This raises the question whether there is a correlation between the locations of posts and their content. Are users who are similar from the geospatial perspective (i.e., who send messages from nearby locations) also similar from the textual perspective (i.e., send messages with similar textual content)? Do posts with similar content have a spatial distribution similar to that of any random set of posts? We present a study that focuses on these questions. We provide statistical tests to examine the correlation between textual content and geospatial locations in tweets. We show that although there is some correlation between locations and textual content, they provide different similarity measures, and combining these two properties for identification of users by their posts outperforms methods that merely use locations or only use the textual content, for identification.

References

  1. H. Abdelhaq, C. Sengstock, and M. Gertz. Eventweet: Online localized event detection from twitter. Proc. VLDB Endow., 6(12):1326--1329, Aug. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. V. Canh and M. Gertz. A spatial lda model for discovering regional communities. In Proc. of the 2013 IEEE/ACM International Conf. on Advances in Social Networks Analysis and Mining, pages 162--168, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. De Choudhury, N. Diakopoulos, and M. Naaman. Unfolding the event landscape on twitter: Classification and exploration of user categories. In Proc. of the ACM 2012 Conf. on Computer Supported Cooperative Work, pages 241--244, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Doytsher, B. Galon, and Y. Kanza. Querying geo-social data by bridging spatial networks and social networks. In Proc. of the 2nd ACM SIGSPATIAL Inter. Workshop on Location Based Social Networks, pages 39--46, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Doytsher, B. Galon, and Y. Kanza. Storing routes in socio-spatial networks and supporting social-based route recommendation. In Proc. of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks, pages 49--56, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Doytsher, B. Galon, and Y. Kanza. Querying socio-spatial networks on the world-wide web. In Proc. of the 21st International Conf. Companion on World Wide Web, pages 329--332, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Gnanasambandam, K. Thompson, I. F. Ho, S. Lam, and S. W. Yoon. Towards situational pattern mining from microblogging activity. In Proc. of the 21st International Conf. on World Wide Web, pages 661--666, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Haklay and P. Weber. Openstreetmap: User-generated street maps. Pervasive Computing, IEEE, 7(4):12--18, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In Proc. of the 21st International Conf. on World Wide Web, pages 769--778, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Ilina, C. Hauff, I. Celik, F. Abel, and G.-J. Houben. Social event detection on twitter. In Proc. of the 12th Inter. Conf. on Web Engineering, pages 169--176, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Kanhabua, S. Romano, A. Stewart, and W. Nejdl. Supporting temporal analytics for health-related events in microblogs. In Proc. of the 21st ACM International Conf. on Information and Knowledge Management, pages 2686--2688, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K.-S. Kim, R. Lee, and K. Zettsu. mTrend: Discovery of topic movements on geo-microblogging messages. In Proc. of the 19th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 529--532, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. J. Levandoski, M. Sarwat, A. Eldawy, and M. F. Mokbel. LARS: A location-aware recommender system. In Proc. of the 2012 IEEE 28th International Conf. on Data Engineering, pages 450--461, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Liang, J. Caverlee, and J. Mander. Text vs. images: On the viability of social media to assess earthquake damage. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 1003--1006, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Liu, J. Tang, and T. Wang. Information current in twitter: Which brings hot events to the world. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 111--112, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. F. Mokbel and M. Sarwat. Mobility and social networking: A data management perspective. Proc. VLDB Endow., 6(11):1196--1197, Aug. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Naaman. Geographic information from georeferenced social media data. SIGSPATIAL Special, 3(2):54--61, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Okazaki and Y. Matsuo. Semantic twitter: Analyzing tweets for real-time event notification. In Proc. of the 2008/2009 Inter. Conf. on Social Software: Recent Trends and Developments in Social Software, pages 63--74, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A.-M. Popescu, M. Pennacchiotti, and D. Paranjpe. Extracting events and event descriptions from twitter. In Proc. of the 20th International Conf. Companion on World Wide Web, pages 105--106, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. F. Porter. An algorithm for suffix stripping. In K. Sparck Jones and P. Willett, editors, Readings in Information Retrieval, pages 313--316. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Robinson, R. Power, and M. Cameron. A sensitive twitter earthquake detector. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 999--1002, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: Real-time event detection by social sensors. In Proc. of the 19th International Conf. on World Wide Web, pages 851--860, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. Twitterstand: News in tweets. In Proc. of the 17th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 42--51, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Sengstock, M. Gertz, H. Abdelhaq, and F. Flatow. Reliable spatio-temporal signal extraction and exploration from human activity records. In Proc. of the 13th International Conf. on Advances in Spatial and Temporal Databases, pages 484--489, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Spearman. The proof and measurement of association between two things. American Journal of Psychology, 15:88--103, 1904.Google ScholarGoogle ScholarCross RefCross Ref
  27. Y. Takeuchi and M. Sugimoto. Cityvoyager: An outdoor recommendation system based on user location history. In Proc. of the 3rd International Conf. on Ubiquitous Intelligence and Computing, pages 625--636, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. E. Teitler, M. D. Lieberman, D. Panozzo, J. Sankaranarayanan, H. Samet, and J. Sperling. Newsstand: A new view on news. In Proc. of the 16th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 18:1--18:10, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. N. Vavliakis, A. L. Symeonidis, and P. A. Mitkas. Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng., 88:1--24, Nov. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Walther and M. Kaisser. Geo-spatial event detection in the twitter stream. In Proc. of the 35th European Conf. on Advances in Information Retrieval, pages 356--367, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. F. Wilcoxon. Individual comparisons by ranking methods. Biometrics, 1(6):80--83, 1945.Google ScholarGoogle ScholarCross RefCross Ref
  32. J. Yin, S. Karimi, B. Robinson, and M. Cameron. ESA: Emergency situation awareness via microbloggers. In Proc. of the 21st ACM International Conf. on Information and Knowledge Management, pages 2701--2703, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    GeoRich'14: Proceedings of Workshop on Managing and Mining Enriched Geo-Spatial Data
    June 2014
    54 pages
    ISBN:9781450329781
    DOI:10.1145/2619112

    Copyright © 2014 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 22 June 2014

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • tutorial
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate25of50submissions,50%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader