ABSTRACT
Microblogs allow users to publish geo-tagged posts---short textual messages assigned to a geographic location. Users send posts from places they visit and discuss an idiosyncratic mixture of personal and general topics. Thus, it is reasonable to assume that the locations and the textual content of posts will be unique and will identify the posting user, to some extent. This raises the question whether there is a correlation between the locations of posts and their content. Are users who are similar from the geospatial perspective (i.e., who send messages from nearby locations) also similar from the textual perspective (i.e., send messages with similar textual content)? Do posts with similar content have a spatial distribution similar to that of any random set of posts? We present a study that focuses on these questions. We provide statistical tests to examine the correlation between textual content and geospatial locations in tweets. We show that although there is some correlation between locations and textual content, they provide different similarity measures, and combining these two properties for identification of users by their posts outperforms methods that merely use locations or only use the textual content, for identification.
- H. Abdelhaq, C. Sengstock, and M. Gertz. Eventweet: Online localized event detection from twitter. Proc. VLDB Endow., 6(12):1326--1329, Aug. 2013. Google ScholarDigital Library
- R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarDigital Library
- T. V. Canh and M. Gertz. A spatial lda model for discovering regional communities. In Proc. of the 2013 IEEE/ACM International Conf. on Advances in Social Networks Analysis and Mining, pages 162--168, 2013. Google ScholarDigital Library
- M. De Choudhury, N. Diakopoulos, and M. Naaman. Unfolding the event landscape on twitter: Classification and exploration of user categories. In Proc. of the ACM 2012 Conf. on Computer Supported Cooperative Work, pages 241--244, 2012. Google ScholarDigital Library
- Y. Doytsher, B. Galon, and Y. Kanza. Querying geo-social data by bridging spatial networks and social networks. In Proc. of the 2nd ACM SIGSPATIAL Inter. Workshop on Location Based Social Networks, pages 39--46, 2010. Google ScholarDigital Library
- Y. Doytsher, B. Galon, and Y. Kanza. Storing routes in socio-spatial networks and supporting social-based route recommendation. In Proc. of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks, pages 49--56, 2011. Google ScholarDigital Library
- Y. Doytsher, B. Galon, and Y. Kanza. Querying socio-spatial networks on the world-wide web. In Proc. of the 21st International Conf. Companion on World Wide Web, pages 329--332, 2012. Google ScholarDigital Library
- N. Gnanasambandam, K. Thompson, I. F. Ho, S. Lam, and S. W. Yoon. Towards situational pattern mining from microblogging activity. In Proc. of the 21st International Conf. on World Wide Web, pages 661--666, 2012. Google ScholarDigital Library
- M. Haklay and P. Weber. Openstreetmap: User-generated street maps. Pervasive Computing, IEEE, 7(4):12--18, 2008. Google ScholarDigital Library
- L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In Proc. of the 21st International Conf. on World Wide Web, pages 769--778, 2012. Google ScholarDigital Library
- E. Ilina, C. Hauff, I. Celik, F. Abel, and G.-J. Houben. Social event detection on twitter. In Proc. of the 12th Inter. Conf. on Web Engineering, pages 169--176, 2012. Google ScholarDigital Library
- N. Kanhabua, S. Romano, A. Stewart, and W. Nejdl. Supporting temporal analytics for health-related events in microblogs. In Proc. of the 21st ACM International Conf. on Information and Knowledge Management, pages 2686--2688, 2012. Google ScholarDigital Library
- K.-S. Kim, R. Lee, and K. Zettsu. mTrend: Discovery of topic movements on geo-microblogging messages. In Proc. of the 19th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 529--532, 2011. Google ScholarDigital Library
- J. J. Levandoski, M. Sarwat, A. Eldawy, and M. F. Mokbel. LARS: A location-aware recommender system. In Proc. of the 2012 IEEE 28th International Conf. on Data Engineering, pages 450--461, 2012. Google ScholarDigital Library
- Y. Liang, J. Caverlee, and J. Mander. Text vs. images: On the viability of social media to assess earthquake damage. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 1003--1006, 2013. Google ScholarDigital Library
- P. Liu, J. Tang, and T. Wang. Information current in twitter: Which brings hot events to the world. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 111--112, 2013. Google ScholarDigital Library
- M. F. Mokbel and M. Sarwat. Mobility and social networking: A data management perspective. Proc. VLDB Endow., 6(11):1196--1197, Aug. 2013. Google ScholarDigital Library
- M. Naaman. Geographic information from georeferenced social media data. SIGSPATIAL Special, 3(2):54--61, 2011. Google ScholarDigital Library
- M. Okazaki and Y. Matsuo. Semantic twitter: Analyzing tweets for real-time event notification. In Proc. of the 2008/2009 Inter. Conf. on Social Software: Recent Trends and Developments in Social Software, pages 63--74, 2010. Google ScholarDigital Library
- A.-M. Popescu, M. Pennacchiotti, and D. Paranjpe. Extracting events and event descriptions from twitter. In Proc. of the 20th International Conf. Companion on World Wide Web, pages 105--106, 2011. Google ScholarDigital Library
- M. F. Porter. An algorithm for suffix stripping. In K. Sparck Jones and P. Willett, editors, Readings in Information Retrieval, pages 313--316. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. Google ScholarDigital Library
- B. Robinson, R. Power, and M. Cameron. A sensitive twitter earthquake detector. In Proc. of the 22Nd International Conf. on World Wide Web Companion, pages 999--1002, 2013. Google ScholarDigital Library
- T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: Real-time event detection by social sensors. In Proc. of the 19th International Conf. on World Wide Web, pages 851--860, 2010. Google ScholarDigital Library
- J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. Twitterstand: News in tweets. In Proc. of the 17th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 42--51, 2009. Google ScholarDigital Library
- C. Sengstock, M. Gertz, H. Abdelhaq, and F. Flatow. Reliable spatio-temporal signal extraction and exploration from human activity records. In Proc. of the 13th International Conf. on Advances in Spatial and Temporal Databases, pages 484--489, 2013. Google ScholarDigital Library
- C. Spearman. The proof and measurement of association between two things. American Journal of Psychology, 15:88--103, 1904.Google ScholarCross Ref
- Y. Takeuchi and M. Sugimoto. Cityvoyager: An outdoor recommendation system based on user location history. In Proc. of the 3rd International Conf. on Ubiquitous Intelligence and Computing, pages 625--636, 2006. Google ScholarDigital Library
- B. E. Teitler, M. D. Lieberman, D. Panozzo, J. Sankaranarayanan, H. Samet, and J. Sperling. Newsstand: A new view on news. In Proc. of the 16th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems, pages 18:1--18:10, 2008. Google ScholarDigital Library
- K. N. Vavliakis, A. L. Symeonidis, and P. A. Mitkas. Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng., 88:1--24, Nov. 2013. Google ScholarDigital Library
- M. Walther and M. Kaisser. Geo-spatial event detection in the twitter stream. In Proc. of the 35th European Conf. on Advances in Information Retrieval, pages 356--367, 2013. Google ScholarDigital Library
- F. Wilcoxon. Individual comparisons by ranking methods. Biometrics, 1(6):80--83, 1945.Google ScholarCross Ref
- J. Yin, S. Karimi, B. Robinson, and M. Cameron. ESA: Emergency situation awareness via microbloggers. In Proc. of the 21st ACM International Conf. on Information and Knowledge Management, pages 2701--2703, 2012. Google ScholarDigital Library
Recommendations
Community is the message: viewing networked public displays through McLuhan's lens of figure and ground
MAB '14: Proceedings of the 2nd Media Architecture Biennale Conference: World CitiesNetworked public displays are being portrayed as "a new communication medium for the 21st century", potentially having the same impact on society as radio, TV, and the Internet. In order to understand how this new medium can impact the society this ...
Predicting activity attendance in event-based social networks: content, context and social influence
UbiComp '14: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous ComputingThe newly emerging event-based social networks (EBSNs) connect online and offline social interactions, offering a great opportunity to understand behaviors in the cyber-physical space. While existing efforts have mainly focused on investigating user ...
Orchestration support for participatory sensing campaigns
UbiComp '14: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous ComputingIn this paper we argue the need for orchestration support for participatory campaigns to achieve campaign quality, and automatisation of said support to achieve scalability, both issues contributing to stakeholder usability. This goes further than ...
Comments