Skip to main content
Top
Published in:

01-12-2016 | Original Article

Where has this tweet come from? Fast and fine-grained geolocalization of non-geotagged tweets

Authors: Pavlos Paraskevopoulos, Themis Palpanas

Published in: Social Network Analysis and Mining | Issue 1/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The rise in the use of social networks in the recent years has resulted in an abundance of information on different aspects of everyday social activities that is available online, with the most prominent and timely source of such information being Twitter. This has resulted in a proliferation of tools and applications that can help end users and large-scale event organizers to better plan and manage their activities. In this process of analysis of the information originating from social networks, an important aspect is that of the geographic coordinates, i.e., geolocalization, of the relevant information, which is necessary for several applications (e.g., on trending venues, traffic jams). Unfortunately, only a very small percentage of the twitter posts are geotagged, which significantly restricts the applicability and utility of such applications. In this work, we address this problem by proposing a framework for geolocating tweets that are not geotagged. Our solution is general and estimates the location from which a post was generated by exploiting the similarities in the content between this post and a set of geotagged tweets, as well as their time-evolution characteristics. Contrary to previous approaches, our framework aims at providing accurate geolocation estimates at fine grain (i.e., within a city). The experimental evaluation with real data demonstrates the efficiency and effectiveness of our approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
For the rest of this paper, we will use the terms geotagged and geolocalized interchangeably.
 
2
This paper extends and improves on our earlier results (Paraskevopoulos and Palpanas 2015).
 
3
Earlier studies have shown that techniques and models built for geotagged data indeed generalize to non-geotagged data, since geotagged and non-geotagged tweets have similar data characteristics (Han et al. 2014).
 
4
We note that the QL results reported here are much better than those reported in our earlier study (Paraskevopoulos and Palpanas 2015). This is due to the different experimental setup (i.e., sliding windows) that we now use for all algorithms, which resulted in an increased number of windows with a high number of tweets, leading to higher execution times and better models.
 
Literature
go back to reference Abdelhaq H, Sengstock C, Gertz M (2013) Eventweet: online localized event detection from twitter. In: Proceedings of the VLDB Endowment , vol 6, no 12 Abdelhaq H, Sengstock C, Gertz M (2013) Eventweet: online localized event detection from twitter. In: Proceedings of the VLDB Endowment , vol 6, no 12
go back to reference Ajao O, Hong J, Liu W (2015) A survey of location inference techniques on twitter. J Inf Sci 41(6):855–864CrossRef Ajao O, Hong J, Liu W (2015) A survey of location inference techniques on twitter. J Inf Sci 41(6):855–864CrossRef
go back to reference Balduini M, Bocconi, S, Bozzon A, Della Valle E, Huang Y, Oosterman J, Palpanas T, Tsytsarau M (2014) A case study of active, continuous and predictive social media analytics for smart city. In: ISWC workshop on semantics for smarter cities (S4SC) Balduini M, Bocconi, S, Bozzon A, Della Valle E, Huang Y, Oosterman J, Palpanas T, Tsytsarau M (2014) A case study of active, continuous and predictive social media analytics for smart city. In: ISWC workshop on semantics for smarter cities (S4SC)
go back to reference Balduini M, Della Valle E, DellAglio D, Tsytsarau M, Palpanas T, Confalonieri C (2013) Social listening of city scale events using the streaming linked data framework. In: ISWC Balduini M, Della Valle E, DellAglio D, Tsytsarau M, Palpanas T, Confalonieri C (2013) Social listening of city scale events using the streaming linked data framework. In: ISWC
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
go back to reference Chang Hw, Lee D, Eltaher M, Lee J (2012) @ phillies tweeting from philly? Predicting twitter user locations with spatial word usage. In: ASONAM Chang Hw, Lee D, Eltaher M, Lee J (2012) @ phillies tweeting from philly? Predicting twitter user locations with spatial word usage. In: ASONAM
go back to reference Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: CIKM Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: CIKM
go back to reference Crooks A, Croitoru A, Stefanidis A, Radzikowski J (2013) # Earthquake: Twitter as a distributed sensor system. Trans GIS 17(1):124–147CrossRef Crooks A, Croitoru A, Stefanidis A, Radzikowski J (2013) # Earthquake: Twitter as a distributed sensor system. Trans GIS 17(1):124–147CrossRef
go back to reference Earle PS, Bowden DC, Guy M (2012) Twitter earthquake detection: earthquake monitoring in a social world. Ann Geophys 54(6):708–715 Earle PS, Bowden DC, Guy M (2012) Twitter earthquake detection: earthquake monitoring in a social world. Ann Geophys 54(6):708–715
go back to reference Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: EMNLP Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: EMNLP
go back to reference Frias-Martinez V, Soto V, Hohwald H, Frias-Martinez E (2012) Characterizing urban landscapes using geolocated tweets. In: SocialCom-PASSAT Frias-Martinez V, Soto V, Hohwald H, Frias-Martinez E (2012) Characterizing urban landscapes using geolocated tweets. In: SocialCom-PASSAT
go back to reference Han B, Cook P, Baldwin T (2014) Text-based twitter user geolocation prediction. J Artif Intell Res 49:451–500 Han B, Cook P, Baldwin T (2014) Text-based twitter user geolocation prediction. J Artif Intell Res 49:451–500
go back to reference Hossain N, Hu T, Feizi R, Zheng D, White AM, Luo J, Kautz H (2016) Precise localization of homes and activities: detecting drinking-while-tweeting patterns in communities. In: Tenth international AAAI conference on web and social media, Cologne, Germany, May 17-20, 2016, pp 587–590 Hossain N, Hu T, Feizi R, Zheng D, White AM, Luo J, Kautz H (2016) Precise localization of homes and activities: detecting drinking-while-tweeting patterns in communities. In: Tenth international AAAI conference on web and social media, Cologne, Germany, May 17-20, 2016, pp 587–590
go back to reference Ikawa Y, Enoki M, Tatsubori M (2012) Location inference using microblog messages. In: Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 687–690 Ikawa Y, Enoki M, Tatsubori M (2012) Location inference using microblog messages. In: Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 687–690
go back to reference Kinsella S, Murdock V, O’Hare N (2011) I’m eating a sandwich in glasgow: modeling locations with tweets. In: SMUC Kinsella S, Murdock V, O’Hare N (2011) I’m eating a sandwich in glasgow: modeling locations with tweets. In: SMUC
go back to reference Li C, Sun A (2014) Fine-grained location extraction from tweets with temporal awareness. In: SIGIR Li C, Sun A (2014) Fine-grained location extraction from tweets with temporal awareness. In: SIGIR
go back to reference Malmi E, Do TMT, Gatica-Perez D (2013) From foursquare to my square: learning check-in behavior from multiple sources. In: ICWSM Malmi E, Do TMT, Gatica-Perez D (2013) From foursquare to my square: learning check-in behavior from multiple sources. In: ICWSM
go back to reference Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In: SIGMOD Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In: SIGMOD
go back to reference Murdock V (2011) Your mileage may vary: on the limits of social media. SIGSPATIAL Spec 3:62–66CrossRef Murdock V (2011) Your mileage may vary: on the limits of social media. SIGSPATIAL Spec 3:62–66CrossRef
go back to reference Paradesi SM (2011) Geotagging tweets using their content. In: FLAIRS conference Paradesi SM (2011) Geotagging tweets using their content. In: FLAIRS conference
go back to reference Paraskevopoulos P, Dinh TC, Dashdorj Z, Palpanas T, Serafini L (2013) Identification and characterization of human behavior patterns from mobile phone data. In: NetMob Paraskevopoulos P, Dinh TC, Dashdorj Z, Palpanas T, Serafini L (2013) Identification and characterization of human behavior patterns from mobile phone data. In: NetMob
go back to reference Paraskevopoulos P, Palpanas T (2015) Fine-grained geolocalisation of non-geotagged tweets. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015. ACM, pp 105–112 Paraskevopoulos P, Palpanas T (2015) Fine-grained geolocalisation of non-geotagged tweets. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015. ACM, pp 105–112
go back to reference Paraskevopoulos P, Pellegrini G, Palpanas T (2016) When a tweet finds its place: fine-grained tweet geolocalisation. In: International workshop on data science for social good (SoGood), in conjunction with the European conference on machine learning and principles and practice of knowledge discovery (ECML PKDD) Paraskevopoulos P, Pellegrini G, Palpanas T (2016) When a tweet finds its place: fine-grained tweet geolocalisation. In: International workshop on data science for social good (SoGood), in conjunction with the European conference on machine learning and principles and practice of knowledge discovery (ECML PKDD)
go back to reference Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW
go back to reference Schulz A, Hadjakos A, Paulheim H, Nachtwey J, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets. In: ICWSM Schulz A, Hadjakos A, Paulheim H, Nachtwey J, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets. In: ICWSM
go back to reference Serdyukov P, Murdock V, Van Zwol R (2009) Placing flickr photos on a map. In: SIGIR Serdyukov P, Murdock V, Van Zwol R (2009) Placing flickr photos on a map. In: SIGIR
go back to reference Tsytsarau M, Amer-Yahia S, Palpanas T (2013) Efficient sentiment correlation for large-scale demographics. In: SIGMOD Tsytsarau M, Amer-Yahia S, Palpanas T (2013) Efficient sentiment correlation for large-scale demographics. In: SIGMOD
go back to reference Tsytsarau M, Palpanas T (2014) Nia: system for news impact analytics. In: KDD workshop on interactive data exploration and analytics (IDEA) Tsytsarau M, Palpanas T (2014) Nia: system for news impact analytics. In: KDD workshop on interactive data exploration and analytics (IDEA)
go back to reference Tsytsarau M, Palpanas T (2012) Survey on mining subjective data on the web. Data Min Knowl Discov 24:478–514CrossRefMATH Tsytsarau M, Palpanas T (2012) Survey on mining subjective data on the web. Data Min Knowl Discov 24:478–514CrossRefMATH
go back to reference Tsytsarau M, Palpanas T, Castellanos M (2014) Dynamics of news events and social media reaction. In: SIGKDD Tsytsarau M, Palpanas T, Castellanos M (2014) Dynamics of news events and social media reaction. In: SIGKDD
go back to reference Tsytsarau M, Palpanas T, Denecke K (2010) Scalable discovery of contradictions on the web. In: WWW Tsytsarau M, Palpanas T, Denecke K (2010) Scalable discovery of contradictions on the web. In: WWW
go back to reference Tsytsarau M, Palpanas T, Denecke K (2011) Scalable detection of sentiment-based contradictions. In: DiversiWeb, WWW Tsytsarau M, Palpanas T, Denecke K (2011) Scalable detection of sentiment-based contradictions. In: DiversiWeb, WWW
go back to reference Van Canneyt S, Van Laere O, Schockaert S, Dhoedt B (2012) Using social media to find places of interest: a case study. In: SIGSPATIAL (GEOCROWD) Van Canneyt S, Van Laere O, Schockaert S, Dhoedt B (2012) Using social media to find places of interest: a case study. In: SIGSPATIAL (GEOCROWD)
go back to reference Yuan Q, Cong G, Ma Z, Sun A, Thalmann NM (2013) Who, where, when and what: discover spatio-temporal topics for twitter users. In: SIGKDD Yuan Q, Cong G, Ma Z, Sun A, Thalmann NM (2013) Who, where, when and what: discover spatio-temporal topics for twitter users. In: SIGKDD
go back to reference Zafarani R, Liu H (2015) Evaluation without ground truth in social media research. Commun ACM 58(6):54–60CrossRef Zafarani R, Liu H (2015) Evaluation without ground truth in social media research. Commun ACM 58(6):54–60CrossRef
Metadata
Title
Where has this tweet come from? Fast and fine-grained geolocalization of non-geotagged tweets
Authors
Pavlos Paraskevopoulos
Themis Palpanas
Publication date
01-12-2016
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2016
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0400-7

Premium Partner