Skip to main content
Erschienen in: GeoInformatica 3/2018

15.03.2017

Strategies for combining Twitter users geo-location methods

verfasst von: Silvio Ribeiro Jr, Gisele L. Pappa

Erschienen in: GeoInformatica | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Twitter has become a major player in the social media scene with over half billion users and over 500 million tweets published daily. With this abundant data, researchers saw the opportunity to explore this data for monitoring events and tracking epidemics. In this type of application, knowing the location of the user is essential. However, most of the information about location self-reported by users is difficult to process, and barely 1% of all published tweets are geolocated. Hence, user location inference is often performed by analyzing public available information from the user profile and his tweets. In this work, we evaluate and compare 16 approaches for user location inference based on different information sources that include interaction networks and text from tweets. We show that methods working with the user friendship network obtain higher values of accuracy and recall when compared to the other methods. From these results, we verify the agreement of pairs of methods regarding the predicted location and the users they cover. We find out that most methods disagree in their inferences while covering different sets of users. These results open up an opportunity to combine different methods in order to improve location accuracy and user recall. We propose four methods for combining the outputs of the evaluated methods. Two of them, one based on a weighting vote scheme (GAVe) and another based on a meta decision tree cover at least 98% of the users in the dataset, while location 75% of them within a distance of 100 km from their real location.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Abrol S, Khan L (2010) Tweethood: Agglomerative clustering on fuzzy k-closest friends with variable depth for location mining 2nd Int. Conf. on Social Computing (SocialCom), pp 153–160 Abrol S, Khan L (2010) Tweethood: Agglomerative clustering on fuzzy k-closest friends with variable depth for location mining 2nd Int. Conf. on Social Computing (SocialCom), pp 153–160
2.
Zurück zum Zitat Aramaki E, Maskawa S, Morita M (2011) Twitter catches the flu: detecting influenza epidemics using Twitter Proceedings of the Conference on empirical methods in natural language processing, pp 1568–1576 Aramaki E, Maskawa S, Morita M (2011) Twitter catches the flu: detecting influenza epidemics using Twitter Proceedings of the Conference on empirical methods in natural language processing, pp 1568–1576
3.
Zurück zum Zitat Backstrom L, Sun E, Marlow C (2010) Find me if you can: improving geographical prediction with social and spatial proximity Proceedings of the 19th Int. Conf on World Wide Web, pp 61–70 Backstrom L, Sun E, Marlow C (2010) Find me if you can: improving geographical prediction with social and spatial proximity Proceedings of the 19th Int. Conf on World Wide Web, pp 61–70
4.
Zurück zum Zitat Bouillot F, Poncelet P, Roche M et al (2012) How and why exploit tweet’s location information? International Conference on Geographic Information Science (AGILE) Bouillot F, Poncelet P, Roche M et al (2012) How and why exploit tweet’s location information? International Conference on Geographic Information Science (AGILE)
5.
Zurück zum Zitat Brazdil P, Gira‘ud-Carrier C, Soares C, Vilalta R (2008) Metalearning: Applications to Data Mining. Springer Brazdil P, Gira‘ud-Carrier C, Soares C, Vilalta R (2008) Metalearning: Applications to Data Mining. Springer
6.
Zurück zum Zitat Chandra S, Khan L, Muhaya FB (2011) Estimating Twitter user location using social interactions–a content based approach 3rd Int. Conf. on Social Computing (SocialCom), pp 838–843 Chandra S, Khan L, Muhaya FB (2011) Estimating Twitter user location using social interactions–a content based approach 3rd Int. Conf. on Social Computing (SocialCom), pp 838–843
7.
Zurück zum Zitat Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating Twitter users Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 759–768 Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating Twitter users Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 759–768
8.
Zurück zum Zitat Compton R, Jurgens D, Allen D (2014) Geotagging one hundred million Twitter accounts with total variation minimization IEEE Int Conf on Big Data, pp 393–401 Compton R, Jurgens D, Allen D (2014) Geotagging one hundred million Twitter accounts with total variation minimization IEEE Int Conf on Big Data, pp 393–401
9.
Zurück zum Zitat Crandall D, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. Proc Natl Acad Sci 107 (52):22436–22441CrossRef Crandall D, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. Proc Natl Acad Sci 107 (52):22436–22441CrossRef
10.
Zurück zum Zitat Davis Jr C, Pappa GL, Rennó Rocha de Oliveira D, de L Arcanjo F (2011) Inferring the location of Twitter messages based on user relationships. Trans GIS 15 (6):735–751 Davis Jr C, Pappa GL, Rennó Rocha de Oliveira D, de L Arcanjo F (2011) Inferring the location of Twitter messages based on user relationships. Trans GIS 15 (6):735–751
11.
Zurück zum Zitat Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 1277–1287 Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 1277–1287
12.
Zurück zum Zitat Finkel J, Grenager T, Manning Ch (2005) Incorporating non-local information into information extraction systems by gibbs sampling Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 363–370 Finkel J, Grenager T, Manning Ch (2005) Incorporating non-local information into information extraction systems by gibbs sampling Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 363–370
13.
Zurück zum Zitat Gelernter J, Mushegian N (2011) Geo-parsing messages from microtext. Trans GIS 15(6):753–773CrossRef Gelernter J, Mushegian N (2011) Geo-parsing messages from microtext. Trans GIS 15(6):753–773CrossRef
14.
Zurück zum Zitat Goldberg DE (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., 1st edition Goldberg DE (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., 1st edition
15.
Zurück zum Zitat Graham M, Hale SA, Gaffney D (2013) Where in the world are you? geolocation and language identification in Twitter CoRR, abs/1308.0683, abs/1308.0683 Graham M, Hale SA, Gaffney D (2013) Where in the world are you? geolocation and language identification in Twitter CoRR, abs/1308.0683, abs/1308.0683
16.
Zurück zum Zitat Bo H, Cook P, Baldwin T (2014) Text-based Twitter user geolocation prediction. Journal of Artificial Intelligence Research, pages 451–500 Bo H, Cook P, Baldwin T (2014) Text-based Twitter user geolocation prediction. Journal of Artificial Intelligence Research, pages 451–500
17.
Zurück zum Zitat Hecht B, Hong L, Suh B, Chi EH (2011) Tweets from justin bieber’s heart: the dynamics of the location field in user profiles Proceedings of the SIGCHI Conf. on Human Factors in Computing Systems, pp 237–246 Hecht B, Hong L, Suh B, Chi EH (2011) Tweets from justin bieber’s heart: the dynamics of the location field in user profiles Proceedings of the SIGCHI Conf. on Human Factors in Computing Systems, pp 237–246
18.
Zurück zum Zitat Ikawa Y, Enoki M, Tatsubori M (2012) Location inference using microblog messages Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 687–690 Ikawa Y, Enoki M, Tatsubori M (2012) Location inference using microblog messages Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 687–690
19.
Zurück zum Zitat Jurgens D (2013) That’s what friends are for: Inferring location in online social media platforms based on social relationships ICWSM Jurgens D (2013) That’s what friends are for: Inferring location in online social media platforms based on social relationships ICWSM
20.
Zurück zum Zitat Jurgens D, McCorriston J, Xu YT, Ruths D (2015) Geolocation prediction in Twitter using social networks: A critical analysis and review of current practice ICWSM Jurgens D, McCorriston J, Xu YT, Ruths D (2015) Geolocation prediction in Twitter using social networks: A critical analysis and review of current practice ICWSM
21.
Zurück zum Zitat Kinsella S, Murdock V, O’Hare N (2011) I’m eating a sandwich in glasgow: modeling locations with tweets Proceedings of the 3rd Int. Workshop on Search and Mining user-generated contents, pp 61–68 Kinsella S, Murdock V, O’Hare N (2011) I’m eating a sandwich in glasgow: modeling locations with tweets Proceedings of the 3rd Int. Workshop on Search and Mining user-generated contents, pp 61–68
22.
Zurück zum Zitat Kohen J (1960) A coefficient of agreement for nominal scale. Educ Psychol Meas 20:37–46CrossRef Kohen J (1960) A coefficient of agreement for nominal scale. Educ Psychol Meas 20:37–46CrossRef
23.
Zurück zum Zitat Longbo K, Liu Z, Huang Y (2014) Spot: Locating social media users based on social network context Proceedings of the VLDB Endowment, vol 7 Longbo K, Liu Z, Huang Y (2014) Spot: Locating social media users based on social network context Proceedings of the VLDB Endowment, vol 7
24.
Zurück zum Zitat Li R, Wang S, Chang KC-C (2012) Multiple location profiling for users and relationships from social network and content. Proceedings of the VLDB Endowment 5(11):1603–1614CrossRef Li R, Wang S, Chang KC-C (2012) Multiple location profiling for users and relationships from social network and content. Proceedings of the VLDB Endowment 5(11):1603–1614CrossRef
25.
Zurück zum Zitat Mahmud J, Nichols J, Drews C (2012) Where is this tweet from? inferring home locations of Twitter users International AAAI Conference on Weblogs and Social Media Mahmud J, Nichols J, Drews C (2012) Where is this tweet from? inferring home locations of Twitter users International AAAI Conference on Weblogs and Social Media
26.
Zurück zum Zitat Paradesi SM (2011) Geotagging tweets using their content FLAIRS Conference Paradesi SM (2011) Geotagging tweets using their content FLAIRS Conference
27.
Zurück zum Zitat Ren K, Zhang S, Lin H (2012) Where are you settling down: Geo-locating Twitter users based on tweets and social networks Information Retrieval Technology, pp 150–161 Ren K, Zhang S, Lin H (2012) Where are you settling down: Geo-locating Twitter users based on tweets and social networks Information Retrieval Technology, pp 150–161
28.
Zurück zum Zitat Ribeiro Jr SS, Davis Jr CA, Oliveira DRR, Meira Jr W, Gonċalves TS, Pappa GL (2012) Traffic observatory: a system to detect and locate traffic events and conditions using Twitter Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-Based Social Networks. ACM, pp 5– 11 Ribeiro Jr SS, Davis Jr CA, Oliveira DRR, Meira Jr W, Gonċalves TS, Pappa GL (2012) Traffic observatory: a system to detect and locate traffic events and conditions using Twitter Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-Based Social Networks. ACM, pp 5– 11
29.
Zurück zum Zitat Rodrigues E, Assunção R, Pappa GL, Renno D, Meira Jr. W (2015) Exploring multiple evidence to infer users’ location in Twitter. Neurocomputing, pages – Rodrigues E, Assunção R, Pappa GL, Renno D, Meira Jr. W (2015) Exploring multiple evidence to infer users’ location in Twitter. Neurocomputing, pages –
30.
Zurück zum Zitat Roller S, Speriosu M, Rallapalli S, Wing B, Baldridge J (2012) Supervised text-based geolocation using language models on an adaptive grid Proceedings of the Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp 1500–1510 Roller S, Speriosu M, Rallapalli S, Wing B, Baldridge J (2012) Supervised text-based geolocation using language models on an adaptive grid Proceedings of the Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp 1500–1510
31.
Zurück zum Zitat Rout D, Bontcheva K, Preoṫiuc-Pietro D, Cohn T (2013) Where’s@ wally?: a classification approach to geolocating users based on their social ties Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp 11–20 Rout D, Bontcheva K, Preoṫiuc-Pietro D, Cohn T (2013) Where’s@ wally?: a classification approach to geolocating users based on their social ties Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp 11–20
32.
Zurück zum Zitat Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors Proceedings of the 19th Int. Conf. on World Wide Web, pp 851–860 Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors Proceedings of the 19th Int. Conf. on World Wide Web, pp 851–860
33.
Zurück zum Zitat Schulz A, Hadjakos As, Paulheim H, Nachtwey Js, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets Proceedings of the 7th Int. Conf. on Weblogs and Social Media, International AAAI Conference on Weblogs and Social Media Schulz A, Hadjakos As, Paulheim H, Nachtwey Js, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets Proceedings of the 7th Int. Conf. on Weblogs and Social Media, International AAAI Conference on Weblogs and Social Media
34.
Zurück zum Zitat Sultanik EA, Fink C (2012) Rapid geotagging and disambiguation of social media text via an indexed gazetteer ISCRAM, 2012, pp 1–10 Sultanik EA, Fink C (2012) Rapid geotagging and disambiguation of social media text via an indexed gazetteer ISCRAM, 2012, pp 1–10
35.
Zurück zum Zitat Takhteyev Y, Gruzd A, Wellman B (2012) Geography of Twitter networks. Soc Networks 34(1):73–81CrossRef Takhteyev Y, Gruzd A, Wellman B (2012) Geography of Twitter networks. Soc Networks 34(1):73–81CrossRef
36.
Zurück zum Zitat Todorovski L, DŻeroski S (2000) Combining multiple models with meta decision trees. Springer Todorovski L, DŻeroski S (2000) Combining multiple models with meta decision trees. Springer
37.
Zurück zum Zitat Wing B, Baldridge J (2011) Simple supervised document geolocation with geodesic grids ACL, vol 11, pp 955–964 Wing B, Baldridge J (2011) Simple supervised document geolocation with geodesic grids ACL, vol 11, pp 955–964
38.
Zurück zum Zitat Witten IH, Frank E, Hall MA (2011) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc. 3rd edition Witten IH, Frank E, Hall MA (2011) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc. 3rd edition
39.
Zurück zum Zitat Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Technical report, Carnegie Mellon University Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Technical report, Carnegie Mellon University
Metadaten
Titel
Strategies for combining Twitter users geo-location methods
verfasst von
Silvio Ribeiro Jr
Gisele L. Pappa
Publikationsdatum
15.03.2017
Verlag
Springer US
Erschienen in
GeoInformatica / Ausgabe 3/2018
Print ISSN: 1384-6175
Elektronische ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-017-0296-z

Weitere Artikel der Ausgabe 3/2018

GeoInformatica 3/2018 Zur Ausgabe