Skip to main content
Erschienen in: GeoInformatica 4/2016

01.10.2016

Automatic targeted-domain spatiotemporal event detection in twitter

verfasst von: Ting Hua, Feng Chen, Liang Zhao, Chang-Tien Lu, Naren Ramakrishnan

Erschienen in: GeoInformatica | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Twitter has become an important data source for detecting events, especially tracking detailed information for events of a specific domain. Previous studies on targeted-domain Twitter information extraction have used supervised learning techniques to identify domain-related tweets, however, the need for extensive manual labeling makes these supervised systems extremely expensive to build and maintain. What’s more, most of these existing work fail to consider spatiotemporal factors, which are essential attributes of target-domain events. In this paper, we propose a semi-supervised method for Automatical Targeted-domain Spatiotemporal Event Detection (ATSED) in Twitter. Given a targeted domain, ATSED first learns tweet labels from historical data, and then detects on-going events from real-time Twitter data streams. Specifically, an efficient label generation algorithm is proposed to automatically recognize tweet labels from domain-related news articles, a customized classifier is created for Twitter data analysis by utilizing tweets’ distinguishing features, and a novel multinomial spatial-scan model is provided to identify geographical locations for detected events. Experiments on 305 million tweets demonstrated the effectiveness of this new approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 438–441 Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 438–441
2.
Zurück zum Zitat Bhattacharya I (2013) Google trends for formulating GIS mapping of disease outbreaks in India. Int J Geoinform 9. Springer Bhattacharya I (2013) Google trends for formulating GIS mapping of disease outbreaks in India. Int J Geoinform 9. Springer
3.
Zurück zum Zitat Brants T, Chen F, Farahat A (2003) A system for new event detection. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 330–337 Brants T, Chen F, Farahat A (2003) A system for new event detection. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 330–337
4.
Zurück zum Zitat Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the 10th international workshop on multimedia data mining. ACM, pp 1–10 Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the 10th international workshop on multimedia data mining. ACM, pp 1–10
5.
Zurück zum Zitat Fung GPC, Yu JX, Yu PS, Lu H (2005) Parameter free bursty events detection in text streams. In: Proceedings of the 31st international conference on very large data bases. VLDB Endowment, pp 181–192 Fung GPC, Yu JX, Yu PS, Lu H (2005) Parameter free bursty events detection in text streams. In: Proceedings of the 31st international conference on very large data bases. VLDB Endowment, pp 181–192
6.
Zurück zum Zitat Hu M, Liu S, Wei F, Wu Y, Stasko J, Ma KL (2012) Breaking news on twitter. In: Proceedings of the 21st SIGCHI conference on human factors in computing systems. ACM, pp 2751–2754 Hu M, Liu S, Wei F, Wu Y, Stasko J, Ma KL (2012) Breaking news on twitter. In: Proceedings of the 21st SIGCHI conference on human factors in computing systems. ACM, pp 2751–2754
7.
Zurück zum Zitat Jin O, Liu NN, Zhao K, Yu Y, Yang Q (2011) Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 775–784 Jin O, Liu NN, Zhao K, Yu Y, Yang Q (2011) Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 775–784
8.
Zurück zum Zitat Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Springer, pp 137–142 Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Springer, pp 137–142
9.
Zurück zum Zitat Kulldorff M (1999) Spatial scan statistics: models, calculations, and applications. In: Scan statistics and applications. Springer, pp 303–322 Kulldorff M (1999) Spatial scan statistics: models, calculations, and applications. In: Scan statistics and applications. Springer, pp 303–322
10.
Zurück zum Zitat Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: Proceedings of the 27th annual ACM SIGIR conference on research and development in information retrieval. ACM, pp 297–304 Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: Proceedings of the 27th annual ACM SIGIR conference on research and development in information retrieval. ACM, pp 297–304
11.
Zurück zum Zitat Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. In: Proceedings of the VLDB endowment, vol 5. VLDB Endowment, pp 836–847 Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. In: Proceedings of the VLDB endowment, vol 5. VLDB Endowment, pp 836–847
12.
Zurück zum Zitat Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49:764–766. ElsevierCrossRef Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49:764–766. ElsevierCrossRef
13.
Zurück zum Zitat Li C, Sun A, Datta A (2012) Twevent: segment-based event detection from tweets. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 155–164 Li C, Sun A, Datta A (2012) Twevent: segment-based event detection from tweets. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 155–164
14.
Zurück zum Zitat Li R, Lei KH, Khadiwala R, Chang KCC (2012) Tedas: a twitter-based event detection and analysis system. In: Proceedings of the 28th international conference on data engineering. IEEE, pp 1273–1276 Li R, Lei KH, Khadiwala R, Chang KCC (2012) Tedas: a twitter-based event detection and analysis system. In: Proceedings of the 28th international conference on data engineering. IEEE, pp 1273–1276
15.
Zurück zum Zitat Mark N (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. National Acad SciencesCrossRef Mark N (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. National Acad SciencesCrossRef
16.
Zurück zum Zitat Min B, Grishman R, Wan L, Wang C, Gondek D (2013) Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL. ACL, pp 777–782 Min B, Grishman R, Wan L, Wang C, Gondek D (2013) Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL. ACL, pp 777–782
17.
Zurück zum Zitat Muthiah S, Huang B, Arredondo J, Mares D, Getoor L, Katz G, Ramakrishnan N (2015) Planned protest modeling in news and social media. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, pp 3920–3927 Muthiah S, Huang B, Arredondo J, Mares D, Getoor L, Katz G, Ramakrishnan N (2015) Planned protest modeling in news and social media. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, pp 3920–3927
18.
Zurück zum Zitat Neill DB (2012) Fast subset scan for spatial pattern detection. J R Stat Soc Ser B (Stat Methodol) 74:337–360. Wiley Online LibraryCrossRef Neill DB (2012) Fast subset scan for spatial pattern detection. J R Stat Soc Ser B (Stat Methodol) 74:337–360. Wiley Online LibraryCrossRef
19.
Zurück zum Zitat Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. IEEECrossRef Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. IEEECrossRef
20.
Zurück zum Zitat Petrović S., Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics. ACL, pp 181–189 Petrović S., Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics. ACL, pp 181–189
21.
Zurück zum Zitat Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on world wide web. ACM, pp 91–100 Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on world wide web. ACM, pp 91–100
22.
Zurück zum Zitat Popescu AM, Pennacchiotti M, Paranjpe D (2011) Extracting events and event descriptions from twitter. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 105–106 Popescu AM, Pennacchiotti M, Paranjpe D (2011) Extracting events and event descriptions from twitter. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 105–106
23.
Zurück zum Zitat Purver M, Battersby S (2012) Experimenting with distant supervision for emotion classification. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics. ACL, pp 482–491 Purver M, Battersby S (2012) Experimenting with distant supervision for emotion classification. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics. ACL, pp 482–491
24.
Zurück zum Zitat Ritter A, Clark S, Etzioni O et al (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing. ACL, pp 1524–1534 Ritter A, Clark S, Etzioni O et al (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing. ACL, pp 1524–1534
25.
Zurück zum Zitat Ritter A, Mausam, Etzioni O, Clark S (2012) Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1104–1112 Ritter A, Mausam, Etzioni O, Clark S (2012) Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1104–1112
26.
Zurück zum Zitat Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860 Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860
27.
Zurück zum Zitat Settles B (2010) Active learning literature survey, vol 52. University of Wisconsin, Madison, p 11 Settles B (2010) Active learning literature survey, vol 52. University of Wisconsin, Madison, p 11
28.
Zurück zum Zitat Signorini A, Segre AM, Polgreen PM (2011) The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS One 6:e19467. Public Library of Science Signorini A, Segre AM, Polgreen PM (2011) The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS One 6:e19467. Public Library of Science
29.
Zurück zum Zitat Tufekci Z, Wilson C (2012) Social media and the decision to participate in political protest: observations from Tahrir Square. J Commun 62:363–379. Wiley Online LibraryCrossRef Tufekci Z, Wilson C (2012) Social media and the decision to participate in political protest: observations from Tahrir Square. J Commun 62:363–379. Wiley Online LibraryCrossRef
30.
Zurück zum Zitat Walker HM (1931) Studies in the history of the statistical method. The Williams and Wilkins Company, pp 24–25 Walker HM (1931) Studies in the history of the statistical method. The Williams and Wilkins Company, pp 24–25
31.
Zurück zum Zitat Weng J, Lee BS (2011) Event detection in twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 401–408 Weng J, Lee BS (2011) Event detection in twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 401–408
32.
Zurück zum Zitat Wilson C, Dunn A (2011) Digital media in the Egyptian revolution: descriptive analysis from the Tahrir data sets. Int J Commun 5:1248–1272. USC Annenberg Press Wilson C, Dunn A (2011) Digital media in the Egyptian revolution: descriptive analysis from the Tahrir data sets. Int J Commun 5:1248–1272. USC Annenberg Press
33.
Zurück zum Zitat Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on World wide web. ACM, pp 247–256 Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on World wide web. ACM, pp 247–256
34.
Zurück zum Zitat Zhang D, Liu Y, Lawrence RD, Chenthamarakshan V (2011) Transfer latent semantic learning: microblog mining with less supervision. In: Proceedings of the 25th AAAI conference on artificial intelligence. AAAI, pp 561–566 Zhang D, Liu Y, Lawrence RD, Chenthamarakshan V (2011) Transfer latent semantic learning: microblog mining with less supervision. In: Proceedings of the 25th AAAI conference on artificial intelligence. AAAI, pp 561–566
35.
Zurück zum Zitat Zhao L, Hua T, Lu CT, Chen R (2015) A topic-focused trust model for Twitter. In: Journal of Computer Communications, vol 76. Springer, pp 1–11 Zhao L, Hua T, Lu CT, Chen R (2015) A topic-focused trust model for Twitter. In: Journal of Computer Communications, vol 76. Springer, pp 1–11
Metadaten
Titel
Automatic targeted-domain spatiotemporal event detection in twitter
verfasst von
Ting Hua
Feng Chen
Liang Zhao
Chang-Tien Lu
Naren Ramakrishnan
Publikationsdatum
01.10.2016
Verlag
Springer US
Erschienen in
GeoInformatica / Ausgabe 4/2016
Print ISSN: 1384-6175
Elektronische ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-016-0263-0

Weitere Artikel der Ausgabe 4/2016

GeoInformatica 4/2016 Zur Ausgabe