Skip to main content
Top
Published in: GeoInformatica 4/2016

01-10-2016

Automatic targeted-domain spatiotemporal event detection in twitter

Authors: Ting Hua, Feng Chen, Liang Zhao, Chang-Tien Lu, Naren Ramakrishnan

Published in: GeoInformatica | Issue 4/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Twitter has become an important data source for detecting events, especially tracking detailed information for events of a specific domain. Previous studies on targeted-domain Twitter information extraction have used supervised learning techniques to identify domain-related tweets, however, the need for extensive manual labeling makes these supervised systems extremely expensive to build and maintain. What’s more, most of these existing work fail to consider spatiotemporal factors, which are essential attributes of target-domain events. In this paper, we propose a semi-supervised method for Automatical Targeted-domain Spatiotemporal Event Detection (ATSED) in Twitter. Given a targeted domain, ATSED first learns tweet labels from historical data, and then detects on-going events from real-time Twitter data streams. Specifically, an efficient label generation algorithm is proposed to automatically recognize tweet labels from domain-related news articles, a customized classifier is created for Twitter data analysis by utilizing tweets’ distinguishing features, and a novel multinomial spatial-scan model is provided to identify geographical locations for detected events. Experiments on 305 million tweets demonstrated the effectiveness of this new approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 438–441 Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 438–441
2.
go back to reference Bhattacharya I (2013) Google trends for formulating GIS mapping of disease outbreaks in India. Int J Geoinform 9. Springer Bhattacharya I (2013) Google trends for formulating GIS mapping of disease outbreaks in India. Int J Geoinform 9. Springer
3.
go back to reference Brants T, Chen F, Farahat A (2003) A system for new event detection. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 330–337 Brants T, Chen F, Farahat A (2003) A system for new event detection. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 330–337
4.
go back to reference Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the 10th international workshop on multimedia data mining. ACM, pp 1–10 Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the 10th international workshop on multimedia data mining. ACM, pp 1–10
5.
go back to reference Fung GPC, Yu JX, Yu PS, Lu H (2005) Parameter free bursty events detection in text streams. In: Proceedings of the 31st international conference on very large data bases. VLDB Endowment, pp 181–192 Fung GPC, Yu JX, Yu PS, Lu H (2005) Parameter free bursty events detection in text streams. In: Proceedings of the 31st international conference on very large data bases. VLDB Endowment, pp 181–192
6.
go back to reference Hu M, Liu S, Wei F, Wu Y, Stasko J, Ma KL (2012) Breaking news on twitter. In: Proceedings of the 21st SIGCHI conference on human factors in computing systems. ACM, pp 2751–2754 Hu M, Liu S, Wei F, Wu Y, Stasko J, Ma KL (2012) Breaking news on twitter. In: Proceedings of the 21st SIGCHI conference on human factors in computing systems. ACM, pp 2751–2754
7.
go back to reference Jin O, Liu NN, Zhao K, Yu Y, Yang Q (2011) Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 775–784 Jin O, Liu NN, Zhao K, Yu Y, Yang Q (2011) Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 775–784
8.
go back to reference Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Springer, pp 137–142 Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Springer, pp 137–142
9.
go back to reference Kulldorff M (1999) Spatial scan statistics: models, calculations, and applications. In: Scan statistics and applications. Springer, pp 303–322 Kulldorff M (1999) Spatial scan statistics: models, calculations, and applications. In: Scan statistics and applications. Springer, pp 303–322
10.
go back to reference Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: Proceedings of the 27th annual ACM SIGIR conference on research and development in information retrieval. ACM, pp 297–304 Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: Proceedings of the 27th annual ACM SIGIR conference on research and development in information retrieval. ACM, pp 297–304
11.
go back to reference Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. In: Proceedings of the VLDB endowment, vol 5. VLDB Endowment, pp 836–847 Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. In: Proceedings of the VLDB endowment, vol 5. VLDB Endowment, pp 836–847
12.
go back to reference Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49:764–766. ElsevierCrossRef Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49:764–766. ElsevierCrossRef
13.
go back to reference Li C, Sun A, Datta A (2012) Twevent: segment-based event detection from tweets. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 155–164 Li C, Sun A, Datta A (2012) Twevent: segment-based event detection from tweets. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 155–164
14.
go back to reference Li R, Lei KH, Khadiwala R, Chang KCC (2012) Tedas: a twitter-based event detection and analysis system. In: Proceedings of the 28th international conference on data engineering. IEEE, pp 1273–1276 Li R, Lei KH, Khadiwala R, Chang KCC (2012) Tedas: a twitter-based event detection and analysis system. In: Proceedings of the 28th international conference on data engineering. IEEE, pp 1273–1276
15.
go back to reference Mark N (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. National Acad SciencesCrossRef Mark N (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. National Acad SciencesCrossRef
16.
go back to reference Min B, Grishman R, Wan L, Wang C, Gondek D (2013) Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL. ACL, pp 777–782 Min B, Grishman R, Wan L, Wang C, Gondek D (2013) Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL. ACL, pp 777–782
17.
go back to reference Muthiah S, Huang B, Arredondo J, Mares D, Getoor L, Katz G, Ramakrishnan N (2015) Planned protest modeling in news and social media. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, pp 3920–3927 Muthiah S, Huang B, Arredondo J, Mares D, Getoor L, Katz G, Ramakrishnan N (2015) Planned protest modeling in news and social media. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, pp 3920–3927
18.
go back to reference Neill DB (2012) Fast subset scan for spatial pattern detection. J R Stat Soc Ser B (Stat Methodol) 74:337–360. Wiley Online LibraryCrossRef Neill DB (2012) Fast subset scan for spatial pattern detection. J R Stat Soc Ser B (Stat Methodol) 74:337–360. Wiley Online LibraryCrossRef
19.
go back to reference Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. IEEECrossRef Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. IEEECrossRef
20.
go back to reference Petrović S., Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics. ACL, pp 181–189 Petrović S., Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics. ACL, pp 181–189
21.
go back to reference Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on world wide web. ACM, pp 91–100 Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on world wide web. ACM, pp 91–100
22.
go back to reference Popescu AM, Pennacchiotti M, Paranjpe D (2011) Extracting events and event descriptions from twitter. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 105–106 Popescu AM, Pennacchiotti M, Paranjpe D (2011) Extracting events and event descriptions from twitter. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 105–106
23.
go back to reference Purver M, Battersby S (2012) Experimenting with distant supervision for emotion classification. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics. ACL, pp 482–491 Purver M, Battersby S (2012) Experimenting with distant supervision for emotion classification. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics. ACL, pp 482–491
24.
go back to reference Ritter A, Clark S, Etzioni O et al (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing. ACL, pp 1524–1534 Ritter A, Clark S, Etzioni O et al (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing. ACL, pp 1524–1534
25.
go back to reference Ritter A, Mausam, Etzioni O, Clark S (2012) Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1104–1112 Ritter A, Mausam, Etzioni O, Clark S (2012) Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1104–1112
26.
go back to reference Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860 Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860
27.
go back to reference Settles B (2010) Active learning literature survey, vol 52. University of Wisconsin, Madison, p 11 Settles B (2010) Active learning literature survey, vol 52. University of Wisconsin, Madison, p 11
28.
go back to reference Signorini A, Segre AM, Polgreen PM (2011) The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS One 6:e19467. Public Library of Science Signorini A, Segre AM, Polgreen PM (2011) The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS One 6:e19467. Public Library of Science
29.
go back to reference Tufekci Z, Wilson C (2012) Social media and the decision to participate in political protest: observations from Tahrir Square. J Commun 62:363–379. Wiley Online LibraryCrossRef Tufekci Z, Wilson C (2012) Social media and the decision to participate in political protest: observations from Tahrir Square. J Commun 62:363–379. Wiley Online LibraryCrossRef
30.
go back to reference Walker HM (1931) Studies in the history of the statistical method. The Williams and Wilkins Company, pp 24–25 Walker HM (1931) Studies in the history of the statistical method. The Williams and Wilkins Company, pp 24–25
31.
go back to reference Weng J, Lee BS (2011) Event detection in twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 401–408 Weng J, Lee BS (2011) Event detection in twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media. AAAI, pp 401–408
32.
go back to reference Wilson C, Dunn A (2011) Digital media in the Egyptian revolution: descriptive analysis from the Tahrir data sets. Int J Commun 5:1248–1272. USC Annenberg Press Wilson C, Dunn A (2011) Digital media in the Egyptian revolution: descriptive analysis from the Tahrir data sets. Int J Commun 5:1248–1272. USC Annenberg Press
33.
go back to reference Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on World wide web. ACM, pp 247–256 Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on World wide web. ACM, pp 247–256
34.
go back to reference Zhang D, Liu Y, Lawrence RD, Chenthamarakshan V (2011) Transfer latent semantic learning: microblog mining with less supervision. In: Proceedings of the 25th AAAI conference on artificial intelligence. AAAI, pp 561–566 Zhang D, Liu Y, Lawrence RD, Chenthamarakshan V (2011) Transfer latent semantic learning: microblog mining with less supervision. In: Proceedings of the 25th AAAI conference on artificial intelligence. AAAI, pp 561–566
35.
go back to reference Zhao L, Hua T, Lu CT, Chen R (2015) A topic-focused trust model for Twitter. In: Journal of Computer Communications, vol 76. Springer, pp 1–11 Zhao L, Hua T, Lu CT, Chen R (2015) A topic-focused trust model for Twitter. In: Journal of Computer Communications, vol 76. Springer, pp 1–11
Metadata
Title
Automatic targeted-domain spatiotemporal event detection in twitter
Authors
Ting Hua
Feng Chen
Liang Zhao
Chang-Tien Lu
Naren Ramakrishnan
Publication date
01-10-2016
Publisher
Springer US
Published in
GeoInformatica / Issue 4/2016
Print ISSN: 1384-6175
Electronic ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-016-0263-0

Other articles of this Issue 4/2016

GeoInformatica 4/2016 Go to the issue