nach oben

Information Systems Frontiers

Erschienen in:

22.02.2018

Weakly Supervised and Online Learning of Word Models for Classification to Detect Disaster Reporting Tweets

verfasst von: Girish Keshav Palshikar, Manoj Apte, Deepak Pandita

Erschienen in: Information Systems Frontiers | Ausgabe 5/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Social media has quickly established itself as an important means that people, NGOs and governments use to spread information during natural or man-made disasters, mass emergencies and crisis situations. Given this important role, real-time analysis of social media contents to locate, organize and use valuable information for disaster management is crucial. In this paper, we propose self-learning algorithms that, with minimal supervision, construct a simple bag-of-words model of information expressed in the news about various natural disasters. Such a model is human-understandable, human-modifiable and usable in a real-time scenario. Since tweets are a different category of documents than news, we next propose a model transfer algorithm, which essentially refines the model learned from news by analyzing a large unlabeled corpus of tweets. We show empirically that model transfer improves the predictive accuracy of the model. We demonstrate empirically that our model learning algorithm is better than several state of the art semi-supervised learning algorithms. Finally, we present an online algorithm that learns the weights for words in the model and demonstrate the efficacy of the model with word weights.

Vorheriger Artikel Classifying and Summarizing Information from Microblogs During Epidemics

Nächster Artikel Analysis and Early Detection of Rumors in a Post Disaster Scenario

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://fire.irsi.res.in/fire/data

Guerra, P.H.C., Veloso, A., Meira, W.Jr., & Almeida, V. (2011). From bias to opinion: a transfer-learning approach to real-time sentiment analysis. In Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 150–158): ACM.

Dai, W., Xue, G.-R., Yang, Q., & Yong, Y. (2007). Transferring naive bayes classifiers for text classification. In Proceedings of the national conference on artificial intelligence 1999 (Vol. 22, p. 540). Menlo Park, CA; Cambridge, MA; London: AAAI Press; MIT Press.

Davidov, D., Tsur, O., & Rappoport, A. (2010). Semi-supervised recognition of sarcastic sentences in twitter and amazon. In Proceedings of the fourteenth conference on computational natural language learning (pp. 107–116). Association for Computational Linguistics.

De Boom, C., Van Canneyt, S., Demeester, T., & Dhoedt, B. (2016). Representation learning for very short texts using weighted word embedding aggregation. Pattern Recognition Letters, 80(C), 150–156.CrossRef

Druck, G., Mann, G., & McCallum, A. (2008). Learning from labeled features using generalized expectation criteria. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 595–602). ACM.

Greene, D., & Cunningham, P. (2006). Practical solutions to the problem of diagonal dominance in kernel document clustering. In Proceedings 23rd international conference on machine learning (ICML06) (pp. 377384). ACM Press.

Imran, M., Castillo, C., Diaz, F., & Vieweg, S. (2015). Processing social media messages in mass emergency: a survey. ACM Computing Surveys, 47(4), 67:1–67:38.CrossRef

Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proceedings of the sixteenth international conference on machine learning (ICML 99) (pp. 200–209).

Kenter, T., & de Rijke, M. (2015). Short text similarity with word embeddings. In Proceedings of the 24th ACM international on conference on information and knowledge management, CIKM ’15 (pp. 1411–1420).

McCallum, A.K. (2002). Mallet: a machine learning for language toolkit. http://mallet.cs.umass.edu .

Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning the MIT press.

Musaev, A., De, W., & Litmus, C.P. (2014). Landslide detection by integrating multiple sources. In 11th international conference information systems for crisis response and management (ISCRAM).

Nigam, K., McCallum, A.K., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled documents using em. Machine Learning, 39(2-3), 103–134.CrossRef

Pennington, J., Socher, R., & Manning, C.D. (2014). Glove: global vectors for word representation. In Empirical methods in natural language processing (EMNLP) (pp. 1532–1543).

Ritter, A., Wright, E., Casey, W., & Mitchell, T. (2015). Weakly supervised extraction of computer security events from twitter. In Proceedings of the 24th international conference on world wide web (pp.896–905). ACM.

Roy, Suman D., Mei, T., Zeng, W., & Li, S. (2012). Socialtransfer: cross-domain transfer learning from social streams for media applications. In Proceedings of the 20th ACM international conference on multimedia (pp. 649–658). ACM.

Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shake s twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on world wide web (pp. 851–860). ACM.

Tsur, O., Davidov, D., & name, A.R. (2010). Icwsm-a great catchy Semi-supervised recognition of sarcastic sentences in online product reviews. In ICWSM.

Yang, C.C., Shi, X., & Wei, C.-P. (2009). Discovering event evolution graphs from news corpora. IEEE Transactions on Systems, Man, and cybernetics-Part A: Systems and Humans, 39(4), 850–863.CrossRef

Zhao, Q., Mitra, P., & Bi, C. (2007). Temporal and information flow based event detection from social text streams. In AAAI (Vol. 7, pp. 1501–1506).

Zhao, Z., Da, Y., Ng, W., & Gao, S. (2013). A transfer learning based framework of crowd-selection on twitter. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1514–1517). ACM.

Zhao, L., Chen, F., Dai, J., Hua, T., Lu, C.-T., & Ramakrishnan, N. (2014). Unsupervised spatial event detection in targeted domains with applications to civil unrest modeling. PLOS ONE, 9(10).

Zhou, Y., Kantarcioglu, M., & Thuraisingham, B. (2012). Self-training with selection-by-rejection. In 2012 IEEE 12th international conference on data mining (pp. 795–803). IEEE.

Zhu, X., Ghahramani, Z., & Lafferty, J. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In ICML (pp. 912–919).

Titel: Weakly Supervised and Online Learning of Word Models for Classification to Detect Disaster Reporting Tweets
verfasst von: Girish Keshav Palshikar
Manoj Apte
Deepak Pandita
Publikationsdatum: 22.02.2018
Verlag: Springer US
Erschienen in: Information Systems Frontiers / Ausgabe 5/2018
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI: https://doi.org/10.1007/s10796-018-9830-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 5/2018

A New Mashup Based Method for Event Detection from Social Media

WebMAC: A web based clinical expert system

Analysis and Early Detection of Rumors in a Post Disaster Scenario

Emergency Vocabulary

Facilitating resource sharing and selection in ubiquitous multi-user environments

Exploring knowledge management software implementation from a knowing-in-practice perspective

Premium Partner