Abstract
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
- Charu C. Aggarwal and Karthik Subbian. 2012. Event detection in social streams. In Proceedings of the 2012 SIAM International Conference on Data Mining (SDM’12). 624--635.Google Scholar
- Alias-i. 2008. LingPipe 4.1.0. Retrieved June 22, 2015, from http://alias-i.com/lingpipe.Google Scholar
- Pramod Anantharam and Biplav Srivastava. 2013. City notifications as a data source for traffic management. In Proceedings of the 20th ITS World Congress.Google Scholar
- Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond trending topics: Real-world event identification on Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.Google Scholar
- Jennifer Bélissent. 2010. Getting Clever About Smart Cities: New Opportunities Require New Business models. Retrieved June 22, 2015, from http://193.40.244.77/iot/wp-content/uploads/2014/02/getting_clever_about_smart_cities_new_opportunities.pdf.Google Scholar
- Jennifer Bélissent. 2013. Service Providers Accelerate Smart City Projects. Retrieved June 22, 2015, from http://blogs.forrester.com/jennifer_belissent_phd/13-07-31-cities_dont_go_it_alone_service_providers_accelerate_smart_city_projects, http://www.forrester.com/pimages/rws/reprints/document/82981/oid/1-LTEQ9N.Google Scholar
- Jeffrey A. Burke, Deborah Estrin, Mark Hansen, Andrew Parker, Nithya Ramanathan, Sasank Reddy, and Mani B. Srivastava. 2006. Participatory sensing. In Proceedings of the World Sensor Web Workshop (Sensys’06).Google Scholar
- Edwin Chen. 2012. Introduction to Conditional Random Fields. Retrieved June 22, 2015, from http://blog.echen.me/2012/01/03/introduction-to-conditional-random-fields/.Google Scholar
- Beate Commentz-Walter. 1979. A String Matching Algorithm Fast on the Average. Springer.Google Scholar
- Wenwen Dou, Xiaoyu Wang, William Ribarsky, and Michelle Zhou. 2012. Event detection in social media data. In Proceedings of the IEEE VisWeek Workshop on Interactive Visual Text Analytics--Task Driven Analytics of Social Media Content. 971--980.Google Scholar
- Charles Elkan. 2008. Log-Linear Models and Conditional Random Fields. Retrieved June 22, 2015, from http://videolectures.net/cikm08_elkan_llmacrf/.Google Scholar
- Luca Filipponi, Andrea Vitaletti, Giada Landi, Vincenzo Memeo, Giorgio Laura, and Paolo Pucci. 2010. Smart city: An event driven architecture for monitoring public spaces with heterogeneous sensors. In Proceedings of the 2010 4rth International Conference on Sensor Technologies and Applications (SENSORCOMM’10). IEEE, Los Alamitos, CA, 281--286. Google ScholarDigital Library
- Ralph Grishman, Silja Huttunen, and Roman Yangarber. 2002. Real-time event extraction for infectious disease outbreaks. In Proceedings of the 2nd International Conference on Human Language Technology Research. 366--369. Google ScholarDigital Library
- Mordechai Haklay and Patrick Weber. 2008. Openstreetmap: User-generated street maps. IEEE Pervasive Computing 7, 4, 12--18. Google ScholarDigital Library
- Michael Kehoe, Michael Cosgrove, Steven De Gennaro, Colin Harrison, Wim Harthoorn, John Hogan, John Meegan, Pam Nesbitt, and Christina Peters. 2011. Smarter Cities Series: A Foundation for Understanding IBM Smarter Cities. Retrieved June 22, 2015, from http://www.redbooks.ibm.com/redpapers/pdfs/redp4733.pdf.Google Scholar
- Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press. Google ScholarDigital Library
- Giridhar Kumaran and James Allan. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 297--304. Google ScholarDigital Library
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 591--600. DOI:http://dx.doi.org/10.1145/1772690.1772751 Google ScholarDigital Library
- John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 282--289. Google ScholarDigital Library
- Vasileios Lampos and Nello Cristianini. 2012. Nowcasting events from the social Web with statistical learning. ACM Transactions on Intelligent Systems and Technology 3, 4, 72. Google ScholarDigital Library
- Greg Lindsay. 2010. Cisco’s Big Bet on New Songdo: Creating Cities from Scratch. Retrieved June 22, 2015, from http://www.fastcompany.com/1514547/ciscos-big-bet-new-songdo-creating-cities-scratch.Google Scholar
- Mingrong Liu, Yicen Liu, Liang Xiang, Xing Chen, and Qing Yang. 2008. Extracting key entities and significant events from online daily news. In Intelligent Data Engineering and Automated Learning—IDEAL 2008. Lecture Notes in Computer Science, Vol. 5326. Springer, 201--209. Google ScholarDigital Library
- Jiakang Lu, Tamim Sookoor, Vijay Srinivasan, Ge Gao, Brian Holben, John Stankovic, Eric Field, and Kamin Whitehouse. 2010. The smart thermostat: Using occupancy sensors to save energy in homes. In Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems. ACM, New York, NY, 211--224. Google ScholarDigital Library
- Lluís Màrquez, Xavier Carreras, Kenneth C. Litkowski, and Suzanne Stevenson. 2008. Semantic role labeling: An introduction to the special issue. Computational Linguistics 34, 2, 145--159. Google ScholarDigital Library
- Andrew Kachites McCallum. 2002. Mallet: Machine Learning for Language Toolkit. Available at http://mallet.cs.umass.edu.Google Scholar
- Dunja Mladenić and Alexandra Moraru. 2012. Complex Event Processing and Data Mining for Smart Cities. Retrieved June 22, 2015, from http://videolectures.net/is2012_moraru_smart_cities/.Google Scholar
- Meenakshi Nagarajan, Karthik Gomadam, Amit P. Sheth, Ajith Ranabahu, Raghava Mutharaju, and Ashutosh Jadhav. 2009. Spatio-temporal-thematic analysis of citizen sensor data: Challenges and experiences. In Web Information Systems Engineering—WISE 2009. Lecture Notes in Computer Science, Vol. 5802. Springer, 539--553. Google ScholarDigital Library
- Milind Naphade, Guruduth Banavar, Colin Harrison, Jurij Paraszczak, and Robert Morris. 2011. Smarter cities and their innovation challenges. Computer 44, 6, 32--39. Google ScholarDigital Library
- Martina Naughton, Nicholas Kushmerick, and Joseph Carthy. 2006. Event extraction from heterogeneous news sources. In Proceedings of the AAAI Workshop on Event Extraction and Synthesis. 1--6.Google Scholar
- Masayuki Okamoto and Masaaki Kikuchi. 2009. Discovering volatile events in your neighborhood: Local-area topic extraction from blog entries. In Information Retrieval Technology. Lecture Notes in Computer Science, Vol. 5839. Springer, 181--192. Google ScholarDigital Library
- John Pucher, Nisha Korattyswaroopam, and Neenu Ittyerah. 2004. The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation 7, 95--113.Google ScholarCross Ref
- Lance A. Ramshaw and Mitchell P. Marcus. 1999. Text chunking using transformation-based learning. In Natural Language Processing Using Very Large Corpora. Text, Speech, and Language Technology, Vol. 11. Springer, 157--176.Google Scholar
- Alan Ritter, Mausam, Oren Etzioni, and Sam Clark 2012. Open domain event extraction from Twitter. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1104--1112. Google ScholarDigital Library
- Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, New York, NY, 851--860. Google ScholarDigital Library
- Hassan Sayyadi, Matthew Hurst, and Alexey Maykov. 2009. Event detection and tracking in social streams. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM’09).Google Scholar
- Amit Sheth. 2009. Citizen sensing, social signals, and enriching human experience. IEEE Internet Computing 13, 4, 87--92. Google ScholarDigital Library
- Hristo Tanev, Jakub Piskorski, and Martin Atkinson. 2008. Real-time news event extraction for global crisis monitoring. In Natural Language and Information Systems. Lecture Notes in Computer Science, Vol. 5039. Springer, 207--218. Google ScholarDigital Library
- Xiaofeng Wang, Matthew S. Gerber, and Donald E. Brown. 2012. Automatic crime prediction using events extracted from Twitter posts. In Social Computing, Behavioral--Cultural Modeling and Prediction. Lecture Notes in Computer Science, Vol. 7227. Springer, 231--238. Google ScholarDigital Library
Index Terms
- Extracting City Traffic Events from Social Streams
Recommendations
Real World City Event Extraction from Twitter Data Streams
The immediacy of social media messages means that it can act as a rich and timely source of real world event information. The detected events can provide a context to observations made by other city information sources such as fixed sensor installations ...
Unsupervised event exploration from social text streams
Social media provides unprecedented opportunities for people to disseminate information and share their opinions and views online. Extracting events from social media platforms such as Twitter could help in understanding what is being discussed. ...
Can a small city be considered a smart city?
The term smart cities has been widely used over the last years. The main goal of the smart cities initiative is to enable cities to manage their assets efficiently, investing in innovation and creativity as a way to promote sustainable and inclusive ...
Comments