skip to main content
research-article

Extracting City Traffic Events from Social Streams

Published:10 July 2015Publication History
Skip Abstract Section

Abstract

Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.

References

  1. Charu C. Aggarwal and Karthik Subbian. 2012. Event detection in social streams. In Proceedings of the 2012 SIAM International Conference on Data Mining (SDM’12). 624--635.Google ScholarGoogle Scholar
  2. Alias-i. 2008. LingPipe 4.1.0. Retrieved June 22, 2015, from http://alias-i.com/lingpipe.Google ScholarGoogle Scholar
  3. Pramod Anantharam and Biplav Srivastava. 2013. City notifications as a data source for traffic management. In Proceedings of the 20th ITS World Congress.Google ScholarGoogle Scholar
  4. Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond trending topics: Real-world event identification on Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  5. Jennifer Bélissent. 2010. Getting Clever About Smart Cities: New Opportunities Require New Business models. Retrieved June 22, 2015, from http://193.40.244.77/iot/wp-content/uploads/2014/02/getting_clever_about_smart_cities_new_opportunities.pdf.Google ScholarGoogle Scholar
  6. Jennifer Bélissent. 2013. Service Providers Accelerate Smart City Projects. Retrieved June 22, 2015, from http://blogs.forrester.com/jennifer_belissent_phd/13-07-31-cities_dont_go_it_alone_service_providers_accelerate_smart_city_projects, http://www.forrester.com/pimages/rws/reprints/document/82981/oid/1-LTEQ9N.Google ScholarGoogle Scholar
  7. Jeffrey A. Burke, Deborah Estrin, Mark Hansen, Andrew Parker, Nithya Ramanathan, Sasank Reddy, and Mani B. Srivastava. 2006. Participatory sensing. In Proceedings of the World Sensor Web Workshop (Sensys’06).Google ScholarGoogle Scholar
  8. Edwin Chen. 2012. Introduction to Conditional Random Fields. Retrieved June 22, 2015, from http://blog.echen.me/2012/01/03/introduction-to-conditional-random-fields/.Google ScholarGoogle Scholar
  9. Beate Commentz-Walter. 1979. A String Matching Algorithm Fast on the Average. Springer.Google ScholarGoogle Scholar
  10. Wenwen Dou, Xiaoyu Wang, William Ribarsky, and Michelle Zhou. 2012. Event detection in social media data. In Proceedings of the IEEE VisWeek Workshop on Interactive Visual Text Analytics--Task Driven Analytics of Social Media Content. 971--980.Google ScholarGoogle Scholar
  11. Charles Elkan. 2008. Log-Linear Models and Conditional Random Fields. Retrieved June 22, 2015, from http://videolectures.net/cikm08_elkan_llmacrf/.Google ScholarGoogle Scholar
  12. Luca Filipponi, Andrea Vitaletti, Giada Landi, Vincenzo Memeo, Giorgio Laura, and Paolo Pucci. 2010. Smart city: An event driven architecture for monitoring public spaces with heterogeneous sensors. In Proceedings of the 2010 4rth International Conference on Sensor Technologies and Applications (SENSORCOMM’10). IEEE, Los Alamitos, CA, 281--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ralph Grishman, Silja Huttunen, and Roman Yangarber. 2002. Real-time event extraction for infectious disease outbreaks. In Proceedings of the 2nd International Conference on Human Language Technology Research. 366--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mordechai Haklay and Patrick Weber. 2008. Openstreetmap: User-generated street maps. IEEE Pervasive Computing 7, 4, 12--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michael Kehoe, Michael Cosgrove, Steven De Gennaro, Colin Harrison, Wim Harthoorn, John Hogan, John Meegan, Pam Nesbitt, and Christina Peters. 2011. Smarter Cities Series: A Foundation for Understanding IBM Smarter Cities. Retrieved June 22, 2015, from http://www.redbooks.ibm.com/redpapers/pdfs/redp4733.pdf.Google ScholarGoogle Scholar
  16. Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Giridhar Kumaran and James Allan. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 297--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 591--600. DOI:http://dx.doi.org/10.1145/1772690.1772751 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 282--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vasileios Lampos and Nello Cristianini. 2012. Nowcasting events from the social Web with statistical learning. ACM Transactions on Intelligent Systems and Technology 3, 4, 72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Greg Lindsay. 2010. Cisco’s Big Bet on New Songdo: Creating Cities from Scratch. Retrieved June 22, 2015, from http://www.fastcompany.com/1514547/ciscos-big-bet-new-songdo-creating-cities-scratch.Google ScholarGoogle Scholar
  22. Mingrong Liu, Yicen Liu, Liang Xiang, Xing Chen, and Qing Yang. 2008. Extracting key entities and significant events from online daily news. In Intelligent Data Engineering and Automated Learning—IDEAL 2008. Lecture Notes in Computer Science, Vol. 5326. Springer, 201--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jiakang Lu, Tamim Sookoor, Vijay Srinivasan, Ge Gao, Brian Holben, John Stankovic, Eric Field, and Kamin Whitehouse. 2010. The smart thermostat: Using occupancy sensors to save energy in homes. In Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems. ACM, New York, NY, 211--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lluís Màrquez, Xavier Carreras, Kenneth C. Litkowski, and Suzanne Stevenson. 2008. Semantic role labeling: An introduction to the special issue. Computational Linguistics 34, 2, 145--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrew Kachites McCallum. 2002. Mallet: Machine Learning for Language Toolkit. Available at http://mallet.cs.umass.edu.Google ScholarGoogle Scholar
  26. Dunja Mladenić and Alexandra Moraru. 2012. Complex Event Processing and Data Mining for Smart Cities. Retrieved June 22, 2015, from http://videolectures.net/is2012_moraru_smart_cities/.Google ScholarGoogle Scholar
  27. Meenakshi Nagarajan, Karthik Gomadam, Amit P. Sheth, Ajith Ranabahu, Raghava Mutharaju, and Ashutosh Jadhav. 2009. Spatio-temporal-thematic analysis of citizen sensor data: Challenges and experiences. In Web Information Systems Engineering—WISE 2009. Lecture Notes in Computer Science, Vol. 5802. Springer, 539--553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Milind Naphade, Guruduth Banavar, Colin Harrison, Jurij Paraszczak, and Robert Morris. 2011. Smarter cities and their innovation challenges. Computer 44, 6, 32--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Martina Naughton, Nicholas Kushmerick, and Joseph Carthy. 2006. Event extraction from heterogeneous news sources. In Proceedings of the AAAI Workshop on Event Extraction and Synthesis. 1--6.Google ScholarGoogle Scholar
  30. Masayuki Okamoto and Masaaki Kikuchi. 2009. Discovering volatile events in your neighborhood: Local-area topic extraction from blog entries. In Information Retrieval Technology. Lecture Notes in Computer Science, Vol. 5839. Springer, 181--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. John Pucher, Nisha Korattyswaroopam, and Neenu Ittyerah. 2004. The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation 7, 95--113.Google ScholarGoogle ScholarCross RefCross Ref
  32. Lance A. Ramshaw and Mitchell P. Marcus. 1999. Text chunking using transformation-based learning. In Natural Language Processing Using Very Large Corpora. Text, Speech, and Language Technology, Vol. 11. Springer, 157--176.Google ScholarGoogle Scholar
  33. Alan Ritter, Mausam, Oren Etzioni, and Sam Clark 2012. Open domain event extraction from Twitter. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1104--1112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, New York, NY, 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hassan Sayyadi, Matthew Hurst, and Alexey Maykov. 2009. Event detection and tracking in social streams. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM’09).Google ScholarGoogle Scholar
  36. Amit Sheth. 2009. Citizen sensing, social signals, and enriching human experience. IEEE Internet Computing 13, 4, 87--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hristo Tanev, Jakub Piskorski, and Martin Atkinson. 2008. Real-time news event extraction for global crisis monitoring. In Natural Language and Information Systems. Lecture Notes in Computer Science, Vol. 5039. Springer, 207--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Xiaofeng Wang, Matthew S. Gerber, and Donald E. Brown. 2012. Automatic crime prediction using events extracted from Twitter posts. In Social Computing, Behavioral--Cultural Modeling and Prediction. Lecture Notes in Computer Science, Vol. 7227. Springer, 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extracting City Traffic Events from Social Streams

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 6, Issue 4
      Regular Papers and Special Section on Intelligent Healthcare Informatics
      August 2015
      419 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2801030
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 July 2015
      • Accepted: 1 January 2015
      • Revised: 1 November 2014
      • Received: 1 April 2014
      Published in tist Volume 6, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader