Skip to main content

2017 | Buch

Seeing Cities Through Big Data

Research, Methods and Applications in Urban Informatics

herausgegeben von: Piyushimita (Vonu) Thakuriah, Nebiyou Tilahun, Moira Zellner

Verlag: Springer International Publishing

Buchreihe : Springer Geography

insite
SUCHEN

Über dieses Buch

This book introduces the latest thinking on the use of Big Data in the context of urban systems, including research and insights on human behavior, urban dynamics, resource use, sustainability and spatial disparities, where it promises improved planning, management and governance in the urban sectors (e.g., transportation, energy, smart cities, crime, housing, urban and regional economies, public health, public engagement, urban governance and political systems), as well as Big Data’s utility in decision-making, and development of indicators to monitor economic and social activity, and for urban sustainability, transparency, livability, social inclusion, place-making, accessibility and resilience.

Inhaltsverzeichnis

Frontmatter
Introduction to Seeing Cities Through Big Data: Research, Methods and Applications in Urban Informatics
Abstract
The chapters in this book were first presented in a 2-day workshop on Big Data and Urban Informatics held at the University of Illinois at Chicago in 2014. The workshop, sponsored by the National Science Foundation, brought together approximately 150 educators, practitioners and students from 91 different institutions in 11 countries. Participants represented a variety of academic disciplines including Urban Planning, Computer Science, Civil Engineering, Economics, Statistics, and Geography and provided a unique opportunity for discussions by urban social scientists and data scientists interested in the use of Big Data to address urban challenges. The papers in this volume are a selected subset of those presented at the workshop and have gone through a peer-review process.
Piyushimita (Vonu) Thakuriah, Nebiyou Y. Tilahun, Moira Zellner
Big Data and Urban Informatics: Innovations and Challenges to Urban Planning and Knowledge Discovery
Abstract
Big Data is the term being used to describe a wide spectrum of observational or “naturally-occurring” data generated through transactional, operational, planning and social activities that are not specifically designed for research. Due to the structure and access conditions associated with such data, their use for research and analysis becomes significantly complicated. New sources of Big Data are rapidly emerging as a result of technological, institutional, social, and business innovations. The objective of this background paper is to describe emerging sources of Big Data, their use in urban research, and the challenges that arise with their use. To a certain extent, Big Data in the urban context has become narrowly associated with sensor (e.g., Internet of Things) or socially generated (e.g., social media or citizen science) data. However, there are many other sources of observational data that are meaningful to different groups of urban researchers and user communities. Examples include privately held transactions data, confidential administrative micro-data, data from arts and humanities collections, and hybrid data consisting of synthetic or linked data.
The emerging area of Urban Informatics focuses on the exploration and understanding of urban systems by leveraging novel sources of data. The major potential of Urban Informatics research and applications is in four areas: (1) improved strategies for dynamic urban resource management, (2) theoretical insights and knowledge discovery of urban patterns and processes, (3) strategies for urban engagement and civic participation, and (4) innovations in urban management, and planning and policy analysis. Urban Informatics utilizes Big Data in innovative ways by retrofitting or repurposing existing urban models and simulations that are underpinned by a wide range of theoretical traditions, as well as through data-driven modeling approaches that are largely theory agnostic, although these divergent research approaches are starting to converge in some ways. The paper surveys the kinds of urban problems being considered by going from a data-poor environment to a data-rich world and the ways in which such enquiries have the potential to enhance our understanding, not only of urban systems and processes overall, but also contextual peculiarities and local experiences. The paper concludes by commenting on challenges that are likely to arise in varying degrees when using Big Data for Urban Informatics: technological, methodological, theoretical/epistemological, and the emerging political economy of Big Data.
Piyushimita (Vonu) Thakuriah, Nebiyou Y. Tilahun, Moira Zellner
Erratum to: Planning for the Change: Mapping Sea Level Rise and Storm Inundation in Sherman Island Using 3Di Hydrodynamic Model and LiDAR
Yang Ju, Wei-Chen Hsu, John D. Radke, William Fourt, Wei Lang, Olivier Hoes, Howard Foster, Gregory S. Biging, Martine Schmidt-Poolman, Rosanna Neuhausler, Amna Alruheil, William Maier

Analytics of User-Generated Content

Frontmatter
Using User-Generated Content to Understand Cities
Abstract
Understanding urban dynamics is crucial for a number of domains, but it can be expensive and time consuming to gather necessary data. The rapid rise of social media has given us a new and massive source of geotagged data that can be transformative in terms of how we understand our cities. In this position paper, we describe three opportunities in using geotagged social media data: to help city planners, to help small businesses, and to help individuals adapt to their city better. We also sketch some possible research projects to help map out the design space, as well as discuss some limitations and challenges in using this kind of data.
Dan Tasse, Jason I. Hong
Developing an Interactive Mobile Volunteered Geographic Information Platform to Integrate Environmental Big Data and Citizen Science in Urban Management
Abstract
A significant technical gap exists between the large amount of complex scientific environmental big data and the limited accessibility to these datasets. Mobile platforms are increasingly becoming important channels through which citizens can receive and report information. Mobile devices can be used to report Volunteered Geographic Information (VGI), which can be useful data in environmental management. This paper evaluates the strengths, weaknesses, opportunities, and threats for the selected real cases: “Field Photo,” “CoCoRaHS,” “OakMapper,” “What’s Invasive!”, “Leafsnap,” “U.S. Green Infrastructure Reporter”, and “Nebraska Wetlands”. Based on these case studies, the results indicate that active, loyal and committed users are key to ensuring the success of citizen science projects. Online and off-line activities should be integrated to promote the effectiveness of public engagement in environmental management. It is an urgent need to transfer complex environmental big data to citizens’ daily mobile devices which will then allow them to participate in urban environmental management. A technology framework is provided to improve existing mobile-based environmental engagement initiatives.
Zhenghong Tang, Yanfu Zhou, Hongfeng Yu, Yue Gu, Tiantian Liu
CyberGIS-Enabled Urban Sensing from Volunteered Citizen Participation Using Mobile Devices
Abstract
Environmental pollution has significant impact on citizens’ health and wellbeing in urban settings. While a variety of sensors have been integrated into today’s urban environments for measuring various pollution factors such as air quality and noise, to set up sensor networks or employ surveyors to collect urban pollution datasets remains costly and may involve legal implications. An alternative approach is based on the notion of volunteered citizens as sensors for collecting, updating and disseminating urban environmental measurements using mobile devices. A Big Data scenario emerges as large-scale crowdsourcing activities tend to generate sizable and unstructured datasets with near real-time updates. Conventional computational infrastructures are inadequate for handling such Big Data, for example, designing a “one-fits-all” database schema to accommodate diverse measurements, or dynamically generating pollution maps based on visual analytical workflows.
This paper describes a CyberGIS-enabled urban sensing framework to facilitate the volunteered participation of citizens in sensing environmental pollutions using mobile devices. Since CyberGIS is based on advanced cyberinfrastructure and characterized as high performance, distributed, and collaborative GIS, the framework enables interactive visual analytics for big urban data. Specifically, this framework integrates a MongoDB cluster for data management (without requiring a predefined schema), a MapReduce approach to extracting and aggregating sensor measurements, and a scalable kernel smoothing algorithm using a graphics processing unit (GPU) for rapid pollution map generation. We demonstrate the functionality of this framework though a use case scenario of mapping noise levels, where an implemented mobile application is used for capturing geo-tagged and time-stamped noise level measurements as engaged users move around in urban settings.
Junjun Yin, Yizhao Gao, Shaowen Wang

Challenges and Opportunities of Urban Big Data

Frontmatter
The Potential for Big Data to Improve Neighborhood-Level Census Data
Abstract
The promise of “big data” for those who study cities is that it offers new ways of understanding urban environments and processes. Big data exists within broader national data economies, these data economies have changed in ways that are both poorly understood by the average data consumer and of significant consequence for the application of data to urban problems. For example, high resolution demographic and economic data from the United States Census Bureau since 2010 has declined by some key measures of data quality. For some policy-relevant variables, like the number of children under 5 in poverty, the estimates are almost unusable. Of the 56,204 census tracts for which a childhood poverty estimate was available 40,941 had a margin of error greater than the estimate in the 2007–2011 American Community Survey (ACS) (72.8 % of tracts). For example, the ACS indicates that Census Tract 196 in Brooklyn, NY has 169 children under 5 in poverty ±174 children, suggesting somewhere between 0 and 343 children in the area live in poverty. While big data is exciting and novel, basic questions about American Cities are all but unanswerable in the current data economy. Here we highlight the potential for data fusion strategies, leveraging novel forms of big data and traditional federal surveys, to develop useable data that allows effective understanding of intra urban demographic and economic patterns. This paper outlines the methods used to construct neighborhood-level census data and suggests key points of technical intervention where “big” data might be used to improve the quality of neighborhood-level statistics.
Seth E. Spielman
Big Data and Survey Research: Supplement or Substitute?
Abstract
The increasing availability of organic Big Data has prompted questions regarding its usefulness as an auxiliary data source that can enhance the value of design-based survey data, or possibly serve as a replacement for it. Big Data’s potential value as a substitute for survey data is largely driven by recognition of the potential cost savings associated with a transition from reliance on expensive and often slow-to-complete survey data collection to reliance on far less-costly and readily available Big Data sources. There may be, of course, serious methodological costs of doing so. We review and compare the advantages and disadvantages of survey-based vs. Big Data-based methodologies, concluding that each data source has unique qualities and that future efforts to find ways of integrating data obtained from varying sources, including Big Data and survey research, are most likely to be fruitful.
Timothy P. Johnson, Tom W. Smith
Big Spatio-Temporal Network Data Analytics for Smart Cities: Research Needs
Abstract
Increasingly, location-aware sensors in urban transportation networks are generating a wide variety of data which has spatio-temporal network semantics. Examples include temporally detailed roadmaps, GPS tracks, traffic signal timings, and vehicle measurements. These datasets, which we collectively call Big Spatio-Temporal Network (BSTN) Data, have value addition potential for several smart-city use-cases including navigation services which recommend eco-friendly routes. However, BSTN data pose significant computational challenges regarding the assumptions of the current state-of-the-art analytic-techniques used in these services. This article attempts to put forth some potential research directions towards addressing the challenges of scalable analytics on BSTN data. Two kinds of BSTN data are considered here, viz., the vehicle measurement big data and the travel-time big data.
Venkata M. V. Gunturi, Shashi Shekhar
A Review of Heteroscedasticity Treatment with Gaussian Processes and Quantile Regression Meta-models
Abstract
For regression problems, the general practice is to consider a constant variance of the error term across all data. This aims to simplify an often complicated model and relies on the assumption that this error is independent of the input variables. This property is known as homoscedasticity. On the other hand, in the real world, this is often a naive assumption, as we are rarely able to exhaustively include all true explanatory variables for a regression. While Big Data is bringing new opportunities for regression applications, ignoring this limitation may lead to biased estimators and inaccurate confidence and prediction intervals.
This paper aims to study the treatment of non-constant variance in regression models, also known as heteroscedasticity. We apply two methodologies: integration of conditional variance within the regression model itself; treat the regression model as a black box and use a meta-model that analyzes the error separately. We compare the performance of both approaches using two heteroscedastic data sets.
Although accounting for heteroscedasticity in data increases the complexity of the models used, we show that it can greatly improve the quality of the predictions, and more importantly, it can provide a proper notion of uncertainty or “confidence” associated with those predictions. We also discuss the feasibility of the solutions in a Big Data context.
Francisco Antunes, Aidan O’Sullivan, Filipe Rodrigues, Francisco Pereira

Changing Organizational and Educational Perspectives with Urban Big Data

Frontmatter
Urban Informatics: Critical Data and Technology Considerations
Abstract
Cities around the world are investing significant resources toward making themselves smarter. In most cases, investments focus on leveraging data through emerging technologies that enable more real-time, automated, predictive, and intelligent decision-making by agents (humans) and objects (devices) within the city. Increasing the connectivity between the various systems and sub-systems of the city through integrative data and information management is also a critical undertaking towards making cities more intelligent. In this chapter, we frame cities as platforms. Specifically, we focus on how data and technology management is critical to the functioning of a city as an agile, adaptable, and scalable platform. The objective of this chapter is to raise your awareness of critical data and technology considerations that still need to be addressed if we are to realize the full potential of urban informatics.
Rashmi Krishnamurthy, Kendra L. Smith, Kevin C. Desouza
Digital Infomediaries and Civic Hacking in Emerging Urban Data Initiatives
Abstract
This paper assesses non-traditional urban digital infomediaries who are pushing the agenda of urban Big Data and Open Data. Our analysis identified a mix of private, public, non-profit and informal infomediaries, ranging from very large organizations to independent developers. Using a mixed-methods approach, we identified four major groups of organizations within this dynamic and diverse sector: general-purpose ICT providers, urban information service providers, open and civic data infomediaries, and independent and open source developers. A total of nine types of organizations are identified within these four groups.
We align these nine organizational types along five dimensions that account for their mission and major interests, products and services, as well activities they undertake: techno-managerial, scientific, business and commercial, urban engagement, and openness and transparency. We discuss urban ICT entrepreneurs, and the role of informal networks involving independent developers, data scientists and civic hackers in a domain that historically involved professionals in the urban planning and public management domains.
Additionally, we examine convergence in the sector by analyzing overlaps in their activities, as determined by a text mining exercise of organizational webpages. We also consider increasing similarities in products and services offered by the infomediaries, while highlighting ideological tensions that might arise given the overall complexity of the sector, and differences in the backgrounds and end-goals of the participants involved. There is much room for creation of knowledge and value networks in the urban data sector and for improved cross-fertilization among bodies of knowledge.
Piyushimita (Vonu) Thakuriah, Lise Dirks, Yaye Mallon Keita
How Should Urban Planners Be Trained to Handle Big Data?
Abstract
Historically urban planners have been educated and trained to work in a data poor environment. Urban planning students take courses in statistics, survey research and projection and estimation that are designed to fill in the gaps in this environment. For decades they have learned how to use census data, which is comprehensive on several basic variables, but is only conducted once per decade so is almost always out of date. More detailed population characteristics are based on a sample and are only available in aggregated form for larger geographic areas.
But new data sources, including distributed sensors, infrastructure monitoring, remote sensing, social media and cell phone tracking records, can provide much more detailed, individual, real time data at disaggregated levels that can be used at a variety of scales. We have entered a data rich environment, where we can have data on systems and behaviors for more frequent time increments and with a greater number of observations on a greater number of factors (The Age of Big Data, The New York Times, 2012; Now you see it: simple visualization techniques for quantitative analysis, Berkeley, 2009). Planners are still being trained in methods that are suitable for a data poor environment (J Plan Educ Res 6:10–21, 1986; Analytics over large-scale multidimensional data: the big data revolution!, 101–104, 2011; J Plan Educ Res 15:17–33, 1995). In this paper we suggest that visualization, simulation, data mining and machine learning are the appropriate tools to use in this new environment and we discuss how planning education can adapt to this new data rich landscape. We will discuss how these methods can be integrated into the planning curriculum as well as planning practice.
Steven P. French, Camille Barchers, Wenwen Zhang
Energy Planning in a Big Data Era: A Theme Study of the Residential Sector
Abstract
With a focus on planning for urban energy demand, this chapter re-conceptualizes the general planning process in the big data era based on the improvements that non-linear modeling approaches provide over mainstream traditional linear approaches. First, it demonstrates challenges of conventional linear methodologies in modeling complexities of residential energy demand. Suggesting a non-linear modeling schema to analyzing household energy demand, the paper develops its discussion around repercussions of the use of non-linear modeling in energy policy and planning. Planners and policy-makers are not often equipped with the tools needed to translate complex scientific outcomes into policies. To fill this gap, this chapter proposes modifications to the traditional planning process that will enable planning to benefit from the abundance of data and advances in analytical methodologies in the big data era. The conclusion section introduces short-term implications of the proposed process for energy planning (and planning, in general) in the big data era around three topics of: tool development, data infrastructures, and planning education.
Hossein Estiri

Urban Data Management

Frontmatter
Using an Online Spatial Analytics Workbench for Understanding Housing Affordability in Sydney
Abstract
In 2007 the world’s population became more urban than rural, and, according to the United Nations, this trend is to continue for the foreseeable future. With the increasing trend of people moving to urban localities—predominantly cities—additional pressures on services, infrastructure and housing is affecting the overall quality of life of city dwellers. City planners, policy makers and researchers more generally need access to tools and diverse and distributed data sets to help tackle these challenges.
In this paper we focus on the online analytical AURIN (Australian Urban Research Infrastructure Network) workbench, which provides a data driven approach for informing such issues. The workbench provides machine to machine (programmatic) online access to large scale distributed and heterogeneous data resources from the definitive data providers across Australia. This includes a rich repository of data which can be used to understand housing affordability in Australia. For example there is more than 20 years of longitudinal housing data nationwide, with information on each housing sales transaction at the property level. For the first time researchers can now systematically access this ‘big’ housing data resource to run spatial-statistical analysis to understand the driving forces behind a myriad of issues facing cities, including housing affordability which is a significant issue across many of Australia’s cities.
Christopher Pettit, Andrew Tice, Bill Randolph
A Big Data Mashing Tool for Measuring Transit System Performance
Abstract
This research aims to develop software tools to support the fusion and analysis of large, passively collected data sources for the purpose of measuring and monitoring transit system performance. This study uses San Francisco as a case study, taking advantage of the automated vehicle location (AVL) and automated passenger count (APC) data available on the city transit system. Because the AVL-APC data are only available on a sample of buses, a method is developed to expand the data to be representative of the transit system as a whole. In the expansion process, the General Transit Feed Specification (GTFS) data are used as a measure of the full set of scheduled transit service.
The data mashing tool reports and tracks transit system performance in these key dimensions:
  • Service Provided: vehicle trips, service miles;
  • Ridership: boardings, passenger miles; passenger hours, wheelchairs served, bicycles served;
  • Level-of-service: speed, dwell time, headway, fare, waiting time;
  • Reliability: on-time performance, average delay; and
  • Crowding: volume-capacity ratio, vehicles over 85 % of capacity, passenger hours over 85 % of capacity.
An important characteristic of this study is that it provides a tool for analyzing the trends over significant time periods—from 2009 through the present. The tool allows data for any two time periods to be queried and compared at the analyst’s request, and puts the focus specifically on the changes that occur in the system, and not just observing current conditions.
Gregory D. Erhardt, Oliver Lock, Elsa Arcaute, Michael Batty
Developing a Comprehensive U.S. Transit Accessibility Database
Abstract
This paper discusses the development of a national public transit job accessibility evaluation framework, focusing on lessons learned, data source evaluation and selection, calculation methodology, and examples of accessibility evaluation results. The accessibility evaluation framework described here builds on methods developed in earlier projects, extended for use on a national scale and at the Census block level. Application on a national scale involves assembling and processing a comprehensive national database of public transit network topology and travel times. This database incorporates the computational advancement of calculating accessibility continuously for every minute within a departure time window of interest. This increases computational complexity, but provides a very robust representation of the interaction between transit service frequency and accessibility at multiple departure times.
Andrew Owen, David M. Levinson
Seeing Chinese Cities Through Big Data and Statistics
Abstract
China has historically been an agricultural nation. China’s urbanization rate was reported to be 18 % in 1978 when it began its economic reforms. It has now become the second largest economy in the world. Urbanization in China increased dramatically in support of this economic growth, tripling to 54 % by the end of 2013. At the same time, many major urban problems also surfaced, including environmental degradation, lack of affordable housing, and traffic congestion. Economic growth will continue to be China’s central policy in the foreseeable future. Chinese cities are seriously challenged to support continuing economic growth with a high quality of life for their residents, while addressing the existing big city diseases. The term “Smart City” began to appear globally around 2008. Embracing the concept allows China to downscale its previous national approach to a more manageable city level. By the end of 2013, China has designated at least 193 locations to be smart city test sites; a national urbanization plan followed in March 2014. The direction of urban development and major challenges are identified in this paper. Some of them are global in nature, and some unique to China. The nation will undoubtedly continue to build their smarter cities in the coming years. The first integrated public information service platform was implemented for several test sites in 2013. It provides a one-stop center for millions of card-carrying residents to use a secure smart card to perform previously separate city functions and consolidate data collection. The pioneering system is real work in progress and helps to lay the foundation for building urban informatics in China. This paper also discusses the evolving research needs and data limitations, observes a smart city in progress, and makes some comparisons with the U.S. and other nations.
Jeremy S. Wu, Rui Zhang

Urban Knowledge Discovery Applied to Different Urban Contexts

Frontmatter
Planning for the Change: Mapping Sea Level Rise and Storm Inundation in Sherman Island Using 3Di Hydrodynamic Model and LiDAR
Abstract
In California, one of the greatest concerns of global climate change is sea level rise (SLR) associated with extreme storm events. Several studies were conducted to statically map SLR and storm inundation, while its dynamic was less studied. This study argues it is important to conduct dynamic simulation with high resolution data, and employs a 3Di hydrodynamic model to simulate the inundation of Sherman Island, California. The big data, high resolution digital surface model (DSM) from Light Detection and Ranging (LiDAR), was used to model the ground surface. The results include a series of simulated inundation, which show that when the sea level rises more than 1 m, there are major impacts on Sherman Island. In all, this study serves as a fine database for better planning, management, and governance to understand future scenarios.
Yang Ju, Wei-Chen Hsu, John D. Radke, William Fourt, Wei Lang, Olivier Hoes, Howard Foster, Gregory S. Biging, Martine Schmidt-Poolman, Rosanna Neuhausler, Amna Alruheil, William Maier
The Impact of Land-Use Variables on Free-Floating Carsharing Vehicle Rental Choice and Parking Duration
Abstract
Carsharing is an innovative transportation mobility solution, which offers the benefits of a personal vehicle without the burden of ownership. Free-floating carsharing service remains a relatively new concept and has gained popularity because it offers a flexible one-way auto rental option that charges usage by the minute. Traditionally, carsharing services require returning the rented vehicle to the same location where rented within a specified rental duration. Since free-floating service is a very new addition in the overall transportation system, the empirical research is still very limited. This study focuses on identifying the impact of land-use variables on free-floating carsharing vehicle rental choice and parking duration of car2go services in Austin, Texas on a typical weekday between the 9:00 AM and 12:00 PM off-hour period. Two different methodological approaches, namely a logistic regression model approach and a duration model technique are used for this purpose. The results of this study indicate that demographic variables, the carsharing parking policy, and the number of transit stops all affect the usage of free-floating carsharing vehicles.
Mubassira Khan, Randy Machemehl
Dynamic Agent Based Simulation of an Urban Disaster Using Synthetic Big Data
Abstract
This paper illustrates how synthetic big data can be generated from standard administrative small data. Small areal statistical units are decomposed into households and individuals using a GIS buildings data layer. Households and individuals are then profiled with socio-economic attributes and combined with an agent based simulation model in order to create dynamics. The resultant data is ‘big’ in terms of volume, variety and versatility. It allows for different layers of spatial information to be populated and embellished with synthetic attributes. The data decomposition process involves moving from a database describing only hundreds or thousands of spatial units to one containing records of millions of buildings and individuals over time. The method is illustrated in the context of a hypothetical earthquake in downtown Jerusalem. Agents interact with each other and their built environment. Buildings are characterized in terms of land-use, floor-space and value. Agents are characterized in terms of income and socio-demographic attributes and are allocated to buildings. Simple behavioral rules and a dynamic house pricing system inform residential location preferences and land use change, yielding a detailed account of urban spatial and temporal dynamics. These techniques allow for the bottom-up formulation of the behavior of an entire urban system. Outputs relate to land use change, change in capital stock and socio-economic vulnerability.
A. Yair Grinberger, Michal Lichter, Daniel Felsenstein
Estimation of Urban Transport Accessibility at the Spatial Resolution of an Individual Traveler
Abstract
Accessibility, particularly for public transport users is an important consideration in sustainable mobility policies. Various accessibility measures have been suggested in the literature, most at coarse aggregate spatial resolution of zones or neighborhoods. Based on recently available Big Urban GIS data our aim is to measure accessibility from the viewpoint of an individual traveler who traverses the transportation network from one building as origin to another at the destination. We estimate transport accessibility by car and by public transport based on mode-specific travel times and corresponding paths, including walking and waiting. A computational application that is based on the intensive querying of relational database management systems is developed to construct high-resolution accessibility maps for an entire metropolitan area. It is tested and implemented in a case study involving the evaluation of a new light rail line in the metropolitan area of Tel Aviv. The results show essential dependence of accessibility estimates on spatial resolution—high-resolution representations of the trip enable unbiased estimates. Specifically, we demonstrate that the contribution of the LRT to accessibility is overrated at low resolutions and for longer journeys. The new approach and fast computational method can be employed for investigating the distributional effects of transportation infrastructure investments and, further, for interactive planning of the urban transport network.
Itzhak Benenson, Eran Ben-Elia, Yodan Rofe, Amit Rosental
Modeling Taxi Demand and Supply in New York City Using Large-Scale Taxi GPS Data
Abstract
Data from taxicabs equipped with Global Positioning Systems (GPS) are collected by many transportation agencies, including the Taxi and Limousine Commission in New York City. The raw data sets are too large and complex to analyze directly with many conventional tools, but when the big data are appropriately processed and integrated with Geographic Information Systems (GIS), sophisticated demand models and visualizations of vehicle movements can be developed. These models are useful for providing insights about the nature of travel demand as well as the performance of the street network and the fleet of vehicles that use it. This paper demonstrates how big data collected from GPS in taxicabs can be used to model taxi demand and supply, using 10 months of taxi trip records from New York City. The resulting count models are used to identify locations and times of day when there is a mismatch between the availability of taxicabs and the demand for taxi service in the city. The findings are useful for making decisions about how to regulate and manage the fleet of taxicabs and other transportation systems in New York City.
Ci Yang, Eric J. Gonzales
Detecting Stop Episodes from GPS Trajectories with Gaps
Abstract
Given increased access to a stream of data collected by location acquisition technologies, the potential of GPS trajectory data is waiting to be realized in various application domains relevant to urban informatics—namely in understanding travel behavior, estimating carbon emission from vehicles, and further building healthy and sustainable cities. Partitioning GPS trajectories into meaningful elements is crucial to improve the performance of further analysis. We propose a method for detecting a stay point (where an individual stays for a while) using a density-based spatial clustering algorithm where temporal criterion and gaps are also taken into account. The proposed method fills gaps using linear interpolation, and identifies a stay point that meets two criteria (spatial density and time duration). To evaluate the proposed method, we compare the number of stay points detected from the proposed method to that of stay points identified by manual inspection. Evaluation is performed on 9 weeks of trajectory data. Results show that clustering-based stay point detection combined with gap treatment can reliably detect stop episodes. Further, comparison of performance between using the method with versus without gap treatment indicates that gap treatment improves the performance of the clustering-based stay point detection.
Sungsoon Hwang, Christian Evans, Timothy Hanke

Emergencies and Crisis

Frontmatter
Using Social Media and Satellite Data for Damage Assessment in Urban Areas During Emergencies
Abstract
Environmental hazards pose a significant threat to urban areas due to their potential catastrophic consequences affecting people, property and the environment. Remote sensing has become the de-facto standard for observing the Earth and its environment through the use of air-, space-, and ground-based sensors. Despite the quantity of remote sensing data available, gaps are often present due to the specific limitations of the instruments, their carrier platforms, or as a result of atmospheric interference. Massive amounts of data are generated from social media, and it is possible to mine these data to fill the gaps in remote sensing observations.
A new methodology is described which uses social networks to augment remote sensing imagery of transportation infrastructure conditions during emergencies. The capability is valuable in situations where environmental hazards such as hurricanes or severe weather affect very large areas. This research presents an application of the proposed methodology during the 2013 Colorado floods with a special emphasis in Boulder County and The City of Boulder. Real-time data collected from social media, such as Twitter, are fused with remote sensing data for transportation damage assessment. Data collected from social media can provide information when remote sensing data are lacking or unavailable.
Guido Cervone, Emily Schnebele, Nigel Waters, Martina Moccaldi, Rosa Sicignano

Health and Well-Being

Frontmatter
‘Big Data’: Pedestrian Volume Using Google Street View Images
Abstract
Responding to the widespread growing interest in walkable, transit-oriented development and healthy communities, many recent studies in planning and public health are concentrating on improving the pedestrian environment. There is, however, inadequate research on pedestrian volume and movement. In addition, the method of data collection for detailed information about pedestrian activity has been insufficient and inefficient. Google Street View provides panoramic views along many streets of the U.S. and around the world. This study introduces an image-based machine learning method to planners for detecting pedestrian activity from Google Street View images. The detection results are shown to resemble the pedestrian counts collected by field work. In addition to recommending an alternative method for collecting pedestrian count data more consistently and subjectively for future research, this study also stimulates discussion of the use of ‘big data’ for planning and design.
Li Yin, Qimin Cheng, Zhenfeng Shao, Zhenxin Wang, Laiyun Wu
Learning from Outdoor Webcams: Surveillance of Physical Activity Across Environments
Abstract
Publicly available, outdoor webcams continuously view the world and share images. These cameras include traffic cams, campus cams, ski-resort cams, etc. The Archive of Many Outdoor Scenes (AMOS) is a project aiming to geolocate, annotate, archive, and visualize these cameras and images to serve as a resource for a wide variety of scientific applications. The AMOS dataset has archived over 750 million images of outdoor environments from 27,000 webcams since 2006. Our goal is to utilize the AMOS image dataset and crowdsourcing to develop reliable and valid tools to improve physical activity assessment via online, outdoor webcam capture of global physical activity patterns and urban built environment characteristics.
This project’s grand scale-up of capturing physical activity patterns and built environments is a methodological step forward in advancing a real-time, non-labor intensive assessment using webcams, crowdsourcing, and eventually machine learning. The combined use of webcams capturing outdoor scenes every 30 min and crowdsources providing the labor of annotating the scenes allows for accelerated public health surveillance related to physical activity across numerous built environments. The ultimate goal of this public health and computer vision collaboration is to develop machine learning algorithms that will automatically identify and calculate physical activity patterns.
J. Aaron Hipp, Deepti Adlakha, Amy A. Eyler, Rebecca Gernes, Agata Kargol, Abigail H. Stylianou, Robert Pless
Mapping Urban Soundscapes via Citygram
Abstract
In this paper we summarize efforts in exploring non-ocular spatio-temporal energies through strategies that focus on the collection, analysis, mapping, and visualization of soundscapes. Our research aims to contribute to multimodal geospatial research by embracing the idea of time-variant, poly-sensory cartography to better understand urban ecological questions. In particular, we report on our work on scalable infrastructural technologies critical for capturing urban soundscapes and creating what can be viewed as dynamic soundmaps. The research presented in this paper is developed under the Citygram project umbrella (Proceedings of the conference on digital humanities, Hamburg, 2012; International computer music conference proceedings (ICMC), Perth, pp 11–17, 2013; International computer music conference proceedings, Athens, Greece, 2014b; Workshop on mining urban data, 2014c; International computer music conference proceedings (ICMC), Athens, Greece, 2014d; INTER-NOISE and NOISE-CON congress and conference proceedings, Institute of Noise Control Engineering, pp 2634–2640, 2014) and includes a cost-effective prototype sensor network, remote sensing hardware and software, database interaction APIs, soundscape analysis software, and visualization formats. Noise pollution, which is the New Yorkers’ number one complaint as quantified by the city’s 311 non-emergency hotline, is also discussed as one of the focal research areas.
Tae Hong Park

Social Equity and Data Democracy

Frontmatter
Big Data and Smart (Equitable) Cities
Abstract
Elected officials and bureaucrats claim that Big Data is dramatically changing city hall by allowing more efficient and effective decision-making. This has sparked a rise in the number of “Offices of Innovation” that collect, manage, use and share Big Data, in major cities throughout the U.S. This paper seeks to answer two questions. First, is Big Data changing how decisions are made in city hall? Second, is Big Data being used to address social equity and how? This study examines Offices of Innovation that use Big Data in five major American cities: New York, Chicago, Boston, Philadelphia, and Louisville, focusing specifically on three dimensions of Big Data and social equity: data democratization, digital access and literacy, and promoting equitable outcomes. Furthermore, this study highlights innovative practices that address social problems in order to provide directions for future research and practice on the topic of Big Data and social equity.
Mai Thi Nguyen, Emma Boundy
Big Data, Small Apps: Premises and Products of the Civic Hackathon
Abstract
Connections and feedback among urban residents and the responsive city are critical to Urban Informatics. One of the main modes of interaction between the public and Big Data streams is the ever-expanding suite of urban-focused smartphone applications. Governments are joining the app trend by hosting civic hackathons focused on app development. For all the attention and effort spent on app production and hackathons, however, a closer examination reveals a glaring irony of the Big Data age: to date, the results have been remarkably small in both scope and users. In this paper, we critically analyze the structure of The White House Hackathon, New York City BigApps, and the National Day of Civic Hacking, which are three recent, high-publicity hackathons in the United States. We propose a taxonomy of civic apps, analyze hackathon models and results against the taxonomy, and evaluate how the hackathon structure influences the apps produced. In particular, we examine problem definitions embedded in the different models and the issue of sustaining apps past the hackathon. We question the effectiveness of apps as the interface between urban data and urban residents, asking who is represented by and participates in the solutions offered by apps. We determine that the transparency, collaboration and innovation that hackathons aspire to are not yet fully realized, leading to the question: can civic Big Data lead to big impacts?
Sara Jensen Carr, Allison Lassiter
Metadaten
Titel
Seeing Cities Through Big Data
herausgegeben von
Piyushimita (Vonu) Thakuriah
Nebiyou Tilahun
Moira Zellner
Copyright-Jahr
2017
Electronic ISBN
978-3-319-40902-3
Print ISBN
978-3-319-40900-9
DOI
https://doi.org/10.1007/978-3-319-40902-3

    Premium Partner