Urban cycles and mobility patterns: Exploring and predicting trends in a bicycle-based public transport system

https://doi.org/10.1016/j.pmcj.2010.07.002Get rights and content

Abstract

This paper provides an analysis of human mobility data in an urban area using the amount of available bikes in the stations of the community bicycle program Bicing in Barcelona. Based on data sampled from the operator’s website, it is possible to detect temporal and geographic mobility patterns within the city. These patterns are applied to predict the number of available bikes for any station some minutes/hours ahead. The predictions could be used to improve the bicycle program and the information given to the users via the Bicing website.

Introduction

Public bike sharing services are becoming more and more popular in the past few years. A still growing list of cities which provides such service systems can be found at the Bike sharing world map.1 Since 2007 the city of Barcelona has been operating one of the largest bike sharing systems called Bicing, with about 6000 bikes distributed in about 400 station across the entire city. The system was very successful with more than 180 000 subscribers in 2009 according to a recent study performed by Barcelona’s city council Lopez [1]. However, the same study also addresses the result of a consumer satisfaction study, which shows still some room for improvements. The two biggest problems detected, which cause user frustration, are (a) the impossibility to find a bike when a user wants to start his/her journey and (b) the impossibility to leave the bike in the user’s destination due to empty or full stations. Without oversizing the system, there are basically two ways to solve these problems: Inform the user in advance about the best places to pick up or leave the bikes and improve the redistribution of bikes from full to empty stations.

In this study we aim to contribute to the solution of these problems via the analysis of cyclic mobility patterns which lead to short term predictions of the number of available bikes in the stations. Such predictions would allow us to improve the current web-service of Bicing and in turn increase users’ satisfaction with the system. Once this type of information is available, users may use mobile devices to access it. Knowledge of those patterns could lead to an optimization of the Bicing system itself, allowing the operator to predict shortage or overflow of bicycles in certain stations well in advance and adapt its redistribution schedule accordingly on the fly.

Furthermore, we intend to show that this type of data also allows us to infer the activity cycles of Barcelona’s population as well as the spatio-temporal distribution of their displacements. Such knowledge may be interesting for city planners and may also represent a cheap way to compare the activity cycles between different cities.

To achieve these goals we use spatio-temporal data, which has been obtained by a web-mining process from the Bicing website and corresponds to the number of bicycles available for the users in a certain moment in time in every one of the approximately 400 different stations.

The rest of the paper is organized as follows. We first review related work on the subject in 1.1 and give a more detailed description of the Bicing system in Section 1.2. Afterwards we describe details of the data retrieval (Section 1.3) and basic quantities of the collected data (Section 1.4). In the results part of the article we first describe the activity patterns in some stations in Section 2 and then take a global picture analyzing the activity cycle of the entire city measured by the amount of bicycles in the stations (Section 2.2) and their variation as spatial distribution (Section 2.3). Then, in Section 3 we apply the findings to predict future activity. Finally, we present the conclusions in Section 4.

Human mobility patterns have received a certain amount of attention in recent studies. However, it is not a straightforward task to obtain data which allows a large scale study, mostly due to privacy issues. Notable exceptions where the authors were able to overcome those difficulties include the use of geotagged photos [2] and location data of mobile phones [3], [4], [5], or analyzing the circulation of individual banknotes [6] and civil aviation traffic [7] to reconstruct geo-spatial data of human displacements in different distance scales.

Most of these studies deal with the trajectories of individuals, but often (as in the case of our data) only aggregate spatio-temporal data is available (e.g. the number of persons at time x in place y). An example for a study with such types of data can be found in [3]. It uses aggregate mobile phone usage data to construct activity cycles for different locations, with clear differences between working day and weekend patterns as well as a characterization of certain areas within the city by a cluster analysis. Our study shows how such results can be obtained as well via web-mining techniques from bike sharing websites.

A similar yet less extensive study which does not include activity prediction has been performed by Froehlich et al. [8].

Prediction of Bicing activity is a problem related to traffic congestion control, which has been analyzed traditionally for vehicular traffic. See for example [9] for a review on this subject. Related problems have also been investigated in the context of web-server traffic congestion where time series analysis techniques, especially the auto-regressive integrated moving average model or variants are widely used [10], [11], although other function approximation techniques, spanning from linear fits [12] to recurrent neural networks [13], have been applied as well to obtain predictions. Here we use a technique, based on activity cycles, more related to Kaltenbrunner et al. [14] where different patterns reflecting a website’s activity cycle were used to predict the number of comments a news item would receive. We also implement time series analysis methods [15] in the form of an Auto-Regressive Moving Average (ARMA) model.

When data in the from of individual trajectories is available a recent study by Song et al. [5] explored the possibility (and limits) of predicting a person’s position using his/her previous mobility data.

Bicing is an urban community bicycle program, managed and maintained in partnership by the city council of Barcelona and the Clear Channel Communications Corporation. Bicing is mainly oriented to cover small and medium daily routes of users within the Barcelona city area.

Users register into the system paying a fixed amount for a yearly subscription and receive an RFID Card that allows them an unlimited usage through the year, where the first half hour of usage is free and subsequent half hour intervals are charged at 0.30 euros up to a maximum of 2 h. Exceeding this period is penalized with 3 euros per hour. There are approximately 400 stations distributed all through the city, where each station has a fixed number of slots, either empty (without a bicycle), occupied (holding a bicycle) or out of service, either because the slot itself or the bicycle it contains is marked as damaged. Whenever a subscriber needs to use a bicycle, he must select one from a station with occupied slots, travel to his destiny station, and leave it there on a free slot. The system registers every time a user takes or parks a bike in a slot. Bicycles can be withdrawn from the stations from Monday to Friday between 5:00 and 24:00. On Saturday and Sunday the service is open 24 h. Outside of these time windows the bicycles can only be returned but not withdrawn.

There are two cases in which the system does not allow a user to fulfill his route:

  • 1.

    The origin station does not have any available bicycles.

  • 2.

    The destiny station does not have any empty slots to park in.

When any of these situations occur, users needing a free slot or a bicycle have to choose between waiting at the station, going to another station or take other means of transportation. In order to reduce these types of situations, there are trucks which move bicycles from highly loaded stations to empty ones. However, in practice users do not wait for these trucks since they do not have a fixed schedule nor ensure a maximum response time to fix problems at a station.

To allow users to plan their routes in advance, the Bicing system provides on their website a map of stations,2 where users can check the status of the stations (amount of available bikes and empty slots) close to their departure and arrival points. However, this information is only available at the specific moment when the user queries the system. The service does not provide a history of previous loads to the stations3 or an expected load of the destiny station at the time that the user gets to it.

The Bicing website provides an information service for users through the Google maps API. It shows a map of Barcelona overlaid with small markers indicating station positions and the amount of available bicycles and free slots for every station. Data is inserted into the map using JavaScript code with a string variable that contains a KML geospatial annotation document. This KML document defines the next information for each station:

  • 1.

    station name

  • 2.

    graphic icon to be inserted in the map

  • 3.

    latitude and longitude

  • 4.

    number of available bicycles

  • 5.

    number of free slots.

In order to analyze the dynamics of station loads, we have been collecting these KML documents since May 15th every two minutes, parsing it and storing in a MySQL database all the relevant information, such as the station name, localization, available bicycles and free slots. As the Bicing network changes from time to time, new stations are added automatically to the database when they first appear in the KML files collected from the Bicing website.

Due to a problem in the Bicing web-service, data after the 3rd of July was updated only once or twice a day and could not be used for our study. We base our results therefore on the data recollected during the 7 weeks between 12:00, May 15th and 12:00, July 3rd, 2008. We also initially did not collect data during Bicing’s closing hours on weekdays between 0:01 and 5:00, which restricts our analysis further to the time window between 5:00 and 24:00.

In total, we collected data from 377 stations with a total of approximate 8700 free slots (three stations, which never contained any bicycles, were omitted from the analysis). The number of slots per station varies between 15 and 39 and the maximum amount of bicycles in the stations observed in our data was 3657.

Section snippets

Activity cycles

After having explained the data we are going to use in this study we will analyze it in this and in the following sections. We will start with an analysis of the activity cycles we can obtain from the amount of bicycles available at the different stations. First, we focus on the local cycles, one for every station. We will later aggregate these cycles to infer activity cycles of Barcelona’s population in 2.2. When taking into account the geographic distribution these cycles allow us to

Prediction of activity

In this section, we present initial results on the prediction of bicycles or free slots at a given station at a given time. We compare several simple prediction models, and establish evaluation measurements as well as a baseline with which other (more complex) models can be compared. We then present a more advanced time series analysis technique that can use information not only from the given station but also its surroundings.

Conclusions

We have shown that mining usage data from community bicycle services allows us to infer the activity cycles of a large city’s population as well as the spatio-temporal distribution of their displacements. There are clear patterns of user behavior by station and type of day. Visualization of the average daily variation in activity allows us to observe that stations with similar behavior also often correspond to adjacent areas in the map revealing residential, university and leisure areas. The

Acknowledgements

We thank Fabien Girardin for pointing out some references and four anonymous referees for helping to improve this manuscript.

References (16)

  • A. Lopez, El transporte publico individual de barcelona, in: II Jornadas de la Bicicleta Publica. Sevilla, Spain, 2009....
  • F. Girardin et al.

    Digital footprinting: uncovering tourists with user-generated content

    IEEE Pervasive Computing

    (2008)
  • J. Reades et al.

    Cellular census: explorations in urban data collection

    IEEE Pervasive Computing

    (2007)
  • M.C. Gonzalez et al.

    Understanding individual human mobility patterns

    Nature

    (2008)
  • C. Song et al.

    Limits of predictability in human mobility

    Science

    (2010)
  • D. Brockmann et al.

    The scaling laws of human travel

    Nature

    (2006)
  • L. Hufnagel et al.

    Forecast and control of epidemics in a globalized world

    Proceedings of the National Academy of Sciences

    (2004)
  • J. Froehlich, J. Neumann, N. Oliver, Measuring the pulse of the city through shared bicycle programs, in: Proceedings...
There are more references available in the full text version of this article.

Cited by (386)

View all citing articles on Scopus
View full text