Zum Inhalt

Neuronal Networks to Analyze Accessibility and GPS Data: A Case Study of Santo Domingo

  • Open Access
  • 2026
  • OriginalPaper
  • Buchkapitel
Erschienen in:

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dieses Kapitel untersucht, wie unterschiedliche Barrierefreiheit das Reiseverhalten in Santo Domingo beeinflusst, wobei die zurückgelegte Strecke als abhängige Variable verwendet wird. Die Studie basiert auf GPS-Daten, die über eine Smartphone-App gesammelt wurden, und berücksichtigt vier Arten der Zugänglichkeit: die Zugänglichkeit des öffentlichen Nahverkehrs außerhalb der Hauptverkehrszeiten und die Zugänglichkeit in der Nachbarschaft. Zur Analyse der Daten wurden neuronale Netzwerkmodelle entwickelt, wobei für die Modellierung Variablen für Spitzen- und Nebenstunden verwendet wurden. Die Studie unterstreicht die Bedeutung der Entfernung bei der Erkennung von Reisemustern und die Herausforderungen des Zugangs zu und der Einfahrt zu öffentlichen Verkehrsstationen in Entwicklungsländern. Die Studie ergab, dass die nördlichen und nordwestlichen Gebiete von Gran Santo Domingo am besten erreichbar sind, wobei es sich bei den meisten Zonen mit hoher Zugänglichkeit um bedeutende Gewerbe- oder Industriegebiete mit einer nahe gelegenen U-Bahn-Station handelt. Die Modelle 5, 3 und 1 erwiesen sich als die besten für die Modellierung, wobei Modell 5 die zuverlässigsten Ergebnisse lieferte. Die Analyse ergab auch, dass der Typ der öffentlichen Verkehrsstationen einen Schlüsselfaktor für die von motorisierten Verkehrsträgern zurückgelegte Strecke darstellt.

1 Introduction

Several factors influence the use of public transport in developing countries. For example, distance to stations/stops can be a primary deterrent to using massive public transport modes (e.g., metro and train). At the same time, short-distance trips are more likely to be undertaken by informal public transport (PT) modes (e.g., public cars or shared taxis). Studies have shown that distance can have positive and negative effects on the choice of destination (Xue & Zhang, 2020); therefore, we have determined that distance is critical when making any trip, and its modeling is a priority. Distance as travel pattern recognition has been analyzed, considering its correlation with activity complexity (Joh et al., 2001). Although travel time and travel cost are essential representations of the level of service, distance is often used as a measure of friction. Access and egress to public transport stations are significant challenges for developing countries. In the Dominican Republic, gathering mobility data is a challenge for the transportation authorities; and monitoring daily volumes of trips is a pending task. Government significant invest on infrastructure. Therefore, it is important to develop data collection methods to analyze accessibility and monitor travel patterns, which will encourage more tailored services to the specific demand.
This study examines how different levels of accessibility influence travel patterns, using traveled distance (per trip) as the dependent variable. The paper is based on GPS data collection in Santo Domingo using a smartphone app named Inertia. Four types of accessibility are developed in the present paper: off-peak and peak accessibility of public transport (station accessibility) and neighborhood off-peak and peak accessibility. The dependent variable is the trip distance retrieved from GPS data collected with the mobile phone application. Additionally, neural network models were created, and the peak hour and off-peak hour variables will be used for modeling.

2 Literature Review

Accessibility has been explained in the literature as a composition of spatial, temporal, transport, and individual components. Accessibility uses GPS data and measurements of access and egress. At the same time, GPS data represents substantial advantages for travel demand models that have not been present in accessibility models.

2.1 Transport Component, Spatial Component, and Station Class

There is a long discussion on accessibility measures and impedance functions, e.g., negative power, Gaussian, and log-logistic. Certain parts allow better behavioral representation, such as the S-shaped conventional logistic function (Geurs et al., 2016). The importance of having a distance decay function represents that shorter trips are more likely to be undertaken than more extended trips. Similarly, closer facilities are more likely to be chosen than farther ones. In the present paper, a distance decay function is estimated based on the observed trips.
Classification of public transport nodes or stations means specific nodes are better positioned in the transport network than others. This classification can infer the population, jobs, or other amenities reachable from specific nodes or zones. Similarly, to represent station class and spatial components of accessibility, the study by Luo et al. (2017) aims to combine stops at a micro-level and not aggregate stops at a zone level. In the present paper, we incorporate population as a spatial component, and we develop a cluster indicator of public transport station class.

2.2 Neural Networks

The concept of a “neural network” describes a collection of loosely related models that share common characteristics, including an ample parameter space and a flexible structure, which are inspired by studies on the brain's functioning (IBM, n.d.). In this study, we cover neural networks by modeling distance through variables such as speed, time, among others.

3 Data Description

In the Dominican Republic, public transport can be described as organized public and informal public transport. Organized public transport refers to the autobuses in the OMSA network and adjacent corridors with determined stops.
Our study used open street maps and road networks. We enriched this data by adjusting the speed per link to the collected speed values by the Inertia app. The travel times between OD pairs were calculated with the tool Network Analyst in ArcGIS.
We have worked with 743 trips, using neural networks for a total of 458 users, and more than 63 users undertaken more than ten trips. The dependent variable in the model is distance. For the selection of independent variables, consider the following:
  • Time and Speed: retrieved from the Inertia app.
  • Neighborhood peak and off-peak accessibilities: these variables are calculated to determine the connectivity between neighborhood centers during peak and off-peak hours.
  • PT availability: variable showing which trips have public transport stop buffers nearby.
  • Mode as motorized or non-motorized transport: this variable indicates whether motorized or non-motorized modes undertook the trip.
  • Station peak and off-peak accessibilities: these variables are linked to the accessibilities of a station during peak and off-peak hours.
  • Station type: a cluster analysis was developed to classify the PT stations. In this, the variables of public transport services (OMSA, Santo Domingo Metro, and Teleferic, known as cable car in English) and number of trips detected during peak hours were used.
We estimated the following models:
  • Model 1: includes the explanatory variables neighborhood peak accessibility, time, and speed.
  • Model 2: the model includes the explanatory variables neighborhood off-peak accessibility, time, and speed.
  • Model 3: the model includes the explanatory variables neighborhood peak accessibility, time, PT Availability, motorized and non-motorized, and speed.
  • Model 4: the model includes the explanatory variables neighborhood off-peak accessibility, time, PT Availability, motorized and non-motorized, and speed.
  • Model 5: the model includes the explanatory variables neighborhood peak accessibility, time, PT Availability, motorized and non-motorized, speed, and Station type.

4 Analytical Framework

4.1 Modeling Accessibility

In this paper, we measure the accessibility of access/egress modes and main modes as a zone to zone and station zones, respectively.
Based on the literature review, we have selected the log-logistic function for this study. This function shows smoother change probabilities, similar to behavioral effects of increasing distances or travel times. The following function has been used to estimate the probabilities (F) based on the travel times (tij) from zone i to zone j in Eq. 1:
$$F\left({t}_{ij}\right)=\frac{1}{1+exp (a+b \,ln\, {t}_{ij}) }$$
(1)
where a and b are parameters to be estimated with a log-logistic regression, the estimation has been done in the statistical software package SPSS. The values of the parameters a and b are used in the calculation of the Accessibility accumulated value as follows in Eq. 2:
$${A}_{i}= {\sum}_{j} F\left({t}_{ij}\right)*{P}_{j}$$
(2)
Neighborhood accessibility is the weighted sum of the population ((Pj) reachable by the impedance (weighted by the probability) of travel times from origin i (neighborhood centroid) to destination j (neighborhood centroid). The speeds obtained from the GPS data are added to the accessibility to account for the area's real congestion levels and impedances. Additionally, peak-hour and off-peak-hour speed measurements were broken-down to calculate peak and off-peak accessibility, where the peak hours are from 6:00 a.m. to 9:00 a.m. and from 5:00 p.m. to 7:30 p.m.
In the station accessibility, the origin zones will be the stations, and the destination zones j will be the (remaining) zones of residential areas. The impedance curve is based on the measured travel times of the non-motorized modes and calculated per mode.

4.2 Modeling Neural Networks

For the neural models, we used three types of partitions based on previous works (IBM, n.d.):
  • The training set: the training sample comprises the data records used to train the neural network. It must be greater than the test.
  • The test sample: is an independent set of data records used to track errors during training to avoid excessive training.
  • The holdout sample: provides a truthful estimate of the model's predictive ability.
  • In our case, MLP was used, a powerful and flexible model that can learn complex non-linear relationships between input and output variables. The hyperbolic tangent was used in the network.

5 Results

We conducted a cluster analysis on public transport stations, categorizing them into two distinct groups:
  • Group 1: This group shows higher average values for services (OMSA and Teleferico) and lower average values for number of trips detected during peak hours and Metro within the catchment area.
  • Group 2: This group shows higher average values for service (Metro) and number of trips detected during peak hours and lower average values for services (OMSA and Teleferico) within the catchment area.
The catchment area is the radius covered by the buffer. These variables have been addressed by lines of the different modes of transport, not by stops, and the service options have been taken within the catchment area, which is a buffer of 1200 m around the station. We can conclude that Group 2 contains the main public transportation and the greater flow of vehicles. This suggests that these stations are the busiest, with Group 2 having a more advantageous position than Group 1 due to higher vehicle traffic.
The north and northwest areas of Gran Santo Domingo (GSD) have the highest accessibility in the region. These areas are generally low-income, except for the northwest region, which boasts middle-to-high incomes. Most zones with higher accessibility tend to be significant commercial or industrial development areas with a nearby metro station. In contrast, areas with low accessibility are located in the southern part of GSD and are predominantly residential. The zones with the highest accessibility are located in the center and east of Gran Santo Domingo. In contrast, the least accessible stations are located in the north or nearly outside the GSD (Fig. 1).
Fig. 1.
Neighborhood and station accessibility.
Source: own elaboration
Bild vergrößern
We have worked with five models, and these are (Table 1):
Table 1.
Neural networks.
Relative error
Model 1
Model 2
Model 3
Model 4
Model 5
Training
0.059
0.059
0.059
0.059
0.013
Test
0.132
0.132
0.132
0.132
0.014
Holdout
0.068
0.130
0.016
0.195
0.011
Models 5, 3, and 1 have demonstrated lower relative errors for holdout partitions, which can be considered positive for modeling. Therefore, the “Station type” and “Neighborhood peak accessibility” variables significantly impact the distance traveled positively. Using Model 4 is not recommended because of its substantial relative errors; consequently, the use of “Neighborhood off-peak accessibility” is not advised.

6 Conclusions

This paper presented an innovative method by investigating accessibility within a developing country, notably the Dominican Republic, an aspect that had not been deeply studied in developing countries. The study highlighted the relationship between distance traveled, accessibility, and the transport infrastructure, such as the metro, which enhanced accessibility in some areas. Our analysis illustrated the interaction between distance, peak and off-peak accessibility, time, and speed. Models 5, 3, and 1 emerged as the best for modeling. A model that integrated peak-hour data provided more reliable results. Furthermore, Public Transport station type represents a key factor in the distance traveled by motorized modes.
We extend our gratitude to the Idelisa Bonnelly Scholarship Program of the Ministry of Youth of Dominican Republic for their support.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
download
DOWNLOAD
print
DRUCKEN
Titel
Neuronal Networks to Analyze Accessibility and GPS Data: A Case Study of Santo Domingo
Verfasst von
Amparo Isabel Álvarez Poyó
Lissy La Paix Puello
Copyright-Jahr
2026
DOI
https://doi.org/10.1007/978-3-032-06763-0_83
Zurück zum Zitat Geurs, K.T., La Paix, L., Van Weperen, S.: A multi-modal network approach to model public transport accessibility impacts of bicycle-train integration policies. Eur. Transp. Res. Rev. 8(4), 25 (2016). https://​doi.​org/​10.​1007/​s12544-016-0212-xCrossRef
Zurück zum Zitat Joh, C.-H., Arentze, T., Timmermans, H.: Pattern recognition in complex activity travel patterns: comparison of Euclidean distance, signal-processing theoretical, and multidimensional sequence alignment methods. Transp. Res. Rec. J. Transp. Res. Board 1752(1), 16–22 (2001). https://​doi.​org/​10.​3141/​1752-03CrossRef
Zurück zum Zitat Luo, D., Cats, O., Van Lint, H.: Constructing transit origin-destination matrices with spatial clustering. Res. Rec. J. Transp. Res. Board 2652(1), 39–49 (2017). https://​doi.​org/​10.​3141/​2652-05CrossRef
Zurück zum Zitat Xue, L., Zhang, Y.: The effect of distance on tourist behavior: a study based on social media data. Ann. Tour. Res. 82, 102916 (2020). https://​doi.​org/​10.​1016/​j.​annals.​2020.​102916CrossRef
    Bildnachweise
    AVL List GmbH/© AVL List GmbH, dSpace, BorgWarner, Smalley, FEV, Xometry Europe GmbH/© Xometry Europe GmbH, The MathWorks Deutschland GmbH/© The MathWorks Deutschland GmbH, IPG Automotive GmbH/© IPG Automotive GmbH, HORIBA/© HORIBA, Outokumpu/© Outokumpu, Hioko/© Hioko, Head acoustics GmbH/© Head acoustics GmbH, Gentex GmbH/© Gentex GmbH, Ansys, Yokogawa GmbH/© Yokogawa GmbH, Softing Automotive Electronics GmbH/© Softing Automotive Electronics GmbH, measX GmbH & Co. KG