Dieses Kapitel vertieft sich in die komplexe Welt der urbanen Netzwerkanalyse und konzentriert sich auf die Anwendung spektraler Graphenanalyse zur Identifizierung von Clustern innerhalb urbaner Netzwerke. Im Mittelpunkt der Studie steht Beirut, eine Stadt, die durch ihr komplexes Stadtgefüge mit Gittern, radialen Schnittpunkten und topografisch beeinflussten Wegen gekennzeichnet ist. Die Forschung verwendet spektrales Clustern, eine Methode des unbeaufsichtigten maschinellen Lernens, um das Zirkulationsnetz der Stadt zu analysieren, das aus OpenStreetMap-Daten abgeleitet wurde. Die Analyse zeigt, dass spektrale Clusterbildung effektiv Nachbarschaften und Industriezonen identifizieren kann, mit Ergebnissen, die eng mit historischen Trennlinien und mentalen Karten der Stadt übereinstimmen. In diesem Kapitel wird die Eigen-Gap-Methode zur Bestimmung der optimalen Anzahl von Clustern vorgestellt, die Rechenleistung und topologische Sensitivität berücksichtigt. Vergleiche mit historischen Daten wie der Grünen Linie aus dem libanesischen Bürgerkrieg und anthropologischen Studien über die mentalen Landkarten der Fahrer von Lebensmittellieferungen unterstreichen die Robustheit spektraler Clusterbildung. Zusätzlich vergleicht das Kapitel die Spektralanalyse mit Werkzeugen zur Analyse urbaner Netzwerke und zeigt das Potenzial für spektrale Clusterbildung auf, um Lastmuster von Fußgängern zu replizieren. Die detaillierte Fallstudie von Beirut und die innovativen methodischen Ansätze machen dieses Kapitel zu einer fesselnden Lektüre für diejenigen, die sich für fortgeschrittene urbane Analysetechniken interessieren.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
This work advances previous research [anonymized] into the application of spectral analysis to urban morphology in two ways. First, it refines cluster selection process and second, it utilizes weighted aggregation of cluster boundaries to more subtly reflect the porosity of neighborhood borders and the multi-scalar formations that characterize urban space, rather than searching for a single optimal solution.
The primary case study is that of Beirut, a city fragmented not only by a history of sectarian conflict but equally by piecemeal development and lack of a clear planning directive. The contradictions and multiplicity of the city’s contrasting regions provides a robust, compelling, and challenging object of study.
Finally, the paper concludes with comparisons between the outcomes calculated and previous studies that approached the definition of neighborhood boundaries and connections empirically from the perspective of bottom-up resident accounts [1] and computationally with regard to pedestrian traffic and walkability [2].
1 Introduction and Background
Urban analysis and particularly, the analysis of urban network morphologies has expanded greatly with the spread of computational tools that enhance the precision of analysis, but also increase the subtlety of research inquiry, enabling fine-grained, close reading of urban patterns even at large scales. This paper presents developments in experimentation with such a tool for identifying clusters within urban networks that would indicate close relationship, high connectivity, and neighborhood qualities. In the absences of a ‘ground truth’ subdivision of urban networks, particularly in mature cities that have developed over a long period of time, we try to judge the quality of the predictions through comparison with other qualitative and quantitative methods. We have selected for this analysis the city of Beirut as it has seen both gradual and rapid growth, as well as ruptures, reconfiguration, and partial sutures. The fabric is a mixture of grids, radial intersections, and topographically inflected pathways.
Spectral clustering is a method of unsupervised machine learning developed primarily within graph theory with additional applications, such as in image processing, where a graph representation is possible. It operates by first representing the graph as a square matrix, L, where there is a row and column for every vertex in the graph. When two vertices, i and j are connected (or adjacent), the entry for the corresponding matrix elements Lij and Lji indicate a connection with a value of -1 in the simplest case, or by a fluctuating value if the connections are differently weighted. Along the diagonal (that is, the case of a vertex’s self-adjacency) is the negative of the value of the sum of the adjacency values, so that every row or column will sum to zero. If the graph is unweighted, this will be the number of links connected to the given node. The resulting matrix is known by the name, Laplacian. By computing the eigenvalues and eigenvectors of the Laplacian, it is possible to achieve an embedding of the initial graph that can be clustered using conventional convex methods like k-means++ [3] but which is informed by the topological connectivity of the network.
Anzeige
It is for this sensitivity to topology that we were initially drawn to spectral clustering as a mechanism for urban analysis. We have covered the construction of the Laplacian matrix, proximity weighting, and eigenvector decomposition in more detail in [4] and will omit it here. Further background can be readily found in [5, 6]. In comparison to our earlier work, which was concerned simply with outlining a process and demonstrating a basic applicability to urban studies, this study will address a single urban situation, the with more detail and attempt to benchmark its conclusions against analyses of the urban fabric that were produced by other methods.
2 Methods
The graph data analyzed here is a simple circulation network based on street or path centerlines, where coincident nodes indicate intersections, including both vehicular and pedestrian modes. This data was originally downloaded from OpenStretMap and simplified to remove redundant points, extraneous subdivisions, and to balance the distribution of nodes (for example, fluid curves with a high number of nodes were reduced to simpler polylines). This prevents accidents of data entry that do not represent significant features of the urban network from unduly biasing the outcomes. The area of study is the urban core of Beirut within the primary ring road. The final graph (Fig. 1) comprises 6600 nodes and 8643 edges.
In [4] we suggested using the ‘elbow method’ to determine the optimal number of clusters, k, for which to calculate. This technique compares the spread of each cluster for a series of possible numbers and identifies the point of diminishing returns, that is when an increase in cluster count stops yielding equally increasing improvements on the outcome. Since then, working on larger, continuous urban environments as opposed to fragments has revealed three shortcomings in this method. Firstly, this process requires computing all cluster options first, as it identifies the most suitable number based on the outcomes. This is computationally expensive on large networks. Secondly, this method tends to yield low cluster counts because the difference between outcomes at large values of k is inherently reduced. This is not necessarily a problem on small extracts, but at the scale of the city, larger values need to be considered. Finally, while the elbow method is frequently used in other fields, urban networks are globally characterized by the fact that they occur in a real space, across a nearly two-dimensional surface, with connections spread out relatively evenly (urban networks are not scale-free, nor do they have nodes with exponentially large numbers of edges). This means that the number of clusters was more influenced by the overall layout of the networks’ shape than by its topology. Rectangular extracts, for example, more often than not returned five clusters distributed in a quincuncial pattern.
Fig. 1.
The geodata after preparation: 6600 nodes, 8643 edges. The edges are coded by length, a distribution plot, at left, shows the relative lengths.
In place of this we are now recommending using the ‘eigen gap’ method. Specifically, this uses the list of eigenvalues, sorted from low to high (a typical output format of many linear algebra code libraries) and identifies the largest gaps between sequential values (Fig. 2). This approach addresses all three points in one move: the eigenvalues are computed one time before clustering regardless of the number of clusters desired, so there is almost no additional calculation needed; the eigen gap method regularly returns a high number of clusters (and diverse alternatives); and the results are given by the topological properties rather than the shape of the boundary.
Anzeige
From the chart shown in Fig. 2, we notice that the best solution is to calculate for 21 clusters, but also that there is a group of three values in 21, 17, and 22 that are close in dimension to one another and notably larger than the rest. We would a priori expect that there could be several similar candidates given that there is no isolation of clusters in the real-world urban network, and therefore alternate boundaries could be equally valid. For the analytical tool to be useful, however, we would like to find that the best candidates are not completely dissimilar to one another. In Fig. 3 we first plot the results for 21 clusters on the Beirut mode. A visual analysis confirms that the clusters do, in fact, largely coincide with identifiable neighborhoods (or the merger of two adjacent) and industrial zones. The scatterplot of silhouette values by cluster lets us evaluate clusters on an individual level. The AUB campus (top left of the map, bright blue, 3rd column of the plot) is a particularly well-defined cluster, which is to be expected as it has limited connections to the city at large and a high density of internal paths. Similarly, Karantina, (second from the right, medium blue) a neighborhood well-known as isolated from the rest of the city, is also strongly defined. The lowest valued clusters are located in the center, and again, this is an expected result, not necessarily indicating poor outcomes, but the result of having many adjacent clusters that are competing for territory.
Fig. 2.
The ordered eigenvalues can be used to identify appropriate cluster counts. Here, the best candidate is for k = 21 clusters. Note that the plot of eigen gaps is not a smooth curve and the value of k = 20 just before the peak is a rather poor candidate.
To develop this assessment further, we next compare the first result with the next two candidates, k = 22, and k = 17. Largely, the initial boundaries are maintained with only an addition of the Sioufi neighborhood as a distinct cluster when increasing from 21 to 22. When reducing from 21 to 17, three pairs of clusters were merged, almost intact into single clusters at Downtown, the Port, and in the south-central part of the map. Additionally, the central cluster of Bachoura-Monot split into its two component neighborhoods and these merged into their opposite neighbors. Apart from these changes, the rest of the borders were virtually unchanged. This suggests an incredibly robust solution and a coherence in the model.
Following on these results, we have also proposed a metric for the persistence of dividing edges across different scenarios, measured by iterating through the list of possible cluster counts and identifying all the edges that span the border between two identified neighborhoods, that is, having endpoints in different clusters. We have weighted the outcomes toward the scenarios recommended by the eigen gap method proportionate to those gap values (Fig. 4). Thus, the results for k = 21, 17, 22 received the strongest weight, followed by 28, 8, 5, and so on. Values that were below the trivial solution of one single cluster were ignored in this calculation.
Fig. 3.
Above: The result of clustering central Beirut with the suggested k = 21. Labels above and below indicate the congruence of these clusters with neighborhoods. Below: A comparison of the impact on cluster borders for the next two candidates, k = 22 (blue) and 17 (orange).
Fig. 4.
Persistent boundary edges in thicker line weight. This analysis also reveals the inverse—gaps of grey that were consistently gathered in the same cluster.
3 Comparisons
In this section, we will make comparisons to three different external sources, the first is historical related to the schism of Beirut during the Lebanese civil war, the second is anthropological gathered from mental maps, and the third an alternate method of computational analysis of pedestrian traffic.
3.1 Green Line
During the height of the civil war, the demarcation line between factions in Beirut was referred to as the “Green Line” because, as a no-man’s land, it became overgrown with wild plants. This was a fuzzy edge more than a strict boundary, but generally followed Damascus Road north through the city and across downtown. Many of the buildings along the green line were heavily damaged during the conflict and extensive demolition and reconstruction followed [7]. The traces of this boundary are still evident walking around the city, but because of Solidere’s rebuilding, we were uncertain if the division would manifest in a spectral analysis. Incredibly, a simple bipartite clustering (Fig. 5) produced a division that largely traces this division, especially along Damascus Road, itself, and then taking a path that crosses downtown passing some of the most prominent urban relics of the war, the Burj El Murr and old Holiday Inn. This result was surprising as there is not an especially strong emphasis on these divisions in either Fig. 3 or Fig. 4.
Fig. 5.
Division into two clusters closely replicates the rift of the civil war Green Line.
The overall silhouette score for a 2-cluster scenario is quite low despite it having a reasonably good eigen gap value (it is the 7th best option in Fig. 2). This will be partly due to the overall size of the clusters, which reduces the cohesion component (the closeness of the cluster). There is also an exclave sector of the blue cluster at top-right within the port area. This sometimes occurs when the k value is artificially low and an outlying area is so detached from its immediate context that it finds itself in the range of another cluster. It is not a desirable result but is usually easy to detect.
3.2 Seeing the City as a Delivery Driver
Within the ‘Refugees as City-Makers’ research project of Beirut Urban Lab, the neighborhoods of Beirut were studied regarding their placement and boundaries as understood in the mental maps of the food delivery drivers [1]. Collected through interviews the identity of the neighborhoods was spatialized through the landmarks that the drivers could list within each neighborhood. The results were not a set of exclusive regions but overlapped or were even nested within larger regions. In comparison with a spectral analysis (Fig. 6), there are in fact, many similarities (ignoring a few clear outliers, like the port extending all the way back inland along the east side of the city) in terms of location and the general boundaries. One of the key differences is the range of sizes. The delivery drivers’ map has a handful of very small areas (e.g. Patriarcat) and some extremely large ones, particularly Achrafieh, which consumes almost the entirety of East Beirut (here rendered in a very light transparency for legibility of the regions behind it).
Fig. 6.
Partial redrawing of the map ‘Delivery Drivers’ Accounts of Neighborhoods’ overlayed with 22-cluster division. Neighborhoods with notably similar definitions are bolded.
3.3 Urban Network Analysis
Finally, we compare to the results of a study co-executed at MIT and AUB [2] on the Walkability and Pedestrian Conditions in Beirut. This study used Urban Network Analysis Toolbox to predict pedestrian load by simulating origin-destination trips between different functional locations, calibrated with observed pedestrian counts. Here, we see some significant departures in what constitutes an urban cluster. Intersections, and especially the radial intersections accumulate great significance as routing points for a large number of trips. The downtown area at place de l'Étoile similarly sees an increased role. Research indicates that spectral analysis can be comparable to random-walk clustering when the Laplacian matrix is constructed correctly. This would have to be explored in parallel with a weighting factor that replicated the assumed pedestrian loads or it is unlikely to produce anything like these patterns.
4 Conclusion and Future Work
This paper endeavored to better uncover the applications of spectral clustering for urban graphs and to construct comparison with other studies in order to judge the appropriateness of its use in urban studies. The results were a significant improvement on earlier research but also suggest new options. Future work will examine wither there is room for more nested scales of belonging than only the cluster as suggested in comparison 2—possibly through the construction of hyperedges around immediate moments (convex spaces)—and testing of more external sources of weighting in the network—demographic, functional, or other built environment data—as suggested in comparison 3.3.
Acknowledgements
All images courtesy the author. Per §2, urban network data was sourced from Open Street Map: openstreetmap.org/copyright.. The figure–ground map in Figs. 3, 5, and 6 was sourced from the Beirut Built Environment Database: beiruturbanlab.com/en/Details/561. Both are distributed under the Open Database License (ODbL).
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.