Zum Inhalt

MAPSIA: Automatic Pavement Distress Detection for Optimal Road Maintenance Planning

  • Open Access
  • 2026
  • OriginalPaper
  • Buchkapitel
Erschienen in:

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …
download
DOWNLOAD
print
DRUCKEN
insite
SUCHEN

Abstract

Dieses Kapitel beschäftigt sich mit der Anwendung von tief lernenden, computergestützten Bildverarbeitungssystemen zur automatischen Erkennung von Fußwegnotfällen, wobei der Schwerpunkt auf der YOLOv5-Architektur liegt. Die Studie stellt einen vielfältigen Datensatz von 7099 geo-markierten Bildern zusammen, der 13 Arten von Verkehrsunfällen umfasst, die unter verschiedenen Bedingungen aufgenommen wurden. Es vergleicht verschiedene YOLOv5-Untermodelle, um das effizienteste und genaueste Modell für die Echtzeiterkennung zu ermitteln. Die Forschung führt einen Pflasterzustandsindex ein, um die Auswirkungen erkannter Notfälle zu quantifizieren, und integriert diesen Index in ein Visualisierungstool für die strategische Instandhaltungsplanung. Die Ergebnisse unterstreichen die Effektivität von YOLOv5l bei der Erkennung verschiedener Arten von Notsituationen, wobei eine signifikante Verringerung von Fehlalarmen durch eine regelbasierte Filtermethode zu verzeichnen ist. Die Studie schließt mit einer Fallstudie, die die praktische Anwendung des Systems bei der Ermittlung und Prioritätensetzung des Instandhaltungsbedarfs von Straßen demonstriert und letztlich darauf abzielt, das Straßenmanagement und die Instandhaltungsbemühungen zu optimieren.

1 Introduction

The road infrastructure deteriorates in its physical and functional conditions due to long-term heavy traffic, aging of materials, and harsh environmental changes [1]. The abundance and level of service directly influence a country’s economy, with a paved road density ranging from 40 km/M inhabitants for low economies to 8550 km/M in high economies. In the USA, 68% are poor condition, costing an extra $61B annually, yet needing $130B/year for upgrades [2]. Also, vehicle speed decreases by 55% and carbon emissions increase by 2.49% on very poor roads compared to excellent conditions [3]. Proactive shorter-interval pavement condition monitoring could promote small-scale repair, extending asphalt lifespan, and saving by up to 80% pavement rehabilitation [4] and lessening rehabilitation negative environmental impact, with a life cycle cost reduction of 37% [2]. Thus, a pavement management system is crucial for optimized decision-making on preservation prioritization, methods, timing and location [5].
The aforementioned precedents, combined with traditional human-based visual inspections by pavement engineers which is time-consuming and inefficient given the extensive road networks, create a knowledge gap [6]. Automation through Deep Learning-based (DL) Computer Vision (CV) systems emerges as a pivotal solution. The systems’ pipeline comprises: image acquisition, DL-driven road distress recognition, surface road distress index for impact quantification, assessment through visualization tool and strategic decision conservation. State-of-the-art studies primarily focus on the design of the architecture (processing), mainly of crack-type defects, using private datasets, with no effort put into designing an index to evaluate a study area.
This full-spectrum study compiled frames of various categories (repairing, sewage, crack-type, particle detachment) using speed-agnostic acquisition device, establishing a dataset for DL model benchmarking. After approaching the problem as an object detection CV task, we compared YOLOv5 DL-architectures, evaluating detection rates per distress and computational cost to select the best model. To refine post-processing, we implemented a visual inspection-based filter method. Using the optimal model’s outputs, we integrated a pavement condition index into visualization software, devising a strategic maintenance system through a case study analysis.

2 Methodology

2.1 Dataset

The dataset, annotated by pavement specialists, contains 7099 geo-tagged images (640 × 640 × 3) with 13 distress types. They have been captured under various lightning and weather conditions using a drone camera on a vehicle. The train/valid/test split is 70/15/15. The following nominal encoding has been established: D1 (block-cracking), D2 (alligator cracking), D3 (diagonal crack), D4 (longitudinal crack), D5 (irregular crack), D6 (transversal crack), D7 (D4 between lanes) D8 (patch), D9 (pothole), D10 (sewer), D11 (manhole), D12 (raveling), and D13 (sealed crack).

2.2 Metrics

Precision (P) is the ratio of correctly detected instances to the total predicted positives. Recall (R) is the relation between predicted positive observations to all actual positives. Mean Average Precision (mAP), the area under the P-R curve serves as our benchmarking for assessing YOLOv5 family. A higher mAP indicates a superior model performance.

2.3 Experimental Setup

Object detection is a CV task where multiple instances of a given class are detected and located within an image. Recent object detectors are DL-based architectures that receive raw images and output the following data per detected element: class (distress type), confidence probability (how confident the detector is that the object is in that location) and the coordinates of the bounding box.
In this research, YOLOv5 [7] (You Only Look at Once, version 5) DL-architecture was selected to automatically identify road defects from images. YOLOv5 is divided into three blocks: backbone, neck, and head. The backbone is Cross-Stage Partial Dark- net53, which performs feature extraction, enhances feature expression, and improves running speed. The neck is Path Aggregation Network which up-samples the output feature map generated from multiple convolution down-sampling from the backbone to generate new feature maps with different scales to detect small, medium, and large objects. The head, made of convolutional layers, predicts the confidence score, the class, and the bounding box. YOLOv5 was adopted for its high detection rate and swift inference, crucial for optimizing real-time image flow from acquisition camera to pavement management tool. The specifications of the dataset can be found in Sect. 2.1.
Initially, a comparative analysis will be conducted between the different YOLOv5 sub-models (n, s, m, l, x) that maintain the architecture but differ in complexity, aiming to conduct a study that balances accuracy against prediction time. All of them share the same configuration displayed in the following table (Table 1).
Table 1.
YOLOv5 family setup. Augmentation techniques are CA -classical augmentations- (flipping, translation, rotation, scaling, shearing and perspective) and Mosaic. The loss is the weighted addition of the three terms.
Learning rate
Optimizer
Momentum
Weight-decay
Augmentation
Epochs
0.01
SGD
0.937
0.0005
CA + Mosaic
300
Classification loss
Objectness loss
Regression loss
Confidence threshold
IoU threshold
Early-stopping patience
Focal
Focal
CIoU
0.25
0.45
7
After identifying the best algorithm, a filter method was designed to minimize illogical distress detections and reduce false positives. Upon visual assessment, isolated crack (D3-D7) detections were nested within meshed cracks (D1/D2), consistent with D1/D2 being aggregates of D3-D7, potentially amplifying ambiguity. Also, child D1/D2s of bigger D1/D2s were observed. Additionally, singular cracks were occasionally detected as multiple distinct crack classes, likely due to camera perspective-induced geometric distortion. In response to these visual observations, a post-processing mechanism using Intersection over the Union and confidence scores was implemented to reduce such detections.
Based on the refined model outputs, the following area-weighted pavement condition index is proposed:
$$ Index^{iimaiie} = min\left( {\sum\nolimits_{ii = 1}^{N} {A_{ii} \alpha_{ii}^{k} c_{ii} ,1} } \right) $$
(1)
For each image, the index represents the addition of the product of area (Ai), confidence score (αii), and a hyperparameter emphasizing small yet critical defects (ci). The index ranges from 0 to 1, where a higher value indicates poorer pavement condition. Finally, a tool was devised to analyze a road segment (case study).

3 Results and Discussion

Fig. 1.
mAP by road distress type for the different YOLOv5 sub-models.
Bild vergrößern
For the isolated crack family (D3-D7), YOLOv5l approximately exhibits the best performance. These results are noteworthy since the appearance is similar, but with different geometry. The best-detected objects, regardless of the model, are D10/D11 belonging to the sewer family, likely due to their distinctive appearance compared to the pavement background. Regarding the repair block, D8 is model-agnostic, while for D13 despite being a minor defect, YOLOv5-l/x outperform the rest. Concerning the meshed crack subset, YOLOv5-l performs slightly better, and YOLOv5-x excels in D2. Simpler architectures n/m/s with inference times of 3.7–5.5 ms/image and sizes of 3.9/14.4/42.2 MB generally produce inferior results. Given YOLOv5 l/x’s comparable efficiency, YOLOv5l was selected to meet the inference demands of recording at 30 frames per second at 120 km/h vehicle speed (l: 8.7ms, x: 17.3ms) (Figs. 1, 2 and 3).
Fig. 2.
False positives of YOLOv5-l after and before applying the rule-based filtering.
Bild vergrößern
A false positive is when there is a detection but no corresponding annotation. Upon applying rule-based filtering derived from visual detection inspection, defects D4/D6, with the highest false positive rates, dropped by about 27%. Meanwhile, other defects fluctuated between 20% and 45%. Then, this method enhances result reliability in the context of road defect detection as per human interpretability.
Fig. 3.
Distribution of detections by category of defect for the case study and geospatial representation. Image produced using Leaflet.js (JavaScript), an open source software.
Bild vergrößern
Fig. 4.
Specific section with pavement distress detections by defect category, detection confidence, and contribution to distress index. Color-index equivalence: green (<0.2), blue (0.2–0.4), yellow (0.4–0.6), orange (0.6–0.8), red (>0.8). Image produced using Leaflet.js (JavaScript), an open source software.
Bild vergrößern
YOLOv5l’s efficient training cost is around 2.3 h. To achieve real-time detection and reduce image storage, the trained model could be integrated into a nano AI computer and connected to the camera.
Longitudinal cracking is the most common defect (19.64%), but potholes and alligator cracking are more concerning for road maintenance (8.78% and 10.58%, respectively). The urban center of Santander, the airport area, and the Guarnizo industrial park have the highest number of pavement defects. Images with detections and color-coded index-based markers allow identifying concerning areas, such as the one in Fig. 4 with recent maintenance (3 D8s) and 2 D2s (severe damage). Thus, this pavement management tool enables strategic maintenance planning.

4 Conclusions

A Computer Vision system has been designed to collect road images to automatically detect 13 pavement distresses. After analyzing YOLOv5 models, YOLOv5l was selected for its high recognition rate and reduced inference time. Filtering the detections with our post-processing pipeline reduced the false positives. A pavement condition index was designed to identify the most affected areas. MAPSIA is a full-scale system that will enable more efficient road management and conservation.

Acknowledgments

This work has received funding under grant TED2021-129749B-I00.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
download
DOWNLOAD
print
DRUCKEN
Titel
MAPSIA: Automatic Pavement Distress Detection for Optimal Road Maintenance Planning
Verfasst von
Saúl Cano-Ortiz
Lara Lloret Iglesias
Pablo Martinez Ruiz del Árbol
Daniel Castro-Fresno
Pedro Lastra-González
Carlos Real-Gutiérrez
Eugenio Sainz-Ortiz
Copyright-Jahr
2026
DOI
https://doi.org/10.1007/978-3-032-06763-0_10
1.
Zurück zum Zitat Ai, D., Jiang, G., Lam, S.K., He, P., Li, C.: Computer vision framework for crack detection of civil infrastructure—a review. Eng. Appl. Artif. Intell. 117, 105478 (2023). https://doi.org/10.1016/j.engappai.2022.105478CrossRef
2.
Zurück zum Zitat El Hakea, A.H., Fakhr, M.W.: Recent computer vision applications for pavement distress and condition assessment. Autom. Constr. 146, 104664 (2023). https://doi.org/10.1016/j.autcon.2022.104664CrossRef
3.
Zurück zum Zitat Setyawan, A., Kusdiantoro, I.: The effect of pavement condition on vehicle speeds and motor vehicles emissions. Procedia Eng. 125, 424–430 (2015). https://doi.org/10.1016/j.proeng.2015.11.111CrossRef
4.
Zurück zum Zitat Kheradmandi, N., Mehranfar, V.: A critical review and comparative study on image segmentation-based techniques for pavement crack detection. Constr. Build. Mater. 321, 126162 (2022). https://doi.org/10.1016/j.conbuildmat.2021.126162CrossRef
5.
Zurück zum Zitat Sholevar, N., Golroo, A., Esfahani, S.R.: Machine learning techniques for pavement condition evaluation. Autom. Constr. 136(2021), 104190 (2022). https://doi.org/10.1016/j.autcon.2022.104190CrossRef
6.
Zurück zum Zitat Cano-Ortiz, S., Pascual-Muñoz, P., Castro-Fresno, D.: Machine learning algorithms for monitoring pavement performance. Autom. Constr. 139(2021), 104309 (2022). https://doi.org/10.1016/j.autcon.2022.104309CrossRef
7.
Zurück zum Zitat Jocher, G.: YOLOv5 by Ultralytics (2020). https://doi.org/10.5281/zenodo.3908559
    Bildnachweise
    AVL List GmbH/© AVL List GmbH, dSpace, BorgWarner, Smalley, FEV, Xometry Europe GmbH/© Xometry Europe GmbH, The MathWorks Deutschland GmbH/© The MathWorks Deutschland GmbH, IPG Automotive GmbH/© IPG Automotive GmbH, HORIBA/© HORIBA, Outokumpu/© Outokumpu, Hioko/© Hioko, Head acoustics GmbH/© Head acoustics GmbH, Gentex GmbH/© Gentex GmbH, Ansys, Yokogawa GmbH/© Yokogawa GmbH, Softing Automotive Electronics GmbH/© Softing Automotive Electronics GmbH, measX GmbH & Co. KG