Skip to main content
Erschienen in: Innovative Infrastructure Solutions 1/2024

Open Access 01.01.2024 | Practice-oriented Paper

Smart monitoring of road pavement deformations from UAV images by using machine learning

verfasst von: Heba Basyouni Ibrahim, Mahmoud Salah, Fawzi Zarzoura, Mahmoud El-Mewafi

Erschienen in: Innovative Infrastructure Solutions | Ausgabe 1/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Road pavement deformation monitoring is considered the main task for maintenance purposes, especially potholes and cracks, which are the most common types of road deformation surfaces. In order to make pavement inspections more effective, new types of remote sensing data that do not damage the pavement are being used more and more to find pavement distress. This article presents a proposed approach for extracting surface cracks from unmanned aerial vehicle (UAV) images using machine learning, focusing on the data pre-treatment processes. The objective of this study is to evaluate the effectiveness of decision tree classification (DT) in detecting cracks. The performance of the models is also evaluated. The performance evaluation approach is predicated on two primary criteria: model validation and testing. Also, the extent of the impact of post-classification operations, edge detection technology, and morphological processes on crack identification as well as classification accuracy, the digital orthomosaic was generated by the use of a technique commonly referred to as backward projection. To achieve this, the study uses a fusion of gray-level co-occurrence matrix (GLCM) attribute data and RGB images. Cracks are discovered using a classification tree (CT)-based classification approach with an overall classification rate of 86%. Ultimately, morphological processes using the closed image that was formed had a commendable level of accuracy, with an overall classification rate of 96%. The Canny edge detection algorithm has demonstrated its efficacy as a preferred method for detecting cracks from UAV images, providing invaluable decision support for actual road maintenance.

Introduction

Road networks are an important aspect of our world today since they contribute significantly to civilization. Without adequate roadway networks, humans are unable to conduct their commerce and activities. Unfortunately, wear is the main reason why asphalt surfaces break down over time. The 4th power for the axle loads of the cars using the pavement causes it to deteriorate more quickly [1]. Early damage to pavement includes four kinds of cracks: longitudinal, transverse, alligator, and block. Once the above initial improvements are not addressed, potholes emerge, making the roadway more unsafe. Repairing potholes, for example, may cost more than renewing cracks. As a result, concrete monitoring and evaluation are critical in order to keep the cost of repairing road damage low and the roadway in excellent shape. Various sorts of asphalt pavement damage can arise during its lifespan. Climate change and traffic jams are two elements that contribute to these problems. Methods used for remote sensing include a variety of different sorts of methodologies that include high temporal as well as spatial resolutions, which are particularly helpful for the assessment of transportation systems. These are not damaging approaches that make use of unmanned aerial vehicles (UAVs), satellites, airplanes, and other vehicles that move [2]. Surveys in the field are both time-consuming and costly.
Monitoring landslides and finding surface cracks with the use of radar is a common application of this technology. But because the geology of dry and semiarid places is complicated, the ground surface changes a lot, and soil crusts and plants are spread out in a messy way. Satellite imagery is a way to find cracks in a picture, but it is not very accurate. UAVs, on the other hand, have clear benefits, such as their high efficiency, flexibility, large resolution, and low price for operation. Their precision can be as small as a few centimeters, which makes them a great way to get information about cracks on the surface [3].
Edge detection, segmentation using thresholds, automated classification by supervised methods, and traditional interpretation of images are the most common ways to find cracks in the surface from UAV images. Additionally, there are scholars who are utilizing new machine learning methodologies to develop methods for extracting cracks in buildings and roads [4]. Current methods for extracting cracks primarily rely on photographs captured by unmanned aerial vehicles (UAVs), which always contain a few wrong pixels due to how complicated the background data is. These wrong pixels mess up the estimation and evaluation of crack attributes, and they do not give accurate data to help study how surface cracks form and how dangerous they are. During the same time period, a lot of the recent research has focused on finding surface cracks in UAV pictures, and there are not many studies on how to figure out information about crack features. Crack feature information has not been carefully explained in terms of what it is, how it is calculated, and how it is used to prepare an image [5].
Dadrasjavan et al. [6] stated that crack detection is commonly performed via airborne systems and more recently, unmanned aerial vehicle (UAV) based imaging systems, as satellite photography has limits in providing adequate spatial resolution. Crack detection strategies in image analysis typically involve four stages: preprocessing, segmentation, classification, and enhancement. During the preprocessing stage, fundamental enhancing methods including noise reduction, smoothing, sharpening, and edge detection are applied. Additionally, more advanced approaches are used to eliminate deceptive and obstructive items such as cars, foliage, shades, signs, and marks. During the segmentation process, potential fracture primitives are mostly extracted by assessing the resemblance of cracks to edge elements. The subsequent stage involves the implementation of classification techniques to delineate the fracture region, and ultimately, the outcomes are mostly improved via morphological filtering.
Ersoz et al. [7] introduced a method for automatically detecting cracks by training artificial neural networks. The results indicate that the algorithm achieved a classification success rate of 79.9% when distinguishing between crack and non-crack classes. This approach achieves a success rate of 73.3% in distinguishing between three categories of cracks, non-cracks, and white lines. Furthermore, it has been determined that the method is effective for monitoring steel pavement surfaces. However, when it comes to evaluating concrete surfaces, the effectiveness of the method significantly decreases.
Kim el al. [8] introduced a crack detection technique that combines various image analysis algorithms, utilizing both imaging and ultra-sonic sensors for distance measurement. The study primarily examined individual cracks, with the main aim of quantifying their length and width. The proposed approach effectively detected cracks with a thickness greater than 0.1 mm, with a maximum prediction error in length of 7.3%.
Cubero et al. [9] introduced a method for detecting cracks by utilizing edge detection and morphological procedures. Following the preprocessing of geographical data to reduce noise and smooth the data, Canny edge detection is implemented in this study. Morphological closing is used to improve the results by removing gaps. Ultimately, by employing a decision tree, an impressive success rate of 88% is attained.
Ersoz el al. [7] introduced a technique for detecting cracks in a low traffic road stretch. The method involves two steps: segmentation and classification, utilizing photographs. The remaining Preprints were transferred using a closure operator after applying thresholding to the results. An SVM classifier is employed to differentiate between cracked and non-cracked products based on their geometric characteristics. An accuracy of 95% was attained during the inspection of concrete surfaces when the aircraft altitude was too low.
Saad et al. [10] evaluated road surface ruts and potholes by utilizing a UAV. Location exploration and design, data collection, processing of data, and analysis of data Information on rut and pothole extraction precision is collected at different heights. Structure-from-motion photogrammetric software analyses the data in this investigation. This study exclusively compares real and measured information for rut and pothole sampling to evaluate accuracy. Low altitude outperformed high altitude. In conclusion, multirotor UAV images extracted ruts and potholes. This device measures road ruts and potholes precisely, improving road condition monitoring.
Pan et al. [11] used multispectral image-based machine learning (ML) techniques using photographs captured by drones in order to differentiate between intact and destroyed asphalt. For each of the RGB as well as multispectral photographs, SVM, ANN, and RF techniques were analyzed and compared. The results showed that RF had the greatest degree of precision of all three techniques when applied to multispectral pictures. The spatial accuracy of the asphalt images was found to be a crucial factor in the effectiveness of the classifier as well as its feature set.
Koch et al. [12] examined asphalt pavement photographs for potholes. Field data gathering using approaches like those used in past studies was laborious and costly. Using a MATLAB prototype, researchers examined 120 pothole, fracture, and patch photographs from earlier investigations. They additionally employed fast-speed fish-eye vehicle photographs. Algorithms identified roadway faults after camera identification. These algorithms detected 86% of potholes.
Zhanget al. [5] proposed a method for extracting cracks using image data from an unmanned aerial vehicle to train machine learning algorithms, as well as methods for estimating crack feature knowledge: This article’s strategy successfully avoids both soil and vegetation crust interference. By adding dispersion rate, the approach can better characterize regional fracture distribution. This article’s crack feature information calculation is close to field survey results, proving that this approach is reliable for quantified crack feature information.
In their study, [9] presented a comprehensive methodology for the identification and classification of pavement cracks. This approach utilizes imagery obtained from road-observing vehicles, offering a rapid and efficient solution. The application of morphological operations serves to diminish noise, while the generation of cracks is achieved through the utilization of dynamic thresholding. Upon evaluating the outcomes, an achievement rate of 95% has been attained.
Zhang et al. [13] suggested using 3D laser-based scanning and an automatic defect identification strategy instead of traditional 2D and subjective methods to find both small (cracks) and large (deformations) faults in asphalt that cannot be seen. In order to locate potential cracks and the places of support for all distortions, a sparse processing approach was developed. With a fault identification rate of over 98%, the algorithm was able to accurately pinpoint the precise location and classify the data on the defects.
This work is primarily concerned with identifying and evaluating road problems, including cracks and potholes on road surfaces. The detection process involves the use of point clouds, digital surface models (DSMs), and orthomosaic derived from images captured by unmanned aerial vehicles (UAVs). The main goal and contribution of this work is to examine the condition of roads through the use of a 3D model created using unmanned aerial vehicle (UAV) systems at a reasonable price and a suitable height, where the height of the aircraft used as well as the type of aircraft used is an important factor in the accuracy of the images, where the value of Output pixels based on height. A methodology was used to improve the crack detection technology with high accuracy, which is the use of a machine learning model and gray level co-occurrence extraction. The GLCM matrix was obtained from the images before the process of applying classification to the images using the decision tree technique, and this will greatly improve the accuracy of defect detection. Various techniques were employed to identify cracks in the pavement image, including the enhancement method, Canny edge detection, morphological operations such as opening and closing, and the connection method, as illustrated in Fig. 2. Enhancement enhances the contrast of each specific image, facilitating the conversion from a grayscale image to a binary image using the thresholding approach. Morphological approaches enhance the appearance of fissures. The following sections of the study are organized as follows: First: introduction to the study area. Next, the recommended methodology is briefly explained. The results will be examined and analyzed in the next section. Finally, the conclusion and final thoughts are briefly summarized in the concluding section.

Materials and methods

Data sources and methodology of research

Source of data

The study was conducted in Mansoura City, Egypt. Figure 1 displays orthomosaic image of unnamed aerial vehicle (UAV). Table 1 displays the parameters of the unnamed aerial vehicle (UAV) image information.
Table 1
Parameters of the unmanned aerial vehicle (UAV) image information
Parameters
Value
Type of data
Multispectral image
The date of the flight
1 Mar 2020
Absolute accuracy
Horizontal 3 cm, vertical 5 cm
UAV model
DJI Phantom 4 RTK
Focal length
8 mm
Altitude range
0–33 feet (0–10 m)
Flight height
100 m

Methodology for crack detection proposal

This article presents an innovative technique for extracting surface cracks from unmanned aerial vehicle (UAV) images using machine learning techniques. Figure 2 depicts a flowchart. First, a UAV flight had been launched in order to effectively capture the intricate details of the study area. Subsequently, the aerial photographs acquired during the flight were subjected to image processing techniques using specialized software, resulting in the production of photogrammetric outcomes. The initial processing stage of image processing entails image alignment, while the subsequent step involves the generation of a dense point cloud, encompassing both point cloud and mesh. The final stages of this process include the production of a digital surface model (DSM) and an orthomosaic, which includes both the DSM and the digital terrain model (DTM). Image attributes are then created, and the UAV image and attribute are merged into one file. Cracks are detected and extracted from unmanned aerial vehicle (UAV) images using machine learning techniques. Then, an assessment of the accuracy of the results was made (Table 2).
Table 2
displays the complete range of potential attributes derived from UAV subset image
https://static-content.springer.com/image/art%3A10.1007%2Fs41062-023-01315-2/MediaObjects/41062_2023_1315_Tab2_HTML.png

The process of data pre-processing

Initial processing

The use of unmanned aerial vehicles (UAVs) for generating photogrammetric data has emerged as a widely preferred technology in contemporary times. Consequently, a diverse array of software applications had been constructed to facilitate flight planning, observation, and interaction between the controller and the unmanned aerial vehicle (UAV). Moreover, it should be noted that automatic flight software operates independently of human intervention, with the exception of situations pertaining to safety and security considerations. Operation plans are formulated by carefully choosing the specific region for which a map will be generated. Subsequently, the alignment of overlap and flight altitudes is established in conjunction with the requirement for high-resolution data. To enhance the precision of image alignment and obtain a more comprehensive point cloud, it is advisable to incorporate oblique images captured alongside the nadir (vertical) photo. The flying height is a crucial component in drone terrain mapping as it directly affects the reliability and precision of the generated DEM models and orthomosaic images. The flight height in this study was set at 100 m, a height deemed highly appropriate for the purpose of the study. As the appropriate height gives an appropriate pixel value, in our case a pixel of 2 by 2 cm was obtained.
The quality of pavement surface photographs is typically diminished due to several factors, including variations in lighting circumstances such as sunny or cloudy weather, the presence of random grainy textures, uneven lighting, irregular shadows, pavement markings, watermarks, tire marks, oil stains, and so on.
These parameters greatly influence the detection of cracks through image processing. The primary objective of picture preprocessing is to mitigate or minimize the adverse impacts of many elements, hence enhancing the overall efficacy of image processing. The study used Gaussian function-based spatial filtering and top-hat transform to preprocess the pavement photographs gathered.
The Pix4DMapper programmer, which is available at www.​pix4d.​com, processed the UAV images obtained during the research period. In the research, the obtained UAV photographs were processed with the Pix4DMapper programmer (available at www.​pix4d.​com) to produce an orthomosaic, also known as a single large 3D image. During the photogrammetric calculation, the captured imagery and the obtained frames are worked together to produce a digital orthographic image. The structure from motion (SFM) method begins with the identification of tie points in imagery via picture matching, which is often carried out automatically by point-obtaining and comparing algorithms. This is how the procedure got its name. After that, the block bundle self-calibration process is carried out, initially to determine both the exterior and interior orientation parameters from the pictures and subsequently to determine the three-dimensional surface coordinates of the tie points. After that, a dense matching method is used in order to produce a digital elevation model of the region. The digital orthomosaic was created by employing a method known as backward projection.

Attribute generation

Crack extraction utilizing UAV photographs has the advantages of being high-resolution, maneuverable, efficient, and cheap to operate. However, due to the intricacy of background details, some crack extraction techniques based on UAV photographs will always include some incorrect pixels. The number of photographs increases after image cutting; however, this is a necessary step since it simplifies the background detail of a UAV picture before crack image extraction and re-splicing. It seems logical to selectively identify cracks in the sub-images that actually have cracks. Therefore, machine learning is used to classify the sub-images into two groups. The sequential stages involved in the methodology are depicted in Fig. 2. Initially, an unmanned aerial vehicle (UAV) image is divided into smaller images in ENVI (Environment for visualizing Images). Secondly, the features of the sub-image are extracted for individual pixels, which are used as input information for the decision or classification trees. A collection of 24 potential attributes was chosen. Due to the construction of texture formulae obtained from the grey-level co-occurrence matrix GLCM [14], a significant correlation exists among several of these equations [15]. Based on the findings of this research, it was determined that out of the total of 24 attributes, only 8 exhibited no correlation with each other. Consequently, these uncorrelated attributes were deemed suitable for utilization in the process of classification. Features are derived from the grey-level co-occurrence matrix GLCM. The attributes encompass those that were calculated from the grey-level co-occurrence matrix GLCM, as described by [16]. A 3 × 3-pixel window was employed for the construction of the grey-level co-occurrence matrix (GLCM) and subsequent texture calculation. According to [17], the use of large window sizes often leads to a decrease in the producer’s accuracy. This accuracy is measured by the ratio of pixels that were correctly classified in a specific class to the overall number of pixels that need to be classified in that class. The reason for this decrease is primarily attributed to the impact of between-class differences on the borders of pixels. These edge pixels, characterized by high texture values, tend to be misclassified and placed in the incorrect category. Moreover, in the case where a window possesses dimensions of M × M, it is worth noting that a strip with a width of (M−1)/2 pixels surrounding the picture is going to stay vacant. The conventional approach for addressing this matter involves the use of the nearest texture estimation to populate the edge pixels. According to [18], the issue of edge effects can be challenging in the field of classification.
The shaded cells indicate attributes that are not correlated and can be utilized for classification purposes. Therefore, DSM subset images and the attributes generated from them will improve the classification process in the second step when combined into one classifier. Figure 3a–h depicts the resulting attributes derived from a sample of subset images that contain cracks in the UAV image. According to [19], it is anticipated that incorporating texture attributes will help minimize misclassifications that may arise due to similarities between different features.

Classification tree (decision tree)

In this study, our objective is to employ classification trees to attain a per-pixel classification of the provided information into two primary classes: cracks (C) and non-cracks (G). The development of the notion of decision trees was attributed to [20] in their seminal work published in 1984. Waske [21] provides a concise introduction, whereas [22] offer a comprehensive description. A classification tree is a one-variable method that does not use parameters and is built using a process called binary recursive partitioning. The described process involves an iterative approach where a diverse collection of training data, encompassing various classes, is systematically divided into increasingly homogeneous clusters through the application of a binary splitting rule. This hierarchical subdivision results in the formation of a tree structure, which can subsequently be utilized to classify additional datasets that exhibit similarities. A classification tree is composed of interconnected nodes connected by branches and consists of three main elements: the root node, which serves as the initial point of the tree; the non-terminal nodes, which are connected to the root node and other internal nodes through branches; and the terminal node, which represents a collection of pixels that has been assigned to a specific class. The classification laws are derived through the utilization of training samples. The derivation of every division in the tree typically involves the application of statistical methods, wherein the establishment of rule T at node N is primarily determined by the measure of 'impurity'. According to Waske, when every sample inside set N belongs to an identical class, a node is deemed pure and has an impurity value of 0. Conversely, if the classes are evenly distributed among the samples, the impurity value is high. In the event that the logical condition of an if–then statement pertaining to the value of an attribute at a given node is satisfied, the subsequent action is to traverse the left branch. Conversely, if the condition is not met, the right branch is selected. The process persists until a node achieves purity, wherein it exclusively contains pixels belonging to a single class and is designated as a terminal node. In the present study, the process involved commencing from the root node and employing training data to partition pixels into distinct groups, which were subsequently allocated based on a binary split rule. When the pixels belonged to the same class, indicating a decrease in impurity toward zero, they were aggregated to create a terminal node. Alternatively, a non-terminal node had been designated, and the process of division persisted.

The process of pruning decision trees

Pruning is a technique used to prevent overfitting in classification trees. Overfitting happens when the tree captures excessive detail or noise from the training data, resulting in inaccurate predictions. According to a study conducted by [23], it has been found that pruning can lead to a reduction in tree size by up to 25%, resulting in more efficient growth. Several pruning techniques have been developed, including decreased error pruning and minimum-description length pruning. Pruning is a technique used to prevent overfitting in classification trees. Overfitting happens when the tree captures excessive detail or noise from the training data, resulting in inaccurate predictions. Numerous studies have conducted comparisons of pruning methods, revealing minimal variation in performance outcomes. The trees underwent pruning using a tenfold cross-validation procedure, as described in “Testing and validation of the model” section. In this study, the decision trees were utilized in their original form to produce the classification outcomes without undergoing any conversion into production rules. According to [24] findings, the conversion of a decision tree containing a maximum of 92 nodes into a collection of production rules leads to a marginal enhancement in the average error rate, not exceeding 0.1%. This improvement can be considered insignificant in practical terms. In Fig. 4, we see a representative example of the study’s pruned classification tree, which was created by applying the decision tree to data from a subset of test regions.

A machine learning-based approach for crack extraction method

Post-classification smoothing

There are several factors that contribute to the degradation of pavement surface photographs, including variations in brightness (e.g., bright or cloudy), the presence of randomly grainy texture, uneven lighting, unsteady shadows, asphalt markings, watermarks, and other similar factors. The aforementioned factors exert a substantial influence on the identification and detection of cracks through the utilization of image processing techniques. Image post-classification smoothing involves mitigating or minimizing the adverse impacts of these factors, thereby enhancing the overall efficacy of image processing. The present study employs Gaussian filtering, median filtering, and standard deviation filtering as post-classification smoothing techniques for the acquired pavement images.
$$G_{{\left( {n,m} \right)}} = \frac{1}{{2\pi \sigma^{2} }}e^{{ - \frac{{n^{2} + m^{2} }}{{2\sigma^{2} }}}}$$
(1)
where n and m are pixel indexes, σ is the Gaussian filter standard deviation, and \({G}_{(n,m)}\) is the mask element value. Mask and original image are convolved to create filtered image. The detector’s noise sensitivity depends on σ values (sigma = 4’).
$$k = {\text{medfilt}}2\left( q \right)$$
(2)
The matrix q undergoes median filtering using the standard 3-by-3 neighborhood and k is the output of median filter.
If the input image q belongs to the integer class, all of the output values will be returned as integers. In the event that the quantity of pixels within the neighborhood, denoted as M * N, is an even number, it is possible for certain median values to not be whole numbers. In these instances, the fractional components are disregarded. The handling of logical input is analogous in nature.

Canny edge detection

Edge detection is a fundamental technique in the field of image processing that aims to identify specific points within a digital image that exhibit discontinuities, specifically abrupt changes in the brightness of the image. The regions in an image where there is a significant variation in brightness are commonly referred to as the edges or boundaries of the image. This study employs edge detection as a means of initially identifying the potential area of crack edges. Subsequently, morphological operations are applied exclusively to the identified crack edge area, resulting in a substantial enhancement in detection efficiency. In the field of image processing, conventional methods for detecting edges encompass various algorithms such as Sobel, Prewitt, Gauss–Laplace (LoG), and Canny operators. In the context of image processing, edge detection is the process of identifying areas within an image where there is a sudden and significant change in grey-level values. The objective of this task is to compute intensity gradients and thereafter determine the regions within the image that exhibit the most pronounced intensity gradients. After that, non-maximum suppression is employed in order to reduce the density of edges. Our objective is to eliminate extraneous pixels that may not be considered part of an edge. Pixels are considered edges if their intensity gradient value surpasses a predetermined threshold. Pixels are classified as edges if the value of the density gradient is over a certain threshold. In the event that a pixel resides between the range of two thresholds, it should be deemed acceptable solely if it is in close proximity to a pixel that surpasses the upper threshold. The equations are displayed below:
$$G = \sqrt {G_{X}^{2} + G_{Y}^{2} }$$
(3)
$$\theta = a\tan 2(G_{y ,} G_{x} )$$
(4)
$${\text{BW}} = {\text{edge}}\left( {I,^{\prime } {\text{Canny}}^{\prime } } \right)$$
(5)
The intensity gradient indicates the picture’s edge direction, which the Canny detector follows horizontally, vertically, and diagonally. G is the gradient quantity, is the gradient in the first direction, is the gradient in the second direction, a tan is the arctangent operator, and is the gradient direction. Equations 3 and 4 yield the gradient magnitude and direction; I is the image after applying smoothing filters; and BW is the Canny output.
Finally, the process concludes the identification of edges by effectively restraining all weak edges that are not interconnected with powerful edges. The operator known as Canny is known for its low error rate as it is designed to detect all edges accurately while minimizing the occurrence of false positive responses. Additionally, it has the capability to accurately identify the boundary in close proximity to the actual boundary. Hence, it can be argued that this method is among the most rigorously defined approaches that offer robust and dependable detection capabilities.

Morphological opening and closing

Mathematical morphology holds significant importance as a fundamental technique in the realm of a few levels of image processing [25]. The bottom and top hat transforms are generated in the morphology of the grayscale through the application of the process of subtracting one image from another, as well as closing and opening operations. The bottom and top-hat transforms serve similar purposes, with the key distinction lying in the object being considered. The top-hat transform is commonly employed to detect and enhance light-colored objects against a dusky background, while the bottom-hat transform is typically used to identify and enhance dark-colored objects against a light background. The concept of the top-hat operator is interpreted as the negation of its opening operation.
The top-hat operator of function I is mathematically interpreted as the difference between I and its opening operation.
$$T_{{{\text{tophat}}}} \left( I \right) = I - \left( {{\text{Iob}}} \right)$$
(6)
The bottom-hat operator of function I is expressed as the result of subtracting I from the closing operation.
$$T_{{{\text{bothat}}}} \left( I \right) = \left( {I.b} \right) - I$$
(7)
Let I represent the initial picture and b denote a structural component. The determination of size b primarily relies on the conversion ratio between pixels and real-world dimensions, as well as the typical dimensions of cracks. The standard dimension of b is 150 mm.
The operator in question uses a robust combination of erosion and dilation techniques. Upon the act of opening the box, the items contained within undergo a process of separation. The process of dilation results in an enlargement of an image, while erosion leads to a reduction in its size. The act of opening an image result in the refinement of its contour, the separation of narrow connections, and the removal of slender extensions in a comprehensive manner. The acronym AB denotes the sequential application of erosion and dilation operations in image processing. Specifically, AB signifies the process of expanding the boundaries of an image ‘A’ using a structural element B.
$$A \circ B = \left( {A\Theta B} \right) \oplus B$$
(8)
The combination of erosion and dilation results in the generation of a robust operator. Objects are brought into proximity when they are in a state of closure. Similar to the process of opening, closing operations in image processing have the effect of reducing irregularities in contour regions. Additionally, they serve to merge small discontinuities and elongated gaps, eliminate small voids, and fill in missing areas. The operation denoted as A·B represents a sequential process that involves dilatation and erosion. This process is observed when an image A undergoes closure through the application of a structural element B.
$$A{\blacksquare }B = \left( {A \oplus B} \right)\Theta B$$
(9)
The process of achieving closure involves the gradual erosion of a dilated image. The act of closing serves to unite fragmented components and bridge any existing gaps within objects. After the closure of one or more dilations, a subsequent erosion operation takes place.

Results and analysis

Assessment of model

Testing and validation of the model

The underlying concept of a classification tree involves the iterative division of high-dimensional data into progressively smaller partitions. This process aims to enhance the purity of the partitions in relation to their class membership. The initial step involves the cultivation of a tree of excessive size, employing a norm. In order to produce the most efficient partitions for the tree. Typically, those expansive trees exhibit a high degree of compatibility with the training data set. However, their ability to generalize is limited, resulting in a low rate of accurate classification for novel patterns. The visual representation in Fig. 5 depicts a tree structure that appears visually complex, wherein a set of rules, such as “X6 < 0.0965483,” are employed to assign each pattern to one of the 15 terminal nodes for classification purposes. In order to ascertain the assignment of input patterns for a given surveillance, the classification tree commences at the highest node and proceeds to apply the corresponding rule. In the event that the dot adheres to the specified rule, the tree will proceed along the left path.
The proposal of Breiman [20] states that the suggested method entails iteratively removing branches from the excessively large tree in order to identify a hierarchical series of smaller subtrees. The optimal tree within this sequence is determined by evaluating the misclassification rate, which is estimated through either an independent test sample or cross-validation. This implies that a reduced subgroup of the aforementioned tree might yield minimal fault, as certain decision rules within the complete tree may have a detrimental rather than beneficial impact. The results of the tree test, as depicted in Fig. 6, were obtained through a ten-fold cross-validation approach. This involved partitioning the data into ten subsets, on the basis of only 10% of the information reserved for verification purposes Furthermore, the remaining 90% of the data is allocated for the purpose of training. Initially, the mistake made with the resubstitution, which represents the percentage of the initial observations that were incorrectly categorized by the different subsets of the first tree, was determined. Subsequently, cross-validation was employed to assess the accuracy of the estimation of fault for trees of different sizes. According to Fig. 6, it can be observed that the substitution error tends to exhibit an overly optimistic tendency. The relationship between tree size and error rate is inversely proportional, as evidenced by the decrease in error rate as the tree size increases. However, the results of cross-validation indicate that beyond a confirmed threshold, increasing the size of the tree actually leads to an increase in error rate. This conclusion is based on the selection of the tree exhibiting the lowest cross-validation error. Although the current approach may be deemed acceptable, it is advisable to opt for a less intricate tree structure if it yields comparable results. The employed criterion involves selecting the most straightforward tree that falls within a range of one standard error from the minimum.
It was derived by finding a threshold equal to the minimum cost plus one standard error and using that. Under this threshold, the optimal level is the one with the smallest tree. The cutoff value was calculated by adding one standard error to the minimum cost since best-level = 0 represents the unpruned tree. Next, the best level of the tree was used to prune it, and the anticipated misclassification cost was calculated. The edited tree of classification is shown in Fig. 7. The cost of misclassification is 1% for this set of cracked data.

Results analysis for cracks detection

Post-classification smoothing

The fundamental step of every edge detection algorithm involves the elimination of image blurriness or noise through the application of filtering algorithms. There exist multiple filtering algorithms, among which Gaussian filtering is widely regarded as the most optimal choice for various types of images. Following the application of the filtering algorithm, the subsequent step involves performing edge detection on the image utilizing the Canny edge detection technique. In this study, the Canny edge detection algorithm was employed to identify a diverse array of edges within images. The detection mechanism identifies edges by detecting the local maxima of the gradient of the function f (x, y). The calculation of the gradient involves the utilization of the derivative of a Gaussian filter, a median filter, and standard deviation filtering; the results are depicted in Fig. 8.
If the value of a pixel exceeds the high threshold, it is classified as an edge pixel. Conversely, if a pixel’s value is below the low threshold, it is not classified as an edge pixel. Through the analysis of various algorithms applied to diverse input data, it has been determined that the Canny edge detection algorithm consistently yields superior outcomes. The outcomes of applying the Canny edge detection algorithm to a sample of images, as shown in Fig. 9, support this assertion.

Morphological operations

In the field of morphological image processing, the process of reconstructing the original image involves the use of dilation, erosion, opening, and closing operations for a finite number of iterations. In this study, a closed operation was used to detect cracks, and a filled image was used to detect patches and potholes, as shown in Fig. 10. The closed operation is responsible for merging narrow discontinuities, removing minor openings, and filling gaps in the contour. The process involves the systematic assignment of pixel values to the boundary region of an image. The operator in question is a potent computational tool that is derived from the fusion of the erosion and dilation operations.

Final cracks, potholes and patches extraction

The main focus of this work was the detection of cracks. To do this, the categorized image was converted into a closed image in order to segregate the cracks. The numerical values corresponding to cracks were transformed to a binary representation of one, and the numerical values corresponding to non-crack pixels were transformed to a binary representation of zero. Ultimately, the photographs were selectively cropped in order to exclusively display the presence of cracks. The outcome yielded a monochromatic depiction that effectively portrays the identified fractures and road depressions, devoid of extraneous visual elements or any absence of data. The study presents Fig. 11 as a representative example.

Accuracy assessment

The evaluation of accuracy is regarded as a crucial concluding phase in the classification process. An accuracy assessment was conducted on the images derived from the classification tree, Canny edge detector, filled image, and morphologically close image in order to ascertain the adequacy of the image classification. The classified images from machine learning were compared with photographs obtained from unmanned aerial vehicles (UAVs). A total of one hundred randomly chosen points were utilized to classify the photographs within the designated study area. Subsequently, the accuracy rating’s reference column was completed based on the reference image data. The Cohen’s kappa coefficient was further examined in relation to the matrix’s shortcomings. The Kappa statistic demonstrates the effectiveness of the classification process in comparison with random values. The calculation of accuracy assessment is elucidated in Tables 3, 4, 5, 6, 7.
Table 3
Accuracy assessment for classification tree
Classification
Accuracy
Crack %
Non-crack %
Classification tree
User’s accuracy
86.79
85.1
Producer’s accuracy
86.79
85.1
Overall accuracy
86
Kappa coefficient
0.7190
Table 4
Accuracy assessment for pruning classification tree
classification
Accuracy %
Crack %
Non-crack %
Pruning Classification tree
User’s accuracy
98
91.5
Producer’s accuracy
92.2
87.8
Overall accuracy
90
0.8
Kappa coefficient
Table 5
Accuracy assessment for Canny edge detection
classification
Accuracy %
Crack %
Non-crack %
Canny edge detection
User’s accuracy
92.45
93.47
Producer’s accuracy
94.23
91.5
Overall accuracy
93
0.8597
Kappa coefficient
Table 6
Accuracy assessment for final crack extraction
classification
Accuracy %
Crack %
Non-crack %
Closed image
User’s accuracy
98.11
93.6
Producer’s accuracy
94.4
95.65
Overall accuracy
96.00
0.9195
Kappa coefficient
Table 7
Accuracy assessment for pothole extraction
classification
Accuracy %
Pothole %
Non-pothole %
filled image
User’s accuracy
98.11
89.36
Producer’s accuracy
91.22
97.67
Overall accuracy
94.00
0.8790
Kappa coefficient
A comparison of the different stages that were performed on the image to extract cracks is shown in Fig. 12.
When compared to the reference data, the utilization of a classification tree resulted in the detection of cracks with well-defined edges with an overall accuracy rate of 86%, as shown in Table 3.
When applying pruning classification, the overall accuracy will rise to 90%, and the kappa coefficient will become 0.8, as shown in Table 4.
By analyzing different algorithms applied to various input data, it has been determined that the Canny edge detection algorithm consistently yields superior results. This assertion is supported by the results obtained from the accuracy assessment, as shown in Table 5, where the overall accuracy is 93% and the kappa coefficient is 0.8597.
In the end, the cracks and potholes were determined using the morphological process (closed and filled image), and it proved its efficiency in evaluating the accuracy, as the total accuracy was 96% for cracks and 94% for potholes, and the Kappa coefficient was 0.9195 for cracks and 0.8790 for potholes, as shown in Table 6 and 7.

Conclusions

This research presents a novel approach for extracting surface cracks using machine learning techniques applied to unmanned aerial vehicle (UAV) imagery. The proposed method includes several steps to prepare the data, as well as the extraction of relevant features, classification by decision tree, post-processing smoothing filters, and morphological operations to find potholes and cracks. The method’s reliability is demonstrated through the process of verifying its accuracy, and we may derive the following inferences: The primary source of mistakes in the extraction of surface cracks from an unmanned aerial vehicle (UAV) image mostly arises from the presence of intricate background elements, such as plant and soil crust. Utilizing machine learning techniques for sub-image classification, crack extraction, and subsequent re-splicing is a viable approach to mitigating errors. The total accuracy reached 86%. The images were smoothed post-classification by using three methods (median filter, Gaussian, and standard deviation). The best filter was Gaussian, which was then used to detect the edges using (Canny edge detection. A morphological process was then applied to the images (closed and filled images) and proved to be highly accurate, with a kappa coefficient of 0.9195 for closed images in obtaining cracks and 0.8790 for filled images in obtaining potholes. And overall accuracy reached 96% and 94%, respectively. Therefore, this article improved the accuracy of obtaining road deformations by adding image features at the beginning and also through improvements made after classification. Compared to the accuracy obtained by [5], where its overall accuracy reached 89.50% in detection cracks and [13] achieved an accuracy of 98% using 3D laser-based scanning. Therefore, our study outperformed previous studies by using machine learning and as for ground measurements, which were conducted in nature, they were very close to our study, but ground measurements contain many problems and defects. This technique offers valuable data support for subsequent studies on the development and risk assessment of surface cracks.
Cracks are a prevalent geological phenomenon with potential disastrous consequences. The process of scientifically repairing cracks necessitates the rational classification and assessment of various levels of crack severity. This is what our study achieved.
In summary, the utilization of unmanned aerial vehicle (UAV) platforms equipped with multispectral remote sensors presents a significant asset in the assessment and surveillance of asphalt pavement quality.

Acknowledgements

Many thanks to the Matlab software, which helped in analyzing images and obtaining accurate and very fast results instead of traditional methods.

Declarations

Conflict of interest

All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important intellectual content; and (c) approval of the final version. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue. The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript.

Ethics approval

Not applicable.
No informed consent is required.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.
Literatur
1.
Zurück zum Zitat Hadjidemetriou GM, Christodoulou SE, Vela PA (2016) Automated detection of pavement patches utilizing support vector machine classification. In: 18th Mediterranean electrotechnical conference (MELECON), IEEE, pp. 1–5 Hadjidemetriou GM, Christodoulou SE, Vela PA (2016) Automated detection of pavement patches utilizing support vector machine classification. In: 18th Mediterranean electrotechnical conference (MELECON), IEEE, pp. 1–5
2.
Zurück zum Zitat Diamanti N, Redman D (2012) Field observations and numerical models of GPR response from vertical pavement cracks. J Appl Geophys 81:106–116CrossRef Diamanti N, Redman D (2012) Field observations and numerical models of GPR response from vertical pavement cracks. J Appl Geophys 81:106–116CrossRef
3.
Zurück zum Zitat Nex F, Remondino F (2014) UAV for 3D mapping applications: a review. Appl Geomat 6:1–15CrossRef Nex F, Remondino F (2014) UAV for 3D mapping applications: a review. Appl Geomat 6:1–15CrossRef
4.
Zurück zum Zitat Verschuuren M, De Vylder J, Catrysse H, Robijns J, Philips W, De Vos WH (2017) Accurate detection of dysmorphic nuclei using dynamic programming and supervised classification. PLoS ONE 12(1):e0170688CrossRef Verschuuren M, De Vylder J, Catrysse H, Robijns J, Philips W, De Vos WH (2017) Accurate detection of dysmorphic nuclei using dynamic programming and supervised classification. PLoS ONE 12(1):e0170688CrossRef
5.
Zurück zum Zitat Zhang F, Hu Z, Fu Y, Yang K, Wu Q, Feng Z (2020) A new identification method for surface cracks from UAV images based on machine learning in coal mining areas. Rem Sens 12(10):1571CrossRef Zhang F, Hu Z, Fu Y, Yang K, Wu Q, Feng Z (2020) A new identification method for surface cracks from UAV images based on machine learning in coal mining areas. Rem Sens 12(10):1571CrossRef
6.
Zurück zum Zitat Dadrasjavan F, Zarrinpanjeh N, Ameri A (2019) Automatic crack detection of road pavement based on aerial UAV imagery Dadrasjavan F, Zarrinpanjeh N, Ameri A (2019) Automatic crack detection of road pavement based on aerial UAV imagery
7.
Zurück zum Zitat Ersoz AB, Pekcan O, Teke T (2017) Crack identification for rigid pavements using unmanned aerial vehicles. In: IOP conference series: materials science and engineering, IOP Publishing, pp 12101 Ersoz AB, Pekcan O, Teke T (2017) Crack identification for rigid pavements using unmanned aerial vehicles. In: IOP conference series: materials science and engineering, IOP Publishing, pp 12101
8.
Zurück zum Zitat Kim H, Lee J, Ahn E, Cho S, Shin M, Sim S-H (2017) Concrete crack identification using a UAV incorporating hybrid image processing. Sensors 17(9):2052CrossRef Kim H, Lee J, Ahn E, Cho S, Shin M, Sim S-H (2017) Concrete crack identification using a UAV incorporating hybrid image processing. Sensors 17(9):2052CrossRef
9.
Zurück zum Zitat Cubero-Fernandez A, Rodriguez-Lozano FJ, Villatoro R, Olivares J, Palomares JM (2017) Efficient pavement crack detection and classification. EURASIP J Imag Video Process 2017:1–11 Cubero-Fernandez A, Rodriguez-Lozano FJ, Villatoro R, Olivares J, Palomares JM (2017) Efficient pavement crack detection and classification. EURASIP J Imag Video Process 2017:1–11
10.
Zurück zum Zitat Saad AM, Tahar KN (2019) Identification of rut and pothole by using multirotor unmanned aerial vehicle (UAV). Measurement 137:647–654CrossRef Saad AM, Tahar KN (2019) Identification of rut and pothole by using multirotor unmanned aerial vehicle (UAV). Measurement 137:647–654CrossRef
11.
Zurück zum Zitat Pan Y, Zhang X, Sun M, Zhao Q (2017) Object-based and supervised detection of potholes and cracks from the pavement images acquired by UAV. Int Arch Photogramm Remote Sens Spat Inf Sci 42:209–217CrossRef Pan Y, Zhang X, Sun M, Zhao Q (2017) Object-based and supervised detection of potholes and cracks from the pavement images acquired by UAV. Int Arch Photogramm Remote Sens Spat Inf Sci 42:209–217CrossRef
12.
Zurück zum Zitat Koch C, Brilakis I (2011) Pothole detection in asphalt pavement images. Adv Eng Inform 25(3):507–515CrossRef Koch C, Brilakis I (2011) Pothole detection in asphalt pavement images. Adv Eng Inform 25(3):507–515CrossRef
13.
Zurück zum Zitat Zhang K, Cheng HD, Zhang B (2018) Unified approach to pavement crack and sealed crack detection using preclassification based on transfer learning. J Comput Civ Eng 32(2):4018001CrossRef Zhang K, Cheng HD, Zhang B (2018) Unified approach to pavement crack and sealed crack detection using preclassification based on transfer learning. J Comput Civ Eng 32(2):4018001CrossRef
14.
Zurück zum Zitat Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804CrossRef Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804CrossRef
15.
Zurück zum Zitat Clausi DA (2002) An analysis of co-occurrence texture statistics as a function of grey level quantization. Can J Remote Sens 28(1):45–62CrossRef Clausi DA (2002) An analysis of co-occurrence texture statistics as a function of grey level quantization. Can J Remote Sens 28(1):45–62CrossRef
16.
Zurück zum Zitat Förstner W, Gülch E (1987) A fast operator for detection and precise location of distinct points, corners and centres of circular features. In: Proceeding. ISPRS intercommission conference on fast processing of photogrammetric data, Interlaken, pp 281–305 Förstner W, Gülch E (1987) A fast operator for detection and precise location of distinct points, corners and centres of circular features. In: Proceeding. ISPRS intercommission conference on fast processing of photogrammetric data, Interlaken, pp 281–305
17.
Zurück zum Zitat Ferro CJS (1998) Scale and texture in digital image classification. West Virginia UniversityCrossRef Ferro CJS (1998) Scale and texture in digital image classification. West Virginia UniversityCrossRef
18.
Zurück zum Zitat Hall-Beyer M (2017) GLCM texture: a tutorial v. 3.0 March 2017 Hall-Beyer M (2017) GLCM texture: a tutorial v. 3.0 March 2017
19.
Zurück zum Zitat Myeong S, Nowak DJ, Hopkins PF, Brock RH (2001) Urban cover mapping using digital high-spatial resolution aerial imagery. Urban Ecosyst 5:243–256CrossRef Myeong S, Nowak DJ, Hopkins PF, Brock RH (2001) Urban cover mapping using digital high-spatial resolution aerial imagery. Urban Ecosyst 5:243–256CrossRef
20.
21.
Zurück zum Zitat Waske B (2007) Classifying multisensor remote sensing data: concepts, algorithms and applications. Citeseer Waske B (2007) Classifying multisensor remote sensing data: concepts, algorithms and applications. Citeseer
22.
Zurück zum Zitat Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674CrossRef Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674CrossRef
23.
Zurück zum Zitat Esposito F, Malerba D, Semeraro G, Kay J (1997) A comparative analysis of methods for pruning decision trees. IEEE Trans Patt Anal Mach Intell 19(5):476–491CrossRef Esposito F, Malerba D, Semeraro G, Kay J (1997) A comparative analysis of methods for pruning decision trees. IEEE Trans Patt Anal Mach Intell 19(5):476–491CrossRef
24.
Zurück zum Zitat Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234CrossRef Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234CrossRef
25.
Zurück zum Zitat Gonzalez RC, Woods RE, Hall PP (2008) Digital image processing third edition pearson international edition prepared by pearson education. J Biomed Opt 14:29901CrossRef Gonzalez RC, Woods RE, Hall PP (2008) Digital image processing third edition pearson international edition prepared by pearson education. J Biomed Opt 14:29901CrossRef
Metadaten
Titel
Smart monitoring of road pavement deformations from UAV images by using machine learning
verfasst von
Heba Basyouni Ibrahim
Mahmoud Salah
Fawzi Zarzoura
Mahmoud El-Mewafi
Publikationsdatum
01.01.2024
Verlag
Springer International Publishing
Erschienen in
Innovative Infrastructure Solutions / Ausgabe 1/2024
Print ISSN: 2364-4176
Elektronische ISSN: 2364-4184
DOI
https://doi.org/10.1007/s41062-023-01315-2

Weitere Artikel der Ausgabe 1/2024

Innovative Infrastructure Solutions 1/2024 Zur Ausgabe