Zum Inhalt

Identifying Lead Water Service Lines Using Ultrasonic Stress Wave Propagation and 1D-Convolutional Neural Network

  • Open Access
  • 01.09.2025
Erschienen in:

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Diese Studie untersucht eine bahnbrechende Methode zur Identifizierung von Leitungsnetzen für die Wasserversorgung mittels Ultraschallausbreitung von Stresswellen und 1D-konvolutionalen neuronalen Netzwerken (1D-CNNs). Die Forschung befasst sich mit der dringenden Notwendigkeit einer nicht-invasiven, kosteneffektiven Technologie zur Unterscheidung zwischen bleihaltigen und nicht-bleihaltigen Dienstleistungslinien, wie von den USA vorgeschrieben. Umweltschutzbehörde. Die Studie präsentiert einen umfassenden Ansatz, der Feldtests an fast 250 Häusern in 20 US-Bundesstaaten umfasst. Städte, Signalverarbeitungstechniken und die Entwicklung eines 1D-CNN-Modells. Die Ergebnisse zeigen, dass das Modell in der Lage ist, Werkstoffe für Servicelinien präzise zu klassifizieren, wobei die Gesamtgenauigkeit in Blindtests bei 80,5% liegt. In der Studie werden auch die Herausforderungen und Verbesserungspotenziale diskutiert, wie etwa die Erweiterung des Datensatzes und die Entwicklung getrennter Modelle für öffentliche und private Versorgungsleitungen. Die Ergebnisse deuten darauf hin, dass diese Technologie ein beträchtliches Potenzial für eine breite Anwendung hat und eine praktische Lösung für ein kritisches Problem der Umweltgesundheit bietet.
K. I. M. Iqbal made the major contribution to this study.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Lead service lines (LSLs) are a significant source of lead contamination in drinking water, posing serious health risks to both children and adults [4, 66]. A study by Cornwell et al. (2016) [17] reported that between 15 to 22 million Americans currently receive drinking water through LSLs. Furthermore, recent service line (SL) surveys in states like Michigan, Illinois, Wisconsin, and Indiana revealed that 10–15% of service line materials are unknown, with the possibility that some may contain Lead [29]. In response, the United States Environmental Protection Agency (EPA) has mandated the replacement of all LSLs [4, 5], with the majority of these lines to be removed within the next ten years, as outlined in the revisions to the Lead and Copper Rule (LCR) [52]. However, water utilities companies are struggling to develop accurate inventories to educate consumers and efficiently implement an LSL replacement strategy.
Conventional approaches for identifying service lines (SL), such as relying on historical records, water quality sampling (which does not pinpoint the source of contamination and may overlook lead service lines), tap card information, service line installation dates, and records of recent repairs, are not always effective [14]. While potholing or full excavation can accurately identify service line materials, these approaches are time-consuming, costly, and cause disruptions to regular activities and transportation in the neighborhood [14]. According to the American Water Works Association [52], there is currently no commercially available non-invasive, rapid, or cost-effective technology capable of accurately identifying or predicting the material of underground service lines, including whether they contain Lead or not, without excavation. Thus, finding a convenient technology that does not require excavation and can at least differentiate between Lead and non-lead-based service lines has become a critical need for water utilities.
Several non-destructive evaluation techniques are being researched to identify service line materials; while each method has its advantages, they also present certain limitations. In Pittsburgh (PA) Water and Sewer Authority and the City of Tucson (AZ), CCTV was used to visually inspect potentially exposed service lines near the curb stop box. However, these lines are often covered with debris or soil, making it difficult to capture clear imagery of the service line [14, 36]. Another group used magnet tests to identify pipe materials, but magnets only adhere to steel or galvanized pipes, not lead, copper, or plastic [6, 14, 36, 52]. A technology based on electrical conductivity, tested in Boston on various pipe materials, showed that while lead has lower conductivity compared to galvanized steel and copper, plastic has the lowest, but ensuring proper connections in real-world settings presents challenges, especially when dealing with corroded pipes, which is commonly the case for steel [14, 36, 41]. Ground Penetrating Radar (GPR), a non-destructive method using high-frequency electromagnetic pulses, is suitable for measuring the diameter of buried service lines, and while the Philadelphia Water Department used GPR to identify pipes with a ½-inch diameter as potentially lead, its wider application is limited due to the variation in pipe diameters across different locations [6, 14, 22, 25, 36, 58]. Using X-ray fluorescence spectrometry, which excites photon emissions with characteristic energy levels or wavelengths to identify materials based on their distinct energy responses, requires the probe to be inserted inside the pipe, which may disturb pipe scales and potentially increase lead levels in the drinking water [14, 77]. Researchers have also applied data-driven artificial intelligence (AI) to various classification problems, including predicting lead versus non-lead service lines [3, 16]. A group of researchers used machine learning to predict lead versus non-lead service lines by utilizing annotated service line data from the City of Flint, Michigan, census block-level data, and water quality sampling from 25,000 tests as key features for model training [3, 16].
This study proposes a method to determine the water SL material through Ultrasonic Stress Wave Propagation, a non-invasive technique that eliminates the need for excavation and works without requiring access to the home. According to this approach, waves can be generated within water pipelines by applying excitation at the curb stop valve, as illustrated in Fig. 1. As stress waves travel along the pipe, a portion of their energy dissipates into the surrounding soil [55, 56]. Leaked waves propagate through the soil and reach the surface at the bulk speed of the soil medium. Piezoelectric sensors placed on the surface of the surrounding medium can detect these vibrations as soon as the leaked waves reach the surface [7, 13, 40, 5456, 60].
Fig. 1
Schematic of the experimental process and wave propagation (approximate) through the service line and soil medium
Bild vergrößern
The theoretical principles of guided waves underlying this technique are detailed in a recent publication by the authors [40]. The research demonstrated that measuring travel time of Guided waves, which depends on material type, could serve as a potential tool to distinguish between service line materials. Moreover, in this study, the proposed technology (as shown in Fig. 1) generates both transient and steady-state waves when an impact is applied, producing resonant frequencies at which the service line vibrates. These vibrations are captured by sensors placed on the surface. Analyzing the FRFs provides valuable information about these resonant frequencies, offering a more comprehensive understanding of the system [23, 69, 72]. Since different pipelines resonate at distinct frequencies based on their material and geometric properties, this can serve as a potential tool for distinguishing service line materials.
While a full theoretical or simulation-based model is beyond the scope of this study, there are fundamental physical differences between service line materials, particularly in density and material stiffness, that suggest their vibrational responses (both transient and steady state) should differ. For example, lead has a relatively high density (\(\sim 11.34\) g/cm³) and relatively low stiffness resulting in a low wave speed (\(\sim 1200\) m/s), while materials like copper and steel exhibit higher wave speeds [40]. These differences can influence how stress waves propagate and dissipate, potentially leading to distinct frequency response characteristics. Although the exact patterns may be affected by geometry and surrounding soil conditions, it is expected that the resulting FRF signals retain material-specific characteristics that differentiate one pipe material from another.
To test this hypothesis under real-world conditions, the proposed technology was applied to nearly 250 houses across 20 cities in the US for field testing on real service lines. During field testing, several real-world challenges were encountered during field testing, including inconsistent soil depth of service lines and variations in soil compactness between the public and private sides. Additionally, surface types varied, with asphalt typically found on the roadside and grass or concrete surfaces on the home side. These factors posed challenges for the proposed approach, as attenuation and wave speed vary significantly across different mediums [7, 49, 50, 54, 61, 76]. It was concluded that classifications of material type by typical guided wave features (wave speed, attenuation) was impractical, time-consuming, and conducted to largely inaccurate predictions using the proposed technology. This limitation motivated the development of an alternative data-driven approach that could leverage the FRF signals more effectively, particularly for classifying lead vs. non-lead materials.
To address this, the study focused on Deep Learning (DL), a prominent branch of artificial intelligence that has been increasingly used in various academic fields, including computer vision, speech recognition, natural language processing and also in structural health monitoring (SHM) [11, 12, 19, 21, 33, 59, 63, 64, 73]. A key advantage of deep learning is its ability to automatically extract complex features from input data by utilizing deep hidden layers and non-linear activation functions at each layer [48, 74]. DL techniques have widely been explored in vibration-based structural condition monitoring, such as machine fault detection [39, 82], wind turbine fault detection [35, 37], and data anomaly detection [12]. In SHM systems, particularly those utilizing ultrasonic guided waves, DL has been used for tasks like acoustic emission source localization, damage detection in structures, and corrosion mapping of metal plates [19, 24, 28, 32, 57, 67, 80].
Among DL algorithms, Convolutional Neural Networks (CNNs) have become particularly popular, demonstrating superior performance in computer vision applications such as object detection in large image databases and facial recognition [18, 27, 38, 68]. CNNs integrate feature extraction and weight determination into a single learning process, allowing the network to optimize features during the training [33, 34, 62]. Their sparse connections and shared weights make CNNs computationally efficient, enabling them to handle large inputs more effectively than traditional fully connected Multi-Layer Perceptrons (MLPs) [26, 33, 62]. Parameter sharing allows feature detectors to scan across all regions of a training example, efficiently identifying common features [19]. Additionally, CNNs are robust to small transformations in input data, such as translation, scaling, and distortion, and can adapt to different input sizes, making them highly versatile [10]. In the context of service line material classification, Deep CNNs offers significant potential by identifying complex patterns in the data collected from different service lines made of distinct materials. These models can be trained to distinguish between lead and non-lead service lines, providing a powerful tool for classification tasks.
However, conventional 2D CNNs are well-suited for processing 2D data such as images or video frames. In contrast, data like ECG signals, raw vibrational data, bearing fault detection sensor data, and voltage/current data from specialized power electronics are typically 1-dimensional and remain so even after performing some signal processing tasks [1, 2, 8, 30, 31, 39, 4346]. Converting this 1D data into 2D images for conventional methods can introduce challenges and reduce accuracy [19, 71, 83]. To address this limitation, researchers have proposed 1D CNNs, which are specifically designed to work directly with 1D data while retaining architectural similarities to 2D CNNs, with some key variations [10, 43, 44]. Recent studies have shown that 1D CNNs offer certain advantages over their deep 2D counterparts when handling 1D signal repositories [1, 2, 8, 9, 30, 31, 39, 46]. FRF data represents a distribution of service line’s responses or coefficients over a certain frequency range; vectors containing these FRF values can be used as input to 1D-CNN models to classify lead versus non-lead service lines. While other recent studies have utilized raw time-acceleration signals for CNN-based damage detection [19, 70], our preliminary results showed very limited success when using raw data. Therefore, this study focuses on using FRF representations and shows its effectiveness in distinguishing material types.
A large dataset is required for deep learning to work efficiently, especially for multiclass classification problems. Although the study collected signals from galvanized steel, copper, plastic, brass, and lead pipes using the proposed technology, the dataset was not large enough to develop a multiclass classification model at this stage. Additionally, since the immediate priority for water utilities is to differentiate between lead and non-lead pipes to expedite lead pipe removal, the collected data was divided into two categories: lead vs. non-lead. Consequently, the study focused on a binary classification task.
So, the primary objective of this study is to utilize these collected signals and perform signal processing to obtain FRF coefficients and use them to develop a 1D-CNN model to identify lead vs non-lead water service lines. The following subsections will discuss the experimental setup and description of the stress wave technology, the signal processing steps involved in preparing the data to obtain the FRF, the details of the 1D-CNN architecture used in this study, and the results of the model, including predictions on unseen data. The section will conclude with a discussion of future tasks and potential areas for improvement.

2 Basic Theory

2.1 Description of Stress Wave Propagation Technology

In typical water service line systems, the water main is located under the roadway, with service lines connecting the main to the house (see Fig. 1). A shut-off valve, known as the curb-stop valve, is positioned between the two, controlling water flow from the main to the house. This valve is accessible via a shut-off rod commonly used by water utilities. The section from the valve to the water main is referred to as the public/company service line, while the section from the valve to the house is the private owner’s service line. The service lines for the same house (public and private) can be made of the same or different materials. Therefore, a tool that can distinguish between both service lines separately is ideal. The proposed technology’s key advantage is its ability to identify service line materials of these two segments independently, reducing the need for digging.
To generate a stress wave in service lines, the study used an aluminum extension rod (which can be replaced by the steel shut-off rod used by water utilities) to reach the top of the curb stop valve (see Fig. 2). The instrumented hammer was then used to strike the extension rod, generating stress waves that travel through the rod and into the service lines. As a result, the waves produce both transient and steady-state vibrations. These vibrations are captured by three PZT transducers (accelerometers) placed at nearly equal distances of 3 feet from the extension rod and from each other. To ensure a continuous connection between the accelerometers and the surface, plumber’s putty was used on asphalt or concrete surfaces, while steel spikes were used for grass or soft soil. The data acquisition was carried out using a National Instrument data acquisition system with four independent channels (the first three channels are used to collect data from the accelerometers, and the fourth is used to store hammer impact time history). The Data-Acquisition (DAQ) was connected to a host computer with a dedicated MATLAB app to record the signals at 51,200 samples per second. 15-20 sets of signals were recorded for each service line tested to ensure repeatability. Figure 2 illustrates how stress waves are generated using an instrumented hammer and an extension rod. With this approach, the technology can test up to 40 service lines or 15-20 houses per day.
Fig. 2
Field testing of the proposed technology on water service lines in Philadelphia (A1, A2, and A3 indicate the positions of three accelerometers)
Bild vergrößern

2.2 The Fundamentals of FRF

The Frequency Response Function (FRF) is a frequency-based linear transfer function consisting of real and imaginary components derived from the Fourier transform of the applied excitation and the system’s responses. One of the key advantages of using FRF data is that it provides more comprehensive insights into the vibration responses of service lines. In this study, the hammer generates the applied excitations, while the service line responses are recorded by sensors placed on the soil surface.
The FRF for the data from all three sensors is calculated as follows:
$$\begin{aligned} FRF = \frac{FFT of output-signal}{FFT of input-signal} \end{aligned}$$
(1)
where, the \(input-signal\) is the interpolated hammer data, and the \(output-signals\) are the data from Sensor 1, Sensor 2, and Sensor 3.

2.3 The Basics of CNN Algorithm

A typical CNN generally includes the input layer, convolution layer, pooling layer, fully connected, and output layers [33, 47, 48, 74].
  • Input Layer: The input layer of a CNN is designed to handle multidimensional data in a standardized format, meaning the data must be normalized before being fed into the network. Standardizing the input features enhances both the algorithm’s efficiency and learning performance.
  • Convolution Layer: In the convolutional layer, the convolution kernel performs convolutions on the output from the previous layer and applies a nonlinear activation function to generate the output features. The output of each layer is the result of convolving multiple input features, and its mathematical model is expressed as follows:
    $$\begin{aligned} y_i^{l+1}(j) = \textbf{K}_i^l * \mathbf {x'}^l(j) + b_i^l \end{aligned}$$
    (2)
    where, \(\textbf{K}_i^l\) indicates the weights of the i-th filter kernel at layer l, \(b_i^l\) indicates the bias of the i-th filter kernel at layer l, \(\textbf{x}^l(j)\) indicates the j-th local region at layer l, and \(y_i^{l+1}(j)\) indicates the input of the j-th neuron in frame i of layer “\(l + 1\)". The notation \(*\) indicates the dot product of the kernel and local regions.
  • Activation Function: The nonlinear activation function enables the training algorithm to learn complex, nonlinear features from the input data beyond what is possible in the linear region. This study used the rectified linear unit (ReLU) activation function, which accelerates the learning process by setting gradients to 1 for positive inputs and 0 for negative inputs or zeros.
  • Pooling Layer: The pooling layer is primarily used to reduce the number of parameters in the neural network by downsampling large matrices, which decreases the computational load and helps prevent overfitting. In practice, max pooling is commonly applied, where the maximum value within a spatial region is taken as the output.
  • Fully Connected Layer: The fully connected layer converts the output from the final pooling layer into a one-dimensional vector, which serves as the input to the fully connected layer. This layer establishes complete connections between the input and output, enabling it to integrate the localized information learned from the convolutional and pooling layers.
  • Output Layer: The output layer typically employs either the softmax or sigmoid classifier to generate classification labels. Softmax is a widely used linear classifier for multi-class classification problems derived from logistic regression, while the sigmoid classifier is commonly applied in binary classification tasks. The sigmoid function outputs the probability of each class, assigning a value between 0 and 1 for the given training set, where the closer the value is to 1, the higher the confidence in that class prediction. The sigmoid function is expressed as:
    $$\begin{aligned} \sigma (z_0(j)) = \frac{1}{1 + e^{-z_0(j)}} \end{aligned}$$
    (3)
    where, \(z_0(j)\) indicates the logits of the output of the j-th neuron at the output layer, these logits represent the weighted sum of inputs from the previous layer.

3 Methodology

3.1 Field Testing

The study applied the proposed technology in actual field settings to gather responses from service lines. With support from American Water, Philadelphia Water Department, Pittsburgh Sewerage and Water Authority, AQUA, DC Water, and Water Research Foundation, the study was conducted on the service lines of nearly 250 houses in different states and cities. A summary of the field testing, including the locations, dates, and number of service lines tested at each site, is provided in Table 1.
Table 1
Field testing Summary
Serial
Date
Location/City
No of Service Line Tested
Total
   
Public
Private
 
01
10-01-2021
A
9
9
18
02
12-14-2021
B
12
11
23
03
04-08-2022
C
17
 
17
04
06-02-2022
D
11
8
19
05
06-03-2022
E
4
14
18
06
06-22-2022
F
10
9
19
07
06-23-2022
G
13
13
26
08
08-25-2022
H
2
12
14
09
09-30-2022
I
11
10
21
10
10-18-2022
J
8
8
16
11
12-7-2022
K
11
11
22
12
12-8-2022
L
4
4
8
13
03-28-2023
M
11
3
14
14
06-14-2023
N
12
12
24
15
08-09-2023
O
9
2
11
16
08-10-2023
P
7
8
15
17
08-17-2023
Q
9
12
21
18
08-29-2023
R
10
10
20
19
12-01-2023
S
7
9
16
20
12-05-2023
T
11
11
22
21
12-14-2023
U
8
8
16
22
07-22-2024
V
10
9
19
23
07-23-2024
W
10
10
20
  
Total
216
203
419
During field testing, challenges were encountered that do not typically arise in controlled lab environments. For instance, at some houses, the curb-stop was covered with dirt, soil, or even water, preventing the extension rod from making direct contact with the curb-stop valve. In such uncommon cases, delivering enough energy to the service lines to generate the necessary vibrations was difficult and required dirt removal with special tools available to utility crews. On the private side, a mix of surface types was sometimes observed, which is fairly common.
Ensuring the accelerometers are positioned directly above the service lines is crucial for obtaining representative signals. In most cases, water utility personnel were familiar with the service line layout, which made it easier to align the accelerometers with the direction of the lines. When the layout was unknown, a commercially available line tracer was used to determine the approximate location, and the accelerometers were then positioned accordingly.
Fig. 3
(a) Unsynchronized data from hammer strike #1, (b) Unsynchronized data from hammer strike #10, (c) Data from hammer strike #1 after synchronization, (d) Data from hammer strike #10 after synchronization
Bild vergrößern

3.2 Signal Processing and Extracting FRF

One of the key challenges in applying deep learning techniques is extracting meaningful features from time-domain vibration response signals, making signal processing an essential component of this type of research. This study’s primary input for the DL models will be the FRF data. Hence, after collecting data with the proposed technology in field conditions, signal processing tasks are carried out to extract the FRF coefficients before training the deep neural networks. The signal processing in this study involves three significant steps: synchronizing all the data, extrapolating clipped regions of the impact (hammer) data, and obtaining the FFT of the input and output signals to extract the FRF, which is then used in the DL models.

3.2.1 Synchronizing Data

The study collects signals in a five-second window at a rate of 51,200 samples per second using a dedicated Matlab App. Initially, the raw data is reviewed, and if a double hammer strike is detected, or if there is noise or a spike in the sensor data before the hammer strike, or if any external noise occurs within the signal window, those signals are discarded to ensure data quality.
Secondly, for the hammer data, the time step of its maximum value is identified, and a threshold value of 0.05 multiplied by that maximum value is used to determine the start of the hammer impact. From this start point, 1,000 data points before and 10,500 data points after are selected for further analysis. This process is repeated for every signal collected on that particular service line (typically 10-15 signals are collected per service line). However, close observation revealed a slight offset between each of these signals. These offsets were determined using the well-known cross-correlation algorithm, which measures the similarity between a vector x and lagged copies of a vector y as a function of the lag.
For cross-correlation, only the hammer data from each strike is considered, and the first hammer strike is used as the reference (vector x) to compare with the others and obtain the lags. Once the lag was determined, it was applied to finalize the correct window to trim the signal from that strike. Zero-padding was used to maintain consistent data length, and this process was repeated for the remaining signals. Figures 3(c) and 3(d) show the data after synchronization. After synchronization, the mean of the first 50 data points from the entire dataset is subtracted to eliminate initial noise, particularly related to DC offsets.
Since the impacts were generated manually, the raw signals initially recorded from each strike exhibited slight variations in energy. Prior to computing the FRFs, the raw time-domain signals from the same service line were analyzed using cross-correlation to quantify consistency across multiple strikes and reject anomalous data sets. The correlation peak for different strikes at the same location was found to range between 0.92 and 0.98 (see Fig. 4), indicating that the impacts generated in each strike for a particular SL were quite repeatable.
Fig. 4
Cross-correlation among sensors from different hammer impacts
Bild vergrößern
Fig. 5
Extrapolation of Hammer Impact Data
Bild vergrößern

3.2.2 Extrapolating Impact Hammer Data

Stress waves were generated by an instrumented hammer (PCB 086C03 modal impact hammer) with a measurement range of ±5 V that can measure forces up to ±2225 N. However, field testing,sometimes required hard impacts to produce sufficient vibrations in the service lines for the surface sensors to detect. In these cases, the hammer exceeded its measurement range, causing the recorded values to be clipped (see Fig. 5(a)).
Fig. 6
FFT of Hammer data and sensors signal
Bild vergrößern
Since the data was recorded at 51,200 samples per second, there was sufficient information to extrapolate the clipped sections. This extrapolation is crucial, as the hammer signal serves as the input to the service lines, and to generate an accurate FFT of the signal, the entire waveform must be considered. Widely used extrapolation algorithms were applied to address the clipped regions, and both makima and spline extrapolation methods were tested. The spline extrapolation provided a more accurate representation of the missing data (see Fig. 5(b))and was used to reconstruct the clipped portions of the hammer signal.
At this stage, the raw, synchronized sensor data in the time domain was ready to be used in deep learning (DL) models for training and testing. The signal length from each sensor was set to 20 data points before the impact and 6,500 data points after the impact, totaling 6,521 points per signal. This range was chosen for two reasons: first, it excludes any noises from surrounding sources, such as traffic or end reflections; second, using a larger signal length for DL training would require additional computational resources. Initially, the raw signals directly were used to train the DL model, which was then tested and applied to unseen data. Although the model trained successfully, it performed poorly in classifying lead vs. non-lead cases in both the test and unseen datasets. A possible cause of this underperformance was the insufficient amount of raw data for effective model training. In response, the study explored further signal-processing.

3.2.3 Frequency Response Function

The time-domain signals were transformed into the frequency domain for further analysis, using the Fast Fourier Transform (FFT) algorithm for all four signals obtained from each hammer strike. The FFT results of the hammer, sensor 1, sensor 2, and sensor 3 are shown in Fig. 6. The centroidal frequency was also calculated and displayed in the FFT plot.
The input frequency, generated by the hammer excitation, displayed a broad spectrum of frequency distributions. In contrast, the output signals (sensor responses) showed that most of the signal energy was concentrated in the low-frequency region (up to 1,000 Hz), with very little energy in the high-frequency region (see Fig. 6). The low-frequency region represents the steady-state vibrations of the service lines, while the higher-frequency region captures also the transient vibration modes. The frequency range was limited to 6,000 Hz, as the PCB PZT transducers (accelerometers) can accurately pick up vibrations within this range with a ±5% error. Beyond this range, the error increases significantly, and the sensors would experience resonant vibrations.
Fig. 7
Absolute FRF values vs frequency for all three sensors
Bild vergrößern
Fig. 8
Representative Absolute FRF curves from the first sensor for four service lines of different materials: copper, galvanized steel, plastic, and lead (all from public-side segments)
Bild vergrößern
In Fig. 7, the absolute FRF coefficients for Sensors 1, 2, and 3 are displayed for frequencies up to 6,000 Hz. While the FFT plots previously showed slight energy in the high-frequency regions, the FRF plots more clearly reveal the high-frequency vibrations detected by the sensors.
Fig. 9
Real and Imaginary parts of FRF data used as input for 1D-CNN Models
Bild vergrößern
To provide insight into the potential of FRF-based classification, Fig. 8 presents representative absolute FRF responses from four different service lines (copper, galvanized steel, plastic, and lead). These FRFs were extracted from the first sensor data of each service line. While the selected samples are not from the same location or identical soil conditions, they were all collected from the roadside (public) segments. Visual differences in the FRF responses are evident, particularly at resonant frequency. However, due to the influence of varying geometries, depths, and environmental conditions, these differences are not always apparent. This further underscores the need for a robust pattern-recognition algorithm, such as a deep learning model, to detect complex features that can generalize across these variabilities and accurately distinguish lead from non-lead service lines.

3.2.4 Preparing FRF data for Input into 1D-CNN Models

Before performing FFT, the raw data consisted of 6,521 data points at a sampling rate of 51,200 Hz. After applying FFT, 6,521 Fourier coefficients were obtained, and at this sampling rate, the Nyquist frequency is 25,600 Hz. Since half of these coefficients are symmetric, only the first half of the FFT is considered, resulting in 3,260 points covering frequencies from 0 to 25,600 Hz. However, because the sensors can accurately capture frequencies up to 6,000 Hz with an error margin of ±5%, any frequencies beyond 6,000 Hz were discarded in this study.
Instead of using absolute FRF coefficients, the study treated the real and imaginary parts of the FRF coefficients separately. The real part up to 6,000 Hz contains 763 data points, and the same goes for the imaginary part. Combining both, the total length of FRF data from each sensor is 1,526 points (see Fig. 9). This data length from each sensor will serve as the input for the 1D-CNN models. It’s worth noting that while the raw data length was 6,521 points, the FRF data resulted in only 1,526 points, which is nearly four times smaller than the original raw data length. This reduction offers a significant advantage when training deep learning models, as it requires fewer computational resources.

3.3 Proposed 1D-CNN Model

This study used 1D-CNN, which is the modified version of the traditional CNN, specifically designed for processing one-dimensional signals. In a 1D-CNN, the input is a one-dimensional vector, and instead of using 2D convolutional filters, 1D filters are applied to extract relevant features from the data. Consequently, the output from each convolutional and pooling layer is also a one-dimensional vector. Additionally, 1D-CNNs require fewer parameters compared to 2D-CNNs, making them more computationally efficient.
Since the FRF data prepared for this study is one-dimensional, 1D-CNN was selected as the model. The convolutional layers in 1D-CNN are capable of extracting local features, such as resonant vibrations from the FRF data, which is particularly useful because different pipes have distinct vibrations at various frequencies. As both real and imaginary coefficients were used to generate the FRF data, CNN can also identify similar patterns in the imaginary components.
Based on the fundamental principles of CNN, the specific structure of the 1D-CNN used in this study is illustrated in Fig. 10. It consists of four convolutional layers, four max-pooling layers, two fully connected layers, and an output layer. The non-linear ReLU activation function is applied to each layer, while the sigmoid activation function is used in the output layer to classify lead versus non-lead service lines. Table 2 provides the details of the kernel sizes for both convolution and max-pooling layers, along with activation functions and padding information.
Fig. 10
The architecture of the 1D-CNN used in this study (plotted using NN-SVG [51])
Bild vergrößern
In addition to the main CNN structure detailed in Table 2, several techniques were incorporated to enhance performance. Batch normalization was applied before the activation functions in the first four convolutional layers to reduce covariate shifts. To tackle overfitting, L2-norm regularization and dropout were used [47, 75]. L2-regularization was applied to all layers, including Conv1D and dense layers. It helps prevent overfitting by adding penalty terms to the weight matrices, preventing them from becoming too large. Additionally, dropout regularization was employed to further improve the model’s generalization ability. Dropout reduces interdependence between layers by randomly deactivating nodes, with the corresponding neuron weights ignored during training. In this study, a dropout rate of 0.5 was used after each layer.
Table 2
Structure of the 1D CNN Model
Layer #
Layer Name
Filter Number
Filter Size
Padding
Activation
Layer 1
conv1d_1
32
3
Same
ReLU
 
max_pooling1d_1
2
Layer 2
conv1d_2
64
3
Same
ReLU
 
max_pooling1d_2
2
Layer 3
conv1d_3
128
3
Same
ReLU
 
max_pooling1d_3
2
Layer 4
conv1d_4
256
3
Same
ReLU
 
max_pooling1d_4
2
FC Layer 1
flatten
 
dense_1
256
ReLU
FC Layer 2
dense_2
128
ReLU
Output Layer
output_layer
1
Sigmoid
Fig. 11
(a) Number of data collected for each SL’s material types and (b) the corresponding total FRF data obtained after signal processing steps
Bild vergrößern
Moreover, mini-batches are used in deep learning algorithms instead of traditional batch learning. Batch learning processes all training examples during forward and backpropagation, leading to high computational costs, whereas mini-batches address this issue. Mini-batch gradient descent is typically used to optimize the gradient descent for these smaller batches, but it can cause large oscillations as the process nears the optimum solution. To reduce this oscillation, the ADAM optimizer is employed. ADAM combines the advantages of the adaptive stochastic gradient descent (SGD) and RMSProp algorithms, enabling faster convergence to the optimal solution with lower memory requirements.
The hyperparameters of the 1D-CNN were optimized using random search, including the number of epochs (iterations over the entire dataset), mini-batch size (set to 32), learning rate (final value of 1e-6), L2-regularization factor (0.001), dropout rate (0.5), filter size, and the number of filters. Since the algorithm was designed to solve the binary classification problem of lead vs non-lead, binary cross-entropy was used as the loss function.

3.4 Data Preparation

3.4.1 Dataset Description

After conducting field testing at the first 14 locations, the study developed its initial 1D-CNN models for training, testing, and predicting new field-tested data. Data from a total of 243 service lines across these 14 locations were used. As mentioned earlier, approximately 10-15 strikes were generated per service line, and with three sensors, this resulted in 30-45 signals per service line. After the signal processing steps, the FRF data from these signals, each consisting of 1,526 points, were used as input for the model.
The 243 service lines were comprised of various materials, including 89 copper, 29 galvanized steel, 41 plastic, and 84 lead service lines. In some locations, only two sensors were placed due to limited access for a third sensor, causing slight variations in the total number of signals. The total number of SLs from each material and the corresponding FRF signals obtained are shown in Fig. 11 (a) and (b). In total, 8925 FRF data were collected from these service lines. The FRF data was saved in CSV files, and the CSV files were used as input for the 1D-CNN model. Since the study focuses on classifying lead versus non-lead service lines, SLs made from materials other than lead were grouped together as non-lead, while lead SLs formed a separate group.

3.4.2 Data Preprocessing

The FRF data from each sensor exhibited variations in magnitude, and since CNN models perform better when the data is on the same scale, normalization was necessary. Several techniques are commonly used for normalization, including standardization, min-max normalization, and z-score normalization [42]. For CNNs, min-max normalization, which scales data to a range of 0 to 1, typically works well and was used [42]. For each FRF dataset, the maximum and minimum values were identified and used for scaling (see Fig. 10, input FRF data). To prevent data leakage, normalization was performed separately for the training, testing, validation, and also for unseen datasets.

3.4.3 Data Distribution

After scaling datasets with min-max normalization, 80% of the service line’s FRF data were allocated for training, 10% for validation, and the remaining 10% for testing. To avoid data memorization in the developed model, all FRF data from a particular service line were exclusively extracted for testing and validation. Additionally, at least one service line’s FRF data from each city/location was included in the test and validation datasets to ensure that these datasets reflected the same geometric and environmental distributions as the training data. For a visual representation of the process, the data preparation steps and distribution are illustrated in the flow chart in Fig. 12.
Fig. 12
Data preparation and distribution steps for 1D-CNN Models
Bild vergrößern

3.4.4 Handling Imbalance data

Figure 10(b) shows that the Lead and Non-lead FRF data are imbalanced. In the first trial, after allocating 80% of the FRF data for training as described in Section 4.1(c), there were 2,190 training samples for Lead and 3,004 for Non-lead. Before training the model, it was necessary to address this class imbalance. Several techniques are commonly used to handle imbalanced data, including random oversampling, under-sampling, SMOTE, and class weights [15, 20, 53, 65, 78, 79, 81]. In this study, the class-weight balancing strategy was applied. This approach assigns higher weights to the minority class, in this case, lead, and lower weights to the majority class, non-lead, to balance the learning process. The class weights for the Lead and Non-lead classes are calculated as follows:
$$\begin{aligned} W_j = \frac{N}{k N_j} \end{aligned}$$
(4)
where, \(W_j\) is the weight of class j, N is the total number of FRF data, k is the total number of classes, and \(N_j\) is the number of FRF data in class j. Class weights were applied during CNN training to adjust the learning process, emphasizing minority classes that are underrepresented. This prevents the model from being biased toward the majority class by increasing the penalty for misclassifying the minority class. The weights are incorporated into the loss function, ensuring the model learns more effectively from imbalanced datasets.

4 Results and Discussion

This section presents the results obtained using the trained 1D-CNN model. First, performance evaluation metrics and a prediction criterion specifically customized for this study are introduced. Then, the accuracy and loss curves for the training and validation data are discussed. Finally, the model was used to predict service line materials at locations with unknown SLs, and its performance was evaluated against the ground truth provided by the water utilities. Prior to selecting the 1D-CNN model, preliminary experiments were conducted using a Multi-Layer Perceptron (MLP) with the same FRF input data. However, the MLP struggled to achieve good training performance and showed relatively lower overall accuracy. Based on these observations, the 1D-CNN was selected as the preferred architecture due to its superior performance.

4.1 Performance Evaluation Metrics

Figure 13 illustrates the possible predictions for an unknown service line (SL) and the actions that water utilities will take based on these predictions. If the actual material is lead and the model predicts lead, this is a true positive (TP), and the utilities will excavate the location. However, if the model incorrectly predicts non-lead for an actual lead line (false negative (FN)), the utility will miss a lead service line, which is the error they most want to avoid.
On the other hand, if the actual material is not lead and the model correctly predicts this (true negative (TN)), the utilities will not excavate. If the model predicts lead when the material is not lead (false positive (FP)), the utilities will excavate but find no lead present. While false positives are undesirable, utilities may prefer them over false negatives. Therefore, the primary objective of the evaluation criteria is to minimize false negatives while accepting that some false positives may occur.
Fig. 13
Flowchart illustrates the prediction outcomes, and the corresponding decisions water utilities would make
Bild vergrößern
Another aspect of this study, as discussed in the data preparation section, is that each service line with three sensors generates approximately 30 or more FRF data points. When predicting the testing data, all FRF data from a particular service line are considered, with the expectation that they will predict the same class. However, it was observed that not all FRF data for a given service line predict the same class. For example, in some cases, most data predict one class while the rest predict another. In a few instances, the predictions were evenly split between the two classes.
To resolve this challenge, a specific evaluation strategy was devised to streamline the classification process:
  • If the majority of predictions for a particular service line fall into one class, that class is selected as the final prediction.
  • If the predictions are evenly split between the two classes, the first two sensor predictions are prioritized, and the majority class from these is chosen.
  • If the split remains even after this step, the study predicts the service line as Lead since utilities prefer more false positives over false negatives.
After that, several performance metrics are used to assess the effectiveness of the 1D-CNN classifier models in identifying lead versus non-lead service lines. These include accuracy, precision, recall, and F1-score. A brief explanation of these metrics, aligned with the specific objectives of this study, is provided below:
  • Accuracy: is a common metric that evaluates how often a model’s predictions are correct. It is calculated by dividing the number of correct predictions (TP and TN) by the total number of instances (TP, TN, FP, FN) in the dataset, providing an overall view of the model’s performance in classifying both lead and non-lead service lines.
    $$\begin{aligned} \text {Accuracy} = \frac{TP + TN}{TP + TN + FN + FP} \end{aligned}$$
    (5)
  • Precision: measures the percentage of predictions that were classified as positive (TP and FP) and were actually correct/positive (TP).
    $$\begin{aligned} \text {Precision} = \frac{TP}{TP + FP} \end{aligned}$$
    (6)
  • Recall: measures how many times the model correctly predicts the class to be positive. A higher recall value indicates fewer false negatives, which is crucial for this study.
    $$\begin{aligned} \text {Recall} = \frac{TP}{TP + FN} \end{aligned}$$
    (7)
  • F1-score: is the weighted harmonic mean of precision and recall.
    $$\begin{aligned} \text {F1\_Score} = \frac{2 * \text {Precision} * \text {Recall}}{\text {Precision} + \text {Recall}} \end{aligned}$$
    (8)

4.2 Accuracy and Loss Curves with k-fold Cross Validation

The changes in training and validation accuracy, as well as the comparison between training and validation loss, over each epoch, are illustrated in Fig. 14. In the initial epochs, validation accuracy was higher than training accuracy. However, after 100 epochs, validation accuracy fluctuated without significant improvement, while training accuracy steadily increased. From the loss curve, it is observed that after around 150 epochs, validation loss decreased at a slower rate compared to training loss, with the gap between them widening as training continued. After seeing no improvement in validation accuracy for 800 epochs, early stopping was used to terminate the training, indicating the model was overfitting. Despite using both L2-regularization and dropout, overfitting persisted.
Fig. 14
(a) Accuracy and (b) Loss performance on both training and validation data
Bild vergrößern
Fig. 15
Confusion matrices of 1D-CNN for (a) Validation and (b) Test data
Bild vergrößern
In the early stages, batch sizes of 8, 16, and 32 were tested to see if this behavior changed, but it remained consistent. Several callback functions were employed throughout the experiments. The early stopping callback function halted training when there was no improvement in validation accuracy or a decrease in validation loss. Additionally, the model checkpoint callback was used to save the best-performing model based on validation accuracy after each epoch. Although checkpointing was in place, the saved weights were not used to resume training. Instead, each model was retrained independently from the beginning, with occasional adjustments to hyper-parameters. The best-performing model from each training run, as determined by validation accuracy, was then used for testing on the unseen data.
The study also applied k-fold cross-validation to assess potential performance improvements. The training and validation data were split into multiple folds, maintaining a ratio of 8:1 for service lines. Although this approach led to a slight improvement, the overall performance remained largely unchanged. This outcome is likely due to the nature of the data and the limited dataset, which may not have been sufficient for the deep learning model to generalize effectively. Further field testing and additional data collection could help mitigate the overfitting issues.
Table 3
Predictions of unknown SLs using the 1D-CNN Model
Location
Private
Public
 
Actual
1D-CNN Predictions
Results
Actual
1D-CNN Predictions
Results
O101
 
\(^{*}\)
 
Lead
Lead
Correct
O102
 
\(^{*}\)
 
Not Lead
Not Lead
Correct
O103
 
\(^{*}\)
 
Lead
Lead
Correct
O104
 
*
 
Lead
Lead
Correct
O105
 
*
 
Not Lead
Not Lead
Correct
O106
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
O107
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
O108
 
\(^{*}\)
 
Not Lead
Not Lead
Correct
O109
 
\(^{*}\)
 
Not Lead
Lead
Not Correct
M102
 
\(^{*}\)
 
Not Lead
Not Lead
Correct
M103
 
\(^{*}\)
 
Not Lead
Lead
Not Correct
M104
 
\(^{*}\)
 
Not Lead
Not Lead
Correct
M111
Lead
Not Lead
Not Correct
 
\(^{*}\)
 
P101
Not Lead
Lead
Not Correct
 
\(^{*}\)
 
P102
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
P103
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
P104
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
P105
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
P106
Not Lead
Lead
Not Correct
Not Lead
Not Lead
Correct
P107
Not Lead
Lead
Not Correct
Not Lead
Lead
Not Correct
P108
Not Lead
Not Lead
Correct
Not Lead
Not Lead
Correct
Q101
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q102
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q103
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q104
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q105
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q106
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q107
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q109
Not lead
Not lead
Correct
 
\(^{*}\)
 
Q110
Lead
Lead
Correct
 
\(^{*}\)
 
Q111
Lead
Not Lead
Not Correct
 
\(^{*}\)
 
Q112
Lead
Lead
Correct
 
\(^{*}\)
 
\(^{*}\) SL was not accessible or couldn’t be tested using the technology

4.3 Evaluating the Classification Performance

All metrics were calculated for the testing data. The confusion matrix obtained from the test and validation data is shown in Fig. 15.
Of the 23 validation service lines (SLs), eight were lead and 15 were non-lead, with varying amounts of FRF data for each. After applying the evaluation strategy to each SL, the confusion matrix for the validation data was generated (see Fig. 15 (a)). The overall accuracy on the validation dataset was 73.9%. The model achieved a recall of 75.0% and a precision of 60.0% for lead cases, resulting in an F1-score of 66.7%. For non-lead cases, the precision was 85%, recall was 73%, and the F1-score was 79%. For the independent testing dataset, 20 SLs were evaluated, with seven lead and 13 non-lead. After applying the evaluation strategy, the confusion matrix for the test data was generated (see Fig. 15 (b)). The overall testing accuracy was 80%, which is slightly higher compared to the validation data. For lead cases, the recall was found to be 86%, with a precision of 66.7%, yielding an F1-score of 75%. For non-lead cases, precision was above 90%, while recall was lower at 76.9%, resulting in an F1-score of 83.3%.
Table 4
Summary of Predictions using 1D-CNN Model
 
Lead
Not Lead
 
Private
Public
Private
Public
Correct
2
3
15
13
Total
4
3
18
16
Percent Correct
50%
100%
83%
81.25%
Private and Public Combined
    
Correct
5
28
  
Total
7
34
  
Percent Correct
71.43%
82.35%
  
Overall accuracy = (5+28)/41 = 80.48%
    

4.4 Evaluation of Prediction Performance on Blind Testing

After evaluating the performance metrics on the test data, the next step was to assess the model’s generalization capability by predicting service line materials from entirely new locations through real-world blind testing. In these cases, the actual material types were not disclosed in advance. The trained model was used to make predictions, which were later compared with the ground truth provided by the utilities. This created a true blind testing scenario under real operational conditions. A total of 41 service lines, from both public and private sides, across four different cities were included in this evaluation, as shown in Table 3. After completing the prediction for a given location, the newly validated data from that city were incorporated into the dataset. The model was then retrained using the same 8:1:1 data distribution strategy for training, validation, and testing. This updated model was subsequently used to predict service lines at the next location.
Table 3 presents the actual material types provided by the water utilities, with "Lead" indicating lead service lines and "Not Lead" for all other materials, alongside the predictions made by the 1D-CNN model. The comparison results are shown in the "Results" column, indicating whether the predictions were correct or incorrect.
Table 4 shows that the developed 1D-CNN model achieved an overall accuracy of 80.5%. Specifically, the model correctly identified 5 out of 7 lead cases, resulting in a recall of 71.5% for lead service lines. For non-lead cases, it accurately predicted 28 out of 34, yielding an accuracy of 82.35% (recall for non-lead materials). Notably, the model successfully identified all three lead cases on the public side, while for the private side, it correctly identified 50% of the lead cases. This suggests that the model generalized well for lead cases on the public side but requires further refinement to improve predictions on the private side. One potential explanation could be the variable surface types encountered during field testing, which may have introduced inconsistencies. On the other hand, the public side often featured more consistent conditions, such as asphalt surfaces. For non-lead cases, both private and public sides showed similar levels of accuracy.

5 Conclusion and Future Work

This study explored ultrasonic stress wave propagation technology combined with deep learning to identify lead vs non-lead water service lines under real-world conditions. A 1D-CNN model was developed using frequency response function (FRF) data as input. The raw data were collected from nearly 20 U.S. cities, representing over 400 service lines. These data were analyzed using signal processing techniques, including synchronization, extrapolation of the excitation signal, and calculation of the FRF values. A revised evaluation strategy was proposed to handle inconsistent predictions across multiple signals from the same service line, improving reliability. The following summary can be drawn from this study:
  • Although the model trained effectively, overfitting was observed with the validation data, and this issue persisted despite implementing various strategies. The overfitting problem could be addressed with the addition of more field-testing data.
  • The testing data were evaluated using this strategy, achieving an overall accuracy of 80% and 74% for the validation data. In this study, false predictions of lead service lines as non-lead were considered critical. Therefore, a higher recall value for lead service lines was prioritized, and it was found to be 75% for validation data and 86% for the test data.
  • The trained model and evaluation strategy were then applied to 41 blind-tested service lines, which included both public and private segments. The model correctly identified all three lead service lines on the public side, while on the private side, it identified two out of four lead service lines. For non-lead service lines, accuracy ranged between 81% and 83% for both public and private sides. The lower performance on the private side may be attributed to varying surface types and conditions encountered during field testing. The overall accuracy of the blind testing was 80.5%.
These results suggest that the stress wave technology proposed in this study, combined with signal processing and the 1D-CNN model, has the potential for wide implementation in distinguishing lead from non-lead service lines without excavation. However, several areas for improvement should be considered in future studies:
  • While this study developed a binary classification system for lead vs. non-lead materials, utilities would ideally prefer a more detailed breakdown of non-lead materials, such as copper, galvanized steel, and plastic. Achieving this would require collecting more data from field tests across all material categories.
  • Developing separate models for public and private service lines could improve prediction accuracy. This approach would be more practical, as surface conditions on the two sides often differ significantly.
  • Future research should focus on expanding the dataset and exploring alternative model architectures to further improve classification performance and model generalization.
  • Moreover, simulations will be conducted to better understand how variations in material properties influence FRF responses, thereby strengthening the physical interpretation of signal-based classification.

Acknowledgements

The work was supported by the Coulter-Drexel Translational Research Partnership Program, with Dr. Jaya Ghosh, as the program director. The opinions expressed in this paper are solely of the authors, and the Coulter-Drexel Program does not necessarily concur with, endorse, or adopt the findings, conclusions, and recommendations reported in the manuscript. The authors would like to thank the researchers of American Water, Suzanne G Chiavari, Zia Bukhari and Shaoqing Ge for the many invaluable discussions. The authors would like to express their gratitude for the help of Dr. Mustafa Furkan, Husain Ibrahaim and Fatmah Hasan for the help provided during the field data collection.

Declarations

’Not applicable’
YES

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
download
DOWNLOAD
print
DRUCKEN
Titel
Identifying Lead Water Service Lines Using Ultrasonic Stress Wave Propagation and 1D-Convolutional Neural Network
Verfasst von
K. I. M. Iqbal
John DeVitis
Kurt Sjoblom
Charles N. Haas
Ivan Bartoli
Publikationsdatum
01.09.2025
Verlag
Springer US
Erschienen in
Journal of Nondestructive Evaluation / Ausgabe 3/2025
Print ISSN: 0195-9298
Elektronische ISSN: 1573-4862
DOI
https://doi.org/10.1007/s10921-025-01236-3
1.
Zurück zum Zitat Abdeljaber, O., Avci, O., Kiranyaz, S., et al.: Real-time vibration-based structural damage detection using one-dimensional convolutional neural networks. Journal of sound and vibration 388, 154–170 (2017)CrossRef
2.
Zurück zum Zitat Abdeljaber, O., Avci, O., Kiranyaz, M.S., et al.: 1-d cnns for structural damage detection: Verification on a structural health monitoring benchmark data. Neurocomputing 275, 1308–1317 (2018)CrossRef
3.
Zurück zum Zitat Abernethy, J., Chojnacki, A., Farahi, A., et al.: Activeremediation: The search for lead pipes in flint, michigan. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 5–14 (2018)
4.
Zurück zum Zitat Agency UEP.: Integrated science assessment for lead. office of research and development. EPA/600/R–10/075F (2013)
5.
Zurück zum Zitat Agency UEP.: National primary drinking water regulations: Lead and copper rule revisions. Fed Regist 84(219), (2021)
6.
Zurück zum Zitat Aminpour, P., Sjoblom, K., Bartoli, I. Identification of water pipe material based on stress wave propagation: Numerical investigations. Mater. Eval. 79(8) (2021)
7.
Zurück zum Zitat Aristegui, C., Lowe, M.J., Cawley, P.: Guided waves in fluid-filled pipes surrounded by different fluids. Ultrasonics 39(5), 367–375 (2001)CrossRef
8.
Zurück zum Zitat Avci, O., Abdeljaber, O., Kiranyaz, S., et al.: Structural damage detection in real time: implementation of 1d convolutional neural networks for shm applications. In: Structural Health Monitoring & Damage Detection, Volume 7: Proceedings of the 35th IMAC, A Conference and Exposition on Structural Dynamics 2017, Springer, pp 49–54 (2017)
9.
Zurück zum Zitat Avci, O., Abdeljaber, O., Kiranyaz, S., et al.: Wireless and real-time structural damage detection: A novel decentralized method for wireless sensor networks. J. Sound Vib. 424, 158–172 (2018)CrossRef
10.
Zurück zum Zitat Avci, O., Abdeljaber, O., Kiranyaz, S., et al.: A review of vibration-based damage detection in civil structures: From traditional methods to machine learning and deep learning applications. Mech. Syst. Signal Process. 147(107), 077 (2021)
11.
Zurück zum Zitat Azimi, M., Pekcan, G.: Structural health monitoring using extremely compressed data through deep learning. Comput. Aided Civ. Infrastruct. Eng. 35(6), 597–614 (2020)CrossRef
12.
Zurück zum Zitat Bao, Y., Tang, Z., Li, H., et al.: Computer vision and deep learning-based data anomaly detection method for structural health monitoring. Struct. Health. Monit. 18(2), 401–421 (2019)CrossRef
13.
Zurück zum Zitat Boaratti, MFG., Ting, DKS., et al.: Measurement of stress waves propagation velocities in solid media using wavelet transforms. In: International Nuclear Atlantic Conference, XIV ENFIR, Aug (2005)
14.
Zurück zum Zitat Bukhari, Z., Ge, S., Chiavari, S., et al.: Lead service line identification techniques. Water Res. Found. Proj. (4693) (2020)
15.
Zurück zum Zitat Chawla, N.V., Bowyer, K.W., Hall, L.O., et al.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef
16.
Zurück zum Zitat Chojnacki, A., Dai, C., Farahi, A., et al.: A data science approach to understanding residential water contamination in flint. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1407–1416 (2017)
17.
Zurück zum Zitat Cornwell, D.A., Brown, R.A., Via, S.H.: National survey of lead service line occurrence. J. Am. Water Works Assoc. 108(4), E182–E191 (2016)CrossRef
18.
Zurück zum Zitat Coşkun M, Uçar A, Yildirim Ö, et al (2017) Face recognition based on convolutional neural network. In: 2017 International Conference on Modern Electrical and Energy Systems (MEES), IEEE, pp 376–379
19.
Zurück zum Zitat Cui, R., Azuara, G., Lanza di Scalea, F., et al.: Damage imaging in skin-stringer composite aircraft panel by ultrasonic-guided waves using deep learning with convolutional neural network. Struct. Health Monit. 21(3), 1123–1138 (2022)CrossRef
20.
Zurück zum Zitat Cui, Y., Jia, M., Lin, TY., et al.: Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9268–9277 (2019)
21.
Zurück zum Zitat Dang, H.V., Tatipamula, M., Nguyen, H.X.: Cloud-based digital twinning for structural health monitoring using deep learning. IEEE Trans. Ind. Inform. 18(6), 3820–3830 (2021)CrossRef
22.
Zurück zum Zitat DanieIs, JJ.: Fundamentals of ground penetrating radar. In: 2nd EEGS Symposium on the Application of Geophysics to Engineering and Environmental Problems, European Association of Geoscientists & Engineers, pp cp–213 (1989)
23.
Zurück zum Zitat Das, S., Roy, K.: A state-of-the-art review on frf-based structural damage detection: development in last two decades and way forward. Int. J. Struct. Stab. Dyn. 22(02):2230,001 (2022)
24.
Zurück zum Zitat De Marchi, L.: Sparse signal processing and deep learning for guided waves ndt and shm. In: Proceedings of Meetings on Acoustics, AIP Publishing (2019)
25.
Zurück zum Zitat Deb, AK., Hasit, YJ., Grablutz, FM.: Innovative techniques for locating lead service lines. AWWA Res. Found. Am. Water Works Assoc. (1995)
26.
Zurück zum Zitat Driss, SB., Soua, M., Kachouri, R., et al.: A comparison study between mlp and convolutional neural network models for character recognition. In: Real-Time Image and Video Processing 2017, SPIE, pp 32–42 (2017)
27.
Zurück zum Zitat Du, J.: Understanding of object detection based on cnn family and yolo. In: Journal of Physics: Conference Series, IOP Publishing, p 012029 (2018)
28.
Zurück zum Zitat Ebrahimkhanlou, A., Salamone, S.: Single-sensor acoustic emission source localization in plate-like structures using deep learning. Aerospace 5(2), 50 (2018)CrossRef
29.
Zurück zum Zitat Michigan Department of Environment GL, Energy(DEGLE).: Michigan lead and copper rules service line material notification requirements. drinking water and environmental health division (2020)
30.
Zurück zum Zitat Eren, L.: Bearing fault detection by one-dimensional convolutional neural networks. Math. Probl. Eng. 2017(1), 8617,315 (2017)
31.
Zurück zum Zitat Eren, L., Ince, T., Kiranyaz, S.: A generic intelligent bearing fault diagnosis system using compact adaptive 1d cnn classifier. J. Signal Process. Syst. 91(2), 179–189 (2019)CrossRef
32.
Zurück zum Zitat Ewald, V., Groves, RM., Benedictus, R.: Deepshm: A deep learning approach for structural health monitoring based on guided lamb wave technique. In: Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2019, SPIE, pp 84–99 (2019)
33.
Zurück zum Zitat Goodfellow, I.: Deep learning (2016)
34.
Zurück zum Zitat Gu, J., Wang, Z., Kuen, J., et al.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018)CrossRef
35.
Zurück zum Zitat Guo, J., Liu, C., Cao, J., et al.: Damage identification of wind turbine blades with deep convolutional neural networks. Renew. Energy 174, 122–133 (2021)CrossRef
36.
Zurück zum Zitat Hao, T., Rogers, C., Metje, N., et al.: Condition assessment of the buried utility service infrastructure. Tunn. Undergr. Space Technol. 28, 331–344 (2012)CrossRef
37.
Zurück zum Zitat Helbing, G., Ritter, M.: Deep learning for fault detection in wind turbines. Renew. Sustain. Energy Rev. 98, 189–198 (2018)CrossRef
38.
Zurück zum Zitat Hu, G., Yang, Y., Yi, D., et al.: When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 142–150 (2015)
39.
Zurück zum Zitat Ince, T., Kiranyaz, S., Eren, L., et al.: Real-time motor fault detection by 1-d convolutional neural networks. IEEE Trans. Ind. Electron. 63(11), 7067–7075 (2016)CrossRef
40.
Zurück zum Zitat Iqbal, K., Hasan, F., Sjoblom, K., et al.: Buried service line material characterization using stress wave propagation: Numerical and experimental investigations. J. Nondestruct. Eval. 43(1), 12 (2024)
41.
Zurück zum Zitat Jallouli, A.: (2020) Evaluation of lead pipe detection by electrical resistance measurement
42.
Zurück zum Zitat Kim, YS., Kim, MK., Fu, N., et al.: Investigating the impact of data normalization methods on predicting electricity consumption in a building using different artificial neural network models. Sustainable Cities and Society p 105570 (2024)
43.
Zurück zum Zitat Kiranyaz, S., Ince, T., Gabbouj, M.: Real-time patient-specific ecg classification by 1-d convolutional neural networks. IEEE Trans. Biomed. Eng. 63(3), 664–675 (2015)CrossRef
44.
Zurück zum Zitat Kiranyaz, S., Ince, T., Hamila, R., et al.: Convolutional neural networks for patient-specific ecg classification. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp 2608–2611 (2015b)
45.
Zurück zum Zitat Kiranyaz, S., Ince, T., Gabbouj, M.: Personalized monitoring and advance warning system for cardiac arrhythmias. Sci. Rep. 7(1), 9270 (2017)CrossRef
46.
Zurück zum Zitat Kiranyaz, S., Gastli, A., Ben-Brahim, L., et al.: Real-time fault detection and identification for mmc using 1-d convolutional neural networks. IEEE Trans. Ind. Electron. 66(11), 8760–8771 (2018)CrossRef
47.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, GE.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inform. Process. Syst. 25 (2012)
48.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
49.
Zurück zum Zitat Leinov, E., Lowe, M.J., Cawley, P.: Investigation of guided wave propagation and attenuation in pipe buried in sand. J. Sound. Vib. 347, 96–114 (2015)CrossRef
50.
Zurück zum Zitat Leinov, E., Lowe, M.J., Cawley, P.: Ultrasonic isolation of buried pipes. J. Sound Vib. 363, 225–239 (2016)CrossRef
51.
Zurück zum Zitat LeNail, A.: Nn-svg: Publication-ready neural network architecture schematics. J. Open Source Softw. 4(33), 747 (2019)CrossRef
52.
Zurück zum Zitat Liggett, J., Baribeau, H., Deshommes, E., et al.: Service line material identification: Experiences from north american water systems. J. Am. Water Works Assoc. 114(1), 8–19 (2022)CrossRef
53.
Zurück zum Zitat Ling, C.X., Sheng, V.S.: Cost-sensitive learning and the class imbalance problem. Encycl. Mach. Learn. 2011, 231–235 (2008)
54.
Zurück zum Zitat Long, R., Vine, K., Lowe, M., et al.: The effect of soil properties on acoustic wave propagation in buried iron water pipes. In: AIP Conference Proceedings, American Institute of Physics, pp 1310–1317 (2002)
55.
Zurück zum Zitat Long, R., Cawley, P., Lowe, M.: Acoustic wave propagation in buried iron water pipes. Proceedings of the Royal Society of London Series A: Mathematical, Physical and Engineering Sciences 459(2039), 2749–2770 (2003)CrossRef
56.
Zurück zum Zitat Long, R., Lowe, M., Cawley, P.: Attenuation characteristics of the fundamental modes that propagate in buried iron water pipes. Ultrasonics 41(7), 509–519 (2003)CrossRef
57.
Zurück zum Zitat Melville, J., Alguri, KS., Deemer, C., et al.: Structural damage detection using deep learning of ultrasonic guided waves. In: AIP Conference Proceedings, AIP Publishing (2018)
58.
Zurück zum Zitat Metje, N., Atkins, P., Brennan, M., et al.: Mapping the underworld-state-of-the-art review. Tunn. Undergr. Space Technol. 22(5–6), 568–586 (2007)CrossRef
59.
Zurück zum Zitat Mohtasham Khani, M., Vahidnia, S., Ghasemzadeh, L., et al.: Deep-learning-based crack detection with applications for the structural health monitoring of gas turbines. Struct. Health Monit. 19(5), 1440–1452 (2020)CrossRef
60.
Zurück zum Zitat Muggleton, J., Brennan, M., Pinnington, R.: Wavenumber prediction of waves in buried pipes for water leak detection. J. Sound Vib. 249(5), 939–954 (2002)CrossRef
61.
Zurück zum Zitat Oelze, M.L., O’Brien, W.D., Darmody, R.G.: Measurement of attenuation and speed of sound in soils. Soil Sci. Soc. Am. J. 66(3), 788–796 (2002)CrossRef
62.
Zurück zum Zitat O’Shea, K.: An introduction to convolutional neural networks (2015). arXiv preprint arXiv:1511.08458
63.
Zurück zum Zitat Pal, J., Sikdar, S., Banerjee, S.: A deep-learning approach for health monitoring of a steel frame structure with bolted connections. Struct. Control Health Monit. 29(2),(2022)
64.
Zurück zum Zitat Pathak, N.: Bridge health monitoring using cnn. In: 2020 International Conference on Convergence to Digital World-Quo Vadis (ICCDW), IEEE, pp 1–4 (2020)
65.
Zurück zum Zitat Pereira, J., Saraiva, F.: Convolutional neural network applied to detect electricity theft: A comparative study on unbalanced data handling techniques. Int. J. Electr. Power Energy Syst. 131(107), 085 (2021)
66.
Zurück zum Zitat Program, N.T.: Ntp monograph on health effects of low-level lead. Hypertension 160, 95 (2012)
67.
Zurück zum Zitat Rautela, M., Senthilnath, J., Moll, J., et al.: Combined two-level damage identification strategy using ultrasonic guided waves and physical knowledge assisted machine learning. Ultrasonics 115(106), 451 (2021)
68.
Zurück zum Zitat Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)CrossRef
69.
Zurück zum Zitat Ruiz, DV., de Bragança, CSC, Poncetti, BL., et al.: Vibration-based structural damage detection strategy using frfs and machine learning classifiers. In: Structures, Elsevier, p 105753 (2024)
70.
Zurück zum Zitat Ruiz, D.V., de Bragança, C.S.C., Poncetti, B.L., et al.: Structural damage detection for a small population of nominally equal beams using pso-optimized convolutional neural networks. Mech. Syst. Signal Process. 225(112), 276 (2025)
71.
Zurück zum Zitat Ruiz, JT., Pérez, JDB., Blázquez, JRB.: Arrhythmia detection using convolutional neural models. In: Distributed Computing and Artificial Intelligence, 15th International Conference 15, Springer, pp 120–127 (2019)
72.
Zurück zum Zitat Sampaio, R., Maia, N., Silva, J.: Damage detection using the frequency-response-function curvature method. J. Sound Vib. 226(5), 1029–1042 (1999)CrossRef
73.
Zurück zum Zitat Sarkar, S., Reddy, KK., Giering, M.: Deep learning for structural health monitoring: A damage characterization application. In: Annual Conference of the PHM Society (2016)
74.
Zurück zum Zitat Schmidhuber, J.: Deep learning in neural networks: An overview. Neural Netw. 61, 85–117 (2015)CrossRef
75.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet
76.
Zurück zum Zitat Su, N., Han, Q., Yang, Y., et al.: Analysis of longitudinal guided wave propagation in a liquid-filled pipe embedded in porous medium. Appl. Sci. 11(5), 2281 (2021)CrossRef
77.
Zurück zum Zitat Tighe, M., Bielski, M., Wilson, M., et al.: A sensitive xrf screening method for lead in drinking water. Anal. Chem 92(7), 4949–4953 (2020)
78.
Zurück zum Zitat Van Hulse, J., Khoshgoftaar, TM., Napolitano, A.: Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th International Conference on Machine Learning, pp 935–942 (2007)
79.
Zurück zum Zitat Wang, C., Xin, C., Xu, Z.: A novel deep metric learning model for imbalanced fault diagnosis and toward open-set classification. Knowledge-Based Systems 220(106), 925 (2021)
80.
Zurück zum Zitat Wang, X., Lin, M., Li, J., et al.: Ultrasonic guided wave imaging with deep learning: Applications in corrosion mapping. Mech. Syst. Signal Process. 169(108), 761 (2022)
81.
Zurück zum Zitat Xing Z, Zhao R, Wu Y, et al (2022) Intelligent fault diagnosis of rolling bearing based on novel cnn model considering data imbalance. Appl. Intell. 52(14), 16,281–16,293
82.
Zurück zum Zitat Xu, X., Cao, D., Zhou, Y., et al.: Application of neural network algorithm in fault diagnosis of mechanical intelligence. Mech. Syst. Signal Process. 141(106), 625 (2020)
83.
Zurück zum Zitat Zhang, W., Peng, G., Li, C.: Bearings fault diagnosis based on convolutional neural networks with 2-d representation of vibration signals as input. In: MATEC web of conferences, EDP Sciences, p 13001 (2017)

JOT - Journal für Oberflächentechnik (Link öffnet in neuem Fenster)

Das führende Magazin für sämtliche Themen in der Oberflächentechnik.
Für Entscheider und Anwender aus allen Bereichen der Industrie.

    Bildnachweise
    Wagner Logo/© J. Wagner GmbH, Harter Drying Solutions/© HARTER GmbH, Cenaris Logo/© CENARIS GmbH, Ecoclean Logo/© SBS Ecoclean Group, Eisenmann Logo/© EISENMANN GmbH, L&S Logo/© L&S Oberflächentechnik GmbH & Co. KG, FreiLacke Logo/© Emil Frei GmbH & Co. KG, Afotek Logo/© @AFOTEK Anlagen für Oberflächentechnik GmbH, Fischer Logo/© Helmut Fischer GmbH, Venjakob Logo/© VENJAKOB Maschinenbau GmbH & Co. KG, Nordson Logo/© Nordson Deutschland GmbH, JOT - Journal für Oberflächentechnik, Chemetall und ZF optimieren den Vorbehandlungsprozess/© Chemetall