nach oben

Neural Computing and Applications

Erschienen in:

Open Access 11.11.2022 | Review

Application of streaming analytics for Artificial Lift systems: a human-in-the-loop approach for analysing clustered time-series data from progressive cavity pumps

verfasst von: Fahd Saghir, M. E. Gonzalez Perdomo, Peter Behrenbruch

Erschienen in: Neural Computing and Applications | Ausgabe 2/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Assessing real-time performance of Artificial Lift Pumps is a prevalent time-series problem to tackle for natural gas operators in Eastern Australia. Multiple physics, data-driven, and hybrid approaches have been investigated to analyse or predict pump performance. However, these methods present a challenge in running compute-heavy algorithms on streaming time-series data. As there is limited research on novel approaches to tackle multivariate time-series analytics for Artificial Lift systems, this paper introduces a human-in-the-loop approach, where petroleum engineers label clustered time-series data to aid in streaming analytics. We rely on our recently developed novel approach of converting streaming time-series data into heatmap images to assist with real-time pump performance analytics. During this study, we were able to automate the labelling of streaming time-series data, which helped petroleum and well surveillance engineers better manage Artificial Lift Pumps through machine learning supported exception-based surveillance. The streaming analytics system developed as part of this research used historical time-series data from three hundred and fifty-nine (359) coal seam gas wells. The developed method is currently used by two natural gas operators, where the operators can accurately detect ten (10) performance-related events and five (5) anomalous events. This paper serves a two-fold purpose; first, we describe a step-by-step methodology that readers can use to reproduce the clustering method for multivariate time-series data. Second, we demonstrate how a human-in-the-loop approach adds value to the proposed method and achieves real-world results.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

The State of Queensland is home to approximately nine thousand natural gas wells [1], where energy operators depend on positive displacement pumps to produce hydrocarbons from these geographically dispersed Coal Seam Gas (CSG) assets. As the natural gas supplied from these wells is critical to sustaining energy demand in domestic and international markets, operators need to avoid unplanned downtime caused by pump failures. To monitor pump performance, data acquisition and control systems are deployed across the entire fleet of CSG wells, where they gather and transmit time-series data from pumps and well sensors. Depending on the natural gas operating company, a petroleum engineer may be assigned to manage anywhere from fifty to a hundred wells. They monitor streaming time-series data to evaluate pump performance and anticipate any failure that may disrupt gas production. However, monitoring, analysing, and mitigating issues on a well-by-well basis is a tedious task, and most often, critical pump events are either missed or ignored [2]. Most importantly, CSG producers are looking to add several hundred wells in the coming years to sustain global energy demand, which will only exacerbate the real-time pump performance analysis issue. This is where machine learning-assisted pump performance analysis can improve pump life.

1.1 Drawbacks of time-series analysis methods used for artificial lift systems

Generally, time-series analysis of Artificial Lift systems is based on either fuzzy logic [3, 4], physics-based models [5] or machine learning [6‐9] based pattern recognition methods. However, such methods present a drawback when assessing an Artificial Lift system's performance as they identify events without context, which may or may not impact the pump performance. Moreover, these methods rely on labelled or known events, and any new or outlier events are not detected. Furthermore, it is rare to find labelled datasets for Artificial Lift applications. In most cases, the assistance of subject matter experts (SMEs) is required to label data sets for improved failure prediction results [10]. However, labelling patterns in raw time-series data is challenging for SMEs, as each pump presents a different data performance profile where the same anomaly or event may have very different characteristics, such as amplitude and length of an event.

1.2 Limitations of time-series clustering methods

In a recently published paper, where the authors benchmarked eight (8) well-known time-series clustering methods [11], they set limitations for their evaluation methods which are mentioned below:

Uniform length time-series: The benchmarked methods mentioned in the paper above were tested on time-series data of uniform length for a pre-defined time-window length. However, time-series data from industrial sensors mostly have non-uniform lengths.

Known number of clusters: The datasets tested to benchmark the clustering methods had a known number of clusters (or k values). As our previous publications have demonstrated [12‐14], it is impossible to pre-define a set number of clusters for industrial time-series data, especially when dealing with data gathered from Artificial Lift Systems.

Another notable research on deep time-series-based clustering [15] mentions similar or related drawbacks. These will be discussed later in the Related Works section.

1.3 A practical approach for streaming time-series analysis of artificial lift systems

To address the drawbacks and limitations mentioned above, we propose a human-assisted approach to labelling clustered time-series data that can be utilized for running streaming performance analytics of positive displacement pumps.

Our research has two unique parts; firstly, we define a streamlined process to cluster multi-variate time-series data. This process is based on our previous research work where we convert multi-variate time-series data into performance heatmap images [14]. These images are then converted to unlabelled clusters based on the methodology defined later in the paper.

Second, to assist with the cluster labelling process, we developed a cluster analysis tool for engineers, where they could apply their petroleum domain expertise to label events of interest. Through this tool, petroleum engineers can combine their expertise with streaming analytics and automate the process of labelling events of interest, allowing them to manage Artificial Lift System proactively. Furthermore, petroleum engineers from two CSG operating companies currently use the cluster analysis tool system developed as part of this research for their daily analysis of approximately five-hundred PCP wells.

2 Overview of Coal Seam Gas production

In eastern Australia, natural gas is predominantly produced through CSG production, where coal seams are depressurized through a dewatering process that allows gas to escape from coal cleats and flow to the surface. Positive displacement pumps are installed in CSG wells, which produce water to the surface and, in the process, depressurize the coal seams. In the oil and gas industry, such pumps are referred to as Artificial Lift Pumps, and a network of these pumps collectively forms an Artificial Lift System. In Fig. 1 (Left), we see how water is displaced from the bottom of the well through the Production Tubing, and gas is produced via the production casing.

A salient characteristic of CSG wells is that they have a shorter production life span, usually ten (10) years, compared to conventional gas-producing wells. This lifespan is shown in Fig. 1 (right), with three (3) distinct stages, where a large quantity of water is produced initially to depressurize the coal seams, followed by a production stage with an increase in gas production. Finally, gas rates decline towards the end of the production lifecycle.

As gas production depletes quickly, CSG producers in Queensland must periodically drill and add new wells to maintain natural gas supplies. Hence, many CSG wells are dotted across Queensland, and this geographical spread and density are shown in Fig. 2.

2.1 Progressive Cavity Pumps

Like any positive displacement pump, a rotor and a stator work in tandem to push the liquid through to achieve vertical hydraulic lift. Figure 3 shows various components of a PCP assembly installed in a producing well. The rotor and elastomer assembly are designed such that the cavities between them push the fluid through when the rotor is operational.

Equations (1) and (2) show the correlation between speed, flow and torque. Time-series trends of these three parameters provide the necessary operational details of PCPs over their lifetime. Hence, the multivariate time-series analysis of our study will focus on these three parameters.

The correlation between flow and speed is shown in Eq. (1) [18].

$$q_{th} = s \omega$$

(1)

where q_th = theoretical flow, s = pump displacement, ω = rotational speed.

The correlation between torque and speed is shown in Eq. (2) [18].

$$T_{{{\text{pr}}}} = \frac{{P_{{{\text{pmo}}}} E_{{{\text{pt}}}} }}{C\omega }$$

(2)

where T_pr = polished rod torque, P_pmo = prime mover power, E_pt = power transmission efficiency, C = constant, ω = rotational speed

2.2 Data gathering from CSG wells

As CSG wells are located in remote and geographically dispersed areas, operators must utilize Supervisory Control and Data Acquisition (SCADA) Systems to control wells through Wireless Telemetry. Ultra-high frequency (UHF) or microwave radio transmit data from CSG wells to a central control room. Figure 4 shows a layout of a typical CSG well site. The Remote Telemetry Unit (RTU) installed at each well site is responsible for recording data from multiple sensors and forwarding it to a central SCADA system. The data are stored and historized in data servers and delivered onwards to a corporate Historian database. It is important to note that data transferred via SCADA systems may not always have a fixed transmit rate; hence, data reporting time in most cases is asynchronous where time windows are not of identical length. Some SCADA systems use a report-by-exception approach, where data are only transmitted when a critical data point changes based on a pre-set percentage change. The report-by-exception method also produces data of unequal time windows.

Unlike univariate time-series data, applying anomaly detection and clustering methods to multivariate time-series data are a complex task which requires additional interpretation and insights [19]. In this section, we will further shed light on research gaps in multivariate time-series based anomaly detection and clustering methods. Furthermore, our previous work on Symbolic Aggregation Approximation (SAX)-based performance heatmap conversion [14] will be discussed to demonstrate why this novel approach provides a better basis for a human-in-the-loop approach when clustering multivariate time-series data. Finally, we will discuss why a human-in-the-loop approach adds value to time-series analysis process proposed in this paper.

3.1 Neural net-based anomaly detection

Neural nets have become a popular choice to solve anomaly detection problems in time-series data. One approach proposes using a fully connected convolutional network, U-Net, to identify anomalies in multivariate time-series data [20]. This method treats a fixed-length multivariate time-series snapshot as a multi-channel image. A U-Net segmentation technique is applied to obtain a final convolution layer corresponding to an augmentation map. The last layer includes the anomaly identification classes for the time-series snapshot, and each anomaly class is considered a mutually exclusive event. However, there are two drawbacks to this anomaly detection approach. Firstly, as the U-Net architecture accepts a fixed number of samples as input in a time window, the time-series data must be resized based on up-sampling or down-sampling techniques. Second, as each anomaly is a mutually exclusive event, it is difficult to segregate anomalies of interest from a routine change in process behaviour.

Another neural net-based anomaly detection approach proposes a Multi-Scale Convolutional Recurrent Encoder–Decoder (MSCRED) method [21]. This method converts multivariate time-series data into signature matrices based on the pairwise inner-product of time-series data streams. The matrices are encoded using a fully connected convolutional encoder. A Convolutional Long Short-Term Memory (ConvLSTM) network is used to extract the hidden layer of each encoder stage, which is added to a convolutional decoder to produce a reconstructed signature matrix. The difference between the original signature and the reconstructed matrix is labelled as the residual signature matrix. This matrix defines a loss function that helps the model detect anomalies in multivariate time-series data. The residual signature matrix also helps determine the duration of anomaly events in time-series data based on small, medium, and large time-window duration.

Although the MSCRED methodology is novel in its approach, there are three limitations to using this approach for multivariate time-series analysis. Firstly, identifying anomaly events depends on the time-window duration. Therefore, the duration of the small, medium and large time windows will have to be tuned based on the properties of the time-series data and the application where it will be applied. Secondly, this approach does not consider the state of the process from time zero (t₀), when the process was initiated for the first time. This restriction, therefore, fails to observe any changes in pump mechanical degradation, which can provide additional insights into time-series-based performance analysis.

3.2 Neural net-based time-series clustering

Multiple research papers have recently been published on the use of neural net based time-series clustering methods [15, 22‐24], both for univariate and multivariate data sets. These novel research methods extract feature matrices which are fed to a neural net architecture to extract low-dimensional embedding. The embeddings are then used to cluster the time-series data with a known clustering method, primarily the k-means method, which means the number of clusters must be known beforehand.

Although our approach is similar, we do not need to know the number of clusters beforehand. Most importantly, our low-dimensional embeddings are based on the novel approach of SAX derived time-series performance heatmap images.

3.3 Converting time-series data into performance heatmap images

This section provides an overview of how the SAX-based performance heatmaps are created for improved understanding of Artificial Lift Performance analysis and, more importantly, how these images provide contextual clustering of multivariate time-series data.

3.3.1 Expanding window technique

To understand how PCPs operate in CSG operations, it is essential to look at their performance from the day they are initiated into operation for dewatering wells. For this purpose, we use the expanding window technique shown in Fig. 5, which evaluates the multivariate data in the expansion stride based on the elapsed pump performance. By doing so, the exploratory data analysis methods utilized for performance analysis can capture the varying mechanical dynamics in the PCP through the pump's life.

3.3.2 Symbolic aggregation approximation (SAX)-based performance heatmaps for PCPs

Performance heatmaps help capture the temporal variation and time-window-based impact of multiple sensor readings in a single image [12]. By converting time-series data into performance heatmaps, it is possible to visualize the sequential variation in sensor readings while understanding the influence of change in sensor values between time windows. Furthermore, the performance heatmap approach is exempt from some of the shortcomings of other time-series-based image conversion techniques.

While other time-series to image conversion methods require a fixed sampling rate for each analysed time window to produce consistent images, the performance heatmap technique overcomes this deficiency by converting sensor values into Symbolic Aggregation Approximation (SAX) symbols [27]. The SAX symbols obtained through the conversion of time-series data are transformed into a symbol matrix and then converted to a performance heatmap—an example of SAX-based time-series image conversion [12]. Figure 6 shows a 1-h time-series trend of flow, speed and torque converted to a performance heatmap.

Moreover, most image conversion techniques [28] are developed for univariate time-series data. Although some techniques convert multivariate data into images [29], they mostly rely on converting univariate data into images and then either stack them horizontally or vertically to create a single 2D image.

3.3.3 Majority and anomaly heatmap images

Once the performance heatmaps are created, they can be split into majority and anomaly event images. Table 1 shows the time-based colour code used to label major, variation and anomaly event in a performance heatmap. In this study, we will only focus on majority and anomaly events, as the variation events are events in transition that are not significant in deducing any abnormal behaviour of the PCPs.

Table 1

Color code for performance heatmaps based on the counts of SAX symbols in a 1-day window

https://static-content.springer.com/image/art%3A10.1007%2Fs00521-022-07995-8/MediaObjects/521_2022_7995_Tab1_HTML.png

Figure 7 shows how a performance heatmap is split into majority and anomaly event images. We will use these images to create our unsupervised image cluster library for time-series data. Each image has a 48 × 48x3 (6912) pixel dimension.

Figure 8 provides a breakdown of the majority and anomaly heatmaps where we have three (3) parameters (flow, torque and speed) represented in their respective columns. The position of the coloured boxes provides the state of each parameter, which can either be low, medium or high. These states can be used in cluster labelling processing to group various time-series clusters into similar performance and anomalous event categories.

3.4 Advantages of a human-in-the-loop approach for data labelling

Multivariate time-series data collected from industrial processes are seldom labelled, and hence, extracting any meaningful information from such data requires input from domain experts [19, 25]. This holds true for CSG operations, where the geophysical dynamics between coal seams and the pump require inference from domain experts for correct interpretation of normal and abnormal behaviour. By adding human inference to unlabelled data sets, it becomes easier for domain experts to accept the results generated by machine learning solutions [26].

In our methodology, we rely on cluster labelling from petroleum engineers to correctly identify various performance states of PCPs used in Artificial Lift Systems. The SAX performance heatmaps provide petroleum engineers with a visual context as to why different clusters are identified based on the majority and anomaly heatmaps. In the next section, we will cover in detail the methodology to cluster SAX-derived performance heatmaps and how petroleum engineers label these clusters via a cluster analysis tool.

4 Methodology

In this section, we will discuss the methodology shown in Fig. 9, which is used to cluster un-labelled time-series data:

Develop a Performance Heatmap-specific Auto-Encoder: This step will be used as the first dimensionality reduction method to reduce the size of the images, which will lower memory usage and improve calculation times on the computer used for conducting the experiments.

Embedding-based Dimensionality Reduction: In this step, we will experiment with various dimensionality reduction methods (DRMs) and reduce the dimension of auto-encoded images to a 2-dimensional plane. Doing so will better visualize the performance heatmaps grouping in an XY plot.

Hierarchical Density-Based Spatial Clustering (HDBSCAN): We will use HDBSCAN to identify various clusters within the 2-dimensional plane of various DRMs in step 2.

Cluster Labelling: Once the images have been clustered, we will assign a label to known historical events from a selected number of wells using a cluster analysis tool. Once these events are labelled, we will use an automated cluster labelling pipeline to identify events on real-time data.

Assumptions

Our methodology works under the following assumptions:

Availability of Key Data Variables: To analyse PCP performance, the time-series data should have flow, toque and speed variables. These three variables are needed to produce SAX performance heatmap that is required for the clustering process. For other multivariate time-series application, key variables should be defined based on the processed being analysed.

Domain Expertise: Petroleum engineers using the cluster analysis tool should have relevant experience in their field to properly label SAX performance heatmap clusters.

Data Completeness: The multivariate data set used for the clustering process must cover the entire operation cycle of a PCP in CSG operation, i.e. the time-series data from beginning to end-of-life of PCPs. This will help capture various performance heatmaps over the life cycle of CSG wells.

Experiment tracking setup

We used Weights & Biases [30] for experiment tracking and visualizations to develop insights for this paper. The Weights & Biases application allows automated tracking of machine learning experiments through Code Sweeps. Through this, multiple combinations of model training, hyperparameter tuning and clustering results can be captured and visualized to obtain the best results for machine learning projects. Figure 10 provides an overview of how multiple sweep experiments can be recorded and visualized to provide actionable insights into the effect of different layer properties for a deep auto-encoder (DAE). All coding for these experiments was done using Python 3.7 and necessary statistics, computer vision and machine learning libraries suited for this Python version.

4.1 I. Auto-encoder-based dimensionality reduction

This section will look at selecting the most optimum auto-encoder (AE) to reduce the latent representation of the performance heatmaps. The SAX-based performance heatmaps have a dimension of 48 × 48 ± 3 pixels (6912 pixels in total). Any data clustering problem must represent data in a two-dimensional space to evaluate results through an X–Y scatter plot. We will utilize an AE approach to minimize pixel dimensionality. Furthermore, reducing dimensionality allows us to examine many images due to reduced processing memory requirement, which improves the overall clustering analysis of the Performance Heatmaps.

4.1.1 i Deep auto-encoder (DAE)

We will start the experiment by developing a DAE that reduces the performance heatmap to fewer dimensions. Figure 11 shows a fully connected neural network with an input, hidden and output layer. These layers form a DAE, where the hidden layer is the reduced latent representation of the input layer. The output layer is the input layer reconstruction based on the hidden layer's interpretation. To gauge the performance of the DAE, we track the validation loss (val_loss), where the lowest value determines the best performing DAE architecture.

Step 1–2-Layer DAE sweep run

In Step 1, we begin the experiment by evaluating a two-layer DAE to gauge the performance of val_loss over different channel sizes. The parameters for the first Sweep Run are as follows:

The settings shown in Table 2 run six (6) sweep experiments and measure the val_loss for different layer combinations. Figure 12 shows that for a two-layer deep neural network, a 16-channel layer2 produces the minimum val_loss compared to an eight or four-channel layer2. As shown in Fig. 13, the decoded image for a 16 × 8-channel DAE configuration does not accurately represent the original image. However, as shown in Fig. 14, the decoded image for a 16 × 16-channel DAE configuration is a more accurate copy of the original image. Table 3 confirms that sweep-4, where layer1 and layer2 are sixteen-channel each, produces the best val_loss for a two-layer DAE.

Table 2

Parameter settings for a 2-layer DAE sweep experiment

Parameter	Settings
Layer 1	[16, 32]
Layer 2	[4, 8, 16]
Train images	134, 346
Test images	33, 587
Loss function	Binary cross entropy
Epochs	100

Table 3

val_loss results from the 2 Layer DAE sweep run

Name	layer1	layer2	Loss	val_loss
sweep-6	16	4	0.043735	0.057785
sweep-5	16	8	0.021559	0.023302
sweep-4	16	16	0.021424	0.02292
sweep-3	32	4	0.064341	0.093155
sweep-2	32	8	0.032088	0.032394
sweep-1	32	16	0.021083	0.023502

Step 2–3-Layer DAE sweep run

In this step, we will add a third layer to the DAE and try further dimensionality reduction. As per Table 4, we will try dimensions 8, 4 and 2 for the third layer in the DAE Table 3 shows the setup for the Sweep experiment used in this step.

Table 4

Parameter settings for a 3-layer DAE sweep experiment

Parameter	Settings
Layer 1	[16]
Layer 2	[16]
Layer 3	[2, 4, 8]
Train images	134, 346
Test images	33, 587
Loss function	Binary cross entropy
Epochs	100

Figure 15 provides an overview of the three-layer DAE Sweep experiment. A 16 × 16x8 DAE produces the minimum val_loss, and the results of this layer configuration are shown in Fig. 16. Results from all three (3) sweep runs are summarized in Table 5.

Table 5

val_loss results from the 3-layer DAE sweep run

Name	layer1	layer2	layer3	loss	val_loss
sweep-3	16	16	2	0.067277	0.074296
sweep-2	16	16	4	0.057862	0.084062
sweep-1	16	16	8	0.022665	0.022379

Step 3—4-Layer DAE sweep run

In this step, we will experiment with dimensions 8, 4 and 2 in the four-layer of the DAE. Table 6 shows the setup for this sweep experiment.

Table 6

Parameter settings for a 4-layer DAE sweep experiment

Parameter	Settings
Layer 1	[16]
Layer 2	[16]
Layer 3	[8]
Layer 4	[2, 4, 8]
Train images	134,346
Test images	33,587
Loss function	Binary cross entropy
Epochs	100

Figure 17 shows that further dimensionality reduction to 2 or 4 channels increases the val_loss; hence, further reduction from 8 channels is not feasible. However, an eight (8) channel fourth-layer does improve the overall val_loss of the DAE from 0.02292 (Table 5) to 0.022079 (Table 7). Results from the four-layer DAE are shown in Fig. 18, which validates that reducing dimensionality below 8 channels is not feasible with a D.A.E. Hence, our final DAE configuration is 16 × 16x8 × 8 for reducing the time-series heatmaps from 6912 pixels (48 × 48x3) to 8 dimensions.

Table 7

val_loss results from the 4-layer DAE sweep run

Name	layer1	layer2	layer3	layer4	loss	val_loss
sweep-3	16	16	8	2	0.05247	0.068188
sweep-2	16	16	8	4	0.056899	0.069199
sweep-1	16	16	8	8	0.020568	0.022079

4.1.2 ii. Convolutional auto-encoder

To see if the val_loss and dimensions can be reduced further, we will use a four-layer convolutional auto-encoder (CAE) architecture. Table 8 shows the Sweep experiment parameters that are investigated using a four-layer architecture to see if the CAE can reduce the image to 8 or fewer dimensions while improving val_loss.

Table 8

Parameter settings for a 4-layer CAE sweep experiment

Parameter	Settings
Layer 1	[16]
Layer 2	[16]
Layer 3	[8]
Layer 4	[2, 4, 8]
Train Images	134, 346
Test images	33, 587
Loss function	Binary cross entropy
Epochs	100

As shown in Fig. 19, CAE, val_loss for less than 8 dimensions in the fourth layer is relatively high versus 4 or 2 dimensions. However, the 16 × 16x8 × 8 CAE configuration further reduces the val_loss compared to the DAE. Table 9 shows the comparison between the DAE and CAE val_loss. Based on this result, we will use a 16 × 16x8 × 8 CAE to encode the 48 × 48x3 major and anomaly event images to 8 dimensions. The final CAE architecture to encode the images is shown in Fig. 20.

Table 9

Parameter settings for a 4-layer CAE sweep experiment

Encoder type	layer1	layer2	layer3	layer4	loss	val_loss
DAE	16	16	8	8	0.020568	0.022079
CAE	16	16	8	8	0.0215	0.02042

4.2 II. High-density dimensionality reduction

We have demonstrated that the time-series-based images can be reduced to a latent size of eight (8) dimensions with a convolutional autoencoder. However, to provide a visual distribution context to time-series image clustering, we need to reduce the number of dimensions to two (2), and this can be achieved by utilizing high-density dimensionality reduction techniques. For this paper, we will experiment with three (3) methods which are t-distributed stochastic neighbour embedding (t-SNE) [31], uniform manifold approximation and project (UMAP) [32], and the minimum-distortion embedding (MDE) [33] method. These methods take high-density multi-dimensional points and assign them to a two-dimensional map.

4.2.1 i t-Distributed stochastic neighbour embedding (t-SNE)

t-SNE determines the conditional probability of high-dimensional data points by computing the Euclidean distance between the points. The probability represents similarities between two points and determines if these points could be picked as neighbours [31]. The probability ${p}_{j\mid i}$ is represented by Eq. (3) [31], where ${x}_{i}$ and ${x}_{j}$ are the data points being compared for similarity. Figure 21 depicts the t-SNE distribution for various numbers of major heatmap images. This distribution provides abstract localization with no recognizable high-density areas for the images.

$$p_{{\left( {ji} \right)}} = \frac{{\exp \left( {\frac{{ - \parallel x_{i} - x_{j} \parallel^{2} }}{{2\sigma_{i}^{2} }}} \right)}}{{\sum \exp \left( {\frac{{ - \parallel x_{i} - x_{k} \parallel^{2} }}{{2\sigma_{i}^{2} }}} \right)}}$$

(3)

where ${p}_{j\mid i}$ = conditional probability.

4.2.2 ii. Uniform manifold approximation and projection (UMAP)

Although t-SNE and UMAP share some clustering similarities [32], UMAP differentiates itself by creating high- and low-dimensional similarities for the distances between two points. Equations (4) and (5) [32] provide an overview of how these dimensionalities are calculated.

$$v_{{j{\mid }i}} = \exp \left[ {\frac{{\left( { - d\left( {x_{i} ,x_{j} } \right) - \rho_{i} } \right)}}{{\sigma_{i} }}} \right]$$

(4)

where ${v}_{j\mid i}$= high dimensional similarities, ${\sigma }_{i}$ = normalizing factor, ${\rho }_{i}$ = distance to the nearest neighbour

$$w_{ij} = \left( {1 + a\parallel y_{i} - y_{j} \parallel_{2}^{2b} } \right)^{ - 1}$$

(5)

where W_ij = low-dimensional similarities.

Figure 22 shows the UMAP distribution of various number of heatmap images. As highlighted in Fig. 23, the high-density groupings are visible within the overall high-dimensional structure.

4.2.3 iii. Minimum-distortion embedding (MDE)

As the name suggests, the MDE dimensionality reduction method (DRM) pairs items based on vectors calculated from distortion functions. Equation (6) [32] shows the equation to calculate the embedding for vectors, aiming to minimize the average distortion. Like UMAP, similar items will have vectors near each other, and different items will have far apart vectors.

$$\left( X \right) = \frac{1}{{\left| {\mathcal{E}} \right|}}\mathop \sum \limits_{{\left( {i,j} \right) \in {\mathcal{E}}}} f_{ij} \left( {d_{ij} } \right)$$

(6)

where E = embedding, ${d_{ij}} = {\left\| {{x_i} - {x_j}} \right\|_2}$ = set of allowable embeddings, ${f}_{ij}$ = distortion functions, $\mathcal{E}$ = set of vector pairs.

Figure 24 depicts the distribution of the embeddings for various number of major heatmap images. Again, a concentrated mass in the centre represents similar vectors, and dissimilar vectors are spread around the concentrated group.

Figure 25b shows a zoomed-in view of the concentrated mass of the similar vectors, and Fig. 25c shows how this mass further consists of neighbourhoods of high-density areas.

4.3 III. Hierarchical density-based spatial clustering (HDBSCAN)

In this step, we will use HDBSCAN to conduct unsupervised clustering of the major heatmap images. The HDBSCAN algorithm is a density-based clustering method, where a simplified cluster tree is produced from which significant clusters are extracted [34].

Based on the experiment run, we see that t-SNE 2-d dimensions produced increasing clusters as the image numbers increased. However, UMAP and MDE have very similar cluster numbers with the increase in image numbers. These results are shown in Fig. 26.

The experiment parameters are set as per Table 10. The cluster size determines the minimum samples in a cluster for it to be considered unique, and the sample size determines the critical mass within the cluster neighbourhood [35]. As per Table 10, our cluster size is 5, and the sample size is 200, which means that our clusters should have a minimum of 5 points, and if that cluster grows beyond 200 points, then that cluster becomes a core point.

Table 10

Parameter settings for the HDBSCAN unsupervised clustering experiment

Parameter	Settings
Wells	[10, 20, 30, 40, 50, 60, 70]
Cluster size	[5]
Sample size	[200]
Dimensionality Reduction	[t-SNE, UMAP, MDE]

Figure 27 shows the minimum and maximum clusters produced by each DRM method. Based on this experiment run, we will discard t-SNE for any further assessment. Also, the UMAP clustering with HDBSCAN provides the narrowest distribution of clusters across the number of images used in this experiment. Based on this information, we will have an in-depth look at UMAP and MDE clusters to understand how the images are sorted in the 2D plane and confirm which DRM methods provide us with a useable cluster distribution.

4.3.1 i. Clustering analysis

To understand the cluster formation in the MDE and UMAP reduction methods, we need to look at how the number of clusters and outliers varies with different combinations of cluster size and sample size parameters in HDBSCAN. To do this, we will run a clustering experiment with the parameters shown in Table 11.

Table 11

Parameter Settings for the HDBSCAN unsupervised clustering experiment

Parameter	Settings
Wells	[70]
Cluster size	[2, 5, 10, 15, 25]
Sample size	[5, 10, 25, 50, 100, 200]
Dimensionality reduction	[UMAP, MDE]

Figure 28 shows how cluster size and sample size impact the cluster and outlier count in the MDE and UMAP reduction methods. Clustering the UMAP distribution provides consistent cluster counts, with zero outliers in most cases. Moreover, running an independent UMAP clustering experiment as per Table 12 shows that the sample size of less than 30 produces the most consistent results where the outliers are minimised and the number of clusters is below 1000. The details of the independent UMAP clustering experiment are shown in Table 13 and Fig. 29. In Fig. 30a, we see that the MDE method produces a large spread of outliers beyond the core cluster area when the sample size is 5 and the cluster size is 2. However, in Fig. 30b we observe that the UMAP method produces 996 clusters with 0 outliers with the same sample and cluster size settings. Hence, it is clear that the UMAP DRM produces the most consistent number of HDBSCAN-derived clusters when the sample size is set to 5.

Table 12

Parameter settings for the HDBSCAN unsupervised clustering experiment

Parameter	Settings
Wells	[70]
Cluster size	[2, 5, 10, 15, 25]
Sample size	[5, 10, 25, 50, 100, 200]
Dimensionality reduction	[UMAP]

Table 13

HDBSCAN clustering sweep results showing the effect of sample size and cluster size on UMAP clusters

Cluster size	Sample size	Number of clusters	Number of outliers
2	5	996	0
15	5	995	0
25	5	994	0
5	5	996	0
10	5	996	0
5	10	996	0
2	10	996	0
10	10	996	0
25	10	994	0
15	10	995	0
10	25	994	0
25	25	993	0
5	25	996	18
15	25	993	0
2	25	996	18
5	50	996	110
2	50	999	116
25	50	989	128
15	50	993	69
10	50	995	115
15	100	991	1179
25	100	977	1002
5	100	1005	1118
10	100	998	1096
2	100	1016	1136
15	200	1006	4774
10	200	1031	4541
5	200	1061	4414
25	200	972	4976
2	200	1078	4402

The 996 clusters and 0 outliers from the UMAP clusters will be used to identify time-series events. Although we have 996 clusters identified in the UMAP distribution, we will use the time-series labelling methodology to generalize the cluster grouping.

4.3.2 ii. Analysing the UMAP and HDBSCAN clusters for Performance Heatmap grouping

To understand the cluster formation in the UMAP reduction methods, we will investigate two cluster areas, as shown in Fig. 31. The performance image grouping for major heatmaps, as shown in Fig. 32a and b, depicts that similar images are grouped in their respective high-density areas. We will use the assigned cluster numbers to label PCP performance events and identify any cluster repetition patterns.

Using the experiment steps explained in the previous sections, we get a UMAP and HDBSCAN cluster layout for anomaly heatmaps, as shown in Fig. 33. For the anomaly heatmaps, we get 98 clusters and 0 outliers. Investigating Cluster Area 1, we see the groupings created in the identified dense area. Like major heatmaps, we will use these cluster numbers to identify abnormal and anomalous PCP performance events.

4.4 IV Cluster labelling

After numbering the major and anomaly heatmap clusters in the previous step, we will now use a cluster labelling tool to add context to cluster numbers. The cluster labelling tool, developed using PowerBI, provides an intuitive approach to identifying clusters. As part of the cluster labelling process, as depicted in Fig. 34, we will use pre-identified event dates marked by production and Artificial Lift engineers to label events of interest. Furthermore, we will also discuss how cluster labelling can help identify the progression of the majority of heatmaps over the lifetime of a well and depict the degradation of a PCP.

4.4.1 i. Cluster labelling tool

The cluster labelling tool has three (3) areas shown in Fig. 35. The major heatmap clusters and anomaly heatmap clusters areas, shown in Fig. 35a and b, respectively, show the UMAP distribution of clusters for a particular well being analysed for labelling the time-series data. Fig. 35c shows the time-series trend with a days filter to browse various periods where abnormal or activity of interest may have occurred during PCP operations. Such periods can then be used to place clusters in categories that identify abnormal or anomalous PCP behaviour.

In Fig. 36, we look at a flow disturbance event on Day 84 of PCP operation. Two major heatmap clusters (44, 240) and two anomaly heatmap clusters (87, 90) were observed on Day 83. Upon selecting the area of flow disturbance on the time-series trend, we observed that both anomaly heatmap clusters, 87 & 90, are prevalent and relate to the flow column, as shown in Fig. 8. Furthermore, when the petroleum engineer selects abnormal behaviour period (red dotted area in Fig. 37), only major heatmap cluster 240 is visible, which depicts that the pump is in a high flow and high torque state. Hence, by looking at this grouping, we can state that on Day 84, the PCP saw flow anomaly events while in a high flow, high torque state. Using this methodology, we can group the major and anomaly heatmaps clusters into various states of PCP operations.

5 Results

This study aimed to demonstrate a streamlined and reproducible method of labelling time-series data gathered from CSG wells, so it may aid production engineers with identifying PCP performance profiles and abnormal production events. Our end-to-end approach is shown in Fig. 38, where saved cluster weights and labels help the streaming analytics process and allow operators to manage PCP wells by exception.

We will highlight in this section how our methodology produces meaningful labelled cluster groups that can be visualized as a coloured sequential bar chart against time-series data. Moreover, the anomalous events detected by our method were consistent among the two operators, specifically the solids and gas through pump events which are detrimental to PCP operational life. The streaming analytics approach was not only able to capture the abnormal event amplitude but also the longevity of the event.

5.1 I. Grouping cluster labels

Based on the observations made with the cluster labelling tool on 20 wells, the 996 major heatmap clusters and 98 anomaly heatmaps were segregated into groups, as shown in Table 14. The group labels were defined based on the experience of production and well surveillance engineers.

Table 14

Heatmap groups based on the observations made in the cluster labelling tool

Major heatmap groups	Anomaly heatmap groups
High torque	High flow
High high torque	Low flow
Erratic torque	High torque
Low flow, low torque	Low torque
Low low flow	Flow and torque
Low low flow, low low torque
High high flow, high high torque
Erratic flow
Ideal
Shutdown

In Table 15, we see the groups with a sample set of images they represent. For example, the major heatmap group labelled erratic torque shows that images within this group did not have a stable torque profile as no green box was recorded in the centre column. This depicts that the torque fluctuated significantly in this period; hence, no SAX character existed long enough to record any symbol count as a major event as per Table 1. Similarly, the image grouping for anomaly heatmaps in Table 16 describes the state that the PCP is in momentarily or may be considered an abrupt change.

Table 15

Sample images of major heatmap groups

https://static-content.springer.com/image/art%3A10.1007%2Fs00521-022-07995-8/MediaObjects/521_2022_7995_Tab15_HTML.png

Table 16

Sample images of anomaly heatmap groups

https://static-content.springer.com/image/art%3A10.1007%2Fs00521-022-07995-8/MediaObjects/521_2022_7995_Tab16_HTML.png

It is important to note that the characteristics of major or the anomaly heatmap groups can provide a performance profile for PCPs independently or in combination.

5.2 II. Cluster sequencing and visual analytics

To understand how heatmap groups define the performance of a PCP, we will use the colour code in Table 17 to represent each image group. These colour codes are then plotted along with the time-series data to understand the cluster sequencing and identify patterns in PCP performance.

Table 17

Color codes for major and anomaly heatmap groups

https://static-content.springer.com/image/art%3A10.1007%2Fs00521-022-07995-8/MediaObjects/521_2022_7995_Tab17_HTML.png

In Fig. 39, we see the heatmap groups plotted with the time-series trend for a lifespan of PCP well. Figure 39a represents the major heatmap group, and Fig. 39c represents the anomaly heatmap group. As seen in the progression of the major heatmap groups, it matches the state of the PCP performance during the dewatering, stable-flow, and high-torque pumping regimes.

If we look closer at a one-week PCP performance window, as shown in Fig. 40, the details in the major and anomaly heatmap groups become more apparent. In this case, we see the major heatmap groups clearly marking the areas of solids through the pump where the PCP torque increases. At the same time, anomaly heatmap groups also present markers of change (primarily high torque, high flow, low flow) either during or before the solids through pump events occur.

5.3 III. Cluster group consistency for anomalous events

Another finding during this study was the repeatability of major and anomaly heatmap group sequencing for events of interest. For example, in Fig. 41, we see the solids through pump events on multiple wells, where the major heatmap groups diverge from ideal to either high torque or high high torque. The major heatmap sequencing is very similar for all such events, regardless of the event's intensity or duration.

Similarly, in Fig. 42, we see gas through pump events. In this case, the major heatmap groups fluctuate between Ideal and Erratic Torque, whereas the anomaly heatmap groups consistently present with low flow and flow and torque events.

5.4 IV. Streaming analytics application for PCP performance analysis

Putting the previous steps together, we provided two natural gas operators with a streaming analytics tool, which assists them with identifying early PCP performance issues and alerts when critical anomalous events are detected. An overview of the application is shown in Fig. 43.

6 Conclusion and future works

Based on the above methodology, we demonstrated that the human-in-the-loop cluster labelling method and the streaming analytics tools developed as part of this research provide a reliable and scalable approach to determining and evaluating the performance of PCP-operated wells.

We have shown that various performance patterns can be detected with this approach, and the repeatability of the heatmap patterns provides a better understanding of changing PCP behaviour. Furthermore, notification of changes in performance profile and anomaly markers can be automated, where only events that require immediate attention can be reported in real time. By doing so, production and surveillance engineers can manage their wells by exception, aided by informed insights by the method proposed in this study.

Most importantly, by allowing petroleum engineers to aid with the labelling of time-series data, we could gain their trust in a machine learning-driven approach and, in turn, capture their knowledge of assessing Artificial Lift systems.

During this study, it became evident that the level of granularity to detect performance changes could be improved with smaller expansion stride lengths. In a forthcoming paper, we will present the effect of expansion stride length on cluster groups. Moreover, there is work in progress to apply this method to electric submersible pumps, which are centrifugal pumps used as an Artificial Lift method in conventional oil reservoirs.

Declarations

Conflict of interest

The authors declare no conflicts of interest.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel Performance analysis of stirling engine using computational intelligence techniques (ANN & Fuzzy Mamdani Model) and hybrid algorithms (ANN-PSO & ANFIS)

Nächster Artikel The application of neural network for software vulnerability detection: a review

Queensland borehole series metadata record. 2022; Available from: https://www.data.qld.gov.au/dataset/queensland-borehole-series.

Hoday JP et al (2013) Diagnosing PCP failure characteristics using exception based surveillance in CSG. In: SPE progressing cavity pumps conference. 2013, Society of Petroleum Engineers: Calgary, Alberta, Canada, p. 13.

Awaid A et al (2014) ESP well surveillance using pattern recognition analysis, oil wells, petroleum development Oman. In: International petroleum technology conference. 2014, International Petroleum Technology Conference, Doha, Qatar, p. 22.

Thornhill DG, Zhu D (2009) Fuzzy analysis of ESP system performance. In: SPE annual technical conference and exhibition. 2009, Society of Petroleum Engineers: New Orleans, Louisiana, p. 7.

Al Sawafi M et al (2021) Intelligent operating envelope integrated with automated well models improves asset wide PCP surveillance and optimization. In: Abu Dhabi international petroleum exhibition & Conference, OnePetro.

Abdelaziz M, Lastra R, Xiao JJ (2017) ESP data analytics: predicting failures for improved production performance. In: Abu Dhabi international petroleum exhibition & conference. 2017, Society of Petroleum Engineers: Abu Dhabi, UAE, p. 17.

Ocanto L, Rojas A (2001) Artificial-lift systems pattern recognition using neural networks. In: SPE Latin American and Caribbean Petroleum engineering conference. 2001, Society of Petroleum Engineers: Buenos Aires, Argentina, p. 6.

Liu S et al (2011) Automatic Early Fault Detection for Rod Pump Systems. In: SPE annual technical conference and exhibition. 2011, Society of Petroleum Engineers: Denver, Colorado, USA, p 11.

Andrade Marin A et al (2021) Real Time Implementation of ESP predictive analytics—towards value realization from data science. In: Abu Dhabi International Petroleum Exhibition & Conference.

10.

Liu Y et al (2011) Semi-supervised failure prediction for oil production wells. In: 2011 IEEE 11th International conference on data mining workshops.

11.

Javed A, Lee BS, Rizzo DM (2020) A benchmark study on time series clustering. Mach Learn Appl 1:100001

12.

Saghir F, Gonzalez Perdomo ME, Behrenbruch P (2020) Application of machine learning methods to assess progressive cavity pumps (PCPs) performance in coal seam gas (CSG) wells. APPEA J 60(1):197–214.

13.

Saghir F, Gonzalez Perdomo ME, Behrenbruch P (2019) Application of exploratory data analytics EDA in coal seam gas wells with progressive cavity pumps PCPs. In: SPE/IATMI Asia Pacific Oil & Gas Conference and Exhibition. 2019, Society of Petroleum Engineers: Bali, Indonesia, p. 10.

14.

Saghir F, Gonzalez Perdomo ME, Behrenbruch P (2019) Converting time series data into images: An innovative approach to detect abnormal behavior of progressive cavity pumps deployed in coal seam gas wells. In: SPE Annual Technical Conference and Exhibition. 2019, Society of Petroleum Engineers: Calgary, Alberta, Canada, p. 14.

15.

Alqahtani A et al (2021) Deep time-series clustering: a review. Electronics 10(23):3001CrossRef

16.

Huddlestone-Holmes CA, Elaheh KJ (2018) Decommissioning coal seam gas wells—Final Report of GISERA Project S.9: Decommissioning CSG wells. CSIRO.

17.

Commonwealth of Australia 2014, Coal seam gas extraction: modelling groundwater impacts. 2014, Department of the Environment.

18.

Matthews CM et al (2007) Petroleum Engineering Handbook. In: Production Operations Engineering. 2007, Society of Petroleum Engineers.

19.

Choi K et al (2021) Deep learning for anomaly detection in time-series data: review, analysis, and guidelines. IEEE Access 9:120043–120065CrossRef

20.

Wen T, Keyes R (2019) Time series anomaly detection using convolutional neural networks and transfer learning. ArXiv, 2019. arXiv:1905.13628.

21.

Zhang C et al (2018) A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data.

22.

Tadayon M, Iwashita Y (2020) A clustering approach to time series forecasting using neural networks: a comparative study on distance-based vs. feature-based clustering methods. arXiv preprint arXiv:2001.09547..

23.

Ienco D, Interdonato R (2020) Deep multivariate time series embedding clustering via attentive-gated autoencoder. Springer International Publishing, ChamCrossRef

24.

Xu C, Huang H, Yoo S (2021) A deep neural network for multivariate time series clustering with result interpretation. In: 2021 International Joint Conference on Neural Networks (IJCNN)

25.

Freeman, C, Beaver I (2019) Human-in-the-Loop Selection of Optimal Time Series Anomaly Detection Methods. In 7th AAAI Conf Hum Comput Crowdsourcing (HCOMP)

26.

Mosqueira-Rey E, Hernández-Pereira E, Alonso-Ríos D et al (2022) Human-in-the-loop machine learning: a state of the art. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10246-w

27.

Lin J et al (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Disc 15(2):107–144MathSciNetCrossRef

28.

Wang Z, Oates T (2015) Imaging time-series to improve classification and imputation. arXiv preprint arXiv:1506.00327.

29.

Yang C et al Multivariate time series data transformation for convolutional neural network. In: 2019 IEEE/SICE international symposium on system integration (SII). 2019.

30.

Biewald L (2020) Experiment racking with weights and biases, 2020. Available from: https://www.wandb.com/.

31.

van der Maaten L, Hinton G (2008) visualizing data using t-SNE. J Mach Learn Res 9:2579–2605MATH

32.

McInnes L, Healy J, Melville J (2018) Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.

33.

Agrawal A, Ali A, Boyd S (2021) Minimum-distortion embedding. Found Trends Mach Learn 14(3):211–378

34.

Campello RJGB, Moulavi D, Sander J. Density-based clustering based on hierarchical density estimates. Springer, Heidelberg, pp 160–172.

35.

Malzer C, Baum M (2020) A hybrid approach to hierarchical density-based cluster selection. In 2020 IEEE In Conf Multisensor Fusion Integr Intell Syst (MFI). IEEE, pp. 223–228

Titel: Application of streaming analytics for Artificial Lift systems: a human-in-the-loop approach for analysing clustered time-series data from progressive cavity pumps
verfasst von: Fahd Saghir
M. E. Gonzalez Perdomo
Peter Behrenbruch
Publikationsdatum: 11.11.2022
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 2/2023
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-022-07995-8

Springer Professional

Abstract

Publisher's Note

1 Introduction

1.1 Drawbacks of time-series analysis methods used for artificial lift systems

1.2 Limitations of time-series clustering methods

1.3 A practical approach for streaming time-series analysis of artificial lift systems

2 Overview of Coal Seam Gas production

2.1 Progressive Cavity Pumps

2.2 Data gathering from CSG wells

3 Related work

3.1 Neural net-based anomaly detection

3.2 Neural net-based time-series clustering

3.3 Converting time-series data into performance heatmap images

3.3.1 Expanding window technique

3.3.2 Symbolic aggregation approximation (SAX)-based performance heatmaps for PCPs

3.3.3 Majority and anomaly heatmap images

3.4 Advantages of a human-in-the-loop approach for data labelling

4 Methodology

4.1 I. Auto-encoder-based dimensionality reduction

4.1.1 i Deep auto-encoder (DAE)

4.1.2 ii. Convolutional auto-encoder

4.2 II. High-density dimensionality reduction

4.2.1 i t-Distributed stochastic neighbour embedding (t-SNE)

4.2.2 ii. Uniform manifold approximation and projection (UMAP)

4.2.3 iii. Minimum-distortion embedding (MDE)

4.3 III. Hierarchical density-based spatial clustering (HDBSCAN)

4.3.1 i. Clustering analysis

4.3.2 ii. Analysing the UMAP and HDBSCAN clusters for Performance Heatmap grouping

4.4 IV Cluster labelling

4.4.1 i. Cluster labelling tool

5 Results

5.1 I. Grouping cluster labels

5.2 II. Cluster sequencing and visual analytics

5.3 III. Cluster group consistency for anomalous events

5.4 IV. Streaming analytics application for PCP performance analysis

6 Conclusion and future works

Declarations

Conflict of interest

Publisher's Note

Weitere Artikel der Ausgabe 2/2023

Fully automatic CNN design with inception and ResNet blocks

C3N: content-constrained convolutional network for mural image completion

Nonparametric method of topic identification using granularity concept and graph-based modeling

Fine-grained image retrieval by combining attention mechanism and context information

Semi-global fixed/predefined-time RNN models with comprehensive comparisons for time-variant neural computing

Selection of optimal wavelet features for epileptic EEG signal classification with LSTM

Premium Partner