survey

Open Access

Generative Adversarial Networks in Time Series: A Systematic Literature Review

Authors:
Eoin Brophy

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland

0000-0002-6486-5746
View Profile

,
Zhengwei Wang

Trinity College Dublin, Dublin, Ireland

Trinity College Dublin, Dublin, Ireland

0000-0001-7706-553X
View Profile

,
Qi She

ByteDance AI Lab, China

ByteDance AI Lab, China

0000-0002-4490-2941
View Profile

,
Tomás Ward

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland

0000-0002-6173-6607
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 55 Issue 10Article No.: 199pp 1–31https://doi.org/10.1145/3559540

Published:02 February 2023Publication History

ACM Computing Surveys

Abstract

Generative adversarial network (GAN) studies have grown exponentially in the past few years. Their impact has been seen mainly in the computer vision field with realistic image and video manipulation, especially generation, making significant advancements. Although these computer vision advances have garnered much attention, GAN applications have diversified across disciplines such as time series and sequence generation. As a relatively new niche for GANs, fieldwork is ongoing to develop high-quality, diverse, and private time series data. In this article, we review GAN variants designed for time series related applications. We propose a classification of discrete-variant GANs and continuous-variant GANs, in which GANs deal with discrete time series and continuous time series data. Here we showcase the latest and most popular literature in this field—their architectures, results, and applications. We also provide a list of the most popular evaluation metrics and their suitability across applications. Also presented is a discussion of privacy measures for these GANs and further protections and directions for dealing with sensitive data. We aim to frame clearly and concisely the latest and state-of-the-art research in this area and their applications to real-world technologies.

1 INTRODUCTION

This review article is designed for those interested in generative adversarial networks (GANs) applied to time series data generation. We provide a review of current state-of-the-art and novel time series GANs and their solutions to real-world problems with time series data. GANs have been gaining a lot of traction within the deep learning research community since their inception in 2014 [38]. Their ability to generate and manipulate high-quality data across multiple domains has contributed to their success. The main focus of GANs to date has been in the computer vision domain; however, they have also been successfully applied to others, such as natural language processing (NLP) and now time series.

A GAN is a generative model consisting of a generator and discriminator, typically two neural network (NN) models. In recent years GANs have demonstrated their ability to produce high-quality image and video generation, style transfer, and image completion. They have also been successfully used for audio generation, sequence forecasting, and imputation, with a movement toward using GANs for time series and sequential data generation and forecasting.

We define a time series as a sequence of vectors dependent on time \((t)\) and can be represented as \(xt = {x1, \ldots , xn}\) for continuous/real time and discrete time. The time series’ values can either be defined as continuous or discrete and, depending on the number of values recorded, are univariate or multivariate. In most cases, the time series will take either an integer value or a real value. As Dorffner [25] states, a time series can be viewed, from a practical perspective, as a value sampled at discrete steps in time. This timestep can be as long as years to as short as milliseconds, for example. We define a continuous time series as a signal sampled from a continuous process—that is, the function’s domain is from an uncountable set. In contrast, a discrete time series has a countable domain.

The applicability of GANs to time series data can solve many issues that current dataset holders face that cannot or have not been addressed by other machine learning or regressive techniques. Data shortage is often an issue that many practitioners face, and GANs can augment smaller datasets by generating new, previously unseen data. Data can be missing or corrupted in cases; GANs can impute data, such as replace the artifacts with information representative of clean data. GANs are also capable of denoising signals in the case of corrupted data. Data protection, privacy, and sharing have become heavily regulated with the introduction of data protection measures; GANs can ensure an extra layer of data protection by generating differentially private datasets containing no risk of linkage from source to generated datasets.

Time series data generation is not a novel concept in that it has long roots seeded in regression. Furthermore, it initially began as forecasting of timesteps rather than whole sequence generation. One of the most used time series forecasting methods was autoregressive (AR) models. Aside from forecasting data points, AR models focus on preserving the temporal dynamics of a sequence. However, they are inherently deterministic in that no randomness is involved in the calculation of future states of the system. This means that AR models are not genuinely generative or probabilistic. For an AR model, the goal is to produce the next timestep (\(x_{t+1}\)) in a sequence as a function of the previous n timesteps, where n is the order of the model. The formula for a classic AR model is given in Equation (1). (1) \(\begin{equation} x_{t+1} = c + \theta _{1}x_{t} + \theta _{2}x_{t-1} + \epsilon \end{equation}\) Here, \(x_{t}\) is the value of the sequence at time t, \(\theta\) is the model parameters, c is a constant, and \(\epsilon\) is the error term usually chosen as normally distributed noise.

Autoregression was a step shy of time series synthesis. That ultimately came in the form of directed generative networks. When using the term directed, we mean a model where the edges are directed and thus indicates which variable’s probability distribution is defined in terms of another. In other words, this is a structured probabilistic model with conditional probability distributions. These data-driven generative models offered researchers the option of generating full-length data sequences versus forecasting singular values in the case of the regressive models. It also required little domain knowledge of the time series signal morphology, which was often a necessity for other statistical modeling techniques. This propelled generative modeling forward in the machine learning community for data synthesis techniques.

Several generative methods have been used in the past to generate synthetic data. One such method is the autoencoder (AE), which is designed to efficiently learn an informative representation of an input in a small dimensional space and reconstruct the encoded data back such that the reconstructed input is similar as possible to the original one. The AE model is made of an encoder and decoder NN, as shown in Figure 1. However, other generative models have emerged as front-runners due to the quality of the generated data and inherent privacy protection measures.

Generative models come in many shapes, from variational autoencoders (VAEs) and recurrent neural network (RNN) variants to GANs, all of which have their pros and cons. For example, VAEs use learned approximate inference to produce synthetic samples efficiently. An inference problem is simply using the value of some variables or probability distributions to predict other values or probability distributions. Approximate inference is when we seek to approximate a true distribution, say \(p(y|x)\), by seeking an approximate distribution \(q(y|x)\). However, this network approximation conducted by VAEs means that their generated data quality can be degraded compared to samples generated by GANs. However, for all of the benefits that come with GANs, they are not without their own downsides. They are a very useful technology that allows us to reproduce amazingly insightful and powerful datasets, but only if we can address their following challenges.

One of the significant challenges of GANs lies in their inherent instability, which makes them difficult to train. GAN models suffer from issues such as non-convergence, diminishing/vanishing gradients, and mode collapse. A non-converging model does not stabilize and continuously oscillates, causing it to diverge. Diminishing gradients prevent the generator from learning anything, as the discriminator becomes too successful. Mode collapse is when the generator collapses, producing only uniform samples with little to no variety.

The second challenge of GANs lies in its evaluation process. With image-based GANs, researchers have reached a loose consensus [8] surrounding the evaluation of the generated distribution estimated from the training data distribution. Unfortunately for time series GANs, due to the comparatively low numbers of papers published, there has not been an agreement reached on the generated data’s evaluation metrics. There have been different approaches put forward, but none established as a front-runner in the metrics space as of yet.

In this review, we present the first complete review and categorization of time series GANs, namely discrete and continuous variants, their applications, architecture, loss functions and how they have improved on their predecessors in terms of variety and quality of their generated data. We also contribute by including experiments for the majority of time series GAN architectures applied to time series synthesis.

2 RELATED WORK

There has been a handful of high-quality GAN review papers published in the past few years. For example, Wang et al. [100] take a taxonomic approach to GANs in computer vision. The authors split GANs into architecture variants and loss variants. Although they include applications of GANs and mention their applicability to sequential data generation, the work is heavily focused on media manipulation and generation. Gui et al. [40] break down GANs into their constituent parts. They begin by discussing the algorithms and architecture of various GANs and their evaluation metrics, then list their surrounding theory and problems such as mode collapse, among others. Finally, they discuss the applications of GANs and provide a very brief account of GANs used for sequential data. Gonog and Zhou [36] provide a short introduction to GANs, their theory, and explore the variety of plausible models, again listing their applications in image and video manipulation with a mention of sequential data (NLP). In another review, Alqahtani et al. [3] give an overview of GAN fundamentals, variants, and applications. Sequential data applications are mentioned in the form of music and speech synthesis.

As with most review papers, Yinka-Banjo and Ugot [106] give an introduction and overview of GANs. However, they also review GANs as adversarial detectors and discuss their limitations applied to cybersecurity. Yi et al. [105] give a review of GANs and their applications in medical imaging, and explain how they can be used in clinical research and potentially deployed to help practicing clinicians. There is no mention of time series data use cases.

A recurring theme in these works focuses on GAN variants that have mostly been applied to the computer vision domain. To the best of our knowledge, no review paper has been conducted with the main focus on time series GANs. Although these reviews have mentioned the application of these GANs in generating sequential data, they have scratched the surface of what is becoming a growing body of research.

We contribute to lessening this gap by presenting our work, which seeks to provide the latest up-to-date research around time series GANs, their architecture, loss functions, evaluation metrics, trade-offs, and approaches to privacy preservation of their datasets.

3 GENERATIVE ADVERSARIAL NETWORKS

3.1 Background

The introduction of GANs facilitated a significant breakthrough in the generation of synthetic data. These deep learning models typically consist of two NNs: a generator and a discriminator. The generator G takes in random noise \({\bf {\it z}} \in \mathbb {R}^{r}\) and attempts generates synthetic data that is similar to the training data distribution. The discriminator D attempts to determine if the generated data is real or fake. The generator aims to maximize the failure rate of the discriminator, whereas the discriminator aims to minimize it. Figure 2 shows a simple example of the GAN architecture and the game that the NN models play. The two networks are locked in a two-player minimax game defined by the value function V(G,D) (2), where D(x) is the probability that x comes from the real data rather than the generated data [38]. (2) \(\begin{equation} \mathop {min}_{G} \mathop {max}_{D}V(G,D)= \mathbb {E}_{x \sim p_{data}(x)}[logD({\bf x})] + \mathbb {E}_{z \sim p_{{\bf z}}(z)}[log(1-D(G({\bf z})))] \end{equation}\)

GANs belong to the family of generative models and are an alternative method of generating synthetic data that do not require domain expertise. They were conceived in the work by Goodfellow et al. [38] in 2014, where a multi-layer perceptron was used for both the discriminator and the generator. In 2015, Radford et al. [85] subsequently developed the deep convolutional generative adversarial network (DCGAN) to generate synthetic images. Since then, researchers have continuously improved on the early GAN architectures, loss functions, and evaluation metrics while innovating on their potential contributions to real-world applications. To appreciate why there has been such concerted activity in the further development of GAN technologies, it is important to understand the limitations of early architectures and the challenges these presented. We describe these next, and in so doing, we prepare the reader for the particular manifestation of these challenges in the more specific context of time series.

3.2 Challenges

There are three main challenges in the area of time series GANs: training stability, evaluation, and privacy risk associated with synthetic data created by GANs. We explain these three challenges next.

Training stability. The original GAN work has already proved the global optimality and the convergence of GANs during training [38]. However, it still highlights the instability problem that can arise when training a GAN. Two problems are well studied in the literature: vanishing gradients and mode collapse. The vanishing gradient is caused by directly optimizing the loss presented in Equation (2). When D reaches the optimality, optimizing Equation (2) for G can be converted to minimizing the Jensen-Shannon (JS) divergence (details of the derivation can be found in Section 5 of the work of Wang et al. [100]) between the real data distribution (\(p_{data}\)) and the generator’s distribution (\(p_g\)): (3) \(\begin{equation} \mathcal {L}_{G}=2\cdot JS(p_{data}\Vert p_{g}) - 2 \cdot \mathrm{log}2. \end{equation}\) \(\mathcal {L}_{G}\) stays constant (\(log2=0.693\)) when there is no overlap between \(p_{data}\) and \(p_g\), which indicates that the gradient for G using this loss is near 0 in this situation. A non-zero gradient for G only exists when \(p_{data}\) and \(p_g\) have substantial overlap. In practice, the possibility that \(p_{data}\) and \(p_g\) are not intersected or have negligible overlap is quite high [4]. To get rid of the vanishing gradient problem for G, the original GAN work [38] highlights the minimization of (4) \(\begin{equation} \mathcal {L}_{G}=-\mathbb {E}_{\mathbf {x}\sim p_{g}}\mathrm{log}[D(\mathbf {x})] \end{equation}\) for updating G. This strategy is able to avoid the vanishing gradient problem but leads to the mode collapse issue. Optimizing Equation (4) can be converted to optimizing the reverse Kullback-Leibler (KL) divergence—that is, \(KL(p_g||p_{data})\) (details can be found in the work of Wang et al. [100]). When \(p_{data}\) contains multiple modes, \(p_g\) chooses to recover a single mode and ignores other modes when optimizing the reverse KL divergence. Considering this case, G trained using Equation (4) might be only able to generate few modes from real data. These problems can be amended by changing the architecture or the loss function, which are reviewed by Wang et al. [100] in detail.

Evaluation. A wide range of evaluation metrics has been proposed to evaluate the performance of GANs [9, 10, 98, 99]. Current evaluations of GANs in computer vision are normally designed to consider two perspectives: quality and quantity of generated data. The most representative qualitative metric is to use human annotation to determine the visual quality of the generated images. Quantitative metrics compare statistical properties between generated and real images: two-sample tests such as maximum mean discrepancy (MMD) [93], inception score (IS) [88], and Fréchet inception distance (FID) [51]. Contrary to evaluating image-based GANs, it is difficult to evaluate time series data from human psycho-perceptual sense qualitatively. In terms of qualitatively evaluating time series based GANs, it normally conducts t-SNE [95] and PCA [13] analyses to visualize how well the generated distributions resemble the original distributions [107]. Quantitative evaluation for time series based GANs can be done by deploying two-sample tests similar to image-based GANs.

Privacy risk. Apart from evaluating the performance of GANs, a wide range of methods have been used to assess the privacy risk associated with synthetic data created by GANs. Choi et al. [17] performed tests for presence disclosure and attribute disclosure. In contrast, others utilized a three-sample test on the training, test, and synthetic data to identify if the synthetic data has overfitted to the training data [17, 31]. It has been shown that common methods of de-identifying data do not prevent attackers from re-identifying individuals using additional data [29, 72]. Sensitive data is usually de-identified by removing personally identifiable information. However, work is ongoing to create frameworks to link different sources of publicly available information together using alternative information to personally identifiable information. Malin and Sweeney [72] developed a software program, REID, to connect individuals contained in publicly available hospital discharge data with their unique DNA records. Culnane et al. [19] re-identified individuals in a de-identified open dataset of Australian medical billing records using unencrypted parts of the records and known information about individuals from other sources. Hejblum et al. [50] developed a probabilistic method to link de-identified electronic health record (EHR) data of patients with rheumatoid arthritis. The re-identification of individuals in publicly available datasets can lead to the exposure of their sensitive health information. Health data has been categorized as special personal data by General Data Protection Regulation (GDPR) and is subject to a higher level of protection under the Data Protection Act of 2018 (Section 36(2)) [32]. Consequently, concerned researchers must find alternative methods of protecting sensitive health data to minimize the risk of re-identification. This will be addressed in Section 7.

3.3 Popular Datasets

Unlike image-based datasets (CIFAR, MNIST, ImageNet [22, 61, 64]), there are no standardized or commonly used benchmarking datasets for time series generation. However, we have compiled a list of some of the more popular datasets implemented in the reviewed works, and they are listed in Table 1 along with their year of release/update, data type, and how many instances and attributes they contain. What makes these datasets interesting/applicable to time series GANs is that they are signals made up of highly complex waveforms (physiological and audio) and contain important temporal dynamics crucial to preserve when generating new samples. Furthermore, these signals are the exact data type that are highly regulated and can stand to benefit from being leveraged by GANs to generate further volumes of this kind of data.

Table 1.

Name (Year)	Data Type	Instances	Attributes
Oxford-Man Institute “realized library” (updated daily)	Real multivariate time series	>2,689,487	5
EEG Motor Movement/Imagery Dataset (2004)	Real multivariate time series	1,500	64
ECG 200 (2001)	Real univariate time series	200	1
Epileptic Seizure Recognition Dataset (2001)	Real multivariate time series	11,500	179
TwoLeadECG (2015)	Real multivariate time series	1,162	2
MIMIC-III (2016)	Real, integer, and categorical multivariate time series	–	–
EPILEPSIAE project database (2012)	Real multivariate time series	30	–
PhysioNet/CinC (2015)	Real multivariate time series	750	4
Wrist PPG During Exercise (2017)	Real multivariate time series	19	14
MIT-BIH Arrhythmia Database (2001)	Real multivariate time series	201	2
PhysioNet/CinC (2012)	Real, integer, and categorical multivariate time series	12,000	43
KDD Cup Dataset (2018)	Real, integer, and categorical multivariate time series	282	3
PeMS Database (updated daily)	Integer and categorical multivariate time series	–	8
Nottingham Music Database (2003)	Special text format time series	1,000	–

View Table

Table 1. Popular Datasets Used in the Reviewed Works

There exist two repositories; the UCR Time Series Classification/Clustering database [20] and the UCI Machine Learning Repository [26] that make available several time series datasets. Despite this, there remains no consensus on a standardized dataset used for benchmarking time series GANs, which may be due to the “continuous” nature of the architecture dimensions. GANs designed for continuous time series generation often differ in the length of their input sequence due to either author preference or the constraints placed on their architecture for the generated data’s downstream tasks.

4 CLASSIFICATION OF TIME SERIES BASED GANS

We propose a categorization of the following time series based GANs based on two distinct variant types: discrete variants (discrete time series) and continuous variants (continuous time series). A discrete time series consists of data points separated by time intervals. This type of data might have a data-reporting interval that is infrequent (e.g., 1 point per minute) or irregular (e.g., whenever a user logs in), and gaps where values are missing due to reporting interruptions (e.g., intermittent server or network downtime in a network traffic application). Discrete time series generation involves generating sequences that may have a temporal dependency but contain discrete tokens; these can be commonly found in EHRs (International Classification of Diseases 9 codes) and text generation. A continuous time series has a data value corresponding to every moment in time. Continuous data generation is concerned with generating a real-valued signal x with temporal dependencies where x \(\in \mathbb {R}\). Figure 3 presents examples of discrete and continuous time series signals.

Fig. 3. Example plots of discrete (left) and continuous time series (right).

Challenges with discrete time series generation. GANs struggle with discrete data generation due to the zero gradient nearly everywhere—that is, the distribution on discrete objects are not differentiable with respect to their parameters [52, 108]. This limitation makes the generator untrainable using backpropagation alone. The generator starts with a random sampling and a deterministic transform guided via the gradient of the loss from the discriminator with respect to the output produced by G and the training dataset. This loss leads to a slight change in G’s output, pushing it closer to the desired output. Making slight changes to continuous numbers makes sense; adding 0.001 to a value of 10 in financial time series data will bring it to 10.001. However, a discrete token such as the word “penguin” cannot simply undergo the addition of 0.001, as the sum “penguin+0.001” makes no sense. What is important here is the impossibility for the generator to jump from one discrete token to the next because the small change gives the token a new value that does not correspond to any other token over that limited discrete space [37]. This is because there exists zero probability in the space between these tokens, unlike with continuous data.

Challenges with continuous time series generation. Modeling continuous time series data presents a different problem for GANs, which are inherently designed to model continuous data, albeit most commonly in the form of images. The temporal nature of continuous data in time series presents an extra layer of difficulty. Complex correlations exist between the temporal features and their attributes—for example, if using multichannel biometric/physiological data, the electrocardiogram (ECG) characteristics will depend on the individual’s age and/or health. In addition, long-term correlations exist in the data, which are not necessarily fixed in dimension compared to image-based data under a fixed dimension. Transforming image dimensions may lead to a degradation in image quality, but it is a recognized practice. This operation becomes more difficult with continuous time series data, as there is no standardized dimension used across time series GAN architectures, which means that benchmarking their performances becomes difficult.

Since their inception in 2014, GANs have shown great success in generating high-quality synthetic images indistinguishable from real images [41, 65, 87]. Although the focus to date has been on developing GANs for improved media generation, there is a growing consensus that GANs can be used for more than image generation and manipulation, which has led to a movement toward generating time series data with GANs.

RNNs (Figure 4, left), due to their loop-like structure, are perfect for sequential data applications but by themselves lack the ability to learn long-term dependencies that might be crucial in forecasting future values based on the past. Long short-term memory (LSTM) networks (Figure 4, right) are a specific kind of RNN that have the ability to remember information for long periods of time and, in turn, learn these long-term dependencies that the standard RNN is not capable of doing. In most work reviewed in this article, the majority of the RNN-based architectures are utilizing the LSTM cell.

Fig. 4. Block diagram of a standard RNN (left) and an LSTM cell (right).

RNNs can model sequential data such as financial data, medical data, text, and speech, and they have been the foundational architecture for time series GANs. A recurrent generative adversarial network (RGAN) was first proposed in 2016. The generator contained a recurrent feedback loop that used both the input and hidden states at each timestep to generate the final output [54]. RGANs often utilize LSTM NNs in their generative models to avoid the vanishing gradient problem associated with more traditional recurrent networks [53]. In the section that follows, we present time series GANs that have either contributed significantly to this space or have made some of the most recent novel advancements in addressing the challenges mentioned previously.

4.1 Discrete-Variant GANs

4.1.1 Sequence GAN (SeqGAN) (Sept. 2016).

Yu et al. [108] proposed a sequential data generation framework [108] that could address the issues with generating discrete data as mentioned previously in Section 4. This approach outperformed previous methods for generative modeling on real-world tasks, including a maximum likelihood estimation (MLE)-trained LSTM, scheduled sampling [6], and policy gradient with bilingual evaluation understudy (PG-BLEU) [79]. SeqGAN’s generative model comprises RNNs with LSTM cells, and its discriminative model is a convolutional neural network (CNN). Given a dataset of structured sequences, the authors train G to produce a synthetic sequence \(Y_{1:T} = (y_{1}\ldots ,y_{t}\ldots ,y_{T}), y_{t} \in \mathcal {Y}\) where \(\mathcal {Y}\) is defined as the vocabulary of candidate tokens. G is updated by a policy gradient and Monte Carlo (MC) search on the expected reward from D (Figure 5). The authors used two datasets for their experiments. A Chinese poem dataset [62] and a Barack Obama Speech dataset [102] with Adam optimizers and a batch size of 64. Their experiments are available online.¹

Fig. 5. SeqGAN: D is trained over real and generated data (left), whereas G is trained by policy gradient where the final reward signal is provided by D and is passed back to the intermediate action value via MC search (right).

Although the purpose of SeqGAN is to generate discrete sequential data, it opened the door to other GANs in generating continuous sequential and time series data. The authors use a synthetic dataset whose distribution is generated from a randomly initialized LSTM following a normal distribution. They also compare the generated data to real-world examples of poems, speech-language, and music. SeqGAN showed competitive performance in generating the sequences and contributed heavily toward the further development of the continuous sequential GANs.

4.1.2 Quant GAN (July 2019).

Quant GAN is a data-driven model that aims to capture long-range dependencies in financial time series data such as volatility clusters. Both the generator and discriminator use temporal convolutional networks (TCNs) with skip connections [101], which are essentially dilated causal convolutional networks. They have the advantage of being suited to model long-range dependencies in continuous sequential data. The generator function is a novel stochastic volatility neural network that consists of a volatility and drift TCN. Temporal blocks are the modules used in the TCN that consist of two dilated causal convolutions layers (Figure 6) and two parametric rectified linear units (PReLU) as activation functions. Data generated by G is passed to D to produce outputs, which can then be averaged to give an MC estimate of D’s loss function. The authors used a dataset of daily spot prices of the S&P 500 from May 2009 until December 2018.

Fig. 6. Dilated causal convolutional layer.

The authors aim to capture long-range dependencies in financial time series; however, modeling the series in continuous time over these long time frames would blow up the models’ computational complexity. Therefore, this method models the time series in discrete time. The authors report that this approach is capable of outperforming more conventional models from mathematical finance (constrained stochastic volatility NN and generalized AR conditional heteroskedasticity (GARCH) [7]) but state that there remain issues that need to be resolved for this approach to become widely adopted. One such issue concerns the need for a unified metric for quantifying the performance of these GANs, which is a point we discuss further in Section 6.

4.2 Continuous-Variant GANs

Training Stability Developments

4.2.1 Continuous RNN-GAN (C-RNN-GAN) (Nov. 2016).

In previous works, RNNs have been applied to modeling music but have generally used a symbolic representation to model this type of sequential data. Mogren [74] proposed the C-RNN-GAN (Figure 7), one of the first examples of using GANs to generate continuous sequential data. The generator is an RNN, and the discriminator a bidirectional RNN, which allows the discriminator to take the sequence context in both directions. The RNNs used in this work were two stacked LSTM layers, with each cell containing 350 hidden units. The loss functions can be seen in Equations (5) and (6), where \(z^{(i)}\) is a sequence of uniform random vectors in [0, 1]\(^k\), and \(x^{(i)}\) is a sequence from the training data. k is the dimensionality of the data in the random sequence. (5) \(\begin{equation} {L}_{G} = \frac{1}{m} \sum _{i=1}^{m} log(1-D(G(z^{(i)}))) \end{equation}\) (6) \(\begin{equation} {L}_{D} = \frac{1}{m} \sum _{i=1}^{m} [-logD(x^{(i)}) - log(1-D(G(z^{(i)})))] \end{equation}\)

Fig. 7. Structure of C-RNN-GAN’s generator and discriminator.

The C-RNN-GAN is trained with backpropagation through time (BPTT) and mini-batch stochastic gradient descent with L2 regularization on the weights of both G and D. Freezing was applied to both G and D when one network becomes too strong relative to the other. The dataset used was 3,697 midi files from 160 different composers of classical music with a batch size of 20. Adam and gradient descent optimizers were used during training; full implementation details are available online.² Overall, the C-RNN-GAN was capable of learning the characteristics of continuous sequential data and, in turn, generate music. However, the author stated that their approach still needs work, particularly in rigorous evaluation of the generated data quality.

4.2.2 Noise Reduction GAN (NR-GAN) (Oct. 2019).

NR-GAN is designed for noise reduction in continuous time series signals but more specifically has been implemented for noise reduction in mice electroencephalogram (EEG) signals [90]. This dataset was provided by the International Institute for Integrative Sleep Medicine (IIIS). EEG is the measure of the brain’s electrical activity, and it commonly contains significant noise artifact. NR-GAN’s core idea is to reduce or remove the noise present in the frequency domain representation of an EEG signal. The architecture of G is a two-layer 1D CNN with a fully connected layer at the output. D contains almost the same two-layer 1D CNN structure with the fully connected layer replaced by a softmax layer to calculate the probability that the input belongs to the training set. The loss functions are defined in Equations (7) and (8) as (7) \(\begin{equation} {G}_{loss}= \sum _{x\in S_{ns}} [log(1-D(G(x))) + \alpha \Vert x - G(x)\Vert ^2], \end{equation}\) (8) \(\begin{equation} {D}_{loss}= \sum _{x\in S_{ns}} [log(D(G(x)))] + \sum _{y\in S_{cs}}[log(1-D(y))], \end{equation}\) where \(S_{ns}\) and \(S_{cs}\) are the noisy and clear EEG signals, respectively. \(\alpha\) is a hyperparameter that essentially controls the aggressiveness of noise reduction; the authors chose a value of \(\alpha = 0.0001\).

For this work, the generator does not sample from a latent space; rather, it attempts to generate the clear signal from the noisy EEG signal input (Figure 8). The authors found that the NR-GAN is competitive with classical frequency filters in terms of noise reduction. They also state that the experimental conditions may favor the NR-GAN and list some limitations in terms of the amount of noise NR-GAN can handle and the influence of \(\alpha\). However, this is a novel method for noise reduction in continuous sequential data using GANs.

Fig. 8. NR-GAN architecture with noisy EEG input \({S_{ns}}\) and clean input data \({S_{cs}}\) .

4.2.3 TimeGAN (Dec. 2019).

TimeGAN provides a framework that utilizes both the conventional unsupervised GAN training method and the more controllable supervised learning approach [107]. By combining an unsupervised GAN network with a supervised AR model, the network aims to generate time series with preserved temporal dynamics. The architecture of the TimeGAN framework is illustrated in Figure 9. The input to the framework is considered to consist of two elements: a static feature and a temporal feature. s represents a vector of static features and x of temporal features at the input to the encoder. The generator takes a tuple of static and temporal random feature vectors drawn from a known distribution. The real and synthetic latent codes \({\bf h}\) and \(\hat{{\bf h}}\) are used to calculate the supervised loss element of this network. The discriminator receives the tuple of real and synthetic latent codes and classifies them as either real (y) or synthetic (\(\hat{y}\)), and the \(\sim {}\) operator denotes the sample as either real or fake.

The three losses used in TimeGAN are calculated as follows. (9) \(\begin{equation} {L}_{reconstruction} = \mathbb {E}_{s,x_{1:T} \sim p}\left[ \Vert s-\sim {s}\Vert _{2} + \sum _{t} \Vert x_{t} - \sim {x}_{t}\Vert _{2} \right] \end{equation}\) (10) \(\begin{equation} {L}_{unsupervised} = \mathbb {E}_{s,x_{1:T} \sim p}\left[log(y_{S}) + \sum _{t} log(y_{t}) \right] + \mathbb {E}_{s,x_{1:T} \sim {\hat{p}}} \left[log(1- \hat{y}_{S}) + \sum _{t} log(1- \hat{y}_{t}) \right] \end{equation}\) (11) \(\begin{equation} {L}_{supervised} = \mathbb {E}_{s,x_{1:T} \sim p}\left[ \sum _{t} \Vert h_{t} -g_{X}(h_{S}, h_{t-1}, z_{t})\Vert _{2} \right] \end{equation}\)

The creators of TimeGAN conducted experiments on generating sine waves, stocks (daily historical Google stocks data from 2004 to 2019), energy (UCI Appliances energy prediction dataset) [26], and event (private lung cancer pathways dataset) datasets. A batch size of 128 and Adam optimizer were used for training, and implementation details are available online.³ The authors demonstrated improvements over other state-of-the-art time series GANs such as RCGAN, C-RNN-GAN, and WaveGAN.

4.2.4 Conditional Sig-Wasserstein GAN (SigCWGAN) (June 2020).

A problem addressed by Ni et al. [76] is that long time series data streams can greatly increase the dimensionality requirements of generative modeling, which may render such approaches infeasible. To counter this problem, the authors develop a metric named Signature Wasserstein-1 (Sig-\(W_1\)) that captures time series models’ temporal dependency and uses it as a discriminator in a time series GAN. It provides an abstract and universal description of complex data streams and does not require costly computation like the Wasserstein metric. A novel generator is also presented that is named conditional autoregressive feed-forward neural network (AR-FNN), which captures the AR nature of the time series. The generator is capable of mapping past series and noise into future series. For a rigorous mathematical description of their method, the interested reader should consult the work of Ni et al. [76].

For the AR-FNN generator, the idea is to obtain the step-q estimator \(\hat{X}^{(t)}_{t+1:t+q}\). The loss function for D is defined as (12) \(\begin{equation} L(\theta)= \sum _{t}\left|\mathbb {E}_{\mu }\left[S_M(X_{t+1:t+q})|X_{t-p+1:t} \right] - \mathbb {E}_v\left[S_M(\hat{X}_{t+1:t+q}^{(t)})|X_{t-p+1:t} \right]\right|, \end{equation}\) where v and \(\mu\) are the conditional distributions induced by the real data and synthetic generator, respectively. \(X_{t-p+1:t}\) is the true past path, \(\hat{X}_{t+1:t+q}^{(t)}\) is the forecasted next path, and \(X_{t+1:t+q}\) is the true forecast value. \(S_{M}\) is the truncated signature of path X of degree M. Further details of the Ni’s algorithm can be found in the appendix of their original paper [76]. SigCWGAN eliminates the problem of approximating a costly D and simplifies training. It is reported to achieve state-of-the-art results on synthetic and empirical datasets compared to TimeGAN, RCGAN, and generative moment matching networks (GMMNs) [68]. The empirical dataset consists of the S&P 500 index (SPX) and Dow Jones index (DJI) and their realized volatility, which is retrieved from the Oxford-Man Institute’s “realized library” [55]. A batch size of 200 with the Adam optimizer was used for training.⁴

4.2.5 Decision-Aware Time Series Conditional GAN (DAT-CGAN) (Sept. 2020).

This framework is designed to provide support for end users’ decision processes, specifically in financial portfolio choices. It uses a multi-Wasserstein loss on structured decision-related quantities [91]. The discriminator loss and generator loss are defined in Equations (13) and (14), respectively. For further details on the loss functions, see Section 3 of the original paper [91] and Equations (15) through (18). (13) \(\begin{equation} \mathop {inf}_{\eta } \mathop {sup}_{\gamma _{k},\theta _{j,k}} \sum _{k=1}^{K} \omega _k\left(\mathbb {E}_{k}^r - \mathbb {E}_{k}^{G_{\eta }} \right)+ \sum _{k=1}^{K} \sum _{j=1}^{J} \lambda _{j,k} \left(\mathbb {E}_{j,k}^{f,R} - \mathbb {E}_{j,k}^{f, G_{\eta }} \right) \end{equation}\) (14) \(\begin{equation} \mathop {inf}_{\eta } - \sum _{k} \omega _k\mathbb {E}_{k}^{G_{\eta }} - \sum _{k,j} \lambda _{j,k} \mathbb {E}_{j,k}^{f, G_{\eta }} \end{equation}\) (15) \(\begin{gather} \mathbb {E}^{r}_{k} = \mathbb {E}_{r_{t+k} \sim P(r_{t+k} | x_{t})} [ D_{\gamma k}(r_{t+k}, x_{t}) ] \end{gather}\) (16) \(\begin{gather} \mathbb {E}^{G_{\eta }}_{k} = \mathbb {E}_{z_{t,k} \sim P(z_{t,k})} [ D_{\gamma k}(r^{\prime }_{t,k}, x_{t}) ] \end{gather}\) (17) \(\begin{gather} \mathbb {E}^{f,R}_{j,k} = \mathbb {E}_{R_{t,k} \sim P(R_{t,k} | x_{t})} [ D_{\theta _{j,k}}(f_{j,k}(R_{t,k}, x_{t}), x_{t})] \end{gather}\) (18) \(\begin{gather} \mathbb {E}^{f,G_{\eta }}_{j,k} = \mathbb {E}_{Z_{t,k} \sim P(Z_{t,k})} [ D_{\theta _{j,k}}(f_{j,k}(R^{\prime }_{t,k}, x_{t}), x_{t})] \end{gather}\)

We offer a full description of all terms used in Equations (13) and (14). \(D_{\gamma k}\) is the discriminator for the data at look ahead period k with respect to parameters \(\gamma\). \(G_{\eta }\) is the generator with parameters \(\eta\). As this is the conditional case, \(x_{t}\) is the conditioning variable containing relevant information up to time t. \(r^{\prime }_{t,k} = G_{\eta }(z_{t,k}, x_{t})\) is defined as the synthetic data at look ahead point k where the noise is \(z_{t,k}\). The discriminator for decision-related quantity j at look ahead period k with respect to parameters \(\theta _{j,k}\) is defined as \(D_{\theta _{j,k}}\). These decision-related quantities may include mean and covariance, for example. \(f_{j,k}(R_{t,k},x_{t})\) represents the decision-related quantity. Finally, \(\omega _{k}\) and \(\lambda _{j,k}\) are weights and \(\mathop {inf}\) and \(\mathop {sup}\) are the infimum and supremum or greatest lower bound and least upper bound of a non-empty subset, respectively.

The generator is a two-layer feed-forward NN for each input—assets in this case. G outputs asset returns that are used to compute decision-related quantities. These quantities are fed into D, which is also a two-layer feed-forward NN. Further details about the architecture can be found in the appendix of the work of Sun et al. [91]. The dataset used is daily price data for each of four U.S. Exchange-traded funds (ETFs): Material (XLB), Energy (XLE), Financial (XLF), and Industrial (XLI) ETFs, from 1999 to 2016. The authors found this model capable of high-fidelity time series generation that supports decision processes by end users due to incorporating a decision-aware loss function. However, this approach’s limitation is that the computational complexity of this model is vast and requires 1 month of training time for a single generative model.

Privacy Developments

4.2.6 Recurrent Conditional GAN (RCGAN) (2017).

RCGAN for continuous data generation [31] differs architecturally from the C-RNN-GAN. Although the RNN LSTM is used, the discriminator is unidirectional, and the outputs of G are not fed back as inputs at the next timestep. There is also additional information that the model is conditioned on, which makes for a conditional RGAN; see the layout of the model in Figure 10. The purpose of the RCGAN and RGAN in this work is to generate continuous time series with a focus on medical data intended for use in downstream tasks, and this was one of the first works in this area. The loss functions can be seen in Equations (19) and (20), where CE is the average cross-entropy between two sequences. \(X_n\) are samples drawn from the training dataset. \(y_n\) is the adversarial ground truth; for real sequences, it is a vector of 1s, and conversely, for generated or synthetic sequences, it is a vector of 0s. \(Z_n\) is a sequence of points sampled from the latent space, and the valid adversarial ground truth is written here as 1. (19) \(\begin{equation} {D}_{loss}(X_n, y_n) = -CE(D(X_n), y_n) \end{equation}\) (20) \(\begin{equation} {G}_{loss}(Z_n) = D_{loss}(G(Z_n),{\bf 1}) = -CE(D(G(Z_n)), {\bf 1}) \end{equation}\)

Fig. 10. RCGAN architecture with conditional input c, input data x, and latent variable z.

In the conditional case, the inputs to D and G are concatenated with some conditional information \(c_n\). This variant of an RNN-GAN facilitates the generation of a synthetic continuous time series dataset with associated labels. Experiments were carried out on generated sine waves, smooth functions sampled from a Gaussian process with a zero-valued mean function, the MNIST dataset as a sequence, and the Philips eICU database [83]. A batch size of 28 with Adam and gradient descent optimizers was used for training. The authors propose a novel method for evaluating their model, which is discussed further in Section 6. Full experimental details can be found online.⁵

4.2.7 Sequentially Coupled GAN (SC-GAN) (April 2019).

SC-GAN aims to generate patient-centric medical data to inform of a patient’s current state and generate a recommended medication dosage based on the state [97]. It consists of two coupled generators tasked with producing two outcomes: one for the current state of an individual and the other for a recommended medication dosage based on the individual’s state. The discriminator is a two-layer bidirectional LSTM, and the coupled generators are both two-layer unidirectional LSTMs. Figure 11 presents further details of the architecture.

\(G_1\) generates the recommended medication dosage data (\({\bf a}_1,{\bf a}_2, \ldots ,{\bf a}_T\)) with a combined input of the sequential continuous patient state data (\({\bf s}_0,{\bf s}_1, \ldots ,{\bf s}_{T-1}\)) and a random noise sequence (\(\hat{{\bf z}}_{0}^{a},\hat{{\bf z}}_{1}^{a}, \ldots ,\hat{{\bf z}}_{T-1}^{a}\)) sampled from a uniform distribution. At each timestep \(t,\) the input \({\bf z}_{t}^{a}\) of \(G_1\) is the concatenation of \({\bf s}_t\) and \(\hat{{\bf z}}_{t}^{a}\).

Conversely, \(G_2\) is tasked with generating the patient state data \({\bf s}_{t}\) and at each timestep the input \({\bf z}_{t}^{s}\) is the concatenation of \({\bf s}_{t-1}\), \({\bf a}_{t-1}\) and \(\hat{{\bf z}}_{t}^{s}\). This means that the outputs from \(G_1\) and \(G_2\) are the inputs to one another. Combining the generators together leads to the following loss function: (21) \(\begin{equation} L_G = \frac{1}{N} \frac{1}{T} \sum _{i=1}^{N} \sum _{t=1}^{T} log(1-D(G({\bf z}_{i,t}))), \end{equation}\) (22) \(\begin{equation} G({\bf z}_{i,t}) = \left[G_1({\bf z}_{i,t}^{a});G_2({\bf z}_{i,t}^{s}) \right], \end{equation}\) where N is the number of patients and T is the time length of the patient record. The SC-GAN has a supervised pre-training step for the generators to avoid an excessively strong D that uses the least-squares loss.

The discriminator is tasked with classifying the sequential patient-centric records as real or synthetic, and the loss function is defined as (23) \(\begin{equation} L_D = - \frac{1}{N} \frac{1}{T} \sum _{i=1}^{N} \sum _{t=1}^{T} (logD({\bf x}_{i,t}) + log(1-D(G({\bf z}_{i,t})))), \end{equation}\) where \({\bf x}_{i,t} = [{\bf s}_{t};{\bf a}_{t}]\). This model contains novel coupled generators that coordinate to generate patient state and medication dosage data. It has performance close to real data for the treatment recommendation task. The dataset used in this experiment is MIMIC-III [56]. The authors benchmark their SC-GAN against variants of SeqGAN, C-RNN-GAN, and RCGAN and observe their model to be the best performing for this specific use case.

4.2.8 Synthetic Biomedical Signals GAN (SynSigGAN) (Dec. 2020).

SynSigGAN is designed to generate different kinds of continuous physiological/biomedical signal data [49]. It is capable of generating ECG, EEG, electromyography (EMG), and photoplethysmography (PPG) from the MIT-BIH Arrhythmia database [75], Siena Scalp EEG database [23], and BIDMC PPG and Respiration dataset [82]. A novel GAN architecture is proposed here that uses a bidirectional grid long short-term memory (BiGridLSTM) for the generator (Figure 12) and a CNN for the discriminator. The BiGridLSTM is a combination of a double GridLSTM (a version of LSTM that can represent the LSTMs in a multidimensional grid) with two directions that can combat the gradient phenomenon from two dimensions and are found to work well in time sequence problems. The authors used the value function defined previously in Equation (2).

Fig. 12. Architecture of BiGridLSTM with LSTM blocks for the time and depth dimension. The prime \((^{\prime })\) symbol indicates reverse in the figure as in the work of Fei and Tan [34].

SynSigGAN is capable of capturing the different physiological characteristics associated with each of these signal types and has demonstrated an ability to generate biomedical time series data with a max sequence length of 191 data points. The authors also present a preprocessing stage to clean and refine the biomedical signals in this work. They compare their architecture to several variants (BiLSTM-GRU, BiLSTM-CNN GAN, RNN-AE GAN, Bi-RNN, LSTM-AE, BiLSTM-MLP, LSTM-VAE GAN, and RNN-VAE GAN) and found the BiGrid-LSTM as the best-performing model.

Evaluation Developments

As evaluating GANs has been identified as one of their major challenges, we discuss standard evaluation metrics and novel developments formally in Section 6.

5 APPLICATIONS

We have discussed the two classes of time series GANs and their contribution to solving the challenges presented in Section 3.2. Now we will list the wide-ranging applications of time series GANs and the benefits of such to both research and industry.

5.1 Data Augmentation

It is common knowledge in the deep learning community that GANs are among the methods of choice when discussing data augmentation. Reasons for augmenting datasets range from increasing the size/variety of small and imbalanced datasets [2, 44, 59, 77] to reproducing restricted datasets for dissemination.

A well-defined solution to the data shortage problem is transfer learning, and it works well in domain adaptation, which has led to advancements in classification and recognition problems [78]. However, it has been found that augmenting datasets with GANs can lead to further improvements in certain classification and recognition tasks [110]. Data synthesized by a GAN can adhere to stricter privacy measures discussed in Section 7. This further demonstrates the advantages of augmenting your training dataset with GANs over implementing transfer learning with a pre-trained model from a different domain on a smaller dataset.

Many researchers find that accessing datasets for their deep learning research and models to be time-consuming, laborious work, particularly when the research is concerned with personal sensitive data. Often medical and clinical data are presented as continuous sequential data that can only be accessed by a small contingent of researchers who are not at liberty to disseminate their research openly. This, in turn, may lead to stagnation in the research progress in these domains.

Fortunately, we are beginning to see the uptake of GANs applied to time series with these types of medical and physiological data [12, 21, 31, 49, 111]. Brophy [11] shows that dependent multivariate continuous high-fidelity physiological signal generation is capable via GANs, demonstrating the impressive capability of these networks. Figure 13 presents an example of the real input and synthetic generated data.

Fig. 13. An example of dependent multichannel ECG data (left) and generated ECG from a multivariate GAN (right) [11]. NSR indicates the training dataset, which is the normal sinus rhythm. The generated data is produced by a GAN named by the authors as LSGAN-DTW.

Of course, this is not comprehensive coverage of the research using time series GANs for data synthesis and augmentation. GANs have been applied to time series data for a plethora of use cases.

Audio generation (both music and speech) and text-to-speech (TTS) [57] have been popular areas for researchers to explore with GANs. The C-RNN-GAN described in Section 4.2.1 was one of the seminal works to apply GANs to generating continuous sequential data in the form of music.

In the financial sector, GANs have been implemented to generate data and predict/forecast values. Wiese et al. [101] implemented a GAN to approximate financial time series in discrete time. Ni et al. [76] designed a decision-aware GAN that generates synthetic data and supports decision processes to financial portfolio selection of end users.

Other time series generation/prediction methods range from estimating soil temperature [67] to predicting medicine expenditure based on the current state of patients [58].

5.2 Imputation

In real-world datasets, missing or corrupt data is an all too common problem that leads to downstream problems. These issues manifest themselves in further analytics of the dataset and can induce biases in the data. Common methods of dealing with missing or corrupted data in the past have been the deletion of data streams containing the missing segments, statistical modeling of the data, or machine learning imputation approaches. Looking at the latter, we review the work in imputing these data using GANs. Guo et al. [42] designed a GAN-based approach for multivariate time series imputation. Figure 14 presents an example of imputed data from a toy experiment [12].

Fig. 14. An example of the incomplete corrupted time series (top) and imputed signal (bottom).

5.3 Denoising

Artifacts induced in time series data often manifest themselves as noise in the signals. This has become an ever-present challenge in further processing and analytical applications. Corrupted data can cause biases in the datasets or lead to degradation in the performance of critical systems such as those used for health applications. Common methods for dealing with noise include the use of adaptive linear filtering. Another approach recently explored in the work of Sumiya et al. [90] used GANs as a noise-reduction technique in EEG data. Their experiments showed that their proposed NR-GAN (Section 4.2.2) was capable of competitive noise reduction performance compared to more traditional frequency filters.

5.4 Anomaly Detection

Detecting outliers or anomalies in time series data is an important part of many real-world systems and sectors. Whether it is detecting unusual patterns in physiological data that may be a precursor to some more malicious condition or detecting irregular trading patterns on the stock exchange, anomaly detecting can be vital to keeping us informed on important information. Statistical measures of non-stationary time series signals may achieve good performance on the surface, but they might also miss some important outliers present in deeper features. They may also struggle in exploiting large unlabeled datasets; this is where the unsupervised deep learning approaches can outperform the conventional methods. Zhu et al. designed a GAN algorithm for anomaly detection in time series data (ECG and taxi dataset) with LSTMs and GANs, which achieved superior performance compared to conventional, more shallow approaches [112]. Similar approaches have been applied to detect cardiovascular diseases [69], in cyber-physical systems to detect nefarious players [66], and even irregular behaviors such as stock manipulation on the stock markets [63].

5.5 Other Applications

Some works have utilized image-based GANs for time series and sequential data generation by first converting their sequences to images via some transformation function and training the GAN on these images. Once the GAN converges, similar images can be generated; then, a sequence can be retrieved using the inverse of the original transformation function. For example, this approach has been implemented in audio generation with waveforms [16, 24, 60], anomaly detection [18], and physiological time series generation [12].

6 EVALUATION METRICS

As mentioned in Section 3, GANs can be difficult to evaluate, and researchers are yet to agree on what metrics reflect GAN performance best. There have been plenty of metrics proposed in the literature [8], with most of them suited to the computer vision domain. Work is still ongoing to suitably evaluate time series GANs. We can break down evaluation metrics into two categories: qualitative and quantitative. Qualitative evaluation is another term for human visual assessment via the inspection of generated samples from the GAN. However, this cannot be deemed a full evaluation of GAN performance due to the lack of a suitable objective evaluation metric. The quantitative evaluation includes the use of metrics associated with statistical measures used for time series analytics and similarity measures such as the Pearson correlation coefficient (PCC), percent root mean square difference (PRD), root mean squared error (RMSE) and mean squared error (MSE), mean relative error (MRE), and mean absolute error (MAE). These metrics are among the most commonly used for time series evaluation and, as such, are used as a suitable GAN performance metric, as they can reflect the stability between the training data and synthetic generated data, and we show some of these common formulas in Equations (24) through (27). (24) \(\begin{equation} PCC = \frac{\sum _{i=1}^{N}(x_{i}- \sim {x})(y_{i}- \tilde{y})}{ \sqrt { \sum _{i=1}^{N}(x_{i}- \tilde{x})^2 \sum _{i=1}^{N}(y_{i}- \tilde{y})^2 } } \end{equation}\) (25) \(\begin{equation} PRD = \sqrt { \frac{\sum _{i=1}^{N}(x_{i}- y_{i})^2}{ \sum _{i=1}^{N}(x_{i})^2 } } \end{equation}\) (26) \(\begin{equation} RMSE = \sqrt { \frac{1}{N}\sum _{i=1}^{N}(x_{i} - y_{i})^2 } \end{equation}\) (27) \(\begin{equation} MRAE = \frac{1}{N} \sum _{i=1}^{N} \left| \frac{x_{i} - y_{i}}{x_{i}-f_{i}} \right| \end{equation}\)

Across these formulas, \(x_{i}\) is the actual value of the time series x at time/sample i, and \(y_{i}\) is the generated value of the time series y at time/sample i. \(\tilde{x}\) and \(\tilde{y}\) represents the mean values of x and \(y,\) respectively. \(f_{i}\) is used in the MRAE calculation for the forecast value at time i of a chosen benchmark model. In general, \(f_{i}\) can be chosen to be \(y_{i-1}\) for non-seasonal time series and \(y_{i-M}\) for seasonal time series, where M is the seasonal period of x.

Several metrics have become well-established choices in evaluating image-based GANs, and some of these have permeated through to the sequential and time series GANs such as IS [88], Fréchet distance, and FID [51]. The structural similarity index (SSIM) is a measure of similarity between two images. However, Parthasarathy et al. [80] use this with time series data, as SSIM does not exclude itself from comparing aligned sequences of fixed length.

Of course, some of these metrics are measures of similarities/dissimilarities between two probability distributions, suitable for many types of data, particularly MMD [39]. In the real world, we do not have access to the underlying distributions of data, and therefore we show an empirical estimate of MMD in Equation (28), which is a quite suitable metric for this task across domains: (28) \(\begin{equation} MMD[\mathcal {F},X,Y] = \left[ \frac{1}{m^2} \sum _{i,j=1}^{m} k(x_{i},x_{j}) - \frac{2}{mn} \sum _{i,j=1}^{m,n} k(x_{i},y_{i}) + \frac{1}{n^2}\sum _{i,j=1}^{n} k(y_{i},y_{j}) \right]^{\frac{1}{2}}, \end{equation}\) where \(\mathcal {F}\) is a class F of smooth functions \(f: \mathcal {X} \rightarrow \mathbb {R}\). Two observations \(X:=\lbrace x_{1}, x_{2}, \ldots , x_{n}\rbrace\) and \(Y:=\lbrace y_{1}, y_{2}, \ldots , y_{n}\rbrace\) are drawn from two distributions p and q with m points sampled from p and n from q. Last, k is the kernel function chosen by the user.

Another metric that generalizes well to the sequential data case is the Wasserstein distance. The Wassterstein-1, or Earth Mover distance, shown in Equation (29), describes the cost it takes to move one cumulative distribution function to another while preserving the shape of the functions, which is done by optimizing the transport plan: (29) \(\begin{equation} W_{p}(\mu ,\nu) = \left(\inf _{\gamma \in \Gamma (\mu , \nu)} \int _{XxY} d^{p}(x,y)d\gamma (x,y)\right)^{\frac{1}{p}}, \end{equation}\) where \(\Gamma (\mu , \nu)\) is the set of all transport plans, \(d^{p}(x,y)\) is the distance function, and \(d\gamma (x,y)\) is the amount of “mass” to be moved.

The data generated from GANs have been used in downstream classification tasks. Using the generated data together with the training data has led to the Train on Synthetic, Test on Real (TSTR) and Train on Real, Test on Synthetic (TRTS) evaluation methods, first proposed by Esteban et al. [31]. In scoring downstream classification applications that use both real and generated data, studies have adopted the precision, recall, and F1 scores to determine the classifier’s quality and, in turn, the quality of the generated data. Other accuracy measures of classifier performance include the weighted accuracy (WA) and unweighted average recall (UAR).

Often used distance and similarity measures in time series data are the Euclidean distance (ED) and dynamic time warping (DTW) algorithms. Multivariate (in)dependent dynamic time warping (MVDTW), implemented in the work of Brophy [11], can determine similarity measures across both dependent and independent multichannel time series signals. The idea behind DTW is to find the minimum cost, or optimal alignment of the warping path via the cumulative distance function. The MVDTW cumulative distance function is given in Equation (30), which is used to find the path that minimizes the warping cost of multivariate time series signals. (30) \(\begin{equation} D(i,j) = \sum _{m=1}^{M} (q_{i,m} - c_{j,m})^2 + min \lbrace D(i-1,j-1), D(i-1,j), D(i,j-1)\rbrace \end{equation}\)

Other metrics used across different applications include:

Financial sector: Autocorrelation function (ACF) score and DY metric.
Temperature estimation: Nash-Sutcliffe model efficiency coefficient (NS), Willmott index of agreement (WI), and the Legates and McCabe index (LMI).
Audio generation: Normalized source-to-distortion ratio (NSDR), source-to interference ratio (SIR), source-to-artifact ratio (SAR), and t-SNE [95].

For a full list of GAN architectures reviewed in this work, their applications, evaluation metrics, and datasets used in their respective experiments, see Table 2. Results for the sine wave and ECG generation using variants of GAN architectures can be found in Tables 3 and 4, respectively.

Table 2.

Application	GAN Architecture(s)	Dataset(s)	Evaluation Metrics
Medical/physiological generation	LSTM-LSTM [2, 31, 44, 45, 77, 97] LSTM-CNN [11, 21] BiLSTM-CNN [111] BiGridLSTM-CNN [49] CNN-CNN [33, 47] AE-CNN [81] FCNN [104]	EEG, ECG, EHRs, PPG, EMG, speech, NAF, MNIST, synthetic sets	TSTR, MMD, reconstruction error, DTW, PCC, IS, FID, ED, S-WD, RMSE, MAE, FD, PRD, averaging samples, WA, UAR, MV-DTW
Financial time series generation/prediction	TimeGAN [107] SigCWGAN [76] DAT-GAN [91] QuantGAN [101]	S& P 500 index (SPX), Dow Jones index (DJI), ETFs	Marginal distributions, dependencies, TSTR, Wasserstein distance, EM distance, DY metric, ACF score, leverage effect score, discriminative score, predictive score
Time series estimation/prediction	LSTM-NN [67] LSTM-CNN [58] LSTM-MLP [58]	Meteorological data, Truven MarketScan dataset	RMSE, MAE, NS, WI, LMI
Audio generation	C-RNN-GAN [74] TGAN (variant) [16] RNN-FCN [109] DCGAN (variant) [60] CNN-CNN [57]	Nottingham dataset, midi music files, MIR-1K, TheSession, speech	Human perception, polyphony, scale consistency, tone span, repetitions, NSDR, SIR, SAR, FD, t-SNE, distribution of notes
Time series imputation/repairing	MTS-GAN [42] CNN-CNN [84] DCGAN (variant) [43] AE-GRUI [71] RGAN [92] FCN-FCN [15] GRUI-GRUI [70]	TEP, point machine, wind turbine data, PeMS, PhysioNet Challenge 2012, KDD CUP 2018, parking lot data,	Visually, MMD, MAE, MSE, RMSE, MRE, spatial similarity, AUC score
Anomaly detection	LSTM-LSTM [63] LSTM-(LSTM& CNN) [112] LSTM-LSTM (MAD-GAN) [66]	SET50, NYC taxi data, ECG, SWaT, WADI	Manipulated data used as a test set, ROC curve, precision, recall, F1, accuracy
Other time series generation	VAE-CNN [80]	Fixed length time series “vehicle and engine speed”	DTW, SSIM

For novel approaches, the GAN name is given as they have been covered already in Section 4.

View Table

Table 2. List of GAN Architectures, Their Applications, and Datasets Used in Their Experiments and Evaluation Metrics Used to Judge the Quality of the Respective GANs

For novel approaches, the GAN name is given as they have been covered already in Section 4.

Table 3.

Architecture	Loss Function	Toy Sine Dataset
Architecture	Loss Function	MMD	DTW	MSE
LSTM-LSTM	BCE	0.9527	91.1071	0.2308
LSTM-LSTM	MSE	0.0078	54.1644	0.1480
BiLSTM-LSTM	BCE	0.1215	428.4310	3.0700
BiLSTM-LSTM	MSE	0.9515	79.5607	0.2362
LSTM-CNN	BCE	0.006	55.3620	0.3154
LSTM-CNN	MSE	0.5757	86.7357	0.5643
BiLSTM-CNN	BCE	1.129E-05	129.9257	0.9193
BiLSTM-CNN	MSE	0.4891	43.2694	0.1869
GRU-CNN	BCE	0.0244	37.1630	0.2303
GRU-CNN	MSE	0.3727	42.7348	0.22823
FC-CNN	BCE	0.0039	58.3565	0.3048
FC-CNN	MSE	0.0117	43.3611	0.2972

View Table

Table 3. Experimental Results Comparing the Performance of Time Series GANs for Sinewave Generation

Table 4.

Architecture	Loss Function	MIT-BIH Arrhythmia Dataset
Architecture	Loss Function	MMD	DTW	MSE
LSTM-LSTM	BCE	0.9931	30.1816	0.0867
LSTM-LSTM	MSE	0.8842	44.4553	0.1389
BiLSTM-LSTM	BCE	0.9916	22.8634	0.0699
BiLSTM-LSTM	MSE	0.9737	23.5533	0.0806
LSTM-CNN	BCE	0.5519	13.0158	0.0151
LSTM-CNN	MSE	0.0005	24.7306	0.0457
BiLSTM-CNN	BCE	0.9246	117.3994	0.2272
BiLSTM-CNN	MSE	0.0687	22.6740	0.0586
GRU-CNN	BCE	0.0055	20.4845	0.0335
GRU-CNN	MSE	0.7704	108.4124	0.1948
FC-CNN	BCE	0.2068	23.9910	0.0309
FC-CNN	MSE	0.3082	18.2340	0.0212

View Table

Table 4. Experimental Results Comparing the Performance of Time Series GANs for ECG Generation on the MIT-BIH Dataset

7 PRIVACY

As well as evaluating the quality of the data, a wide range of methods have been used to evaluate and mitigate the privacy risk associated with synthetic data created by GANs.

7.1 Differential Privacy

The goal of differential privacy is to preserve the underlying privacy of a database. An algorithm or, more specifically, a GAN achieves differential privacy if, by looking at the generated samples, we cannot identify whether the samples were included in the training set. As GANs attempt to model the training dataset, the problem of privacy lies in capturing and generating useful information about the training set population without the possibility of linkage from generated sample to an individual’s data [27].

As we have addressed previously, one of the main goals of GANs is to augment existing under-resourced datasets for use in further downstream applications such as upskilling of clinicians where healthcare data is involved. These personal sensitive data must contain privacy guarantees, and the rigorous mathematical definition of differential privacy [28] offers this assurance.

Work is ongoing to develop machine learning methods with privacy-preserving mechanisms such as differential privacy. Abadi et al. [1] demonstrated the ability to train deep NNs with differential privacy and implemented a mechanism for tracking privacy loss. Xie et al. Xie2018 proposed a differentially private GAN (DPGAN) that achieved DP by adding noise gradients to the optimizer during the training phase [103].

7.2 Decentralized/Federated Learning

Distributed or decentralized learning is another method for limiting the privacy risk associated with personal and personal sensitive data in machine learning. Standard approaches to machine learning require that all training data be kept on one server. Decentralized/distributed approaches to GAN algorithms require large communication bandwidth to ensure convergence [5, 46] and are also subject to strict privacy constraints. A new method that enables communication efficient collaborative learning on a shared model while keeping all of the training data decentralized is known as federated learning [73]. Rasouli et al. [86] applied a federated learning algorithm to a GAN for communication-efficient distributed learning and proved the convergence of their federated learning GAN (FedGAN) [86]. However, it should be noted that they did not experiment with differential privacy in this study but note that it as an avenue of future work.

Combining the preceding techniques of federated learning and differential privacy in developing new GAN algorithms would lead to a fully decentralized private GAN capable of generating data without leakage of private information to the source data. This is clearly an open research avenue for the community.

7.3 Assessment of Privacy Preservation

We can also assess how well the generative model was able to protect our privacy through tests known as attribute and presence disclosure [17]. The latter test is more commonly known in the machine learning space as a membership inference attack. This has become a quantitative assessment of how machine learning models leak information about the individual data records on which they were trained [89]. Membership inference attacks attempt to detect the data that was used to train a target model without the attacker having access to the model’s parameters. A nefarious actor creates random records for a target machine learning model. The attacker then feeds each record into the model. The model will return a confidence score, and based on this score, the records will be fine tuned until a higher confidence score is returned. This process will continue until the model returns a very high score, and at this stage the record will be nearly identical to one of the examples used in the training dataset. These steps will be repeated until enough dataset examples are generated. The fake records will then be used to train an ensemble of models to predict whether a data record was used in the training set of the target model.

Hayes et al. [48] carried out membership inference attacks on synthetic images and concluded that for acceptable levels of privacy in the GAN, the quality of the data generated is sacrificed. Conversely, others have followed this approach and found that differential privacy networks can successfully generate data that adheres to differential privacy and resists membership inference attacks without too much degradation in the quality of the generated data [11, 21, 31].

8 DISCUSSION

We have presented a survey of time series GAN variants that have made significant progress in addressing the primary challenges identified in Section 3.2. These GANs introduced the idea of both discrete and continuous sequential data generation and have made incremental improvements over one another via an architecture variant or a modified objective function capable of capturing the spatio-temporal dependencies present in these data types. The loss functions implemented in these works for some architectures will not necessarily generalize to others; hence, they become architecture specific. The architecture choices of the time series GANs affect both the quality and diversity of the data. However, there remain open problems in terms of the practical implementation of the generated data and GANs in real-world applications, particularly in health applications where the performance of these models can directly affect patients’ quality of care/treatment.

The “best” GAN architecture and objective function is yet to be determined. This is because humans have manually designed most architectures. As a result, there is growing interest in automated neural architecture search (NAS) methods [30], whereby automating the architecture engineering aspect of machine learning. It is a growing branch of automatic machine learning (AutoML) and automatic deep learning (AutoDL) that seeks to optimize the processes around machine learning. Work has been done in the image domain space with neural architecture search and GANs [35]. This method, named AutoGAN, achieved highly competitive performance compared to state-of-the-art human-engineered GANs. This is a promising area for time series GANs; to the authors’ knowledge, it is yet unexplored.

As it stands, GANs tend to be application specific; they perform well for their intended purpose but do not generalize well beyond their original domain. Furthermore, a major limitation of time series GANs is the restrictions placed on the length of the sequence specified that the architecture can manage; documented experiments validating how well a time series GAN can adapt to varying data lengths are notably absent at the time of writing. However, glimpses of work in the NLP literature in the form of Transformers [96] have demonstrated some applicability to dealing with varying sequence lengths that may prove beneficial in addressing this issue and might emerge in time series generation given time.

Other aspects not in the scope of this survey article but important to note is how GANs can deal with issues such as scalability and real-time data. Given its importance, we present some draft ideas and direct the interested reader to further resources for full-stack machine learning in general. Thankfully, the emerging practice of machine learning operations (MLOps) addresses most concerns surrounding retraining models once real-time data begins to diverge from the original dataset it was trained on [14, 94]. This can be applied to GANs, whereby the datasets encountered in production can be driven through a metric process to assess divergence from the original data and subsequent data for retraining, allowing for reliable machine learning solutions that scale. For a parallel computing approach, we would consider federated learning, as referenced previously, where you can train the GAN on subsets of the data and can combine the models following training.

9 CONCLUSION

This article reviews a niche but growing use of GANs for time series data based mainly around architectural evolution and loss function variants. We see that each GAN provides application-specific performance and does not necessarily generalise well to other applications—for example, a GAN for generating high-quality physiological time series may not produce high-fidelity audio due to some limitation imposed by the architecture or loss function. A detailed review of the applications of time series GANs to real-world problems has been provided, along with their datasets and the evaluation metrics used for each domain. As stated in the work of Wang et al. [100], GAN-related research for time series lags that of computer vision both in terms of performance and defined rules for generalization of models. This review has highlighted the open challenges in this area and offers directions for future work and technological innovation, particularly for those GAN aspects related to evaluation, privacy, and decentralized learning.

Footnotes

¹ SeqGAN GitHub: https://github.com/LantaoYu/SeqGAN/.
Footnote
² C-RNN-GAN GitHub: https://github.com/olofmogren/c-rnn-gan/.
Footnote
³ TimeGAN GitHub: https://github.com/jsyoon0823/TimeGAN.
Footnote
⁴ SigCWGAN GitHub: https://github.com/SigCGANs/Conditional-Sig-Wasserstein-GANs/.
Footnote
⁵ RCGAN GitHub: https://github.com/ratschlab/RGAN/.
Footnote

REFERENCES

[1] Abadi Martín, McMahan H. Brendan, Chu Andy, Mironov Ilya, Zhang Li, Goodfellow Ian, and Talwar Kunal. 2016. Deep learning with differential privacy. In Proceedings of the ACM Conference on Computer and Communications Security (CCS’16). 308–318. Google ScholarDigital Library
Reference
[2] Abdelfattah Sherif M., Abdelrahman Ghodai M., and Wang Min. 2018. Augmenting the size of EEG datasets using generative adversarial networks. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN’18). IEEE, Los Alamitos, CA, 1–6. Google ScholarCross Ref
Reference 1Reference 2
[3] Alqahtani Hamed, Kavakli-Thorne Manolya, and Kumar Gulshan. 2019. Applications of generative adversarial networks (GANs): An updated review. Archives of Computational Methods in Engineering 28 (Dec. 2019), 525–552. Google ScholarCross Ref
Reference
[4] Arjovsky Martin, Chintala Soumith, and Bottou Léon. 2017. Wasserstein GAN. arXiv preprint arXiv:1701.07875.Google Scholar
Reference
[5] Augenstein Sean, McMahan H. Brendan, Ramage Daniel, Ramaswamy Swaroop, Kairouz Peter, Chen Mingqing, Mathews Rajiv, and Arcas Blaise Aguera y. 2020. Generative models for effective ML on private, decentralized datasets. arxiv:1911.06679 [cs.LG].Google Scholar
Reference
[6] Bengio Samy, Vinyals Oriol, Jaitly Navdeep, and Shazeer Noam. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. arxiv:1506.03099 [cs.LG].Google Scholar
Reference
[7] Bollerslev Tim. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 3 (1986), 307–327. https://EconPapers.repec.org/RePEc:eee:econom:v:31:y:1986:i:3:p:307-327.Google ScholarCross Ref
Reference
[8] Borji Ali. 2018. Pros and cons of GAN evaluation measures. arxiv:1802.03446 [cs.CV].Google Scholar
Reference 1Reference 2
[9] Borji Ali. 2019. Pros and cons of GAN evaluation measures. Computer Vision and Image Understanding 179 (2019), 41–65. Google ScholarDigital Library
Reference
[10] Borji Ali. 2021. Pros and cons of GAN evaluation measures: New developments. arxiv:2103.09396 [cs.LG].Google Scholar
Reference
[11] Brophy Eoin. 2020. Synthesis of dependent multichannel ECG using generative adversarial networks. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM’20). ACM, New York, NY, 3229–3232. Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[12] Brophy Eoin, Wang Zhengwei, and Ward Tomas E.. 2019. Quick and easy time series generation with established image-based GANs. arxiv:1902.05624 [cs.LG].Google Scholar
Reference 1Reference 2Reference 3
[13] Bryant Fred B. and Yarnold Paul R.. 1995. Principal-components analysis and exploratory and confirmatory factor analysis. In Reading and Understanding Multivariate Statistics.American Psychological Association, Washington, DC, 99–136.Google Scholar
Reference
[14] Burkov Andriy. 2020. Machine Learning Engineering. True Positive Inc.Google Scholar
Reference
[15] Chen Yuanyuan, Lv Yisheng, and Wang Fei-Yue. 2020. Traffic flow imputation using parallel data and generative adversarial networks. IEEE Transactions on Intelligent Transportation Systems 21, 4 (April2020), 1624–1630. Google ScholarCross Ref
Reference
[16] Cheng Ping-Sung, Lai Chieh-Ying, Chang Chun-Chieh, Chiou Shu-Fen, and Yang Yu-Chieh. 2020. A variant model of TGAN for music generation. In Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference. ACM, New York, NY, 40–45. Google ScholarDigital Library
Reference 1Reference 2
[17] Choi Edward, Biswal Siddharth, Malin Bradley, Duke Jon, Stewart Walter F., and Sun Jimeng. 2017. Generating multi-label discrete patient records using generative adversarial networks. arxiv:1703.06490.Google Scholar
Reference 1Reference 2Reference 3
[18] Choi Yeji, Lim Hyunki, Choi Heeseung, and Kim Ig-Jae. 2020. GAN-based anomaly detection and localization of multivariate time series data for power plant. In Proceedings of the 2020 IEEE International Conference on Big Data and Smart Computing (BigComp’20). IEEE, Los Alamitos, CA, 71–74. Google ScholarCross Ref
Reference
[19] Culnane Chris, Rubinstein Benjamin I. P., and Teague Vanessa. 2017. Health data in an open world. arxiv:1712.05627.Google Scholar
Reference
[20] Dau Hoang Anh, Keogh Eamonn, Kamgar Kaveh, Yeh Chin-Chia Michael, Zhu Yan, Gharghabi Shaghayegh, Ratanamahatana Chotirat Ann, et al. 2018. UCR Time Series Classification Archive. Retrieved September 7, 2022 from https://www.cs.ucr.edu/eamonn/time_series_data_2018/.Google Scholar
Reference
[21] Delaney Anne Marie, Brophy Eoin, and Ward Tomás E.. 2019. Synthesis of realistic ECG using generative adversarial networks. arxiv:1909.09150.Google Scholar
Reference 1Reference 2Reference 3
[22] Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 248–255. Google ScholarCross Ref
Reference
[23] Detti Paolo, Vatti Giampaolo, and Lara Garazi Zabalo Manrique de. 2020. EEG synchronization analysis for seizure prediction: A study on data of noninvasive recordings. Processes 8, 7 (2020), 846. Google ScholarCross Ref
Reference
[24] Donahue Chris, McAuley Julian, and Puckette Miller. 2019. Adversarial audio synthesis. arxiv:1802.04208 [cs.SD].Google Scholar
Reference
[25] Dorffner Georg. 1996. Neural networks for time series processing. Neural Network World 6 (1996), 447–468.Google Scholar
Reference
[26] Dua Dheeru and Graff Casey. 2017. UCI Machine Learning Repository. Retreived September 7, 2022 from http://archive.ics.uci.edu/ml.Google Scholar
Reference 1Reference 2
[27] Dwork Cynthia. 2006. Differential privacy. In Proceedings of the 33rd International Conference on Automata, Languages, and Programming—Volume Part II (ICALP’06). 1–12. Google ScholarDigital Library
Reference
[28] Dwork Cynthia and Roth Aaron. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9, 3–4 (Aug. 2014), 211–407. Google ScholarDigital Library
Reference
[29] Emam Khaled El, Jonker Elizabeth, Arbuckle Luk, and Malin Bradley. 2011. A systematic review of re-identification attacks on health data. PLoS One 6 (2011), e28071. Google ScholarCross Ref
Reference
[30] Elsken Thomas, Metzen Jan Hendrik, and Hutter Frank. 2019. Neural architecture search: A survey. Journal of Machine Learning Research 20, 1 (2019), 1997–2017.Google ScholarDigital Library
Reference
[31] Esteban Cristóbal, Hyland Stephanie L., and Rätsch Gunnar. 2017. Real-valued (medical) time series generation with recurrent conditional GANs. arxiv:1706.02633.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[32] Union European. 2018. Data Protection Act 2018 (Section36(2)). Retrieved September 7, 2022 from http://www.irishstatutebook.ie/eli/2018/si/314/made/en/pdf.Google Scholar
Reference
[33] Fahimi Fatemeh, Zhang Zhuo, Goh Wooi Boon, Ang Kai Keng, and Guan Cuntai. 2019. Towards EEG generation using GANs for BCI applications. In Proceedings of the 2019 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI’19). IEEE, Los Alamitos, CA, 1–4. Google ScholarCross Ref
Reference
[34] Fei Hongxiao and Tan Fengyun. 2018. Bidirectional grid long short-term memory (BiGridLSTM): A method to address context-sensitivity and vanishing gradient. Algorithms 11, 11 (2018), 172. Google ScholarCross Ref
Reference
[35] Gong Xinyu, Chang Shiyu, Jiang Yifan, and Wang Zhangyang. 2019. AutoGAN: Neural architecture search for generative adversarial networks. arXiv:1908.03835.Google Scholar
Reference
[36] Gonog Liang and Zhou Yimin. 2019. A review: Generative adversarial networks. In Proceedings of the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA’19). IEEE, Los Alamitos, CA, 505–510. Google ScholarCross Ref
Reference
[37] Goodfellow Ian. 2016. Generative Adversarial Networks for Text. Retrieved September 7, 2022 from https://www.reddit.com/r/MachineLearning/comments/40ldq6/generative_adversarial_networks_for_text/.Google Scholar
Reference
[38] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems, Ghahramani Z., Welling M., Cortes C., Lawrence N., and Weinberger K. Q. (Eds.), Vol. 27. Curran Associates, Montréal, Canada. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[39] Gretton Arthur, Borgwardt Karsten M., Rasch Malte J., Schölkopf Bernhard, and Smola Alexander. 2012. A kernel two-sample test. Journal of Machine Learning Research 13 (March2012), 723–773.Google ScholarDigital Library
Reference
[40] Gui Jie, Sun Zhenan, Wen Yonggang, Tao Dacheng, and Ye Jieping. 2020. A review on generative adversarial networks: Algorithms, theory, and applications. arxiv:2001.06937 [cs.LG].Google Scholar
Reference
[41] Guibas John T., Virdi Tejpal S., and Li Peter S.. 2017. Synthetic medical images from dual generative adversarial networks. arxiv:1709.01872.Google Scholar
Reference
[42] Guo Zijian, Wan Yiming, and Ye Hao. 2019. A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing 360 (Sept. 2019), 185–197. Google ScholarDigital Library
Reference 1Reference 2
[43] Han Lingyi, Zheng Kan, Zhao Long, Wang Xianbin, and Wen Huimin. 2020. Content-aware traffic data completion in ITS based on generative adversarial nets. IEEE Transactions on Vehicular Technology 69, 10 (Oct. 2020), 11950–11962. Google ScholarCross Ref
Reference
[44] Harada Shota, Hayashi Hideaki, and Uchida Seiichi. 2018. Biosignal data augmentation based on generative adversarial networks. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’18). IEEE, Los Alamitos, CA, 368–371. Google ScholarCross Ref
Reference 1Reference 2
[45] Harada Shota, Hayashi Hideaki, and Uchida Seiichi. 2019. Biosignal generation and latent variable analysis with recurrent generative adversarial networks. IEEE Access 7 (2019), 144292–144302. Google ScholarCross Ref
Reference
[46] Hardy Corentin, Merrer Erwan Le, and Sericola Bruno. 2019. MD-GAN: Multi-discriminator generative adversarial networks for distributed datasets. arxiv:1811.03850 [cs.LG].Google Scholar
Reference
[47] Hartmann Kay Gregor, Schirrmeister Robin Tibor, and Ball Tonio. 2018. EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals. arxiv:1806.01875.Google Scholar
Reference
[48] Hayes Jamie, Melis Luca, Danezis George, and Cristofaro Emiliano De. 2019. LOGAN: Membership inference attacks against generative models. Proceedings on Privacy Enhancing Technologies 2019, 1 (2019), 133–152. Google ScholarCross Ref
Reference
[49] Hazra Debapriya and Byun Yung-Cheol. 2020. SynSigGAN: Generative adversarial networks for synthetic biomedical signal generation. Biology 9, 12 (Dec. 2020), 441. Google ScholarCross Ref
Reference 1Reference 2Reference 3
[50] Hejblum Boris P., Weber Griffin M., Liao Katherine P., Palmer Nathan P., Churchill Susanne, Shadick Nancy A., Szolovits Peter, Murphy Shawn N., Kohane Isaac S., and Cai Tianxi. 2019. Probabilistic record linkage of de-identified research datasets with discrepancies using diagnosis codes. Scientific Data 6 (2019), Article 180298, 11 pages. Google ScholarCross Ref
Reference
[51] Heusel Martin, Ramsauer Hubert, Unterthiner Thomas, Nessler Bernhard, and Hochreiter Sepp. 2018. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. arxiv:1706.08500 [cs.LG].Google Scholar
Reference 1Reference 2
[52] Hjelm R. Devon, Jacob Athul Paul, Che Tong, Trischler Adam, Cho Kyunghyun, and Bengio Yoshua. 2018. Boundary-seeking generative adversarial networks. arxiv:1702.08431 [stat.ML].Google Scholar
Reference
[53] Hochreiter Sepp and Schmidhuber Jurgen. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. Google ScholarDigital Library
Reference
[54] Im Daniel Jiwoong, Kim Chris Dongjoo, Jiang Hui, and Memisevic Roland. 2016. Generating images with recurrent adversarial networks. arxiv:1602.05110.Google Scholar
Reference
[55] Institute Oxford-Man. 2021. Oxford-Man Institute of Quantitative Finance: Realized Library. Retrieved April 30, 2021 from https://realized.oxford-man.ox.ac.uk.Google Scholar
Reference
[56] Johnson Alistair E. W., Pollard Tom J., Shen Lu, Lehman LiWei H., Feng Mengling, Ghassemi Mohammad, Moody Benjamin, Szolovits Peter, Celi Leo Anthony, and Mark Roger G.. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3 (2016), 160035.Google ScholarCross Ref
Reference
[57] Juvela Lauri, Bollepalli Bajibabu, Yamagishi Junichi, and Alku Paavo. 2019. Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks. In Proceedings of the 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’19). IEEE, Los Alamitos, CA, 6915–6919. Google ScholarCross Ref
Reference 1Reference 2
[58] Kaushik Shruti, Choudhury Abhinav, Natarajan Sayee, Pickett Larry A., and Dutt Varun. 2020. Medicine expenditure prediction via a variance-based generative adversarial network. IEEE Access 8 (2020), 110947–110958. Google ScholarCross Ref
Reference 1Reference 2Reference 3
[59] Kiyasseh Dani, Tadesse Girmaw Abebe, Nhan Le Nguyen Thanh, Tan Le Van, Thwaites Louise, Zhu Tingting, and Clifton David. 2020. PlethAugment: GAN-based PPG augmentation for medical diagnosis in low-resource settings. IEEE Journal of Biomedical and Health Informatics 24, 11 (Nov. 2020), 3226–3235. Google ScholarCross Ref
Reference
[60] Kolokolova Antonina, Billard Mitchell, Bishop Robert, Elsisy Moustafa, Northcott Zachary, Graves Laura, Nagisetty Vineel, and Patey Heather. 2020. GANs & reels: Creating Irish music using a generative adversarial network. arxiv:2010.15772 [cs.SD].Google Scholar
Reference 1Reference 2
[61] Krizhevsky Alex and Hinton Geoffrey. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. University of Toronto, Toronto, Ontario.Google Scholar
Reference
[62] Lapata Mirella. 2015. EMNLP14. Retrieved April 30, 2021 from http://homepages.inf.ed.ac.uk/mlap/Data/EMNLP14/.Google Scholar
Reference
[63] Leangarun Teema, Tangamchit Poj, and Thajchayapong Suttipong. 2018. Stock price manipulation detection using generative adversarial networks. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI’18). IEEE, Los Alamitos, CA, 2104–2111. Google ScholarCross Ref
Reference 1Reference 2
[64] LeCun Y., Bottou L., Bengio Y., and Haffner P.. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
Reference
[65] Ledig Christian, Theis Lucas, Huszár Ferenc, Caballero Jose, Cunningham Andrew, Acosta Alejandro, Aitken Andrew, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 105–114. Google ScholarCross Ref
Reference
[66] Li Dan, Chen Dacheng, Jin Baihong, Shi Lei, Goh Jonathan, and Ng See-Kiong. 2019. MAD-GAN: Multivariate anomaly detection for time series data with generative adversarial networks. In Artificial Neural Networks and Machine Learning—ICANN 2019: Text and Time Series. Lecture Notes in Computer Science, Vol. 11730. Springer, 703–716. Google ScholarDigital Library
Reference 1Reference 2
[67] Li Qingliang, Hao Huibowen, Zhao Yang, Geng Qingtian, Liu Guangjie, Zhang Yu, and Yu Fanhua. 2020. GANs-LSTM model for soil temperature estimation from meteorological: A new approach. IEEE Access 8 (2020), 59427–59443. Google ScholarCross Ref
Reference 1Reference 2
[68] Li Yujia, Swersky Kevin, and Zemel Richard. 2015. Generative moment matching networks. arxiv:1502.02761 [cs.LG].Google Scholar
Reference
[69] Luer Fiete, Mautz Dominik, and Bohm Christian. 2019. Anomaly detection in time series using generative adversarial networks. In Proceedings of the 2019 International Conference on Data Mining Workshops (ICDMW’19). IEEE, Los Alamitos, CA, 1047–1048. Google ScholarCross Ref
Reference
[70] Luo Yonghong, Cai Xiangrui, Zhang Ying, Xu Jun, and Xiaojie Yuan. 2018. Multivariate time series imputation with generative adversarial networks. In Advances in Neural Information Processing Systems, Bengio S., Wallach H., Larochelle H., Grauman K., Cesa-Bianchi N., and Garnett R. (Eds.), Vol. 31. Curran Associates, Montréal, Canada, 1596–1607. https://proceedings.neurips.cc/paper/2018/file/96b9bff013acedfb1d140579e2fbeb63-Paper.pdf.Google Scholar
Reference
[71] Luo Yonghong, Zhang Ying, Cai Xiangrui, and Yuan Xiaojie. 2019. E\(^2\)GAN: End-to-end generative adversarial network for multivariate time series imputation. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 3094–3100. Google ScholarCross Ref
Reference
[72] Malin B. and Sweeney L.. 2001. Re-identification of DNA through an automated linkage process. In Proceedings of the AMIA Symposium. 423–427. https://pubmed.ncbi.nlm.nih.gov/11825223.Google Scholar
Reference 1Reference 2
[73] McMahan H. Brendan, Moore Eider, Ramage Daniel, Hampson Seth, and Arcas Blaise Agüera y. 2017. Communication-efficient learning of deep networks from decentralized data. arxiv:1602.05629 [cs.LG].Google Scholar
Reference
[74] Mogren Olof. 2016. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arxiv:1611.09904 [cs.AI].Google Scholar
Reference 1Reference 2
[75] Moody G. B. and Mark R. G.. 2001. The impact of the MIT-BIH arrhythmia database. IEEE Engineering in Medicine and Biology Magazine 20, 3 (2001), 45–50. Google ScholarCross Ref
Reference
[76] Ni Hao, Szpruch Lukasz, Wiese Magnus, Liao Shujian, and Xiao Baoren. 2020. Conditional Sig-Wasserstein GANs for time series generation. arxiv:2006.05421 [cs.LG].Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[77] Nikolaidis Konstantinos, Kristiansen Stein, Goebel Vera, Plagemann Thomas, Liestøl Knut, and Kankanhalli Mohan. 2019. Augmenting physiological time series data: A case study for sleep apnea detection. arxiv:1905.09068 [cs.LG].Google Scholar
Reference 1Reference 2
[78] Pan S. J. and Yang Q.. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345–1359. Google ScholarDigital Library
Reference
[79] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318. Google ScholarDigital Library
Reference
[80] Parthasarathy Dhasarathy, Bäckström Karl, Henriksson Jens, and Einarsdóttir Sólrún. 2020. Controlled time series generation for automotive software-in-the-loop testing using GANs. arxiv:2002.06611 [cs.LG].Google Scholar
Reference 1Reference 2
[81] Pascual Damian, Amirshahi Alireza, Aminifar Amir, Atienza David, Ryvlin Philippe, and Wattenhofer Roger. 2020. EpilepsyGAN: Synthetic epileptic brain activities with privacy preservation. IEEE Transactions on Biomedical Engineering 67 (2020), 1. Google ScholarCross Ref
Reference
[82] Pimentel Marco A. F., Johnson Alistair E. W., Charlton Peter H., Birrenkott Drew, Watkinson Peter J., Tarassenko Lionel, and Clifton David A.. 2017. Toward a robust estimation of respiratory rate from pulse oximeters. IEEE Transactions on Biomedical Engineering 64, 8 (2017), 1914–1923. Google ScholarCross Ref
Reference
[83] Pollard Tom J., Johnson Alistair E. W., Raffa Jesse D., Celi Leo A., Mark Roger G., and Badawi Omar. 2018. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific Data 5, 1 (Sept. 2018), 180178. Google ScholarCross Ref
Reference
[84] Qu Fuming, Liu Jinhai, Ma Yanjuan, Zang Dong, and Fu Mingrui. 2020. A novel wind turbine data imputation method with multiple optimizations based on GANs. Mechanical Systems and Signal Processing 139 (May2020), 106610. Google ScholarCross Ref
Reference
[85] Radford Alec, Metz Luke, and Chintala Soumith. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arxiv:1511.06434.Google Scholar
Reference
[86] Rasouli Mohammad, Sun Tao, and Rajagopal Ram. 2020. FedGAN: Federated generative adversarial networks for distributed data. arxiv:2006.07228 [cs.LG].Google Scholar
Reference 1Reference 2
[87] Reed Scott, Akata Zeynep, Yan Xinchen, Logeswaran Lajanugen, Schiele Bernt, and Lee Honglak. 2016. Generative adversarial text to image synthesis. In Proceedings of the 33rd International Conference on Machine Learning, M. F. Balcan and K. Q. Weinberger (Eds.). PMLR, New York, NY, 1681–1690.Google Scholar
Reference
[88] Salimans Tim, Goodfellow Ian, Zaremba Wojciech, Cheung Vicki, Radford Alec, Chen Xi, and Chen Xi. 2016. Improved techniques for training GANs. In Advances in Neural Information Processing Systems, Lee D., Sugiyama M., Luxburg U., Guyon I., and Garnett R. (Eds.), Vol. 29. Curran Associates, Barcelona, Spain, 2234–2242. https://proceedings.neurips.cc/paper/2016/file/8a3363abe792db2d8761d6403605aeb7-Paper.pdf.Google Scholar
Reference 1Reference 2
[89] Shokri Reza, Stronati Marco, Song Congzheng, and Shmatikov Vitaly. 2017. Membership inference attacks against machine learning models. arxiv:1610.05820 [cs.CR].Google Scholar
Reference
[90] Sumiya Yuki, Horie Kazumasa, Shiokawa Hiroaki, and Kitagawa Hiroyuki. 2019. NR-GAN: Noise reduction GAN for mice electroencephalogram signals. In Proceedings of the 2019 4th International Conference on Biomedical Imaging, Signal Processing. ACM, New York, NY, 94–101. Google ScholarDigital Library
Reference 1Reference 2
[91] Sun He, Deng Zhun, Chen Hui, and Parkes David C.. 2020. Decision-aware conditional GANs for time series data. arxiv:2009.12682 [cs.LG].Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[92] Sun Yuqiang, Peng Lei, Li Huiyun, and Sun Min. 2018. Exploration on spatiotemporal data repairing of parking lots based on recurrent GANs. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC’18). IEEE, Los Alamitos, CA, 467–472. Google ScholarDigital Library
Reference
[93] Sutherland Dougal J., Tung Hsiao-Yu, Strathmann Heiko, De Soumyajit, Ramdas Aaditya, Smola Alex, and Gretton Arthur. 2016. Generative models and model criticism via optimized maximum mean discrepancy. arxiv:1611.04488.Google Scholar
Reference
[94] Treveil Mark and Team Dataiku. 2020. Introducing MLOps. O’Reilly.Google Scholar
Reference
[95] Maaten Laurens van der and Hinton Geoffrey. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.html.Google Scholar
Reference 1Reference 2
[96] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.Google ScholarDigital Library
Reference
[97] Wang Lu, Zhang Wei, and He Xiaofeng. 2019. Continuous patient-centric sequence generation via sequentially coupled adversarial learning. In Database Systems for Advanced Applications, Li Guoliang, Yang Jun, Gama Joao, Natwichai Juggapong, and Tong Yongxin (Eds.). Springer International, Cham, Switzerland, 36–52.Google Scholar
Reference 1Reference 2
[98] Wang Zhengwei, Healy Graham, Smeaton Alan F., and Ward Tomas E.. 2020. Use of neural signals to evaluate the quality of generative adversarial network performance in facial image generation. Cognitive Computation 12, 1 (2020), 13–24.Google ScholarCross Ref
Reference
[99] Wang Zhengwei, She Qi, Smeaton Alan F., Ward Tomas E., and Healy Graham. 2020. Synthetic-neuroscore: Using a neuro-AI interface for evaluating generative adversarial networks. Neurocomputing 405 (2020), 26–36.Google ScholarCross Ref
Reference
[100] Wang Zhengwei, She Qi, and Ward Tomás E.. 2021. Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys 54, 2 (2021), 1–38.Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[101] Wiese Magnus, Knobloch Robert, Korn Ralf, and Kretschmer Peter. 2020. Quant GANs: Deep generation of financial time series. Quantitative Finance 20, 9 (Sept. 2020), 1419–1440. arXiv: 1907.06673Google ScholarCross Ref
Reference 1Reference 2Reference 3
[102] Winiger Samim. 2015. Obama Political Speech Generator—Recurrent Neural Network. Retrieved April 30, 2021 from https://github.com/samim23/obama-rnn.Google Scholar
Reference
[103] Xie Liyang, Lin Kaixiang, Wang Shu, Wang Fei, and Zhou Jiayu. 2018. Differentially private generative adversarial network. arxiv:1802.06739.Google Scholar
Reference
[104] Yi L. and Mak M.. 2019. Adversarial data augmentation network for speech emotion recognition. In Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC’19). IEEE, Los Alamitos, CA, 529–534. Google ScholarCross Ref
Reference
[105] Yi Xin, Walia Ekta, and Babyn Paul. 2019. Generative adversarial network in medical imaging: A review. Medical Image Analysis 58 (2019), 101552. Google ScholarCross Ref
Reference
[106] Yinka-Banjo Chika and Ugot Ogban-Asuquo. 2020. A review of generative adversarial networks and its application in cybersecurity. Artificial Intelligence Review 53, 3 (March2020), 1721–1736. Google ScholarCross Ref
Reference
[107] Yoon Jinsung, Jarrett Daniel, and Schaar Mihaela van der. 2019. Time-series generative adversarial networks. In Advances in Neural Information Processing Systems, Wallach H., Larochelle H., Beygelzimer A., d’Alché-Buc F., Fox E., and Garnett R. (Eds.), Vol. 32. Curran Associates, Vancouver, Canada, 5508–5518. https://proceedings.neurips.cc/paper/2019/file/c9efe5f26cd17ba6216bbe2a7d26d490-Paper.pdf.Google Scholar
Reference 1Reference 2Reference 3
[108] Yu Lantao, Zhang Weinan, Wang Jun, and Yu Yong. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI’17). 2852–2858.Google ScholarCross Ref
Reference 1Reference 2Reference 3
[109] Zhang Hui, Xiao Niannao, Liu Peishun, Wang Zhicheng, and Tang Ruichun. 2020. G-RNN-GAN for singing voice separation. In Proceedings of the 2020 5th International Conference on Multimedia Systems and Signal Processing. ACM, New York, NY, 69–73. Google ScholarDigital Library
Reference
[110] Zhang Zixing, Han Jing, Qian Kun, Janott Christoph, Guo Yanan, and Schuller Bjoern. 2019. Snore-GANs: Improving automatic snore sound classification with synthesized data. arxiv:1903.12422 [cs.LG].Google Scholar
Reference
[111] Zhu Fei, Ye Fei, Fu Yuchen, Liu Quan, and Shen Bairong. 2019. Electrocardiogram generation with a bidirectional LSTM-CNN generative adversarial network. Scientific Reports 9, 1 (2019), Article 6734, 11 pages. Google ScholarCross Ref
Reference 1Reference 2
[112] Zhu Guangxuan, Zhao Hongbo, Liu Haoqiang, and Sun Hua. 2019. A novel LSTM-GAN algorithm for time series anomaly detection. In Proceedings of the 2019 Prognostics and System Health Management Conference (PHM-Qingdao’19). IEEE, Los Alamitos, CA, 1–6.Google ScholarCross Ref
Reference 1Reference 2

Index Terms

Generative Adversarial Networks in Time Series: A Systematic Literature Review
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

CapsuleGAN: Generative Adversarial Capsule Network
Computer Vision – ECCV 2018 Workshops
Abstract
We present Generative Adversarial Capsule Network (CapsuleGAN), a framework that uses capsule networks (CapsNets) instead of the standard convolutional neural networks (CNNs) as discriminators within the generative adversarial network (GAN) ...
Read More
Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks
Information Security
Abstract
Deep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires expensive computing resources and a lot of training data, which are difficult to obtain for most individual ...
Read More
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

Generative Adversarial Networks (GANs) is a novel class of deep generative models that has recently gained significant attention. GANs learn complex and high-dimensional distributions implicitly over images, audio, and data. However, there exist major ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 55, Issue 10
October 2023
772 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3567475
Editor:
Albert Zomaya
University of Sydney, Australia
Issue’s Table of Contents
Copyright © 2023 Copyright held by the owner/author(s).
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 February 2023
- Online AM: 31 August 2022
- Accepted: 21 August 2022
- Revised: 12 August 2022
- Received: 18 June 2021
Published in csur Volume 55, Issue 10

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Generative adversarial networks
time series
discrete-variant GANs
continuous-variant GANs
Qualifiers
- survey
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 19,695
  Total Downloads
- Downloads (Last 12 months)17,090
- Downloads (Last 6 weeks)2,064
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Generative Adversarial Networks in Time Series: A Systematic Literature Review

ACM Computing Surveys

Abstract

1 INTRODUCTION

2 RELATED WORK

3 GENERATIVE ADVERSARIAL NETWORKS

3.1 Background

3.2 Challenges

3.3 Popular Datasets

4 CLASSIFICATION OF TIME SERIES BASED GANS

4.1 Discrete-Variant GANs

4.1.1 Sequence GAN (SeqGAN) (Sept. 2016).

4.1.2 Quant GAN (July 2019).

4.2 Continuous-Variant GANs

4.2.1 Continuous RNN-GAN (C-RNN-GAN) (Nov. 2016).

4.2.2 Noise Reduction GAN (NR-GAN) (Oct. 2019).

4.2.3 TimeGAN (Dec. 2019).

4.2.4 Conditional Sig-Wasserstein GAN (SigCWGAN) (June 2020).

4.2.5 Decision-Aware Time Series Conditional GAN (DAT-CGAN) (Sept. 2020).

4.2.6 Recurrent Conditional GAN (RCGAN) (2017).

4.2.7 Sequentially Coupled GAN (SC-GAN) (April 2019).

4.2.8 Synthetic Biomedical Signals GAN (SynSigGAN) (Dec. 2020).

5 APPLICATIONS

5.1 Data Augmentation

5.2 Imputation

5.3 Denoising

5.4 Anomaly Detection

5.5 Other Applications

6 EVALUATION METRICS

7 PRIVACY

7.1 Differential Privacy

7.2 Decentralized/Federated Learning

7.3 Assessment of Privacy Preservation

8 DISCUSSION

9 CONCLUSION

Footnotes

REFERENCES

Cited By

Index Terms

Recommendations

CapsuleGAN: Generative Adversarial Capsule Network

Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks

Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media