Market efficiency assumes that prices in financial markets are perfectly informative and, therefore, it is not possible to design trading strategies that outperform the market. The concept of efficiency has important implications for financial stability and, consequently, for financial policies. If asset returns exhibit persistent or anti-persistent behavior, then predictability based on past returns might be possible, which would be a clear violation of the weak form of efficiency. Many studies rely on the Hurst exponent to evaluate the level of memory of financial returns, and the purpose of this paper is to show that long memory or anti-persistence of financial returns is not incompatible with the random walk model or the efficient market hypothesis (EMH). The use of the Hurst exponent to demonstrate the inefficiency of financial markets using common estimators is troublesome, especially when applied to financial returns, since values of \(\hat{H} \ne 0.5\) are not evidence against the random walk model or the EMH. Moreover, the high variability of Hurst exponent estimates and their dependence on the chosen algorithm should motivate careful use of this tool. This study proposes a simple theoretical explanation and an extensive simulation study to show that \(\hat{H} \ne 0.5\) for financial returns is perfectly compatible with the random walk model. As a robustness check, both the traditional rescaled range and the wavelet lifting algorithms are used. Applications to real data are also discussed to show that the empirical values of the Hurst exponent are in the range suggested by the simulations, providing evidence that over-reliance on the Hurst exponent could lead to erroneous rejection of the random walk model. Specifically, the paper presents an application to the daily returns of stock market indices (DJIA and S&P 500) over a period of more than 30 years and cryptocurrencies (Bitcoin and Ethereum) over a period of more than 5 years.
Hinweise
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
The efficient market hypothesis (EMH) assumes that financial market prices are perfectly informative and, therefore, it is not possible to design trading strategies that can outperform the market. If the hypothesis were true, then ‘a blindfolded chimpanzee throwing darts at the stock pages of The Wall Street Journal could, according to EMH, select a portfolio that performs as well as one carefully chosen by the experts’ (Malkiel 1989, p. 1313). The hypothesis has been widely challenged in the literature (e.g. Shiller 1981; Yaes 1989; Lo 2004, 2017; Pernagallo and Torrisi 2020a, b; Caruso and Pernagallo 2021). The concept of efficiency is of fundamental importance in the study of financial markets for many reasons. Indeed, efficiency is closely linked to the predictability of financial returns and the success of trading strategies. Moreover, spreading the idea that markets are somehow inefficient is dangerous. Many untrained investors may think they can get rich by making poor investment decisions with consequences for financial stability.
In the literature, the random walk model (Fama 1965a, 1970) plays an important role. In fact, the EMH needs a testable model to be studied. According to the random walk model, changes in stock prices are identically and independently distributed, so they are not predictable. Clearly, this implies market efficiency (at least in its weak form), since it would mean that strategies based on past patterns of financial data, such as technical analysis, would be meaningless.
Anzeige
The idea of persistent return series is an old one. Mandelbrot (1971) was among the first to consider the possible presence and consequences of persistent statistical dependence in financial returns, and many others in the past have claimed to have found anomalous behavior in long-term stock returns (Greene and Fielitz 1977; Fama and French 1988; Poterba and Summers 1988; Jegadeesh 1990). The R/S statistic, originally proposed by Hurst (Hurst 1951) and still widely used in this type of study, is unable to distinguish between short-run and long-run dependence. To this end, Lo (1991) proposed a refined measure of the R/S statistic and found that there is no evidence of long-run dependence in stock returns when short-run dependence is taken into account.
Long memory in financial markets and for trading strategies has received increasing interest in recent times among both scholars and (Lobato and Velasco 2000; Garzarelli et al. 2014; Caporale et al. 2018; Fister et al. 2021; Bui and Ślepaczuk 2022) and traders. Finding the returns of assets with long memory becomes a kind of ‘gold rush’ to secure good profits. In general, scholars and trading experts suggest the use of the Hurst exponent, H, to quantify the degree of memory in financial data, which can be computed using the R/S method (see Sect. 2.2). In particular, values of the Hurst exponent greater than 0.5 should be indicative of persistent behavior of the series, which makes past data exploitable for predicting future values of the series. Values of the Hurst exponent below 0.5 are indicative of anti-persistent behavior. If the returns were persistent or anti-persistent, we would have evidence against the random walk model and the EMH would not be confirmed (e.g., Caporale et al. 2018).
The research question of this article is to show why this approach is insidious and can erroneously lead to a refutation of the random walk model. Through a theoretical explanation and an extensive simulation study, this paper shows that \(\hat{H} \ne 0.5\) is perfectly compatible with the random walk model and the EMH. It is already known that the Hurst exponent is biased in finite samples (Hamed 2007), less reliable when data are heavy-tailed (Sánchez et al. 2015), and that \(\hat{H} \ne 0.5\) is compatible with Markov processes with scaling solutions (Bassler et al. 2006). In addition, no asymptotic distribution theory was derived for the Hurst exponent calculated by R/S analysis (Weron 2002), making it impossible to test statistical significance in the canonical way. This paper contributes to this literature in two ways: theoretically, by showing that some level of memory is expected even in a random walk model; empirically, by showing that the Hurst exponent estimated on real data shows plausible values with the random walk model. Therefore, if stock prices indeed follow a random walk, we cannot conclude that the memory evidenced by the Hurst exponent estimation in financial returns is evidence of inefficiency. To the best of my knowledge, this is the first study to evaluate the problem for financial returns. While other works (such as Lo (1991)) have been concerned with finding a robust R/S statistic, this paper illustrates why the search for memory in financial returns with Hurst exponent estimation is, in principle, an ill-defined problem and, a fortiori, should not be used to motivate market efficiency.
To provide simulations close to the observed data, the time series are generated by two random walk processes: one with the common Gaussian error term and one with the Student’s t error term. The latter is less used in theoretical work, but it proves to be a very realistic model for financial data, as it reproduces many features of stock prices and return series, such as jumps and fat tails (Pernagallo 2023). In addition to the traditional R/S algorithm, a new approach based on wavelets is used to corroborate the results of the work. Finally, the paper also presents some applications to real data, using the daily returns of stock market indices and cryptocurrencies. The two stock market indices are the Dow Jones Industrial Average (DJIA) over the period 1985-2020 and the Standard & Poor 500 (S&P 500) over the period 1927-2020. The two cryptocurrencies analyzed are Bitcoin over the period 2014-2020 and Ethereum over the period 2015-2020.
Anzeige
The rest of the paper is structured as follows. Section 2 briefly presents the econometric framework for understanding the results of the simulations. Section 3 presents the simulations. Section 4 shows the Hurst exponent estimates on stock market indices and cryptocurrencies. The final section concludes the article.
2 Econometric framework
2.1 The random walk model
This section refers generally to ‘stocks’, but the same notions can be applied to any asset. Definition 2.1. is the traditional definition of market efficiency (Fama 1970).
Definition 2.1
(Efficient market) In an efficient market, stock prices always fully reflect available information.
Given the ambiguity of the definition, it is necessary to specify the price formation process or a testable model (Fama 1970). One of the most widely used models for empirical validation of the EMH is the random walk model, defined as follows (Fama 1965a, 1970).
Definition 2.2
(Random walk model) Changes in stock prices have the same distribution and are independent of each other.
with \(t=1,2,...\), and where c is the drift parameter, \(\Delta P_t = P_{t} - P_{t-1}\) and \(\epsilon _t \overset{i.i.d.}{\sim } \mathcal {N}(0,\sigma ^2)\), \(\sigma ^2<\infty\). The distributional assumption on \(\epsilon _t\) can be relaxed (see Definition 2.3), while if we let \(c=0\) we have the model without drift. Without any loss of generality, let \(c=0\), then
It is clear from (7) that the returns at time t depend on the past states of the process. Since (7) was derived assuming that the random walk model is true, we should expect the returns to show some degree of memory, and using the same reasoning the same can be said for log-returns. In other words, if returns showed a long memory, we would have no evidence against the random walk model or market efficiency. Thus, if stock prices follow a random walk, we can observe \(\hat{H} \ne 0.5\) for returns.
Note that computing \(\hat{H}\) on price changes would not change the key message of the paper. In fact, it is well known that for n that is not large enough, \(\hat{H}\) is distorted and even for white noise the estimate does not coincide with the true theoretical value of 0.5 (e.g., Annis and Lloyd 1976). Therefore, it makes no sense to apply the exponent to simple price changes, because we already know that it would be biased in short-term analysis. Second, there are hundreds of articles and practitioners who calculate the \(\hat{H}\) on returns, a common practice in the financial literature. Thousands of simulations on simple price changes that show the same results reported in the article can be provided to the interested reader and are not reported here for the sake of brevity.
A convenient alternative form for the random walk model in (1) is the following (Pernagallo 2023).
with \(t=1,2,...\), and where c is the drift parameter, \(\Delta P_t = P_{t} - P_{t-1}\) and \(\epsilon _t \overset{i.i.d.}{\sim } \text {Student's} \; t(df), df > 1\).
The extension of the random walk in (8) describes now stock price variations and stock returns via a Student’s t error term, which is a better description for financial data as shown in many empirical works (e.g., Blattberg and Gonedes 1974; Pernagallo and Torrisi 2019). This alternative description of the random walk is also used in the simulations as a robustness check. An example of a simulated random walk with Student’s t error term and drift is shown in Fig. 1. It can be clearly seen that this random walk is very similar to the generally observed price series for financial data, and the log-returns series also seems to exhibit many of the characteristics of the real series (Chakraborti et al. 2011). Moreover, this construction allows us to reproduce jumps and trends in the price series, and fat tails of stock returns as evidenced by the Q-Q plot and the high kurtosis of the simulated log-returns.
Fig. 1
A simulated random walk with Student’s t error term and drift (\(n=1000\)). Upper panels show time series of the random walk (left) and the log-returns (right) derived from it, while the bottom panels show the Q-Q plot (left) and histogram (right) of log-returns
×
2.2 The Hurst exponent
The Hurst exponent represents one of the most popular tools for studying long memory and efficiency (e.g. Bassler et al. 2006; McCauley et al. 2007; Ramos-Requena et al. 2017; Caporale et al. 2018; Garnier and Sølna 2018; Tiwari et al. 2018; Wei 2018; Pernagallo and Torrisi 2019, 2020a; Bennedsen et al. 2021). The exponent was devised in British hydrologist H.E. Hurst’s studies in 1951 for the practical problem of determining optimal dam sizing due to volatile rainfall and drought conditions in the Nile River (Hurst 1951). The Hurst exponent is generally used to test market efficiency. In particular, a value of Hurst’s exponent near 0.5 should be evidence of a memoryless series, which is intuitively what we expect for stock returns in an efficient market. For this reason, generally the exponent is calculated on the series of returns or log-returns. However, as pointed out in Sect. 2.1. the argument is improper because if prices follow a random walk, returns may exhibit memory.
2.3 The R/S algorithm
To compute the Hurst exponent, following Lillo and Farmer (2004), we can characterise a positively correlated long-memory process using the autocovariance function \(\rho (k)\). A process exhibits long-memory if in the limit \(k \rightarrow \infty\)
where \(0< \alpha < 1\) and L(k) is a slowly varying function at infinity, i.e. a measurable function \(L: (0, +\infty ) \rightarrow (0, +\infty )\) for which \(lim_{x \rightarrow \infty } L(ax)/L(x)=1\), \(\forall a>0\), and where \(\sim\) means asymptotic equality. The degree of long-memory is provided by the exponent \(\alpha\); indeed, as \(\alpha\) decreases, the memory increases. The relationship between \(\alpha\) and the Hurst exponent is provided by the following equation
$$\begin{aligned} H = 1 - \frac{\alpha }{2} \end{aligned}$$
(10)
with \(0< H < 1\). Short-memory processes exhibits \(H = 1/2\), which is what we expect for a random series, and the autocorrelation function decays faster than \(k^{-1}\). A Hurst exponent in the range (0.5,1) signals a positively correlated long-memory process, which can be associated with persistent time-series behavior and is what we expect in inefficient markets. The same is true for anti-persistent series, such as pink noise, for which \(H<0.5\).
The rescaled range algorithm for calculating the Hurst exponent (for a review, see Pernagallo and Torrisi (2019)) is the prevalent method and is derived following these steps. Given a time series of financial returns of length T, we divide it into a number of shorter time series of length \(n= T/2, T/4,...\), and for each sub-series we compute the average rescaled range. Given the (partial) time series of length n, \(X=X_1,..., X_n\):
1.
Calculate the mean \(m = \frac{1}{n} \sum \limits _{i=1}^n X_i\).
2.
Compute the standard deviation \(S(n) = \sqrt{\frac{1}{n} \sum \limits _{i=1}^n (X_i-m)^2}\).
3.
Create a mean-adjusted series \(Y_t = X_t - m\), for \(t=1,...n\).
4.
Create the cumulative series \(Z_t = \sum \limits _{i=1}^t Y_i\).
5.
Compute the range R: \(R(n) = \textrm{max}(Z_1,..., Z_n) - \textrm{min}(Z_1,..., Z_n)\).
6.
Rescale the range to obtain R(n)/S(n) and calculate the mean value \((R/S)_n\) of the rescaled range for all sub-series of length n.
Once \((R/S)_n\) is obtained we consider the power law relation
where c is a constant and H is the Hurst exponent to be estimated. Finally, we can estimate H by running an Ordinary Least Squares (OLS) estimating the slope of the equation
$$\begin{aligned} \textrm{log}(R/S)_n = \textrm{log} \, c + H \textrm{log} \,n \end{aligned}$$
(12)
2.4 Long-memory parameter estimation using wavelet lifting
Recent work (Ramírez-Cobo et al. 2011; Knight et al. 2017) has exploited wavelet methods to estimate the Hurst exponent and the long-memory parameter \(\alpha\). Knight et al. develop a new wavelet algorithm for irregular data (i.e., irregularly sampled or with missing observations) based on the lifting one coefficient at a time (LOCAAT) transform proposed by Jansen et al. (2009). In particular, let r denote the stage of LOCAAT at which we obtain the wavelet coefficient \(d_{j_r}\), and let its corresponding artificial level be \(g^*\), then for some constant K
Further details on the LOCAAT transform can be found in other works (Jansen et al. 2009; Knight et al. 2017)
Following Knight et al. (2017), we let \(X = \{X_{t_i}\}_{i=0}^{N-1}\) denote a (zero-mean) long-memory stationary time series with finite variance and spectral density \(f_X(\omega ) \sim c_f|\omega |^{-\alpha }\) for frequencies \(\omega \rightarrow 0\), for some \(\alpha \in (0,1)\). The method applies also to irregularly spaced time series. If the series is observed at irregularly spaced times \(\{t_i\}_{i=0}^{N-1}\), we can transform the observed data X into a collection of lifting or detail coefficients, \(\{d_{j_r}\}_r\), using the LOCAAT transform.
The \(\alpha\) parameter (and consequently the Hurst exponent) can be estimated by following this five-step algorithm (Knight et al. 2017, p. 1460).
I
Apply LOCAAT to the observed process X using a particular lifting trajectory to obtain lifting coefficients \(\{d_{j_r}\}_r\). Then group the coefficients into a set of artificial scales.
II
Normalize the detail coefficients by dividing through by the square root of the corresponding diagonal entry of \(\tilde{W}\tilde{W}^T\), where \(\tilde{W}\) is the lifting transform matrix. For simplicity, we use \(d_{j_r}\) to denote the normalized details, \(d_{j_r}(\tilde{W}\tilde{W}^T)_{j_r,j_r}^{-1/2}\).
III
Estimate the wavelet coefficients’ variance within each artificial level \(j^*\) by
where \(n_j^*\) is the number of observations in artificial level \(j^*\).
IV
Fit a weighted linear regression to the points \(log_2(\hat{\sigma }^2_{j^*})\) versus \(j^*\) and use its slope to estimate \(\alpha\).
V
Repeat steps I to IV for P bootstrapped trajectories, obtaining an estimate \(\hat{\alpha }_P\) for each trajectory \(p \in \overline{1,P}\). The final estimator is \(\bar{\alpha } = P^{-1}\sum _{p=1}^P \hat{\alpha }_P\). Finally, we can estimate H using the usual relationship between \(\alpha\) and H.
This alternative algorithm is especially useful for financial data, which are often irregularly sampled or have missing observations. Anyway, the algorithm can also be used on regularly time series. The aim of using this different algorithm is to show that the paper’s key results are not dependent on the algorithm choice, while conclusions about financial data using H may vary according to the selected algorithm. The algorithm is implemented in the R package liftLRD.1
3 Simulations
In this section, we confirm the points raised in the previous sections through simulations. The interested reader is referred to other works for a mathematical explanation of how some series without memory can exhibit \(H \ne 0.5\) (Bassler et al. 2006; McCauley et al. 2007). Figures 2, 3, 4, 5, 6 show the Hurst exponent computed via R/S analysis using the R package pracma for 10000 simulated random walks. The random walk generating processes considered are random walks of the form \(R_t=c+R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\), and random walks of the form \(R_t=c+R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\text {Student's} \; t(3)\). The choice of \(df=3\) is justified by estimates obtained in empirical work (Pernagallo and Torrisi 2019), while the use of a standard Gaussian is common in literature. However, the choice of parameters does not alter the conclusions, and the interested reader can contact the author for further simulations. Regarding the drift, we consider a no-drift case \(c=0\), a case with drift \(c=0.3\) to reproduce an upward trend as common in financial series, and a case where the drift is randomly generated at each iteration from \(\mathcal {N}(0,1)\). In all the cases, the conclusions are identical. Note that the Hurst exponent is computed from the returns, not the random walks themselves.
The simulations consider two distinct sample sizes, namely \(n = 100\) and \(n = 1000\), to observe how the outcomes vary with increasing n. Intermediate sample sizes yield similar results. The exponent is computed on returns \(\Delta P_t/P_{t-1}\) and log-returns \(ln(P_t/P_{t-1})\). The results of the simulations are shown in Figs 2, 3, 4, 5, 6 and Tables 4, 5, 6 and 7 in Appendix B. For the random walk with randomly generated drift at each iteration, only the results for \(n=1000\) and \(\epsilon _t \overset{i.i.d.}{\sim }\text {Student's} \; t(3)\) are reported, as there is no difference from the other scenarios.
On average, the Hurst exponent indicates that the simulated returns do not exhibit long memory (\(\hat{H}\) close to 0.5), but with high variability even in large samples. In some instances, the maximum detected in the simulations exceeds 0.70, which should be associated with a series possessing long memory. These findings carry significant implications for the profession. If the Hurst exponent is used to ascertain efficiency, we might reject the random walk model even if the financial series adheres to a random walk. For this reason, efficiency analysis utilizing the Hurst exponent in financial applications ought to be undertaken with caution, at least with the algorithms currently in use.
The wavelet lifting-based procedure was implemented using the R package liftLRD. Given the computational complexity of the algorithm, it was necessary to reduce the number of simulations from 10000 to 1000, since already 1000 simulations are computationally intensive, requiring several hours on a quad-core laptop with parallel processing. For brevity, we show only graphs of simulations on the random walk process without drift and Gaussian innovations; the other processes produce figures similar to those in Fig. 7.
The simulations in Fig. 7 still show that even though the original series is a random walk, the value of Hurst exponent calculated on the returns may deviate from the theoretical value of 0.5. Moreover, in this case we get an average value of \(\hat{H}\) less than 0.5, which provides anti-persistent evidence. If we had trusted Hurst exponent we would have concluded that we are dealing with pink noise, when in fact it is white noise. This also shows that, in addition to being variable, Hurst exponent estimates depend strongly on the type of algorithm chosen, another reason to suggest careful use of the exponent.
In addition to estimating exponents, the liftLRD package also allows bootstrap confidence intervals to be calculated. This is not usual in applied work, where generally only exponent estimation is provided, due to the lack of an asymptotic theory of the Hurst exponent using the R/S method (Weron 2002) and the fact that programming packages do not always implement the calculation of bootstrap confidence intervals.
With this new feature, we can also show that many of the calculated exponents are indeed significant. To give an idea, Table 1 shows the estimates and bootstrap confidence interval of the first 50 iterations of the simulations (\(n =\)100) for which the interval does not include zero (in other words, the Hurst estimate is statistically significant). It can be seen that 19 estimates are significant and among them many are significantly lower or higher than the true theoretical value. The situation does not improve if we increase the sample size (\(n=1000\)), as shown in Table 2. Also, note that even for the estimates for which we conclude \(H=0\) we are drawing an incorrect conclusion, since we know that \(H=0.5\) since it is calculated on white noise.
The practical implication is that applied work demonstrating persistent or anti-persistent behavior of financial markets may actually erroneously reject the random walk model. Indeed, Hurst exponents are generally calculated over time windows of 50-1000 observations (Vogl 2023). This is perfectly understandable, since short-term analysis is particularly important for studying the profitability of financial markets. Thus, obtaining \(\hat{H} \ne 0.5\) (and significant) would not provide any conclusive evidence against the random walk model.
Fig. 2
Monte Carlo simulations (10000) to see the performance of the Hurst exponent (R/S algorithm) in detecting efficiency. In each simulation a random walk of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\) is generated, and then the exponent is computed on returns (upper panels, left panel \(n = 100\), right panel \(n = 1000\)) and log-returns (bottom panels, left panel \(n = 100\), right panel \(n = 1000\))
Fig. 3
Monte Carlo simulations (10000) to see the performance of the Hurst exponent (R/S algorithm) in detecting efficiency. In each simulation a random walk of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\text {Student's} \; t(3)\) is generated, and then the exponent is computed on returns (upper panels, left panel \(n = 100\), right panel \(n = 1000\)) and log-returns (bottom panels, left panel \(n = 100\), right panel \(n = 1000\))
Fig. 4
Monte Carlo simulations (10000) to see the performance of the Hurst exponent (R/S algorithm) in detecting efficiency. In each simulation a random walk with drift of the form \(R_t= 0.3 + R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\) is generated, and then the exponent is computed on returns (upper panels, left panel \(n = 100\), right panel \(n = 1000\)) and log-returns (bottom panels, left panel \(n = 100\), right panel \(n = 1000\))
Fig. 5
Monte Carlo simulations (10000) to see the performance of the Hurst exponent (R/S algorithm) in detecting efficiency. In each simulation a random walk with drift of the form \(R_t=0.3+R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\text {Student's} \; t(3)\) is generated, and then the exponent is computed on returns (upper panels, left panel \(n = 100\), right panel \(n = 1000\)) and log-returns (bottom panels, left panel \(n = 100\), right panel \(n = 1000\))
Fig. 6
Monte Carlo simulations (10000) to see the performance of the Hurst exponent (R/S algorithm) in detecting efficiency. In each simulation a random walk with drift of the form \(R_t=c+R_{t-1}+\epsilon _t\) is generated, with c extracted in each simulation from a standard normal, \(\epsilon _t \overset{i.i.d.}{\sim }\text {Student's} \; t(3)\), and then the exponent is computed on returns (upper panel, \(n = 1000\)) and log-returns (bottom panel, \(n = 1000\))
Fig. 7
Monte Carlo simulations (1000) to see the performance of the Hurst exponent (wavelet lifting algorithm) in detecting efficiency. In each simulation a random walk of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\) is generated, and then the exponent is computed on returns (upper panels, left panel \(n = 100\), right panel \(n = 1000\)) and log-returns (bottom panels, left panel \(n = 100\), right panel \(n = 1000\))
Table 1
Hurst exponent and 95% bootstrap confidence interval for 50 random walks (\(n=100\)) of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\). The Hurst exponents (wavelet lifting algorithm) are computed on \(\Delta P_t/P_{t-1}\) and the table shows only statistically significant estimates
Iteration
Hurst exp
Lower bound
Upper bound
3
0.4755
0.0521
0.7742
4
0.4630
0.0834
0.8097
6
0.4524
0.0046
0.9764
11
0.3691
0.0829
0.7239
16
0.5787
0.1656
1.2283
17
0.6937
0.4283
0.9127
18
0.3099
0.0209
0.5655
21
0.6281
0.3670
0.9801
25
0.3899
0.0086
0.7701
26
0.4757
0.1123
1.0363
27
0.4318
0.1907
0.7203
28
0.5583
0.2232
0.9075
30
0.3753
0.0293
0.7329
31
0.4911
0.2118
0.9291
36
0.5342
0.1946
0.7491
39
0.4111
0.0241
1.0041
42
0.5605
0.1390
1.0165
46
0.5086
0.1175
0.9599
49
0.4863
0.0533
0.9526
Table 2
Hurst exponent and 95% bootstrap confidence interval for 50 random walks (\(n=1000\)) of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \overset{i.i.d.}{\sim }\mathcal {N}(0,1)\). The Hurst exponents (wavelet lifting algorithm) are computed on \(\Delta P_t/P_{t-1}\) and the table shows only statistically significant estimates
Iteration
Hurst exp
Lower bound
Upper bound
1
0.3291
0.0135
1.0515
2
0.3281
0.0497
0.5586
3
0.3615
0.0091
0.8944
4
0.3904
0.1393
0.6196
7
0.3066
0.0528
0.7690
8
0.2971
0.0895
0.6173
9
0.4415
0.3585
0.5411
10
0.3901
0.1531
0.6201
11
0.3647
0.1495
0.6557
12
0.4188
0.2430
0.6106
13
0.4582
0.2832
0.6670
14
0.3033
0.0763
0.5478
16
0.2933
0.0309
0.5937
17
0.2711
0.0227
0.5684
20
0.3273
0.0439
0.6948
22
0.2953
0.0393
0.5330
23
0.3026
0.0485
0.7367
25
0.4030
0.2070
0.6731
28
0.3760
0.1009
0.6144
29
0.3784
0.1288
0.6627
32
0.4676
0.1506
0.7497
33
0.3779
0.1473
0.5887
34
0.3914
0.1907
0.6042
35
0.3339
0.1024
0.5216
37
0.5977
0.5081
0.6950
38
0.3772
0.1541
0.6294
41
0.4273
0.1705
0.7241
43
0.3030
0.0589
0.6101
44
0.4475
0.2615
0.6814
45
0.4071
0.2637
0.5644
46
0.3135
0.0898
0.5952
47
0.3689
0.1039
0.7545
48
0.3836
0.1685
0.5927
49
0.4149
0.1724
0.6318
50
0.3094
0.0396
0.9883
×
×
×
×
×
×
4 Some applications on real data
4.1 Data description
This section shows Hurst exponent estimates on stock market indices and cryptocurrencies to demonstrate that the simulated series closely reproduce the behavior of the real data. The purpose of this section is not to show that these assets are efficient or inefficient, but only that the empirical values obtained are theoretically plausible with a random walk model.
The data are freely available at Yahoo! Finance.2 The series downloaded are daily adjusted closing prices for stock indexes as suggested by Fama (1965b), and the time interval is from the first available observation in the database to the last one in December 2020. The stock market indexes are the \(S{ \& }P \, 500\) (SP500) and the Dow Jones Industrial Average (DJIA), the two major U.S. stock market indices. The SP500 is observed from December 1927, whereas the DJIA is observed from January 1985. Observations of the SP500 start from 1927, while those of the DJIA from 1985. The cryptocurrencies price series downloaded are Bitcoin and Ethereum, both in USD, the former observed since September 2014 and the latter since August 2015. The two indexes give us an overview of the stock market, while the analysis of cryptocurrencies is relevant because many papers have recently argued that they exhibit some degree of inefficiency and because they form an intricate network (e.g. Cheah et al. 2018; Tiwari et al. 2018; Pernagallo 2024).
First, we tested the price series for the presence of a unit root using the augmented Dickey-Fuller test (ADF test). This test is useful because, if we do not reject the null hypothesis for the price series, we have evidence that the four series can be modeled through a random walk. To understand this point, it is sufficient to recall the basic (stationary) AR(1) process
where \(v_t\) are independent random errors with zero mean and constant variance \(\sigma _v^2\). To test nonstationarity, the null hypothesis \(\rho = 1\) can be tested against the alternative \(|\rho |<1\). If the null hypothesis is not rejected, (15) becomes a nonstationary random walk process. This is the logic behind the Dickey-Fuller tests. If asset prices are random walk, we know from the theory of Sects. 2 and 3 that returns can exhibit some degree of memory, so if \(\hat{H} \ne 0.5\) we have no conclusive evidence against the random walk model. For the four price series analyzed, the p-values of the ADF test are 0.99 for Bitcoin, 0.7364 for Ethereum, 0.99 for SP500, and 0.9159 for DJIA. Therefore, we have strong evidence in favor of the random walk model. The test was computed using the R function adf.test() of the \({\textbf {tseries}}\) package, but the result does not change even if different types of tests are used.
Descriptive statistics of the returns over the entire period are given in Table 3, from which it can be seen that they are leptokurtic. Also, as expected, cryptocurrency returns are much more variable. Returns in Table 3 are log-returns and are calculated from the price series using the formula described in (5). The estimation of the Hurst exponent on log-returns in the literature is quite common (e.g, Caporale et al. (2018) or Pernagallo and Torrisi (2020b)).
Table 3
Descriptive statistics of log-returns
DJIA
SP500
Bitcoin
Ethereum
Min
\(-\)0.2563
\(-\)0.2289
\(-\)0.4647
\(-\)1.3028
Mean
0.0003
0.0002
0.0018
0.0028
Median
0.0006
0.0005
0.0019
0.0002
Max
0.1076
0.1536
0.2251
0.4103
SD
0.0114
0.0120
0.0388
0.0682
Skewness
\(-\)1.5963
\(-\)0.4832
\(-\)0.9372
\(-\)3.4708
Kurtosis
38.7458
19.1062
13.3083
72.2722
Obs
9056
23362
2298
1974
Period
1985–2020
1927–2020
2014–2020
2015–2020
4.2 Results
In Figs. 8, 9, 10, 11, the Hurst exponent on log-returns is calculated over different time horizons using rolling windows of lengths 50, 100, 253 (the average trading year), and 500. For example, for Bitcoin in Fig. 9 we can clearly identify the period before the 2018 cryptocurrency crash. The exponent signals a departure from the theoretical value under efficiency (\(H = 0.5\)), as it generally ranges between 0.4 and 0.7 for the assets analyzed. This result is consistent with empirical evidence (e.g., Cont 2001; Pernagallo and Torrisi 2019), but more importantly, it is perfectly in line with simulation results. In other words, these values are perfectly consistent with the random walk model, so no conclusions about market efficiency should be drawn from these results. Unfortunately, in many cases this evidence is used as proof of the inefficiency of a specific asset.
Figure 12 shows the empirical estimates of the Hurst exponent of the four time series using the wavelet lifting algorithm. In this case, given the computational complexity of the algorithm, only the rolling window of length 50 is calculated. It is clearly seen that the results are not different from those obtained with the R/S algorithm. Moreover, these estimates are theoretically more reliable because the wavelet lifting algorithm can handle missing observations or irregular samples.
Fig. 8
Hurst exponent via R/S analysis on DJIA. In the upper panels the exponent is computed over 50 (left panel) and 100 (right panel) trading days rolling windows, whereas in the bottom panels the windows are 253 trading days (left panel) and 500 trading days (right panel)
Fig. 9
Hurst exponent via R/S analysis on S&P 500. In the upper panels the exponent is computed over 50 (left panel) and 100 (right panel) trading days rolling windows, whereas in the bottom panels the windows are 253 trading days (left panel) and 500 trading days (right panel)
Fig. 10
Hurst exponent via R/S analysis on Bitcoin. In the upper panels the exponent is computed over 50 (left panel) and 100 (right panel) trading days rolling windows, whereas in the bottom panels the windows are 253 trading days (left panel) and 500 trading days (right panel)
Fig. 11
Hurst exponent via R/S analysis on Ethereum. In the upper panels the exponent is computed over 50 (left panel) and 100 (right panel) trading days rolling windows, whereas in the bottom panels the windows are 253 trading days (left panel) and 500 trading days (right panel)
Fig. 12
Hurst exponent via wavelet lifting algorithm on DJIA, S&P 500, Bitcoin and Ethereum. The exponent is computed over 50 trading days rolling window
×
×
×
×
×
5 Conclusions
The objective of this article is to shed light on a very useful statistical tool that is often misused by the financial community. The use of the Hurst exponent computed with available algorithms to demonstrate the inefficiency of financial markets is improper, especially when applied to financial returns, since the values of \(\hat{H} \ne 0.5\) are not evidence against the random walk model or EMH. Moreover, the high variability of Hurst exponent estimates and its dependence on the type of algorithm used should motivate careful use of this tool.
This paper explicitly shows that financial returns can be rewritten to show that the process depends on past history, so some degree of memory can be expected even if the price process is a random walk. Using an extensive simulation study, we show that the empirical estimates of the Hurst exponent usually obtained in the empirical literature on financial data are compatible with the random walk model. These results are robust when using two different random walk models (with Gaussian or Student’s t innovations), different sample sizes, and different algorithms to compute the Hurst exponent. In the latter regard, the simulations were based on the traditional R/S algorithm and the newer wavelet lifting algorithm, and in both cases the results are consistent.
Moreover, estimation on real data shows that even for stock indices or cryptocurrencies the values obtained are very close to those obtained in simulations. In particular, our application on the S&P 500, the DJIA, Bitcoin, and Ethereum over a long period uses different rolling windows to estimate the memory of the processes in different time intervals. In all cases, the values of the estimates obtained are close to those obtained in the simulations. This does not mean that these assets are efficient, only that Hurst exponent should not be used to exclude the EMH.
This paper aims to raise awareness among the financial community about the proper use of statistical tools. Informed investment decisions are indeed the basis of well-functioning financial markets and are a primary goal of policy makers around the world. As for future research, two important lines of research are urgently needed. The first calls for the creation of a more efficient estimator of the Hurst exponent. The idea behind the Hurst exponent is right; unfortunately, sample estimation with available algorithms is still affected by a degree of variability that undermines the reliability of this useful statistical tool. The second should focus on quantifying memory in financial returns using alternative tools to compare with the Hurst exponent, which is one of the most widely used tools for quantifying memory, but not necessarily the best. This paper could be useful to researchers who decide to pursue either of these two lines of research.
Acknowledgements
The author would like to thank three anonymous referees and the editor for their comments that helped improve the article.
Declarations
Conflict of interest
The author has no conflict of interest to declare.
Ethical statement
This study is compliant with the ethical standards of the scientific community.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Summary statistics of Monte Carlo simulations in Sect. 3 for random walks of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \mathop {\sim }\limits ^{\tiny {i.i.d.}} \mathcal {N}(0,1)\). The Hurst exponents (R/S algorithm) are computed on \(\Delta P_t/P_{t-1}\) and \(ln(P_t/P_{t-1})\)
Descriptive statistics
\(\Delta P_t/P_{t-1}\)
\(ln(P_t/P_{t-1})\)
Hurst exp
Hurst exp
Hurst exp
Hurst exp
No. of simulations
10,000
10,000
10,000
10,000
Sample size
100
1.000
100
1.000
Mean
0.5144
0.5180
0.4893
0.4887
Median
0.5064
0.5121
0.4872
0.4863
1st Quartile
0.4910
0.4986
0.4532
0.4632
3rd Quartile
0.5390
0.5366
0.5230
0.5130
St. Dev
0.0400
0.0300
0.0508
0.0368
Minimum
0.3711
0.4217
0.3247
0.3659
Maximum
0.7108
0.6494
0.6612
0.6258
Table 5
Summary statistics of Monte Carlo simulations in Sect. 3 for random walks of the form \(R_t=R_{t-1}+\epsilon _t\), with \(\epsilon _t \mathop {\sim }\limits ^{\tiny {i.i.d.}} \text {Student's} \; t(3)\). The Hurst exponents (R/S algorithm) are computed on \(\Delta P_t/P_{t-1}\) and \(ln(P_t/P_{t-1})\)
Descriptive statistics
\(\Delta P_t/P_{t-1}\)
\(ln(P_t/P_{t-1})\)
Hurst exp
Hurst exp
Hurst exp
Hurst exp
No. of simulations
10,000
10,000
10,000
10,000
Sample size
100
1.000
100
1.000
Mean
0.5144
0.5180
0.4893
0.4887
Median
0.5064
0.5121
0.4872
0.4863
1st Quartile
0.4910
0.4986
0.4532
0.4632
3rd Quartile
0.5390
0.5366
0.5230
0.5130
St. Dev
0.0400
0.0300
0.0508
0.0368
Minimum
0.3711
0.4217
0.3247
0.3659
Maximum
0.7108
0.6494
0.6612
0.6258
Table 6
Summary statistics of Monte Carlo simulations in Sect. 3 for random walks with drift of the form \(R_t=0.3 + R_{t-1}+\epsilon _t\), with \(\epsilon _t \mathop {\sim }\limits ^{\tiny {i.i.d.}} \mathcal {N}(0,1)\). The Hurst exponents (R/S algorithm) are computed on \(\Delta P_t/P_{t-1}\) and \(ln(P_t/P_{t-1})\)
Descriptive statistics
\(\Delta P_t/P_{t-1}\)
\(ln(P_t/P_{t-1})\)
Hurst exp
Hurst exp
Hurst exp
Hurst exp
No. of simulations
10,000
10,000
10,000
10,000
Sample size
100
1.000
100
1.000
Mean
0.5333
0.5531
0.5270
0.5699
Median
0.5276
0.5466
0.5264
0.5715
1st Quartile
0.4986
0.5136
0.4878
0.5376
3rd Quartile
0.5641
0.5891
0.5665
0.6043
St. Dev
0.0449
0.0463
0.0558
0.0455
Minimum
0.3737
0.4412
0.3182
0.4105
Maximum
0.7037
0.6923
0.7046
0.6935
Table 7
Summary statistics of Monte Carlo simulations in Sect. 3 for random walks with drift of the form \(R_t=0.3+R_{t-1}+\epsilon _t\), with \(\epsilon _t \mathop {\sim }\limits ^{\tiny {i.i.d.}} \text {Student's} \; t(3)\). The Hurst exponents (R/S algorithm) are computed on \(\Delta P_t/P_{t-1}\) and \(ln(P_t/P_{t-1})\)