Introduction

There are various motives for individual investors to participate in the stock market such as for speculative purposes (Oehler 1995; Shefrin and Statman 2000). Kumar (2009) demonstrates that individual investors in the USA are drawn to stocks which simultaneously have the following three lottery-like characteristics: high idiosyncratic volatility, high idiosyncratic skewness, and low price. Furthermore, Kumar (2009) shows that individual investors suffer from (over)investing in those so-called lottery-like stocks because these stocks tend to significantly underperform their counterparts.

Stocks with lottery-like characteristics have been the subject of a variety of academic studies (Bali et al. 2011; Kumar 2009). We add to this body of work in several respects. So far, the majority of studies that assess lottery-like stocks focus on the US and correspondingly US private investors’ preferences. By using the German stock market and aggregate portfolio data on private investors, we bring diversity to a widely discussed—yet still mostly US-focused—research area.

Studies have relied on discount brokerage data to examine private investors’ holdings regarding lottery-like stocks (Han and Kumar 2013; Kumar 2009; Meng and Pantzalis 2018). Dorn and Sengmueller (2009) provide evidence that investors who use discount brokers (partly) consider trading as entertainment and thus engage in it excessively. Dorn et al. (2015) provide further evidence that the clients of discount brokers substitute between participating in lottery games and financial market gambling. Hence, investors who access the stock market via discount brokers may use stocks to gamble (Barberis and Huang 2008; Statman 2002). In contrast, investors holding stocks via the deposit account of their house bank (i.e., retail broker) may follow a buy-and-hold approach that involves stocks for which lottery-like characteristics (Bali et al. 2011; Kumar 2009) are less common. Since discount brokerage data, as applied in other studies, only capture a fraction of all investor holdings, the results may be biased. We describe the holding preferences of individual investors for lottery-like stocks with data from the German central bank’s (Deutsche Bundesbank) Securities Holdings Statistics (SHS). This database (SHS-base) captures the aggregate holdings of the entire German private sector, that is, our results are not subject to preselection distortions. Thus far, studies that have covered lottery-like gambling in the stock market mostly use one distinct approach of characterizing these stocks. We apply Kumar’s (2009) lottery-like stock definition that comprises idiosyncratic volatility, idiosyncratic skewness, and low price as well as its extension by Bali et al. (2011) that provides a more comprehensive overview in which only extreme past daily returns are considered.

Moreover, regarding lottery-like characteristics, most studies focus on investors’ preferences for domestic stocks; however, respective preferences for foreign stocks may be substantially different. Since we assess German private investors’ preferences for German (domestic) as well as US (foreign) stocks, a comparison between domestic and foreign preferences for holdings is possible.

In contrast to Kumar (2009), we do not discover significant pricing differentials with regard to lottery-like stocks. This is not overly surprising as (sophisticated) investors acquire knowledge about mispricing through academic publications; as rational investors trade against mispricing, the effect decays or disappears (McLean and Pontiff 2016). However, evidence exists that lottery-like stocks as defined by Bali et al. (2011) continue to underperform their counterparts. As these stocks have high levels of idiosyncratic risk, repressed arbitrage may cause a more persistent mispricing as indicated in the literature (McLean and Pontiff 2016; Pontiff 2006, 1996; Treynor and Black 1973).

Analyzing aggregate holdings of the private sector, we find evidence that German private investors significantly overinvest in stocks with lottery-like characteristics as defined by Kumar (2009). Further, German private investors only overinvest in German lottery-like stocks as defined by Bali et al. (2011). We reconcile private investors’ preferences for a subgroup of domestic (i.e., German) stocks with seemingly different preferences for a similar subgroup in a foreign (i.e., US) equity market by pointing to the interrelation of familiarity and risk perception (Heath and Tversky 1991) as well as ambiguity aversion (Ahn et al. 2014; Baltzer et al. 2015; Bossaerts et al. 2010; Boyle et al. 2012; Fox and Tversky 1995).

Using the dataset with German and US stocks, the results from our regression analysis show that German private investors have preferences for low-priced stocks as well as for stocks with high (idiosyncratic) volatility. Furthermore, we find evidence that private investors gravitate toward stocks with high maximum daily returns.

In conflict with other studies, we do not find consistent evidence that (idiosyncratic) skewness drives private sector investments (Brunnermeier and Oehmke 2013; Kane 1982; Kumar et al. 2019; Mitton and Vorkink 2007). As private investors are subject to limited capabilities in regards to perceiving and processing information (Kahneman 1973), they may struggle to identify higher distribution moments like skewness. But they may more easily identify features like price or maximum daily returns and thus these are reflected in their aggregate holdings.

The remainder of this article is structured as follows: in “Literature review” section, we provide a short review on the related literature. In “Data and methodology” section, we describe our data and methodological approach. Subsequently, in “Results and discussion” section we present and discuss the results. “Conclusion” section is the conclusion.

Literature review

Standard neoclassical finance theory (Markowitz 1952; Sharpe 1964) fails to explain why investors engage in excessive trading which deteriorates their performance (Barber and Odean 2001, 2000; Odean 1999) or why investors participate in negative-sum games like purchasing lotteries (Ariyabuddhiphongs 2011; Davis et al. 1992). In this context, Statman (2002) discusses several behavioral effects which may explain why private investors display such seemingly irrational patterns.

Within the available investment universe, Kumar (2009) argues that stocks that simultaneously have a low share price, high (idiosyncratic) volatility, and high (idiosyncratic) skewness resemble lottery tickets and thus are especially appealing to private investors.

Other studies have addressed the importance of skewness for asset pricing (Arditti 1971, 1967; Barone-Adesi 1985; Kraus and Litzenberger 1976; Sears and Wei 1985). Prior to Kumar’s (2009) publication, Barberis and Huang (2008) postulated that investors who behaved according to Tversky and Kahneman’s (1992) cumulative prospect theory were inclined to overweight low probability events and thus had a corresponding preference for positively skewed stocks. Furthermore, investors’ preferences for skewness are addressed by Kane (1982), Brunnermeier et al. (2007), Mitton and Vorkink (2007), and Kumar et al. (2019). Directly addressing lotteries, Garrett and Sobel (1999) provide evidence that the skewness of prize distributions may explain why risk-averse individuals accept unfair gambles. Dorn and Huberman (2010) describe the preferences of private investors for volatile stocks. Evidence for retail investors’ preferences for low-priced stocks is provided by Kumar and Lee (2006).

Introducing a more viable definition, Bali et al. (2011) characterize lottery-like stocks in terms of extreme daily returns. Similar to Kumar (2009), Bali et al. (2011) show a statistically significant underperformance of lottery-like stocks in comparison to their counterparts.

The studies on private investors have widely discussed biases such as overconfidence (Barber and Odean 2001, 2000; Odean 1999) and attention (Barber and Odean 2008; Da et al. 2011) and their corresponding effects on performance.Footnote 1

If private investors preferred lottery-like payoffs and thus overinvested in lottery-like stocks, their overall portfolio performance would suffer. First, by (substantially) overweighting a subgroup of assets, investors deviate from the market portfolio—the optimal investment choice in neoclassical finance (Sharpe 1964)—and thus forgo diversification benefits. Second, overinvestment in stocks that significantly underperform their counterparts deteriorates performance (Bali et al. 2011; Kumar 2009).

Fong (2013) argues that individuals are drawn to lottery-like stocks because they seek risk and are prone to sentiment. While risk-averse investors generally avoid lottery-like stocks, risk seekers are strongly attracted to this category when their sentiment is positive. When sentiment wanes, the preference reverses. Investigating the characteristics of stocks with a high proportion of retail trading, Han and Kumar (2013) find strong lottery-like features. Furthermore, Han and Kumar (2013) indicate that collectively speculative retail trading has an effect on stock prices.

Regarding institutional investors, Kumar (2009) shows a collective underinvestment in lottery-like stocks. Agarwal et al. (2019) show that certain institutional investors, that is, actively managed US equity funds, might be prone to investing in lottery-like stocks, the reasons being their catering to investor preferences as well as shifting risk. Hsu et al. (2016) examine whether lottery-like characteristics affect institutional participation in share allocation around seasoned equity offerings (SEO) as well as the issuing firms' post-issue long-run performance: firms with lottery-like characteristics have lower pre-SEO levels of institutional ownership. However, regarding these particular firms, SEOs result in a sharp increase in institutional ownership. Moreover, lottery-like characteristics are negatively associated with long-run performance after the SEO’s issue.

Data and methodology

Stock market data

As basis for the empirical analysis, we built a dataset containing German and US equities. Only stocks with data for at least 7 months were considered.

We use the CDAX, which is a broad German stock index that comprises all prime and general standard equities, as a proxy for the German stock market. To create a dataset (relatively) free of survivorship bias, we obtained data on monthly index compositions from Thomson Reuters Datastream (Datastream) for the period from July 2000 to August 2020.Footnote 2 Subsequently, we consolidated all International Security Identification Numbers (ISINs) and removed any duplicates. The consolidation led to 1059 different ISINs for the period from January 1990 to August 2020 that corresponded to individual companies that were in the CDAX between July 2000 and August 2020. Daily and monthly returns were calculated with the Datastream’s total return index—time series data were queried for the period from January 1990 to August 2020. For calculating monthly returns, we, respectively, applied the daily total return index of the first and the last day of each considered stock. Monthly values for share price and market capitalization were obtained by applying the means to the corresponding daily values.

We merged this dataset with the market, size, and book-to-market factors (Fama and French 1993) as well as the momentum factor (Carhart 1997) and obtained the daily and monthly factors from the Kenneth French Data Library (KFDL); as factor data are geographically based and thus provided for different regions, we applied the corresponding factors for Europe.

For US stocks, we used daily as well as monthly data from the Center for Research in Security Prices (CRSP). We included all stocks in the CRSP universe that were listed on one of the three major US exchanges: NYSE, AMEX, or NASDAQ. As CRSP data have the advantage of being free of survivorship bias, no further adjustments were necessary. Factors for North America are, in turn, obtained from the KFDL.

The market, size, and book-to-market factors (Fama and French 1993) are available from July 1990; data for the momentum factor (Carhart 1997) starts in November 1990.

Portfolio sorts

In this subsection we describe the construction of different portfolios which are subsequently used to assess private sector holding preferences. As CRSP data were provided in USD, we converted all CDAX data into USD to eliminate currency effects.

As in Kumar (2009), each month we form three distinct portfolios based on idiosyncratic volatility, idiosyncratic skewness, and price: \(Lottery\), \(NonLottery\), and \(Others\). In order to compute monthly idiosyncratic volatility, we follow Kumar (2009) and use the standard deviation in the residuals from applying the four-factor model (Carhart 1997) to the time-series of the respective daily stock returns. Thus, we run the regression on the daily stock returns of the previous 6 months (i.e., months \(t - 6\) to \(t - 1\)). As in the case with idiosyncratic volatility, for the computation of monthly idiosyncratic skewness, we follow Kumar (2009) and apply the Harvey and Siddique (2000) method. In this context, idiosyncratic skewness is measured as the third moment of the residuals obtained by regressing daily stock returns on a two-factor model of the market excess return and the square of the market excess return. As before, we obtain the residuals by running the regression on the daily stock returns of the previous 6 months. The \(Lottery\) portfolio contains stocks in the lowest \(k{\text{th}}\) price percentile (measured as the average price in the previous month), highest \(k{\text{th}}\) idiosyncratic volatility percentile, and in the highest \(k{\text{th}}\) idiosyncratic skewness percentile. As in Kumar (2009), for the major part of the analysis, we have chosen \(k = 50\) where stocks are above the median idiosyncratic volatility, above median idiosyncratic skewness, and below median price. We identify these stocks as lottery-like. In contrast to the \(Lottery\) portfolio, the \(NonLottery\) portfolio is composed of stocks that are assigned to the highest \(k{\text{th}}\) stock price percentile, the lowest \(k{\text{th}}\) idiosyncratic volatility percentile, and the lowest \(k{\text{th}}\) idiosyncratic skewness percentile, that is, stocks featuring below median idiosyncratic volatility, below median idiosyncratic skewness, and above median price. The portfolio labeled \(Others\) comprises all stocks that are neither in the \(Lottery\) nor in the \(NonLottery\) portfolio.

Furthermore, we use another definition of lottery-like stocks: We follow Bali et al. (2011) who define stocks with extreme past daily returns as lottery-like. Stocks are sorted based on the constituent maximum daily return over the previous month. Stocks in the highest \(k{\text{th}}\) percentile, that is, stocks with the highest daily return over the previous month, are categorized as lottery-like. Similarly, stocks in the lowest \(k{\text{th}}\) percentile are classified as nonlottery-like. The corresponding portfolios are labeled \(Max\) and \(NonMax\). As a variation, decile portfolios are formed based on the average of the five highest daily returns in the previous month. Accordingly, stocks in the highest and lowest \(k{\text{th}}\) percentiles are categorized as lottery-like (\(Max5\)) and nonlottery-like (\(NonMax5\)). In accordance with Bali et al. (2011), we set \(k = 10\).

In order to analyze the preferences of the German private sector regarding lottery-like characteristics on a broader level, we construct several more portfolios. In this context, we sort portfolios on Kumar’s (2009) constituent characteristics of lottery-like stocks. The resulting portfolios are as follows: low/high price (\(LPrice\)/\(HPrice\)), high/low total volatility (\(HTVol/LTVol\)), high/low idiosyncratic volatility (\(HIVol/LIVol\)), high/low total skewness (\(HTSkew/LTSkew\)), and high/low idiosyncratic skewness (\(HISkew/LISkew\)). Stocks in the highest/lowest \(k{\text{th}}\) percentile of each sorting criterion are assigned to the corresponding portfolio. When sorting portfolios on one criterion, we set \(k = 10\).

Furthermore, portfolios are simultaneously sorted by using various combinations of the (constituent) characteristics of lottery-like stocks. Hence, we construct additional portfolios based on low/high price and high/low total volatility (\(LPrice\& HTVol/HPrice\& LTVol\)), low/high price and high/low idiosyncratic volatility (\(LPrice\&HIVol/HPrice\& LIVol\)), low/high price and high/low total skewness (\(LPrice\& HTSkew/HPrice\& LTSkew\)), low/high price and high/low idiosyncratic volatility (\(LPrice\& HISkew/HPrice\& LISkew\)), high/low total volatility and high/low total skewness (\(HTVol\& HISkew/LTVol\& LTSkew\)), and high/low idiosyncratic volatility and high/low idiosyncratic skewness (\(HIVol\& HISkew/LIVol\& LISkew\)). Stocks in the highest or lowest \(k{\text{th}}\) percentile are assigned to the corresponding portfolio. When sorting portfolios on two criteria, we chose \(k = 25\).

Given this methodology, there are overlaps among several of the constructed portfolios in which stocks may be assigned to various portfolios at the same time. Summary statistics for all portfolios are displayed in Table 1.

Table 1 Summary statistics

Performance analysis

We conduct a performance analysis on all the portfolios. In this context, we compute the mean monthly raw returns by averaging the value-weighted monthly returns for each portfolio. Additionally, the performance is measured via risk-adjusted returns, which are calculated as the regression intercept (\(\alpha\)) from Carhart’s (1997) four-factor model:

$$R_{i,t} = RF_{t} + \beta_{1} \times RMRF_{t} + \beta_{2} \times SMB_{t} + \beta_{3} \times HML_{t} + \beta_{4} \times WML_{t} + \alpha.$$
(1)

\(R_{i,t}\) denotes the value-weighted return of portfolio \(i\), \(RF_{,t}\) is the risk-free return, and \(RMRF_{t}\) represents the return of the market portfolio net of the risk-free return for month \(t\). \(SMB\) and \(HML\) reflect the size and book-to-market factors as described by Fama and French (1993). \(WML\) is a factor that captures momentum as identified by Jegadeesh and Titman (1993). As described, stock market data were obtained from January 1990. Since the Carhart (1997) factors were used to compose some of the portfolio sorting criteria, factor data availability marks the inception of the respective conducted analyses (see “Stock market data” section).

The results for the portfolios sorted according to Kumar (2009) and Bali et al. (2011) are displayed in Table 2. The results for the remaining portfolios described in the previous subsection are displayed in Tables 5 and 6 of Appendix.

Table 2 Value-weighted portfolio returns

Considering all the constructed portfolios, statistically significant mispricing is rare. In contrast to Kumar (2009), we do not find consistent evidence that the \(Lottery\) portfolio statistically and significantly underperforms. The \(Lottery\) portfolio of the US market shows an alpha of − .39% per month, however, with weak statistical significance at the 5% level (see Table 2 Panel B (1)). For German lottery-like stocks, we do not find evidence of any underperformance (see Table 2 Panel A, (1) to (3)). Pricing differentials are insignificant in both markets. Regarding the \(Max\) and \(Max5\) portfolios, the underperformance found by Bali et al. (2011) still prevails in both the US and the German stock markets.Footnote 3 Further evidence of mispricing for stocks with extreme maximum daily returns is provided by Annaert et al. (2013). Yet, in the US market the economic magnitude of the effect, as well as its statistical significance, is weaker as reported by Bali et al. (2011). For the decay in mispricing—instead of disappearing—after its publication, McLean and Pontiff (2016) point to frictions hindering arbitrage from completely eliminating the effect.Footnote 4\(Max\) and \(Max5\) stocks have very high levels of idiosyncratic risk represented by idiosyncratic volatility (see Table 1). As reported in the research, idiosyncratic risk restrains the amount investors are willing to invest in mispriced assets, thereby inhibiting arbitrage (McLean and Pontiff 2016; Pontiff 2006, 1996; Treynor and Black 1973). Thus, regarding the \(Max\) and \(Max5\) portfolios, mispricing may be fairly persistent (Annaert et al. 2013).

Considering all other portfolios sorted, we report evidence of a statistically significant underperformance of high-risk stocks, that is, stocks with simultaneously high (idiosyncratic) volatility and high (idiosyncratic) skewness, in both markets. In this context, significant performance differentials may be attributed to the widely known low-volatility anomaly, which can be traced back to Black (1972) and Haugen and Heins (1975). Contradicting the Capital Asset Pricing Model (Sharpe 1964), the low-volatility anomaly states that low-risk assets, irrespective of the applied risk measure, have superior returns.Footnote 5

Private sector holdings

Securities holdings statistics data

Data on private sector holdings come from the SHS-base which is a reasonable indicator for the distribution of listed securities among German households. The SHS-base is a collection of obligatory reports filed by all financial institutions domiciled in Germany to Deutsche Bundesbank (Bade et al. 2017). The available data contain quarterly observations from the fourth quarter of 2005 to the fourth quarter 2012; each observation is from the end of the last month of the quarter. Starting in January 2013, the SHS-base changed to monthly observations. Monthly data are obtained up until June 2017. Accordingly, they reflect end-of-month security holdings. Subsequently, SHS-base data points are labeled security-month observations.

The reports comprise data on all debt securities, shares, and mutual funds stored at the reporting institutions that correspond to German households. Security holdings are reported by their ISIN. For each security-month observation, Deutsche Bundesbank provides the aggregated market value of the shares owned by German households that comprises the aggregated number of shares multiplied by the corresponding end-of-month market price in EUR. In contrast to discount and retail brokerage data which mirror portfolios of a corresponding client base, SHS-base aggregated market values are based on the shares owned by the entirety of German households. Hence, the SHS-base dataset, as applied in this analysis, gives information about the actual (unbiased) distribution of German private sector funds across the considered securities (Oehler and Wanger 2020).

In order to assess the holdings of the German private sector with regard to the previously described portfolios, we merge the SHS-base with the applied proxies for the German (CDAX) and the US (CRSP) stock markets. The CRSP data do not have security ISINs. Hence, SHS-base data cannot be directly merged with the CRSP dataset. Applying ticker symbols as common identifiers, we access Datastream to obtain ISINs for the corresponding CRSP securities. Matching CRSP securities with ISINs proves to be rather difficult. When merging the CRSP dataset—supplemented by all accessible ISINs—by using SHS-base aggregated market values for the private sector, only about half of all security-month observations can be matched. The poor matching results are explained by the difficulties in acquiring ISINs for CRSP securities as well as the particular composition of the CRSP database. Regarding the latter, CRSP has a variety of securities that correspond to relatively unknown US companies that are unlikely to be a pertinent part of German private sector holdings.Footnote 6 Henceforth, we address this issue by using the S&P1500 as an alternative proxy for the US stock market which leads to vastly superior matching results. S&P1500 data are in turn obtained from Datastream.Footnote 7

Unexpected portfolio weights

In this subsection, we assess if the German private sector, as mirrored by SHS-base data, disproportionally invests in any of the previously described portfolios. In this context, we construct the unexpected portfolio weights (\(EW_{p,t}^{h}\)) which are composed as follows:

$$EW_{p,t}^{h} = \frac{{w_{p,t}^{h} - w_{p,t}^{m} }}{{w_{p,t}^{m} }} \times 100 ,$$
(2)

where \(w_{p,t}^{h}\) is the relative weight of portfolio \(p\) held by the private sector in month \(t\) in relation to all corresponding private sector holdings; accordingly, \(w_{p,t}^{m}\) is the relative market weight of portfolio \(p\) in month \(t\). The unexpected portfolio weights are, respectively, composed for the proxies for the German (CDAX) and the US (S&P1500) stock markets. The relative private sector weight is constructed as the funds assigned to the respective portfolio that are divided by all funds assigned to German and US stocks for which SHS-base data are available. Accordingly, the relative market weight is constructed as the market value of the respective portfolio that is divided by the total market value of all German and US stocks; stock-month observations which cannot be matched to SHS-base data are not included when constructing the relative portfolio market weights with the available SHS-base data. The results are displayed in Table 3.

Table 3 Weighting

Regression analysis

Furthermore, we use a regression analysis to assess the preferences of the private sector for lottery-like characteristics. Following Goetzmann and Kumar (2008) and Kumar (2009), we apply the unexpected weight allocated to each stock as the dependent variable. The measure is constructed as follows:

$$EW_{i,t}^{h} = \frac{{w_{i,t}^{h} - w_{i,t}^{m} }}{{w_{i,t}^{m} }} \times 100 ,$$
(3)

where \(w_{i,t}^{h}\) is the relative weight of stock \(i\) held by the private sector in month \(t\) in relation to all corresponding private sector holdings; \(w_{i,t}^{m}\) depicts the relative market weight of stock \(i\) in month \(t\).Footnote 8 The baseline model for the regression analysis is as follows:

$$\begin{aligned} EW_{i,t}^{h} & = \alpha + \beta_{1} \times Vol + \beta_{2} \times Skew + \beta_{3} \times Price + \beta_{4} \times DDomestic + \beta_{5} \times lnMCap \\ & \quad + \beta_{6} \times SSkew + \beta_{7} \times RMax + \beta_{8} \times R + \varepsilon \\ \end{aligned}$$
(4)

All dependent variables refer to stock-month observations. \(Vol/Skew\) depicts (idiosyncratic) volatility/skewness that is measured using the daily returns of the previous month and previous 6 months, and \(Price\) is the stock price during the previous month, \(DDomestic\) is a dummy variable which equals one if the corresponding stock is listed in the CDAX, \(lnMCap\) is the natural logarithm of the corresponding firm’s market capitalization during the previous month, \(SSkew\) is the systematic skewness that is measured by using the daily returns of the previous month and previous 6 months, \(RMax\) is the maximum daily return attained in the previous month, and \(R\) is the monthly return over the previous month.

Furthermore, we report the results for the following regression model:

$$\begin{aligned} EW_{i,t}^{h} & = \alpha + \beta_{1} \times DVol + \beta_{2} \times DSkew + \beta_{3} \times DVolSkew + \beta_{4} \times DPrice + \beta_{5} \times DPriceVol \\ & \quad + \beta_{6} \times DPriceSkew + \beta_{7} \times DPriceVolSkew + \beta_{8} \times DDomestic + \beta_{9} \times DRMax + \varepsilon . \\ \end{aligned}$$
(5)

where \(DVol/DSkew\) is a dummy variable that equals one if the corresponding stock’s (idiosyncratic) volatility/skewness measured by using the daily returns of the previous month and previous 6 months is within the highest \(k{\text{th}}\) percentile of its domestic market; \(DPrice\) depicts a dummy variable which equals one if the corresponding stock’s price during the previous month is within the lowest \(k{\text{th}}\) percentile. \(DVolSkew\), \(DPriceVol\), and \(DPriceSkew\) depict dummy variables that equal one if the corresponding stock is simultaneously within the highest \(k{\text{th}}\) percentile with regard to the volatility and the skewness measures, or the lowest \(k{\text{th}}\) percentile with regard to the price and the highest \(k{\text{th}}\) percentile, respectively, with regard to the volatility or skewness measure. \(DPriceVolSkew\) is a dummy variable which equals one if the corresponding stock is simultaneously in the lowest \(k{\text{th}}\) price percentile, the highest \(k{\text{th}}\) (idiosyncratic) volatility percentile, and the highest \(k{\text{th}}\) (idiosyncratic) skewness percentile. \(DRMax\) depicts a dummy variable equal to one if the stock is within the highest \(k{\text{th}}\) percentile with regard to the maximum daily return of the previous month.

All variables are displayed and summarized in Table 8 of Appendix. The results are reported in Table 4.

Table 4 Regression analysis

Results and discussion

Weighting

Our results presented in Table 3 show that German private investors overinvest in stocks with lottery-like characteristics. The results are in line with the research that has reported that private investors have a strong preference for stocks with lottery-like features (Bali et al. 2017; Doran et al. 2012; Han and Kumar 2013; Kumar 2009; Kumar and Lee 2006). The German private sector overweights both domestic and foreign lottery-like stocks as defined by Kumar (2009). The exposure to domestic lottery-like stocks is 107% higher (see column (7)) and the exposure to US lottery-like stocks is 25% higher (see column (14)) than justified by the stocks’ market capitalization. However, the households only overinvest in the domestic \(Max\) and \(Max5\) portfolios as defined by Bali et al. (2011).

The German private investors marginally overweight the domestic \(NonLottery\) portfolio but seem to underinvest in the foreign \(NonLottery\) portfolio. Furthermore, they underweight the domestic \(NonMax\) and \(NonMax5\) portfolios. Stocks with relatively low maximum daily returns, that is, stocks without large (positive) outliers, are unlikely to capture (extra) attention from private investors (Barber and Odean 2008; Odean 1999). Thus, the underinvestment in stocks with low maximum daily returns may be driven by this lack of attention. As argued by Dorn and Sengmueller (2009), private investors to some extent consider trading as entertainment. Therefore, stocks assigned to the \(NonMax\) and \(NonMax5\) portfolios may be unpopular choices as they do not trigger investors’ excitement. However, in contrast to their domestic equivalents, foreign \(NonMax\) and \(NonMax5\) stocks appear to be overweighted by private investors. For the German \(Max\) and \(Max5\) portfolio, the mean of the relative market weight (\(w_{p,t}^{m}\)) exceeds the mean of the relative household portfolio weight (\(w_{p,t}^{h}\)), yet the mean of the excess weight (\(EW_{p,t}^{h}\)) indicates an average overinvestment (see Table 3, Panel A). This can be attributed to two positive outliers in the excess market weight, yet the robustness of this pattern appears to be weak.

Differences with regard to relative weights assigned to a domestic portfolio and its foreign counterpart may be driven by the interrelation of familiarity and risk perception (Heath and Tversky 1991). Studies have well-documented that investors are subject to ambiguity aversion (Ahn et al. 2014; Baltzer et al. 2015; Bossaerts et al. 2010; Boyle et al. 2012; Fox and Tversky 1995). In this context, due to their geographic remoteness, distant stocks correspond to a greater sense of unfamiliarity and thus investors perceive them as being riskier (Baltzer et al. 2015; Goetzmann and Kumar 2008; Huberman 2001).Footnote 9 In this context, when investing aboard, investors may be drawn to stocks which have low levels of idiosyncratic risk.

Furthermore, when taking into account Shefrin and Statman’s (2000) Behavioral Portfolio Theory, investors who favor certain high-risk stocks and their low-risk counterparts do not pose a contradiction; as investors segregate their portfolios into mental accounts that correspond to different aspirations, assets at both ends of the risk spectrum may appear as suitable investment choices (Oehler et al. 2018a; Oehler and Horn 2021, 2019).

Furthermore, our results with regard to US stocks may be partly driven by the market proxy. As described in “Portfolio sorts” section, lottery-like stocks are defined in relative terms (Bali et al. 2011; Kumar 2009). While capturing a large portion of its market capitalization, our proxy for the US market—the S&P1500—only includes a fraction of available US equities. We acknowledge that with regard to the classification of US lottery-like stocks the applied benchmark may potentially lead to distortions.

Regarding disproportional investments, private investors substantially overinvest in low-priced stocks as well as in stocks with high levels of (idiosyncratic) volatility. In contrast, they underweight stocks with a high level of idiosyncratic skewness. The results are displayed in Table 7 (Panel A) of Appendix.

Further, private investors overweight the portfolio that contains low-priced stocks which simultaneously have high levels of (idiosyncratic) volatility. Moreover, private investors overinvest in the portfolio that contains low-priced stocks which simultaneously have high levels of (idiosyncratic) skewness. They also overweight the domestic portfolio that contains high (idiosyncratic) volatility and high (idiosyncratic) skewness stocks; regarding its foreign counterpart, there is no evidence of a significant disproportional investment. The results are displayed in Table 7 (Panel B) of Appendix.

Regression analysis

The results of the regression analyses are displayed in Table 4. In line with Kumar (2009), we find evidence that private investors prefer low-priced stocks and stocks with high (idiosyncratic) volatility.

The regression model depicted in Eq. (5) yields significantly positive coefficients for \(PriceVolSkew\) and \(DRMax\) that, respectively, reflect lottery-like characteristics according to Kumar (2009) and Bali et al. (2011). They are evidence that private investors show preferences for the established definitions of lottery-like stocks.

Surprisingly, we do not find consistent evidence that (idiosyncratic) skewness drives overinvestment in the private sector; the results are very consistent across the applied regression models. This is in contrast to the theoretical and empirical literature which highlights the importance of skewness with regard to investors’ preferences (Brunnermeier and Oehmke 2013; Kane 1982; Kraus and Litzenberger 1976; Kumar et al. 2019; Mitton and Vorkink 2007).

There are several factors which may drive the obtained results.Footnote 10 As individuals inherent limited capabilities to perceive and process information (Kahneman 1973), the assumption that private investors are sufficiently able to assess a stock’s corresponding (idiosyncratic) skewness appears to be rather pretentious. Even when financial literacy among investors is generally high, identifying and evaluating skewness may impose a challenge. In line with this argument, van Rooij et al. (2011) find that financial literacy is predominantly limited to basic knowledge.Footnote 11 Share price and (idiosyncratic) volatility are features that may be identified much more easily by private investors. Accordingly, regarding the price and the idiosyncratic volatility feature, the conducted regression analysis yields unambiguous results. Furthermore, investors’ expected skewness may not exactly match the applied skewness measures which are based on past daily returns.Footnote 12 Drerup et al. (2022) assess heterogeneity in skewness expectations, providing evidence that individuals disagree on the magnitude of skewness as well as on its sign.Footnote 13 In this study, we extrapolate past return skewness (\(Skew_{t - 1} /Skew_{t - 6}^{t - 1}\)) into the future (Barberis et al. 2016; Kumar 2009). While being a reasonable proxy, this approach may not directly capture private investor skewness expectations. Moreover, as skewness may not be persistent over time (Adcock and Shutes 2005; DeFusco et al. 1996; Harvey and Siddique 1999; Singleton and Wingender 1986), investors may exhibit preferences for skewness when choosing stocks, but (at the aggregate level) do not rebalance their portfolios when stock and/or portfolio characteristics change (Calvet et al. 2009). The latter behavior might even be beneficial for households since excessive trading and rebalancing might considerably hamper their investment performance (Anderson 2005; Barber and Odean 2000; Bauer et al. 2007; Horn and Oehler 2020). Finally, the observation period which coincides with the emergence of innovations in financial markets that are popular among private investors may have an impact on the reported results. These innovations include Contracts for Difference (CFDs) as well as various forms of Social Trading. CFDs are leveraged financial instruments which enjoy popularity among private investors. As they allow investors to take highly levered positions in financial instruments without taking actual physical positions, their nature is highly speculative (Brown et al. 2010; Corbet and Twomey 2014; Lee and Choy 2014; Twomey and Corbet 2014). Social Trading is a social network-based innovation where private investors may delegate their investment decision to other private investors (Horn et al. 2020; Oehler et al. 2016). A first attempt to study gambling behavior in the context of social trading is made by Schneider and Oehler (2021). Popular Social Trading platforms like eToro (www.etoro.com) and ZuluTrade (www.zulutrade.com) additionally offer CFD trading. Given these new possibilities, private investors may no longer rely on stocks in order to include skewness into their overall portfolios.

Economic significance

As in other studies, we find that private investors on an aggregate level overinvest in stocks with lottery-like features. The statistical significance of this disproportional investment is high. Yet, due to the minor overall size of the \(Lottery\), \(Max\), and \(Max5\) portfolios, the effect is not as severe. From October 2005 until June 2017, the mean market value of the German \(Lottery\) portfolio is 6.1 billion EUR or 7.8 billion USD. The German \(Lottery\) portfolio, on average, accounts for 0.5% of the total market capitalization of the CDAX. Thus, on an aggregate level, investors should assign 0.5% of their funds designated for domestic equities to the \(Lottery\) portfolio. Yet, the average weight assigned to the domestic \(Lottery\) portfolio is 1.0%. As on the aggregate level German private investors have 145.2 billion EUR in domestic stocks, the expected aggregate investment in the German \(Lottery\) portfolio is 726 million EUR. As the actual funds assigned to the \(Lottery\) portfolio are about twice as high, the average aggregate overinvestment is 726 million EUR. Considering the entirety of German private investors, the corresponding aggregate overinvestment of 726 million EUR does not seem to be particularly relevant.

Considering our results, one could make the argument that German private investors hold substantial parts of their public equity investments in foreign lottery-like stocks which are listed in a country other than the USA. However, considering the previously discussed home bias phenomenon and the associated overall overinvestment in domestic assets (Cooper and Kaplanis 1994; French and Poterba 1991; Tesar and Werner 1995), this does not seem to be likely.

Thus, while in relative terms the aggregate overinvestment in stocks with lottery-like features may appear to be large, when considering the absolute invested funds, the effect appears relatively minor.

Conclusion

Since Kumar’s (2009) fundamental publication, a growing body of research has addressed the stocks with lottery-like characteristics (Bali et al. 2017, 2011; Blau et al. 2016; Doran et al. 2012; Fong 2013; Gao and Lin 2015; Han and Kumar 2013; Hsu et al. 2016; Kumar and Page 2014; Meng and Pantzalis 2018).

When assessing lottery-like stocks as defined by Kumar (2009), we do not discover significant pricing differentials. This is not overly surprising as sophisticated investors acquire knowledge about mispricing through academic publications; as rational investors trade against mispricing, the effect decays or disappears (McLean and Pontiff 2016). In contrast, there is evidence that lottery-like stocks as defined by Bali et al. (2011) still tend to underperform their counterparts. As these stocks have high levels of idiosyncratic risk, repressed arbitrage and thus a more persistent mispricing is in line with these studies (McLean and Pontiff 2016; Pontiff 2006, 1996; Treynor and Black 1973).

Taking into account aggregate private sector holdings (SHS-base), we find evidence that German private investors overinvest in stocks with lottery-like characteristics as defined by Kumar (2009). Further, German private investors only overinvest in domestic lottery-like stocks as defined by Bali et al. (2011). We attribute the preferences for a subgroup of domestic stocks and the seemingly differing preferences for a similar subgroup in a foreign equity market to the interrelation of familiarity and risk perception (Heath and Tversky 1991) and ambiguity aversion (Ahn et al. 2014; Baltzer et al. 2015; Bossaerts et al. 2010; Boyle et al. 2012; Fox and Tversky 1995).

We conduct a regression analysis and find evidence that private investors prefer low-priced stocks and those with high (idiosyncratic) volatility. Furthermore, our results show that private investors gravitate to stocks with high maximum daily returns. As opposed to the literature, we do not find evidence that (idiosyncratic) skewness drives the (over)investments of the private sector (Brunnermeier and Oehmke 2013; Kane 1982; Kumar et al. 2019; Mitton and Vorkink 2007). Taking into account limited capabilities to perceive and process information (Kahneman 1973), we argue that private investors may struggle to identify higher distribution moments like skewness. Features like price, (idiosyncratic) skewness, or maximum daily returns may be identified more easily and thus are reflected in the aggregate holdings of the private sector. Moreover, private investors may be subject to heterogeneous skewness expectations which are not captured by the applied proxies and/or are reluctant to rebalance their portfolios when skewness characteristics change. In addition, given the rise of financial innovations like CFDs and Social Trading which enjoy great popularity, private investors may no longer rely on stocks to include skewness into their overall portfolios.

Finally, while in relative terms the aggregate overinvestment in stocks with lottery-like features may appear to be large, it has a relatively minor effect with regard to the absolute invested funds. Nonetheless, German private investors may still engage in excessive gambling in the financial market.