Can investor sentiment be used to predict the stock price? Dynamic analysis based on China stock market

https://doi.org/10.1016/j.physa.2016.11.114Get rights and content

Highlights

  • Investor sentiment data is obtained through user comments.

  • TOP methods are used to explore the dynamic lead–lag relationship.

  • Investor sentiment can be used to predict the stock price sometimes.

Abstract

With the development of the social network, the interaction between investors in stock market became more fast and convenient. Thus, investor sentiment which can influence their investment decisions may be quickly spread and magnified through the network, and to a certain extent the stock market can be affected. This paper collected the user comments data from a popular professional social networking site of China stock market called Xueqiu, then the investor sentiment data can be obtained through semantic analysis. The dynamic analysis on relationship between investor sentiment and stock market is proposed based on Thermal Optimal Path (TOP) method. The results show that the sentiment data was not always leading over stock market price, and it can be used to predict the stock price only when the stock has high investor attention.

Introduction

Under the Efficient Market Hypothesis (EMH), all relevant information is included in the stock price as each participant in the market is perfectly rational  [1]. With the rapid development of financial market in recent decades, more and more different kinds of investors participated in stock market, and lots of financial anomalies which not conforming to EMH emerged. A growing number of evidence found that investors are not fully rational. For example, they may overconfidence about the precision of private information  [2]. With the maturing of the behavioral finance, the influence of investors’ irrational factors on stock market caused more attention  [3]. The classical theoretical models of behavioral finance such as DSSW, BSV, DHS and HS, studied the effect of investors’ irrational factors like overconfidence, herd behavior, information asymmetry and so on in stock market from different perspectives  [4], [5], [6], [7].

Stock market is a typical complex system with different types of agents  [8]. Individual investors as one important kind of agents in the market, their decision may affect the movements of the market price and volatility. With the development of the social network, the interaction between investors became more fast and convenient. Thus, investor sentiment which can influence their investment decisions may be quickly spread and magnified through the network, and to a certain extent the stock market can be affected.

The traditional theoretical models of behavioral finance cannot fully explain how investor sentiment affect stock market in social network environment  [4], [5], [6], [7]. However, lots of literatures showed that the investor sentiment which presented by traditional exchange indicator, investigation data and internet data may take an important role in stock market investment. Baker and Wurgler constructed a sentiment index based on six proxies including trading volume, dividend premium, closed-end fund discount, the number and first-day returns on IPOs and the equity share in new issue, and found that the waves of sentiment had effects on individual firms and on the stock market as a whole  [9], [10]. Tetlock used daily content from Wall Street Journal column to indicate the investor sentiment and found that high media pessimism predicted downward pressure on market prices followed by a reversion to fundamental and high or low pessimism predicted high market trading volume  [11]. Brown and Cliff found indirect measures of sentiment are related to direct measures such as surveys. Also, although sentiment levels were strongly correlated with contemporaneous market returns, the sentiment had little predictive power for near-term future stock market  [12]. Schmeling used consumer confidence as a proxy for individual investor sentiment and studied its effect on stock returns internationally in 18 industrialized countries, the result showed that not the same as evidence from US, sentiment negatively forecasted aggregate stock market returns on average across countries, and the impact of sentiment on stock returns was higher for countries which have less market integrity  [13]. Zhang and Yang used indirect investor sentiment index to study the relationship between investor sentiment and stock returns in China and the results showed that investor sentiment was systematic factor in forming stock price as the impact due to positive and negative investor sentiment changes  [14]. Joseph et al. found that online ticker searches served as a valid proxy for investor sentiment, and in a sample of S&P 500 firms online search intensity reliably predicted abnormal stock returns and trading volume  [15]. Bollen et al. analyzed the text content of daily Twitter feeds and measures mood in terms of 6 dimensions to indicator public mood. They found that the accuracy of DJIA predictions could be significantly improved by the inclusion of some specific public mood dimensions but not others  [16]. Preis et al. analyzed the changes in Google query volumes related to finance and found patterns that may be interpreted as “early warning signs” of stock market moves  [17]. Kristoufek proposed a approach to portfolio diversification using the information of searched items on Google Trends as the search queries is correlated with the stock riskiness  [18].

There are more similar literatures using indirect or direct indicators to construct the investor sentiment index. With the rapid development of internet and big data technology, the direct indicators are on longer confined to the survey. More direct indicators from web data such as online search, news, social networking site and so on are used to present the investor sentiment. However, the data used in former literatures such as Twitters may be not only the mood from stock market. The user of the Twitters may not be the investor in stock market. Also, almost all methods that studied the relationship between investor sentiment and stock market in former literatures are traditional econometric models which may be not suitable for dynamic and nonlinear complex system such as stock market.

In this paper, based on the user comments data from a popular professional social networking site of China stock market called Xueqiu,1 the investor sentiment of each stock are calculated through semantic analysis. To answer the question that can investor sentiment be used to predict the stock market in China, the dynamic analysis on relationship between investor sentiment and stock market is proposed using Thermal Optimal Path (TOP) method. The structure of this paper is presented as follows: Section  1 is the introduction including literature review that studied the relationship between investor sentiment and stock market. Then construction of the investor sentiment indicators is introduced in Section  2. In Section  3, using TOP method, the dynamic lead–lag analysis is proposed between sentiment indicator and stock index returns such as the whole market and some classical industries. Section  4 is the conclusion.

Section snippets

Original data

The original data of the investor sentiment is from the social networking site of China stock market called Xueqiu which means snowball in English. The sense of snowball is that collecting the thinking of individual investors. Website of Xueqiu is a representative vertical type financial community in China that set up in 2011. It provide comprehensive financial service such as real-time quotes, news, investment strategy, transaction service and so on. The users of Xiuqiu are individual

Thermal optimal path (TOP) method

The thermal optimal path (TOP) method was proposed to identify and quantify the time-varying lead–lag structure between two time series by Sornette and Zhou in 2005  [19]. With the globalization of financial market and the prevalence of quantitative trading, more and more features of complex system are emerged in stock market, such as nonlinear, dynamic, self-organization and so on. Traditional linear econometric models are not suitable to study the complex system. The past literatures which

Conclusion

In this paper, investor sentiment index is firstly constructed based on the comments data of a social networking site of China stock market called Xueqiu. Then the TOP method is proposed to study the dynamic lead–lag structure for investor sentiment and stock price of the whole market and two industries. The results show that the sentiment data do not always lead stock price. It answered the question that investor sentiment is not useful to predict the stock price all the time. Only if the

Acknowledgments

This work is supported by the National Natural Science Foundation of China ​Nos. 71501175 and 71203218, Shandong Independent Innovation and Achievement Transformation Special Fund of China (2014ZZCX03302), and the Open Project of Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences.

Cited by (108)

View all citing articles on Scopus
View full text