research-article

Textual analysis of stock market prediction using breaking financial news: The AZFin text system

Authors:
Robert P. Schumaker

Iona College, New Rochelle, NY

Iona College, New Rochelle, NY
View Profile

,
Hsinchun Chen

University of Arizona, Tucson, AZ

University of Arizona, Tucson, AZ
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 27 Issue 2Article No.: 12pp 1–19https://doi.org/10.1145/1462198.1462204

Published:09 March 2009Publication History

ACM Transactions on Information Systems

Abstract

Our research examines a predictive machine learning approach for financial news articles analysis using several different textual representations: bag of words, noun phrases, and named entities. Through this approach, we investigated 9,211 financial news articles and 10,259,042 stock quotes covering the S&P 500 stocks during a five week period. We applied our analysis to estimate a discrete stock price twenty minutes after a news article was released. Using a support vector machine (SVM) derivative specially tailored for discrete numeric prediction and models containing different stock-specific variables, we show that the model containing both article terms and stock price at the time of article release had the best performance in closeness to the actual future stock price (MSE 0.04261), the same direction of price movement as the future price (57.1% directional accuracy) and the highest return using a simulated trading engine (2.06% return). We further investigated the different textual representations and found that a Proper Noun scheme performs better than the de facto standard of Bag of Words in all three metrics.

References

Bishop, C. M. and Tipping, M. E. 2003. Bayesian Regression and Classification. IOS Press, Amsterdam.Google Scholar
Burns, D. and Wutkowski, K. Nov. 15, 2005. Schwab to miss forecast, fined by NYSE. http://biz.yahoo.com/rb/051115/financial_schwab.html?.v=3.Google Scholar
Cho, V. 1999. Knowledge Discovery from Distributed and Textual Data. Tech. rep. Department of Computer Science. Hong Kong University of Science and Technology.Google Scholar
Cho, V., Wuthrich, B., and Zhang, J. 1998. Text processing for classification. J. Computat. Intel. Fin. 26.Google Scholar
Conrad, J. G. and Claussen, J. R. S. 2003. Early user-system interaction for database selection in massive domain-specific online environments. ACM Trans. Inform. Syst. 21, 1, 94--131. Google ScholarDigital Library
Fama, E. 1964. The behavior of stock market prices. Tech. rep. Graduate School of Business, University of Chicago.Google Scholar
Fung, G. P. C., Yu, J. X., Yu, X., and Lam, W. 2002. News sensitive stock trend prediction. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). Google ScholarDigital Library
Gao, J. B., Gunn, S. R., Harris, C. J., and Brown, M. 2002. A probabilistic framework for SVM regression and error bar estimation. Mach. Learn. 46, 1--3, 71--89. Google ScholarDigital Library
Gidofalvi, G. 2001. Using news articles to predict stock price movements. Tech rep. Department of Computer Science and Engineering, University of California, San Diego.Google Scholar
Joachims, T. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning. Springer-Verlag, 137--142. Google ScholarDigital Library
Kloptchenko, A., Eklund, T., Karlsson, J., Back, B., Vanharanta, H., and Visa, A. 2004. Combining data and text mining techniques for analysing financial reports. Intel. Syst. Account. Fin. Manage. 12, 1, 29--41. Google ScholarDigital Library
Lavrenko, V., Schmill, M., Lawrie, D., and Ogilvie, P. 2000b. Mining of concurrent text and time series. In Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining (KDD).Google Scholar
Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., and Allan, J. 2000a. Language models for financial news recommendation. In Proceedings of the 9th International Conference on Information and Knowledge Management. Google ScholarDigital Library
Le Moigno, S., Charlet, J., Bourigualt, D., Degoulet, P., and Jaulent, M.-C. 2002. Terminology extraction from text to build an ontology in surgical intensive care. In Proceedings of the AMIA Symposium.Google Scholar
LeBaron, B., Arthur, W. B., and Palmer, R. 1999. Time series properties of an artificial stock market. J. Econ. Dynam. Contr. 23, 9--10, 1487--1516.Google ScholarCross Ref
Malkiel, B. G. 1973. A Random Walk Down Wall Street. W.W. Norton, New York.Google Scholar
McDonald, D. M., Chen, H., and Schumaker, R. P. 2005. Transforming open-source documents to terror networks: The Arizona TerrorNet. In Proceedings of the American Association for Artificial Intelligence Conference Spring Symposia.Google Scholar
Mittermayer, M.-A. 2004. Forecasting intraday stock price trends with text mining techniques. In Proceedings of the 37th Hawaii International Conference on Social Systems. Google ScholarDigital Library
Moldovan, D., Pasca, M., Harabagiu, S., and Surdeanu, M. 2003. Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inform. Syst. 21, 2, 133--154. Google ScholarDigital Library
Pai, P.-F. and Lin, C.-S. 2005. A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 33, 6, 497--505.Google ScholarCross Ref
Platt, J. C. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods: Support Vector Learning, MIT Press, 185--208. Google ScholarDigital Library
Sekine, S. and Nobata, C. 2003. Definition, dictionaries and tagger for extended named entity hierarchy. In Proceedings of the International Conference on Language Resources and Evaluation.Google Scholar
Seo, Y.-W., Giampapa, J., and Sycara, K. 2002. Text classification for intelligent portfolio management. Tech rep. Robotics Institute, Carnegie Mellon University.Google Scholar
Tay, F. and Cao, L. 2001. Application of support vector machines in financial time series forecasting. Omega 29, 309--317.Google ScholarCross Ref
Technical-Analysis. 2005. The Trader's Glossary of Technical Terms and Topics. http://www.traders.com/documentation/RESource_docs/glossary/glossary.html.Google Scholar
Thomas, J. D. and Sycara, K. 2002. Integrating genetic algorithms and text learning for financial prediction. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO).Google Scholar
Tolle, K. M. and Chen, H. 2000. Comparing noun phrasing techniques for use with medical digital library tools. J. Amer. Soc. Inform. Sci. 51, 4, 352--370. Google ScholarDigital Library
Vanschoenwinkel, B. 2003. A discrete kernel approach to support vector machine learning in language independent named entity recognition. Tech. rep. Computational Modeling Lab, Vrije Universiteit, Brussels.Google Scholar

Index Terms

Textual analysis of stock market prediction using breaking financial news: The AZFin text system

Recommendations

Stock Market Prediction using Financial News Articles on Ho Chi Minh Stock Exchange
IMCOM '16: Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication

In this paper, we examined the effects of financial news on Ho Chi Minh Stock Exchange (HoSE) and we tried to predict the direction of VN30 Index after the news articles were published. In order to do this study, we got news articles from three big ...
Read More
Stock market analysis using clustering techniques: the impact of foreign ownership on stock volatility in Vietnam
SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

Data mining techniques have been used for various aspects of the financial market, such as prediction on stock index and price, portfolio risk management, and trend detection. In the stock market, there are a huge amount of data, including firms' ...
Read More
Stock Market, Exchange Rate and Chinese Money Demand
ISME '10: Proceedings of the 2010 International Conference of Information Science and Management Engineering - Volume 02

The paper examines the long-term relationship among RMB exchange rate, stock market, interest rate, consumption, general real money balance and their dynamics from 2000 to 2009 by employing Johanson and SVAR methods. The result shows: exchange rate and ...
Read More

Reviews

Reviewer: Jonathan P. E. Hodgson

"Information from quarterly reports or breaking news stories can dramatically affect the share price of a security." Previous attempts to use machine learning techniques to exploit such information to predict price movements have relied on using a pre-identified set of keywords. Here, Schumaker and Chen experiment with using other linguistic elements for prediction, specifically bags of words, noun phrases, and named entities. Their system first extracts the attributes from news articles, and then uses various models for prediction: a regression model and "three models [that] use supervised learning of support vector machines (SVM) regression." The first of these models uses only the terms extracted from the article; the second model uses both "terms and the stock price at the time the article was released"; and the third uses the "terms and a regressed estimate of the [future] stock price." In all cases, the future meant 20 minutes later. For each of these models, each of the three different entities was used, giving 12 different prediction systems. The experiments were performed using data for the period of October 26th to November 28th, 2005. The authors found that the second model-the one using terms and the current stock price-performed best in all cases. Noun phrases performed best in predicting direction, whereas named entities gave better results when closeness of prediction was sought. Schumaker and Chen performed additional experiments, employing a representation that used noun phrases tagged as proper nouns-a hybrid of noun phrases and named entities. This model had the best performance. It seems to be worth exploring the degree to which this insight applies to other systems that analyze text. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 27, Issue 2
February 2009
184 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1462198
Issue’s Table of Contents

Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 March 2009
- Accepted: 1 September 2008
- Revised: 1 May 2008
- Received: 1 May 2006
Published in tois Volume 27, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SVM
prediction
stock market
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 536
  Total Citations
  View Citations
- 9,203
  Total Downloads
- Downloads (Last 12 months)466
- Downloads (Last 6 weeks)63
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Textual analysis of stock market prediction using breaking financial news: The AZFin text system

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Stock Market Prediction using Financial News Articles on Ho Chi Minh Stock Exchange

Stock market analysis using clustering techniques: the impact of foreign ownership on stock volatility in Vietnam

Stock Market, Exchange Rate and Chinese Money Demand

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Textual analysis of stock market prediction using breaking financial news: The AZFin text system

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Stock Market Prediction using Financial News Articles on Ho Chi Minh Stock Exchange

Stock market analysis using clustering techniques: the impact of foreign ownership on stock volatility in Vietnam

Stock Market, Exchange Rate and Chinese Money Demand

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media