skip to main content
10.1145/3397271.3401427acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments

Published:25 July 2020Publication History

ABSTRACT

Event representative learning aims to embed news events into continuous space vectors for capturing syntactic and semantic information from text corpus, which is benefit to event-driven quantitative investments. However, the financial market reaction of events is also influenced by the lead-lag effect, which is driven by internal relationships. Therefore, in this paper, we present a knowledge graph-based event embedding framework for quantitative investments. In particular, we first extract structured events from raw texts, and construct the knowledge graph with the mentioned entities and relations simultaneously. Then, we leverage a joint model to merge the knowledge graph information into the objective function of an event embedding learning model. The learned representations are fed as inputs of downstream quantitative trading methods. Extensive experiments on real-world dataset demonstrate the effectiveness of the event embeddings learned from financial news and knowledge graphs. We also deploy the framework for quantitative algorithm trading. The accumulated portfolio return contributed by our method significantly outperforms other baselines.

References

  1. Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. 2009. Pearson correlation coefficient. In Noise reduction in speech processing. Springer, 1--4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 1247--1250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Wesley S Chan. 2003. Stock price reaction to news and no-news: drift and reversal after headlines. Journal of Financial Economics, Vol. 70, 2 (2003), 223--260.Google ScholarGoogle ScholarCross RefCross Ref
  4. Dawei Cheng, Ye Liu, Zhibin Niu, and Liqing Zhang. 2018a. Modeling similarities among multi-dimensional financial time series. IEEE Access, Vol. 6 (2018), 43404--43413.Google ScholarGoogle ScholarCross RefCross Ref
  5. Dawei Cheng, Yi Tu, Zhibin Niu, and Liqing Zhang. 2018b. Learning Temporal Relationships Between Financial Signals. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2641--2645.Google ScholarGoogle Scholar
  6. Edouard Delasalles, Ali Ziat, Ludovic Denoyer, and Patrick Gallinari. 2019. Spatio-temporal neural networks for space-time data modeling and relation discovery. Knowledge and Information Systems, Vol. 61, 3 (2019), 1241--1267.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shumin Deng, Ningyu Zhang, Wen Zhang, Jiaoyan Chen, Jeff Z Pan, and Huajun Chen. 2019. Knowledge-driven stock trend prediction and explanation via temporal convolutional network. In Companion Proceedings of The 2019 World Wide Web Conference. 678--685.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.Google ScholarGoogle Scholar
  9. Xiao Ding, Kuo Liao, Ting Liu, Zhongyang Li, and Junwen Duan. 2019. Event Representation Learning Enhanced with External Commonsense Knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 4896--4905.Google ScholarGoogle ScholarCross RefCross Ref
  10. Xiao Ding, Yue Zhang, Ting Liu, and Junwen Duan. 2015. Deep learning for event-driven stock prediction. In Twenty-fourth international joint conference on artificial intelligence .Google ScholarGoogle Scholar
  11. Xiao Ding, Yue Zhang, Ting Liu, and Junwen Duan. 2016. Knowledge-driven event embedding for stock prediction. In Proceedings of coling 2016, the 26th international conference on computational linguistics: Technical papers. 2133--2142.Google ScholarGoogle Scholar
  12. Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. 2016. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1555--1564.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Oren Etzioni, Michele Banko, Stephen Soderland, and Daniel S Weld. 2008. Open information extraction from the web. Commun. ACM, Vol. 51, 12 (2008), 68--74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Eugene F Fama. 1965. The behavior of stock-market prices. The journal of Business, Vol. 38, 1 (1965), 34--105.Google ScholarGoogle ScholarCross RefCross Ref
  15. Fuli Feng, Xiangnan He, Xiang Wang, Cheng Luo, Yiqun Liu, and Tat-Seng Chua. 2019. Temporal relational ranking for stock prediction. ACM Transactions on Information Systems (TOIS), Vol. 37, 2 (2019), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Florian Holzschuher and René Peinl. 2013. Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j. In Proceedings of the Joint EDBT/ICDT 2013 Workshops. 195--204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kewei Hou. 2007. Industry information diffusion and the lead-lag effect in stock returns. The Review of Financial Studies, Vol. 20, 4 (2007), 1113--1138.Google ScholarGoogle ScholarCross RefCross Ref
  19. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  20. Shimon Kogan, Dimitry Levin, Bryan R Routledge, Jacob S Sagi, and Noah A Smith. 2009. Predicting risk from financial reports with regression. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 272--280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rafal Kuc and Marek Rogozinski. 2013. Elasticsearch server .Packt Publishing Ltd.Google ScholarGoogle Scholar
  22. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. 1188--1196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. David Leinweber and Jacob Sisk. 2011. Event-driven trading and the "new news". The Journal of Portfolio Management, Vol. 38, 1 (2011), 110--124.Google ScholarGoogle ScholarCross RefCross Ref
  24. Qing Li, Jinghua Tan, Jun Wang, and HsinChun Chen. 2020. A Multimodal Event-driven LSTM Model for Stock Prediction Using Online News. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarGoogle ScholarCross RefCross Ref
  25. Ying Li, Ting Jin, Meng Xi, Shengpeng Liu, and Zhiling Luo. 2018. Massive Text Mining for Abnormal Market Trend Detection. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 4135--4141.Google ScholarGoogle Scholar
  26. Zhongguo Li and Maosong Sun. 2009. Punctuation as implicit annotations for Chinese word segmentation. Computational Linguistics, Vol. 35, 4 (2009), 505--512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Zhige Li, Derek Yang, Li Zhao, Jiang Bian, Tao Qin, and Tie-Yan Liu. 2019. Individualized indicator for all: Stock-wise technical indicator optimization with stock embedding. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 894--902.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2124--2133.Google ScholarGoogle ScholarCross RefCross Ref
  29. Andrew W Lo and A Craig MacKinlay. 1990. When are contrarian profits due to stock market overreaction? The review of financial studies, Vol. 3, 2 (1990), 175--205.Google ScholarGoogle Scholar
  30. Ronny Luss and Alexandre d'Aspremont. 2015. Predicting abnormal returns from news using text classification. Quantitative Finance, Vol. 15, 6 (2015), 999--1012.Google ScholarGoogle ScholarCross RefCross Ref
  31. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, Nov (2008), 2579--2605.Google ScholarGoogle Scholar
  32. Tiago Macedo and Fred Oliveira. 2011. Redis Cookbook: Practical Techniques for Fast Data Manipulation ." O'Reilly Media, Inc.".Google ScholarGoogle Scholar
  33. Daniel Myers and James W McGuffee. 2015. Choosing scrapy. Journal of Computing Sciences in Colleges, Vol. 31, 1 (2015), 83--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shirui Pan, Jia Wu, Xingquan Zhu, Chengqi Zhang, and Yang Wang. 2016. Tri-party deep network representation. Network, Vol. 11, 9 (2016), 12.Google ScholarGoogle Scholar
  35. Swarnadeep Saha et al. 2018. Open information extraction from conjunctive sentences. In Proceedings of the 27th International Conference on Computational Linguistics. 2288--2299.Google ScholarGoogle Scholar
  36. Swarnadeep Saha, Harinder Pal, et al. 2017. Bootstrapping for numerical open ie. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 317--323.Google ScholarGoogle ScholarCross RefCross Ref
  37. Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A Smith, and Yejin Choi. 2019. Atomic: An atlas of machine commonsense for if-then reasoning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3027--3035.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J Shaheen. 2017. Apache Kafka: Real Time Implementation with Kafka Architecture Review. International Journal Of Advanced Science And Technology, Vol. 109 (2017), 35--42.Google ScholarGoogle ScholarCross RefCross Ref
  39. Jianfeng Si, Arjun Mukherjee, Bing Liu, Sinno Jialin Pan, Qing Li, and Huayi Li. 2014. Exploiting social relations and sentiment for stock prediction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1139--1145.Google ScholarGoogle ScholarCross RefCross Ref
  40. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, Vol. 15, 1 (2014), 1929--1958.Google ScholarGoogle Scholar
  41. Paul C Tetlock, Maytal Saar-Tsechansky, and Sofus Macskassy. 2008. More than words: Quantifying language to measure firms' fundamentals. The Journal of Finance, Vol. 63, 3 (2008), 1437--1467.Google ScholarGoogle ScholarCross RefCross Ref
  42. Jingyuan Wang, Yang Zhang, Ke Tang, Junjie Wu, and Zhang Xiong. 2019. AlphaStock: A Buying-Winners-and-Selling-Losers Investment Strategy using Interpretable Deep Reinforcement Attention Networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1900--1908.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Boyi Xie, Rebecca Passonneau, Leon Wu, and Germán G Creamer. 2013. Semantic frames to predict stock price movement. In Proceedings of the 51st annual meeting of the association for computational linguistics. 873--883.Google ScholarGoogle Scholar
  44. Yang Yang, ZHOU Deyu, Yulan He, and Meng Zhang. 2019. Interpretable Relevant Emotion Ranking with Event-Driven Attention. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 177--187.Google ScholarGoogle ScholarCross RefCross Ref
  45. Wenbo Zhang, Xiao Ding, and Ting Liu. 2018. Learning target-dependent sentence representations for chinese event detection. In China Conference on Information Retrieval. Springer, 251--262.Google ScholarGoogle ScholarCross RefCross Ref
  46. Sendong Zhao, Quan Wang, Sean Massung, Bing Qin, Ting Liu, Bin Wang, and ChengXiang Zhai. 2017. Constructing and embedding abstract event causality networks from text snippets. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. 335--344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ali Ziat, Edouard Delasalles, Ludovic Denoyer, and Patrick Gallinari. 2017. Spatio-temporal neural networks for space-time series forecasting and relations discovery. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 705--714.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
        July 2020
        2548 pages
        ISBN:9781450380164
        DOI:10.1145/3397271

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 July 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader